The winter school on web search and data mining is a satellite event of WSDM 2015 (the Eighth ACM International Conference on Web Search and Data Mining). The goal of this winter school is to promote research on web search and data mining and introduce the fundamentals as well as recent advance of the field to people interested in it particularly students.
Four distinguished researchers, Jianfeng Gao, Noah Smith, Jure Leskovec, and Eric Xing will give lectures on topics related to web search and data mining, including deep learning, natural language processing, social network, and machine learning platform.
The winter school will be held at Shanghai Jiaotong University (Shang Yuan 500, located at 800 Dongchuan Road) on Jan 31 – Feb 1, 2015, right before the WSDM 2015 conference. Details will be announced later.
Students, researchers, and industry practitioners are welcome to attend the winter school. Note that registrations must be made in advance. Due to space limitation, we will only accept the first 200 applications. Register here.
The winter school will be held on Jan 31 and Feb 1.
Saturday Jan 31, 2015
Lecture 1: Deep Learning for Web Searh and Natural Language Processing (Jianfeng Gao, Microsoft Research)
Lecture 2: Natural Language Processing: Algorithms and Applications, Old and New (Noah Smith, Carnegie Mellon University)
08:45am – 09:00am Opening ceremony and group photo 09:00am – 09:50am Lecture 1 Lesson 1 09:50am – 10:00am Break 10:00am – 10:50am Lecture 1 Lesson 2 10:50am – 11:00am Break 11:00am – 12:00pm Lecture 1 Lesson 3 12:00pm – 02:00pm Lunch 02:00pm – 02:50pm Lecture 2 Lesson 1 02:50pm – 03:00pm Break 03:00pm – 03:50pm Lecture 2 Lesson 2 03:50pm – 04:00pm Break 04:00pm – 05:00pm Lecture 2 Lesson 3
Sunday Feb 01, 2015
Lecture 3: Fundamentals of Large Scale Social and Information Network (Jure Leskovec, Stanford University)
Lecture 4: Distributed Systems and Algorithms for Scalable Machine Learning (Eric Xing, Carnegie Mellon University)
09:00am – 09:50am Lecture 3 Lesson 1 09:50am – 10:00am Break 10:00am – 10:50am Lecture 3 Lesson 2 10:50am – 11:00am Break 11:00am – 12:00pm Lecture 3 Lesson 3 12:00pm – 02:00pm Lunch 02:00pm – 02:50pm Lecture 4 Lesson 1 02:50pm – 03:00pm Break 03:00pm – 03:50pm Lecture 4 Lesson 2 03:50pm – 04:00pm Break 04:00pm – 05:00pm Lecture 4 Lesson 3
Four active and prominent researchers will give lectures at the winter school.
- Jianfeng Gao, Microsoft Research
- Jure Leskovec, Stanford University
- Noah Smith, Carnegie Mellon University
- Eric Xing, Carnegie Mellon University
Title: Deep Learning for Web Search and Natural Language Processing
Abstract: In this talk, I first survey the latest deep learning technology, presenting both theoretical and practical perspectives that are most relevant to our topic. Next, we review general problems and tasks in text/language processing, and underline the distinct properties that differentiate language processing from other tasks such as speech and image object recognition. More importantly, we highlight the general issues of language processing, and elaborate on how new deep learning technologies are proposed and fundamentally address these issues. We then place particular emphasis on several important applications: 1) web search, 2) online recommendation and 3) machine translation. For each of the tasks we discuss what particular architectures of deep learning models are suitable given the nature of the task, and how learning can be performed efficiently and effectively using end-to-end optimization strategies. Beyond providing a systematic review of the general theory, we also present hands-on experience in building state-of-the-art systems. In the talk, we will share our practice with concrete examples drawn from our first-hand experience in major research benchmarks and some industrial scale applications which we have been working on extensively in recent years.
Speaker: Jianfeng Gao is a Principal Researcher of Microsoft Research, Redmond, WA, USA. His research interests include Web search and information retrieval, natural language processing and statistical machine learning. Dr. Gao is the primary contributor of several key modeling technologies that help significantly boost the relevance of the Bing search engine. His research has also been applied to other MS products including Windows, Office and Ads. In benchmark evaluations, he and his colleagues have developed entries that obtained No. 1 place in the 2008 NIST Machine Translation Evaluation in Chinese-English translation. He was Associate Editor of ACM Trans on Asian Language Information Processing, (2007 to 2010), and was Member of the editorial board of Computational Linguistics (2006 – 2008). He also served as area chairs for ACL-IJCNLP2015, SIGIR2014, IJCAI2013, ACL2012, EMNLP2010, ACL-IJCNLP 2009, etc. Dr. Gao recently joined Deep Learning Technology Center at MSR-NExT, working on Enterprise Intelligence.
Title: Fundamentals of Large Scale Social and Information Network
Speaker: Jure Leskovec is an assistant professor of Computer Science at Stanford University where he is a member of the InfoLab and the AI lab. He joined the department in September 2009. In 2008/09 he was a postdoctoral researcher at Cornell University working with Jon Kleinberg and Dan Huttenlocher. He completed his Ph.D. in Machine Learning Department, School of Computer Science at Carnegie Mellon University under the supervision of Christos Faloutsos in 2008. He did his undergraduate degree in computer science at University of Ljubljana, Slovenia in 2004. He also work with the Artificial Intelligence Laboratory, Jozef Stefan Institute, Ljubljana, Slovenia.
Title: Natural Language Processing: Algorithms and Applications, Old and New
Abstract: As long as we’ve had computers, we’ve dreamed of algorithms that take speech or text as input and make sound inferences. In this lecture, we’ll talk about why NLP is hard, and some of the big debates in the field. We’ll attempt to give a general description of how many modern NLP algorithms work. And finally, we’ll survey applications that are enjoying success right now, and areas where NLP may have a big impact in the near future.
Speaker: Noah Smith is Associate Professor of Language Technologies and Machine Learning in the School of Computer Science at Carnegie Mellon University. In fall 2015, he will join the University of Washington as Associate Professor of Computer Science & Engineering. He received his Ph.D. in Computer Science from Johns Hopkins University in 2006 and his B.S. in Computer Science and B.A. in Linguistics from the University of Maryland in 2001. His research interests include statistical natural language processing, especially unsupervised methods, machine learning for structured data, and applications of natural language processing. His book, Linguistic Structure Prediction, covers many of these topics. He has served on the editorial board of the journals Computational Linguistics (2009–2011), Journal of Artificial Intelligence Research (2011–present), and Transactions of the Association for Computational Linguistics (2012–present), and as the secretary-treasurer of SIGDAT (2012–present). His research group, Noah’s ARK, is currently supported by the NSF, DARPA, IARPA, ARO, and gifts from Amazon and Google. Smith’s work has been recognized with a Finmeccanica career development chair at CMU (2011–2014), an NSF CAREER award (2011–2016), a Hertz Foundation graduate fellowship (2001–2006), numerous best paper nominations and awards, and coverage by NPR, BBC, CBC, New York Times, Washington Post, and Time.
Title: Distributed Systems and Algorithms for Scalable Machine Learning
Speaker: Eric Xing is a professor in the School of Computer Science at Carnegie Mellon University. His principal research interests lie in the development of machine learning and statistical methodology, and large-scale computational system and architecture, for solving problems involving automated learning, reasoning, and decision-making in high-dimensional, multimodal, and dynamic possible worlds in complex systems. Professor Xing received a Ph.D. in Molecular Biology from Rutgers University, and another Ph.D. in Computer Science from UC Berkeley. His current work involves, 1) foundations of statistical learning, including theory and algorithms for estimating time/space varying-coefficient models, sparse structured input/output models, and nonparametric Bayesian models; 2) framework for parallel machine learning on big data with big model in distributed systems or in the cloud; 3) computational and statistical analysis of gene regulation, genetic variation, and disease associations; and 4) application of statistical learning in social networks, data mining, and vision. Professor Xing has published over 200 peer-reviewed papers, and is an associate editor of the Journal of the American Statistical Association, Annals of Applied Statistics, the IEEE Transactions of Pattern Analysis and Machine Intelligence, the PLoS Journal of Computational Biology, and an Action Editor of the Machine Learning journal, and the Journal of Machine Learning Research. He is a member of the DARPA Information Science and Technology (ISAT) Advisory Group, a recipient of the NSF Career Award, the Alfred P. Sloan Research Fellowship, the United States Air Force Young Investigator Award, and the IBM Open Collaborative Research Faculty Award.