The WSDM 2019 Industry Day will take place on Monday 11 February.
The program will feature talks from a wide range of consumer-facing brand-name companies you know of, as well as talks from sectors you did not expect to see at WSDM. The talks will cover how machine learning has been put to use in practical scenarios, how user behaviour can be observed and used to improve systems in practice, how industrial pipelines can be optimized, and how scale is a challenge in more ways than the obvious.
Keynote: Entity-Based Task Completion and its Natural Language interface in Alexa
Alexa AlView Abstract
Abstract: A significant portion of user requests in Alexa is about task completion, in particular, fulfilled by taking actions on entities such as listening to music and ordering food from restaurants. With voice interface, users interact with Alexa via natural languages. In this talk, we show an end-to-end system from understanding user’s utterances to invoking service providers to take actions. This includes: (1) organizing entities and service providers to enable actions (knowledge); (2) generating a semantic representing of user utterances in terms of intents and entity slots (interpretation); (3) resolving intents and entities to canonical values (resolution); (4) selecting and ranking service providers and routing user requests (action). On top of this technical stack, we build self service tools to empower both first-party and third-party developers to ingest data/knowledge and build domain-specific ML models, with interoperability and composability support to enable cross-domain complex tasks via dialogues. At the end, we will also discuss the remaining challenges in our system, such as complex user utterances with multiple intents and nested entity slots, entity category search and recommendations, and spam detection.
Bio: Xiaodong Fan is the director of applied science in Alexa AI, where he leads the team to work on ontology, entity linking and resolution, and catalogue ingestion. Previously, he was the Partner Group Engineering Manager in Bing. He has worked on image and video search, web graph analysis, web index selection and static ranking, spam and junk detection, knowledge graph, entity and local search, and personalized content recommendation. Xiaodong holds a PhD in computer vision from Johns Hopkins University.
‘No Interaction’ as Indicator of Search Satisfaction: Accounting for Good Abandonment in User Success Metrics
Microsoft BingView Abstract
Abstract: At Bing, measuring user success has always been a deciding factor as to which feature or change is shipped to production. Testing such changes is carried out through randomized controlled experiments, where success metrics are used to measure the treatment effect on user satisfaction. Over the years, we have designed and refined our metrics to capture various user interactions, from search queries to clicks and hovers, and interpreted them to predict users’ satisfaction with the search engine. One of the main scenarios that is hard to interpret is search result page abandonment, where the user doesn’t click on the page or interact with any specific element. In this scenario of abandonment, we need to differentiate cases where the user abandoned due to getting the information they need without clicking on any results, from those where the user abandoned due to a defective and/or unsatisfactory search result page. In this talk, we outline Bing’s journey in addressing this measurement problem. We talk about our initial effort of considering the presence of specific elements on the page as indicator of success; to our offline/online hybrid approach to identify good abandonment; and finally, to a fully-online solution that relies on a user’s behavior across their search session. We also cover the pitfalls of the different approaches, how we evaluate them and the current challenges and problems left to solve.
Bio: Widad Machmouchi is a Principal Applied Researcher at Bing, Microsoft where she works in the AI & Research group focusing on user modeling, A/B experimentation and success measurement. She is part of the Metrics team, where online metrics are developed to measure user satisfaction on Bing and are used as the main overall evaluation criteria (OEC) in almost all A/B experiments at Bing. She holds a PhD in Theoretical Computer Science from the University of Washington, Seattle and is a co-founder of a technology hardware start-up.
Demonstrating and Trialling of an Internet of Things Solution for Real-Time Computation and Delivery of Plant KPIs
Prem Prakash Jayaraman
Swinburne University of Technology, AustraliaView Abstract
Abstract: With the rise of Industry 4.0, manufacturing industries are exploring new digital technologies to enhance the efficiency of production process to maintain the industry’s competitiveness. In this talk, I will present the outcomes of an industry (Australian Meat Processing Corporation) funded Industrial Internet of Things (IIoT) research project to evaluate individualised performance and productivity of workers working in a meat processing plant. I will provide an overview of the key Industry 4.0 opportunities that we identified in the Australian meat processing sector. I will present the technical design of the developed IIoT solution to monitor worker performance and the outcomes of a 2 day in-plant trial conducted at one of Australia’s leading Meat manufacturer.
Bio: Dr. Prem Prakash Jayaraman is a Senior Lecture and Head of the Digital Innovation Lab in the Department of Computer Science and Software Engineering, Faculty of Science, Engineering and Technology at Swinburne University of Technology. Previously, he was a PostDoctoral Research Scientist at CSIRO. Dr. Jayaraman has experience leading/co-leading several industrial research projects in domains such as digital agriculture, mining, manufacturing, food and beverages and smart cities. Prem is broadly interested in emerging areas of Internet of Things (IoT), Mobile and Cloud Computing. He is the recipient of several industry awards including Black Duck Rookie of the Year Award for Open IoT project (www.openiot.eu). He has (co) authored 70+ journal, conference and book chapter publications that have received 1200+ google scholar citation (h-index: 20) including two seminal papers in IoT and Industry 4.0.
Reinforcement Multi-Units Mechanism Design: Baidu’s Vision and Practice
Ruohan (Jeff) Qian
Abstract: The annual Internet advertising revenue in China skyrocketed to $42 billion in 2017. Improvement on efficiency of ADs auction system is likely to benefits both advertisers and the platform. Founded in 2000, Baidu is the largest web search engine in China, and one of the biggest internet advertising service provider. Fengchao, Baidu’s online ads distribution system, processes more than 2 billion auction requests each day, generating the majority of the company's revenue. Designing&Implementing new algorithms for a mature and major system like Baidu Fengchao is very challenging. In this talk, I will give an overview of the research opportunities and challenges for the Internet ads auction system, followed by our field experience of how we bring reinforcement learning and multi-unit mechanisms into large scale implementation.
Bio: Ruohan Qian (Jeff) is the head of Baidu's mechanism design team. His is responsible for the design and implementation of Baidu’s search ads auction system. His research areas include mechanism design, recommendation system and reinforcement learning. He earned his master’s degree from University of Science and Technology of China (USTC).
User Response Prediction at Scale
Walmart LabsView Abstract
Abstract: Millions of users browse Walmart.com each day with varying levels of intent. Many of them end up making a purchase in the same session and most, well, do not. Display retargeting channels, with ads over open web and your favourite social media sites, are then used to reach out to the potential customers with relevant content. The ad serving comes at a cost and optimizing these costs becomes especially important given the huge scale. Predicting a user’s purchase (or click) propensity and bidding appropriately is crucial for reaching out to the right user with the right content and at the right time. We discuss how we, at WalmartLabs, build the user propensity prediction models to efficiently bid for ad impressions. We start from ground zero - understanding data nuances and formulating the problem. We delve into the finer aspects of offline data crunching and building models and pipelines on top of petabytes of user data. We further elaborate the critical stage of deploying models into the real world, where the model scores are just not enough! We also discuss the effect of multiple user touchpoints on these models and how ‘robust’ algorithms come to the rescue.
Bio: Priyanka Bhatt is passionate about building intelligent machines, and has worked at WalmartLabs for the past 4.5 years. Experienced in building ML models for Bidding and Recommendations in the computational advertising domain, Priyanka obtained an ME degree from IISc Bangalore with specialization in Game Theory under Prof. Y. Narahari, and has publications in a range of Game Theory and ML conferences, including AAAI'15 and WWW'18.
Lessons Learned from Characterizing and Classifying Homes in Airbnb’s Marketplace
Abstract: Airbnb is a two-sided marketplace that brings together travelers and hosts to build a world where every person can belong anywhere. To achieve that mission, the Data Science Team contributes to all parts of the business and product, from informing hosts to optimally set prices to routing customer issues to the right customer experience agent. Many of these problems are instances of applied research, and data scientists must consider how to balance long term outcomes with two different but related constraints: 1) creating impact in the near term and 2) achieving the same impact in less time and/or with fewer person-hours. In this talk, we share how we deal with these two challenges through the case study of characterizing and classifying our highly diverse inventory. This is a prototypical applied research problem where scope is large and impact can extend to multiple parts of the business but carries substantial risk of “failure” depending on the execution. We will discuss 1) the importance of order of operations -- how our choice of milestones around simple use cases (e.g. hotel vs not a hotel) satisfied business needs and maintained a path for the long term, 2) the importance of foundational investments, such as well designed ontologies and quality standards for labeling data, and how initial underinvestment led to poor results and wasted work, and 3) how off-the-shelf, tree-based models gave us competitive results with minimal effort and provided cover to invest in more sophisticated solutions. Along the way, we hope to share interesting challenges specific to operating a global marketplace of homes including the heterogeneity of user generated data and regional differences in the interpretation of accommodation types.
Bio: Gary Tang is a data scientist at Airbnb and leads a team of machine learning engineers focused on characterizing and classifying listings in the Airbnb marketplace. His past research includes statistical estimation in high dimensions and uncertainty quantification of computational fluid dynamics simulations. He has a PhD from Stanford University.
A Case Study on Microsoft’s Ruuh.ai: Is User Growth a Peril to Research Progress?
Microsoft AI and ResearchView Abstract
Abstract: Striking a balance between business goals such as user growth and deep meaningful research is always a challenging task in an industrial research setting. In this talk, taking Microsoft’s Ruuh as a case study, we will discuss the challenges and opportunities in the industry when it comes to research. Microsoft’s Ruuh was conceptualized about 2.5 years back and the main product promise of Ruuh is to be able to talk to its users on any subject they choose. We realized that the promise meant thinking beyond the utilitarian notion of merely generating “relevant” responses and enabling Ruuh to comprehend and meet a wider range of user social needs, like expressing happiness when user's favorite team wins, sharing a cute comment on showing the pictures of the user's pet and so on. At the onset, this seems an impossible task to achieve coupled with aggressive release deadline and pressure to grow usage. However, in this talk we will discuss how our research progress helped the user growth and vice versa, and also discuss scenarios where we suffered setbacks. A good quality product leads to high usage which in turn provides the much-needed data to improve the research and understand the flaws in the current approach. At the same time, high usage of the product forces the team to focus on the efficiency, cost per query and other infrastructure related workloads. This talk will take real-world examples and explain these tradeoffs. More details of the talk are presented in last section.
Bio: Puneet Agrawal is the Principal Engineering Manager in the AI & Research division at Microsoft, Hyderabad, India and Founder of Microsoft’s Ruuh.ai. He has more than eleven years of experience working in fields of Machine Learning, IR, NLP and AI. During his stint at Microsoft, he has actively led and developed several AI powered products and features that have reached millions of users. He is especially passionate about creating products with a human-like personality and was the co-creator of Cortana's personality. He also holds several patents in related fields.
Reinforcement Learning for Recommender Systems: A Case Study on Youtube
Google BrainView Abstract
Abstract: While reinforcement learning (RL) has achieved impressive advances in games and robotics, it has not been widely adopted in recommender systems. Framing recommendation as an RL problem offers new perspectives, but also faces significant challenges in practice. Industrial recommender systems deal with extremely large action spaces – many millions of items to recommend and complex user state spaces -- billions of users, who are unique at any point in time. In this talk, I will discuss our work on scaling up a policy-gradient-based algorithm, i.e. REINFORCE to a production recommender system at Youtube. We proposed algorithms to address data biases when deriving policy updates from logged implicit feedback. I will also discuss some follow up work and outstanding research questions in applying RL, in particular off-policy optimization in recommender systems.
Bio: Minmin Chen is a Research Scientist in Google Brain. Her main research interests are in machine learning, currently focusing on reinforcement learning for recommendation systems and sequence modeling. Before that, she was a research scientist at Criteo Lab, building computational models for online advertising, and Amazon, working on the Amazon Go project. She did her PhD study at Washington University in St. Louis on representation learning and domain adaptation. She published over 20 papers at top conferences in machine learning such as NIPS, ICML, ICLR and AISTATs.
Short-Term Growth or Long-Term Reputation? Trade-off Between Bookings and Trip Quality in Airbnb’s Marketplace
Abstract: How should we trade-off short term growth against long term reputation based on quality experience for users? Many tech companies face tremendous pressure to show immediate growth, since evaluation logic of the financial markets are dictated by growth numbers. However, successful companies also understand they need to build quality experience for their users for long-term reputation and sustained growth. Sometimes these two objectives are at conflict. At a macro level, time and resources devoted to quality are time and resources away from direct growth efforts. At a micro level, launch of certain product features can improve user experience but at slight expense of immediate growth. In this talk, I will discuss a trade-off framework between bookings and trip quality in Airbnb’s Marketplace, including how we define and measure trip quality and how we evaluate trip quality. I will discuss (1) how we define value of trip quality; (2) how we can break down value of trip quality to components of direct customer support resources, guest rebooking, and reputation via reviews as well as word-of-mouth; and (3) how we estimate different components of value using causal inference methods. I will discuss how causal inference calls for a different approach and methodology from prediction algorithms, and discuss alternative methods and identification challenges. As an outcome of this framework, there will be no conflict between growth and quality, but a unified goal of sustained growth.
Bio: Jing Xia is a data scientist at Airbnb, and has worked in market dynamics, host, and homes platform teams. Prior to Airbnb, she’s mostly worked in the area of policy consulting, such as evaluating the impact of Medicaid formulary policy or the impact of hospital mergers. Her expertise are in causal inference, econometrics, and market design. Jing earned her B.A. in mathematics from Yale College and Ph.D. in economics from Harvard University.
Billion-Scale Recommender Systems
Abstract: The problem of building a recommender system from implicit feedback is well studied and has many proposed solutions, from BPR to VAE-CF. But in real world applications it may face significant constraints: billions of products and hundreds of millions of users, biased data sampling, user cold-start problems, and more. No state-of-the-art model is able to solve all of these, and we needed to implement something reliable and working in relatively short time. In this talk we’ll walk you through our explored options and how we managed to solve in a “good enough” way most of these challenging problems.
Bio: Ivan Lobov is a self-taught researcher and engineer, the author of billion-scale distributed randomized SVD implementation in Spark. In Criteo Ivan works on product similarities and personalized recommendations.
Gold Industry Day Supporter