Industry Day

Session 1: 10:00-11:30



Ido GuyeBay Royi RonenMicrosoft

Good Data Scientist, Bad Economist - When ML Stops and Modeling Begins


Roy SassonWaze, Israel

Abstract: Economics has always been considered a "colonial" social science, in the sense that Economists have had many insights not only about monetary policies and commerce, but also about education, health, transportation, housing, and even marriage decisions by individuals. Technology-driven applications are at the forefront of many of the valuable new services that we see in these traditional industries in recent years. Many of those services became so valuable to users due to the insights and models developed by data scientists working closely with engineering and product teams. However - at some point ML stops and modeling begins. In this talk - I will lay down some of the advanced principles and methodologies undertaken by Economists, and how we apply them in the Data Org at Waze, in the context of Waze's mission statement "to eliminate traffic altogether".

Bio: Roy manages an org of 30 Data Scientists, Product Analysts and Data Engineers at Waze, accountable for the data services the company relies on, while serving more than 130 Million monthly users, starting from the Data Science behind Waze’s ETA estimations, Waze’s Carpool Product Analytics, Carpool’s Pricing Strategy, Fraud Detection, and the Recommendation system behind the Carpool ranking system, all the way to the BI systems and Waze’s absolute sources of truth. During his career, Roy established from scratch various data organizations in varying sizes and expertise. In his free time - Roy serves as a lecturer at Tel Aviv University and the IDC Herzliya, for more than a decade.

Health Personalisation: From Wellbeing to Medicine


Shlomo BerkovskyCentre for Health Informatics, Macquarie University, Australia

Abstract: The current agenda of health personalisation research mainly revolves around lifestyle and wellbeing. A number of works on personalised technologies for physical activity, food intake, mental support, health information consumption, and more have been published at recent conferences. While these mainly addressed the patient as the recipient of the personalised service, strikingly little attention has been paid to personalised medical applications targeting clinical users. In this talk, we turn the spotlight to such medical use-cases and the advantages personalisation can bring there. We will overview the established health care processes and highlight the touch points, where personalised support can improve the clinician’s decision making. Also, we will discuss the differences between patient- and clinician-facing personalisation, particularly focussing on risk, trust, and explainability.

Bio: Shlomo Berkovsky is a Computer Scientist, with deep theoretical and applied expertise in several areas related to human-centric applications of Artificial Intelligence and Machine Learning. His original research areas include user modelling and personalised technologies. Currently, he leads the Precision Health research stream at the Centre for Health Informatics, Macquarie University. The stream focusses on the use of Machine Learning methods to develop patient models and personalised predictions of diagnosis and care, and studies how sensors can be deployed to predict medical conditions, and how clinicians and patients interact with health technologies.

A Qualitative Approach for Handling FAQ Pages for Chatbot Usage


Michal Shmueli-ScheuerIBM Research, Israel

Abstract: In this talk, I will present a system for automatically building and maintaining conversational chatbot data, based on FAQ pages. It was developed to provide reliable, accurate, and up-to-date information on the COVID-19 pandemic. In particular, I will focus on the technology for identifying FAQ pages and extracting question-answer pairs. I'll present our algorithms, and describe how we constructed a new, large-scale, multi-lingual dataset to evaluate the quality of the algorithms, and report our results on this dataset. These algorithms power the IBM COVID-19 Kit, which is implemented as part of an industrial chatbot platform. It allows users to generate and maintain chatbots by crawling a curated set of authoritative sources along with additional sources they specify. Finally, I'll discuss current limitations and some practical challenges.

Bio: Michal is a senior researcher in the Language and Retrieval research group in IBM Research AI. Her area of expertise is in the fields of NLP including conversational bots, summarization, and affective computing. Michal received her PhD from the University of California, Irvine, USA. She has published more than 30 academic papers in leading conferences, and journals, and book chapters. She has served as a PC member and a reviewer of numerous leading conferences and journals. She was an organizer of the 1st and 2nd user2agent workshops at IUI 2019 and 2020.



Session 2: 12:00-13:40

Optimizing Recommender Systems for Multi-stakeholder Objectives


Rishabh MehrotraSpotify, UK

Abstract: Multi-sided marketplaces are digital platforms that facilitate efficient interactions between multiple stakeholders, and have customers not only on the demand side (e.g. users), but also on the supply side (e.g. hosts, retailer, delivery partners). While traditional recommender systems focused specifically towards increasing consumer satisfaction by providing relevant content to the consumers, multi-sided marketplaces involve interaction between multiple stakeholders, and face an interesting problem of optimizing for multiple stakeholder objectives. In this talk we discuss a number of ML problems which need to be addressed when developing a search recommendation framework powering multistakeholder marketplaces. First, we discuss algorithmic techniques for multiobjective rankings recommendations to jointly optimize the different objectives. Second, we discuss different ways in which stakeholders specify their objectives. Third, we discuss user content specific characteristics which could be leveraged while jointly optimizing such models. Furthermore, we discuss evaluation of such systems and present numerous industrial case studies. We share insights from large scale experimentation and deployment of multi-objective methods at Spotify, and identify key directions of future research.

Bio: Rishabh Mehrotra is a Senior Research Scientist at Spotify in London. He obtained his PhD in the field of Machine Learning and Information Retrieval from University College London where he was partially supported by a Google Research Award. His PhD research focused on inference of search tasks from search conversational interaction logs. His current research focuses on machine learning for marketplaces, bandit based recommendations, and multi-objective modeling of recommenders. Some of his recent work has been published at conferences including KDD, WWW, SIGIR, NAACL, RecSys and WSDM. He 1 has co-taught a number of tutorials at leading conferences including KDD, RecSys, WWW & CIKM, and taught courses at summer schools.

Neural Databases


Fabrizio SilvestriFacebook/U Rome

Abstract: We asked ourselves a question: can we build a database management system that doesn’t rely on the fundamental concept of a schema. In recent years, neural networks have shown impressive performance gains on long-standing AI problems, and in particular, answering queries from natural language text. These advances raise the question of whether they can be extended to a point where we can relax the fundamental assumption of database management, namely, that our data is represented as fields of a predefined schema. We present a first step in answering that question and we describe NeuralDB: a database system with no predefined schema. In NeuralDB updates and (select) queries are given in natural language. We develop query processing techniques that build on the primitives offered by the state of the art Natural Language Processing methods. We begin by demonstrating that at the core, recent NLP transformers, powered by pre-trained language models, can answer select-project-join queries if they are given the exact set of relevant facts. However, they cannot scale to non-trivial databases and cannot perform aggregation queries. Based on these findings, we describe a NeuralDB architecture that runs multiple Neural Select Project Join (SPJ) operators in parallel, each with a set of database sentences that can produce one of the answers to the query. The result of these operators is fed to an aggregation operator if needed. We describe an algorithm that learns how to create the appropriate sets of facts to be fed into each of the Neural SPJ operators. Importantly, this algorithm can be trained by the Neural SPJ operator itself. We experimentally validate the accuracy of NeuralDB and its components, showing that we can answer queries over thousands of sentences with very high accuracy.

Bio: Fabrizio Silvestri is a full professor at the Computer Engineering Dept. of Sapienza University of Rome. It was previously a Research Scientist at Facebook AI in London. His interests are in AI applied to integrity related problems and application of Natural Language Processing. In the past he has worked on web search research and in particular his specialization is building systems to better interpret queries from search users. Prior to Facebook, Fabrizio was a principal scientist at Yahoo where he has worked on sponsored search and native ads within the Gemini project. Fabrizio holds a Ph.D. in Computer Science from the University of Pisa, Italy where he studied problems related to Web Information Retrieval with particular focus on Efficiency related problems like Caching, Collection Partitioning, and Distributed IR in general.

Transfer Learning with Knowledge Distillation and Its Applications to Real-world High-scale Recommender Systems


Gil ChamielTaboola, Israel

Abstract: Transfer learning is a popular technique in machine learning, aiming to improve a model's performance using knowledge provided by another model which was trained to solve a different but related problem. In Taboola, where we have over a hundred models trained and deployed in production each day, transfer learning is a powerful tool for making our recommendation system better. For example, our data consists of various clusters which we can exploit in order to transfer knowledge from one model to another. Additionally, we can benefit from transfer learning techniques in order to reduce computational resources or train models on fresher data. In this talk, we will discuss several transfer learning techniques, with a focus on the differences between transferring weights from pre-trained models (also known as warm-starting) and knowledge distillation (also known as the student-teacher approach). We will discuss how these techniques can be used to improve a recommendation system and lessons learned from using these techniques in a real-world high-scale production system.

Bio: Gil Chamiel is a VP of Algorithms and Data Science at Taboola. He holds a PhD in Computer Science (AI) from the University of New South Wales, Australia in the area of personalization and preference elicitation. Gil is a Taboola veteran; working on Taboola's core algorithmic engine for over 10 years.

Developing and Deploying a Recommender Model for Continuous Systematic Literature Review (SLR)


Omri MendelsMicrosoft, Israel

Abstract: Systematic Literature Review (SLR) is the process of systematically exploring scientific studies in literature and identifying the most meaningful ones in answer to a specific scientific question. Once completed, maintaining it with the latest relevant content is a necessary but a labor-intensive process. We propose a framework for developing, evaluating, and deploying ML models leveraging textual and non-textual features to estimate the similarity between new studies and existing SLRs. We will cover the approaches for modeling, the data pipelines for data enrichment using the Microsoft Academic Graph, the deployment aspects of the ML models in the production system, and the results achieved on various datasets. This work is a collaboration between Microsoft and University College London’s EPPI Centre.

Bio: Omri Mendels is a Sr. Data Scientist at Microsoft. In his current role in the Commercial Software Engineering team, he leads AI/ML engagements with strategic Microsoft customers. Before Microsoft, Omri spent 6 years at Intel working on wearable devices and big data analytics and holds a M.Sc. from Ben Gurion University in Israel.



Session 3: 14:10-15:50

Grounded Domain Exploration Dialogues with Dynamic Composition


Idan SzpektorGoogle, Israel

Abstract: We study conversational domain exploration (CODEX), where the user’s goal is to enrich her knowledge of a given domain by conversing with an informative bot. Such conversations should be well grounded in high-quality domain knowledge as well as engaging and open-ended. A CODEX bot should be proactive and introduce relevant information even if not directly asked for by the user. The bot should also appropriately pivot the conversation to undiscovered regions of the domain. While Seq2seq models, such as GPT3 and Meena, achieved human-like level of conversations, their output is not grounded by factual knowledge and these models still frequanty generate untrue facts. These hellucinations prevent from using these models in production user-facing systems. On the other hand retrieval-based methods are more reliable, and are used in production systems, but are constrained to the responses stored in their indexes. In this presentation we'll introduce a novel approach termed dynamic composition that decouples candidate content generation from the flexible composition of bot responses. This allows the bot to control the source, correctness and quality of the offered content, while achieving flexibility via a dialogue manager that selects the most appropriate contents in a compositional manner. We implemented a CODEX bot based on dynamic composition and integrated it into the Google Assistant. We will present domains that we've experimented with, discuss challenges and next steps.

Bio: Dr. Idan Szpektor is a staff research scientist at Google Research Israel, leading the Conversational AI team. Previously, he was a senior research scientist at Yahoo Research, where he studied the relationship between Web search and community-based question answering. Idan earned his PhD in computer science from Bar-Ilan University, Israel, in 2009. He co-authored over 50 publications and he is a recipient of the best paper honorable mention at ECIR'16 and of best paper runner-up awards at ACL'13 and CoNLL'14.

Labeling Data In House: Fast and Good Labeled Data When You can't Outsource


Tal PerryLightTag, Berlin

Abstract: Even though AGI is right around the corner, it's not here yet. So you're still doing supervised learning and need labeled data. In this talk, we'll discuss getting good labeled data fast, specifically in scenarios where you're labeling in house. I run a text annotation platform, and have seen a few thousand annotation projects. I'll share the patterns that successful projects have in common, We'll focus on productivity and data quality, asking for each how they are measured, achieved and maintained over the course of a project. Finnaly, labeling data is an organization commitment, we'll discuss how to champion annotation within an organization and align non-technical stakeholders.

Bio: Tal is the founder of LightTag - The Text Annotation Tool. He previously worked on NLP at Citi and was CTO of Superfly, where he labeled lots of data. Tal holds a B.Sc in mathematics and lives in Berlin.

When One Model Must Fit All: Challenges and Opportunities in Multi-tenant Machine Learning


Jacopo TagliabueCoveo, USA

Abstract: With more than 3.9 trillion dollars spent globally in online retail, research in the eCommerce space has gained considerable traction in recent years. The investments and benefits of this wave are all but evenly distributed in the market; in particular, the majority of innovation is concentrated in a handful of companies with a direct-to-consumer channel. This imbalance creates two issues: first, research themes are heavily influenced by tech giants incentives; second, even when use cases are transferable to mid-sized digital shops, computing constraints make findings irrelevant for a large fraction of practitioners. In this talk, we detail a “multi-tenant friendly” research agenda, showing the commercial interest and intellectual viability of focusing on under-served problems: multi-tenant providers - i.e. companies providing services to other companies - need models that work in shops of different industries and all sizes. After discussing the peculiarities of multi-tenant platforms with regard to incentives, data assets and technological constraints, we share with the community the lessons we learned in scaling our models to hundreds of deployments.

Bio: Jacopo Tagliabue was co-founder and CTO of Tooso, an A.I. company based in San Francisco acquired by Coveo in 2019. Jacopo is currently the Lead A.I. Scientist at Coveo, shipping machine learning models to hundreds of customers and millions of users. When not busy building A.I. products, he is exploring research topics at the intersection of language, reasoning and learning: he is a committee member for international NLP workshops, and his work has been often featured in the general press and major A.I. venues (including WWW, ACL, SIGIR, RecSys). In previous lives, he managed to get a Ph.D., do scienc-y things for a pro basketball team, and simulate a pre-Columbian civilization.

State of the Art, Challenges, and Future Directions of Multimodal Machine Learning in Education


Zitao LiuTAL Education Group, China

Abstract: Recently we have seen a rapid rise in the amount of education data available through the digitization of education. This huge amount of education data usually exhibits in a mixture form of images, videos, speech, texts, etc. It is crucial to consider data from different modalities to build successful applications in AI in education (AIED). This talk targets AI researchers and practitioners who are interested in applying state-of-the-art multimodal machine learning techniques to tackle some of the hard-core AIED tasks. These include tasks such as automatic short answer grading, student assessment, class quality assurance, knowledge tracing, etc. In this talk, we will comprehensively review recent developments of applying multimodal learning approaches in AIED, with a focus on those classroom multimodal data. Beyond introducing the recent advances of computer vision, speech, natural language processing in education respectively, we will discuss how to combine data from different modalities and build AI driven educational applications on top of these data. More specifically, we will talk about (1) representation learning; (2) algorithmic assessment & evaluation; and (3) personalized feedback. Participants will learn about recent trends and emerging challenges in this topic, representative tools and learning resources to obtain ready-to-use models, and how related models and techniques benefit real-world AIED applications.

Bio: Zitao Liu is the Head of Engineering, Xueersi iZhiKang at TAL Education Group (NYSE:TAL), one of the largest leading education and technology enterprises in China. His research is in the area of machine learning, and includes contributions in the areas of artificial intelligence in education, multimodal knowledge representation and user modeling. He has published his research in highly ranked conference proceedings, such as NeurIPS, AAAI, WWW, AIED, etc. and serves as the executive committee of the International AI in Education Society and top tier AI conference/workshop organizers/program committees. He won the 1st place at NeurIPS 2020 education challenge (Task 3), 1st place at Ubicomp 2020 time series classification challenge, 1st place at CCL 2020 humor computation competition and 2nd place at EMNLP 2020 ClariQ challenge. He is a recipient of ACM/CCF Distinguished Speaker and Beijing Nova Program 2020. Before joining TAL, Zitao was a senior research scientist at Pinterest and received his Ph.D degree in Computer Science from University of Pittsburgh.

Long Break


Session 4: 19:00-20:45

When Data Science Met Real Estate


Ron BekkermanCherre Inc., USA

Abstract: Real Estate is one of the oldest and largest industries in the world -- and it is slow to change. Since the dramatic shakeup of 2008, the industry established multiyear trends which seemed to be set in stone, as of early 2020. The COVID-19 crisis has shaken the industry again, most trends have been broken, and a tremendous opportunity has emerged, which might not reappear in decades to come. On the one hand, residential real estate (home sales) is experiencing a significant boost, on the other hand, commercial real estate (multifamily and office space rentals) is under a dark cloud of uncertainty. Both sides of the industry are rapidly changing, and both turn their attention to data to help them survive / succeed. Unfortunately, the Data Science community has not been aware of challenging real estate data problems that are awaiting an immediate solution. This talk aims at closing the gap between Data Science and the Real Estate industry. We provide an introduction to real estate for data scientists, and outline a spectrum of data science problems, many of which are being currently tackled by "prop-tech" companies, while some are yet to be approached. We present concrete examples from Cherre -- a real estate data integration platform.

Bio: Ron Bekkerman is the Chief Technology Officer of Cherre Inc., an AI-powered real estate data integration platform. From 2013 to 2018, Ron was Assistant Professor and Director of the Big Data Science Lab at the University of Haifa, Israel. At UHaifa, Ron designed and taught the Data Science curriculum. Prior to UHaifa, he was the Chief Data Officer of Viola Ventures, a founding member of the Data Science team at LinkedIn, and a Research Scientist at HP Labs in the Bay Area. He received his PhD in Machine Learning from the University of Massachusetts, Amherst.

Bayesian Imputation of Missing Feature Values in Product Sort and Recommendation at Tripadvisor


Narendra MukherjeeTripAdvisor, USA

Abstract: Missing values are an inevitable by-product of effective feature engineering in ML models; yet, almost all practitioners impute them with hand-picked defaults or just feed them as-is into the ML model. Despite there being a lot of statistics research ([1][Gelman], [2][King]) about the missing value problem, its findings have largely not made it across to the ML community; as a result, ad-hoc imputation techniques abound and there is often no post-hoc analysis of their impact on model predictions. I have two goals in this talk: 1) use my work with sort algorithms at Tripadvisor to show how ad-hoc imputation of missing values severely hurts the performance of real-world ML models, and 2) cast the missing value problem as a probabilistic model which one can solve through Bayesian inference. I will spend about 30% of my allotted time on part 1 and spend the remaining 70% on part 2. I will end by showing that the most widely used missing value imputation technique in the statistics community (Multiple Imputation by Chained Equations ([3][MICE]), which scikit-learn implements in its IterativeImputer) can be better understood as approximate Bayesian inference in a simple probabilistic model. This talk will have content that should appeal to data and ML related researchers of all skill levels. For beginning data-related practitioners, part 1 of my talk will demonstrate why it is important to think about missing values carefully during feature engineering and how to examine their role in a model’s predictive performance. For more experienced attendees, part 2 of my talk will try to draw a bridge between the statistical literature on missing value imputation and the world of the machine learning practitioner through a Bayesian lens. [1]: “Gelman” [2]: “King” [3]: “MICE”"

Bio: I am a long time Bayesian interested in the connect between statistics, causal inference and machine learning. Currently, I am a Machine Learning Scientist at Tripadvisor based at their global headquarters in Needham, MA. My work at Tripadvisor spans the entire range of customer-centric ML problems from recommendation engines to building probabilistic models of user-generated content creation. Before Tripadvisor, I obtained my PhD in systems neuroscience from Brandeis University where I developed probabilistic latent variable models of stimulus coding in the brain. I got into the world of Bayesian machine learning during my PhD, and have been in love with that world ever since! Outside of Bayes’ and ML, I am an avid cyclist and have explored much of north-east US on my bike. To learn more about me, look at my webpage at:

Using AI to Drive Query Understanding


Zhe WueBay, USA

Abstract: Query understanding is at the heart of eBay Search, a powerful eCommerce Search Engine. eCommerce queries, showing up in a rich variety of forms and containing explicit and/or implicit intent, are a challenge to deal with. A solid understanding of those queries, however, is a foundation of the ability to provide a good user experience. This talk covers several key Query Understanding components including query segmentation, entity resolution, and query similarity, and the use of AI in those components. We share lessons learned and practical experience on applying AI and other cutting-edge technologies to achieve better query understanding in a large-scale eCommerce environment.

Bio: Zhe Wu is a distinguished engineer in the Search team at eBay. He focuses on search quality and is leveraging AI/ML, structured data, and knowledge graph to improve the search engine that powers eBay’s marketplace. He has extensive enterprise experience in semantic web technologies, graph database, queries and analytics. He has participated in the W3C OWL 2 working group, RDF working group, and UDDI standard committee. He has served as a member of the program committee for ESWC, OrdRing, ISWC, RR, and OWLED. Zhe has served as co-chair for JIST 2011 and he has served on the editorial board of SWJ 2010. He has been invited to attend Japan-America Frontiers of Engineering. Zhe has publications in ISWC, WWW, AAAI, VLDB, ICDE, JCST, CP, ICTAI, ASAP, and more. He has over seventeen US/International patents granted. Zhe received his PhD in computer science from the University of Illinois at Urbana-Champaign in 2001.

Machine Learning for Financial Transaction Data: A Recommendation Use Case


Mahashweta DasVisa Research, USA

Abstract: Visa is a leading global payments technology company that provides consumers, merchants, businesses, financial institutions, and governments with the best way to pay and be paid. Visa handles more than ten trillion dollars of payments annually, thereby accruing humongous amounts of transaction data that reflects how consumers around the world spend money. This gold mine of data motivates us to employ advanced machine learning (ML) and artificial intelligence (AI) techniques to solve critical real problems for the company such as fraud detection. We harness the power of AI and deep learning to personalize consumer and merchant experiences. We also leverage AI to design and develop a range of behavioral biometric technologies, cross-border payment solutions, etc. In this talk, we focus on recommendation. How can we build data-driven AI solutions that learns consumers’ personalized preferences and recommend restaurants, tourist spots, travel itineraries, hotels, etc.? Specifically, we consider the problem of personalized restaurant recommendation using financial transaction data with little-to-no domain knowledge. We discuss the challenges associated with building such a recommendation engine and how we address some of them. We present a novel context-aware recommendation solution and validate its effectiveness over related state-of-the-art.

Bio: Mahashweta Das is a Sr. Staff Research Scientist at Visa Research where she works on challenging real problems at the crossroads of tech and payment industry. At Visa Research, she is leading research and development efforts in Recommendation. She is also employed as a Part-Time Lecturer at Northeastern University, Silicon Valley campus. Previously, she worked as a Research Scientist at Hewlett Packard Labs where she designed and developed big data analytics solutions for HPE’s 'The Machine'. She has held summer positions at Yahoo! Research, Technicolor Research Lab, and IBM Research. Mahashweta received her Ph.D. in Computer Science from the University of Texas at Arlington in 2013. Her research interests include machine learning, deep learning, data mining, and algorithms. She has published over fifteen refereed articles at premier international research conferences and journals, and regularly serves on the program committee of these conferences. Her PhD dissertation received Honorable Mention at ACM SIGKDD 2014 Doctoral Dissertation Award.

Closing Remarks


Ido GuyeBay Royi RonenMicrosoft