Microsoft @ WSDM 2019

Microsoft @ WSDM 2019


Microsoft is excited to be a Silver sponsor of the 12th ACM International conference on Web Search and Data Mining. Come by our booth to chat with our experts, see demos of our latest research and find out about career opportunities with Microsoft.

Microsoft Attendees

Puneet Agrawal
Ahmed Awadallah
Murat Bayir
Paul Bennett
Susan Dumais
Manish Gupta
Emre Kiciman
Widad Machmouchi
Tobias Schnabel
Amit Sharma
Raghav Somani
Jaime Teevan
Kuansan Wang
Ryen White
Wei Wu
Roy Zimmermann

Demonstration Chair

Paul Thomas

Program Committee Co-Chair

Paul Bennett

Senior Program Committee Members

Nick Craswell
Emre Kiciman
Milad Shokouhi
Ryen White
Xing Xie

Program Committee Members

Omar Alonso
Jiang Bian
Yuxiao Dong
Adam Fourney
Ahmed Hassan Awadallah
Julia Kiseleva
Arnd Christian König
Jonathan Litz
Hao Ma
Elnaz Nouri
Robert Sim
Ruihua Song
Adith Swaminathan
Chi Wang
Fangzhao Wu
Chenyan Xiong
Elad Yom-Tov
Yizhe Zhang
Imed Zitouni

Workshop Organizers

Task Intelligence Workshop

Ahmed Hassan Awadallah
Ryen W. White


Tuesday, February 12, 2019
Attending to What Matters

Jaime Teevan

Accepted Papers

Tuesday, February 12, 2019 | 10:00 AM–10:30 AM | Session 1: Search and Ranking

MSA: Jointly Detecting Drug Name and Adverse Drug Reaction Mentioning Tweets with Multi-Head Self-Attention

Chuhan Wu, Fangzhao Wu, Zhigang Yuan, Junxin Liu, Yongfeng Huang, Xing Xie


Tuesday, February 12, 2019 | 11:00 AM–12:30 PM | Session 2: Knowledge Graphs and Analytics

Clustered Monotone Transforms for Rating Factorization

Raghav Somani, Gaurush Hiranandani, Oluwasanmi Koyejo, Sreangsu Acharyya


Wednesday, February 13, 2019 | 10:00 AM–10:30 PM | Session 5: Understanding Conversation, Discussion, and Opinions

Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots

Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, Rui Yan

Attitude Detection for One-Round Conversation: Jointly Extracting Target-Polarity Pairs

Zhaohao Zeng, Ruihua Song, Pingping Lin, Tetsuya Sakai


Wednesday, February 13, 2019 | 4:15 PM-5:30 PM | Session 8: Counterfactual and Causal Learning

Genie: An Open Box Counterfactual Policy Estimator for Optimizing Sponsored Search Marketplace

Murat Ali Bayir, Mingsen Xu, Yaojia Zhu, Yifan Shi


Thursday, February 14, 2019 | 9:00 AM–10:30 AM | Session 9: Recommendation

Slice: Scalable Linear Extreme Classifiers trained on 100 Million Labels for Related Searches

*Best Paper Award

Himanshu Jain, Venkatesh Balasubramanian, Bhanu Teja Chunduri, Manik Varma

Neural Tensor Factorization for Temporal Interaction Learning

Xian Wu, Baoxu Shi, Yuxiao Dong, Chao Huang, Nitesh Chawla

Shaping Feedback Data in Recommender Systems with Interventions Based on Information Foraging Theory

Tobias Schnabel, Paul Bennett, Thorsten Joachims


Thursday, February 14, 2019 | 11:00 AM–11:45 AM | Session 10: Personalization and Characterizing User Behavior

Characterizing and Predicting Email Deferral Behaviour

Bahareh Sarrafzadeh, Ahmed Hassan Awadallah, Christopher Lin, Chia-Jung Lee, Milad Shokouhi, Susan Dumais

Task Duration Estimation

Ryen W. White, Ahmed Hassan Awadallah

Neural Demographic Prediction using Search Query

Chuhan Wu, Fangzhao Wu, Junxin Liu, Shaojian He, Yongfeng Huang, Xing Xie


Thursday, February 14, 2019 | 2:00 PM–3:25 PM | Session 11: Domain Transfer and Representation Learning

Domain Adaptation for Commitment Detection in Email

Hosein Azarbonyad, Robert Sim, Ryen W. White

Demos & Tutorials


clstk: The Cross-Lingual Summarization Tool-Kit

Tuesday, February 12 | 3:30 PM–4:15 PM
Nisarg Jhaveri, Manish Gupta, Vasudeva Varma

Half-Day Tutorials

Causal Inference and Counterfactual Reasoning

Monday, February 11 | 9:00 AM–12:30 PM
Emre Kiciman, Amit Sharma

As computing systems are more frequently and more actively intervening to improve people’s work and daily lives, it is critical to correctly predict and understand the causal effects of these interventions. This tutorial will introduce participants to concepts in causal inference and counterfactual reasoning, drawing from abroad literature from statistics, social sciences and machine learning. To tackle such questions, we will introduce the key ingredient that causal analysis depends on—counterfactual reasoning—and describe the two most popular frameworks based on Bayesian graphical models and potential outcomes. Based on this, we will cover a range of methods suitable for doing causal inference with large-scale online data, including randomized experiments, observational methods like matching and stratification, and natural experiment-based methods such as instrumental variables and regression discontinuity. We will also focus on best practices for evaluation and validation of causal inference techniques, drawing from our own experiences.

Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned

Monday, February 11 | 1:30 PM–5:00 PM
Sarah Bird, Krishnaram Kenthapadi, Emre Kiciman, Margaret Mitchell

Researchers and practitioners from different disciplines have highlighted the ethical and legal challenges posed by the use of machine-learned models and data-driven systems, and the potential for such systems to discriminate against certain population groups, due to biases in algorithmic decision-making systems. This tutorial presents an overview of algorithmic bias/discrimination issues observed over the last few years and the lessons learned key regulations and laws, and evolution of techniques for achieving fairness in machine learning systems. We will motivate the need for adopting a “fairness by design” approach (as opposed to viewing algorithmic bias/fairness considerations as an afterthought) when developing machine learning based models and systems for different consumer and enterprise applications. Then, we will focus on the application of fairness-aware machine learning techniques in practice, by presenting non-proprietary case studies from different technology companies. Finally, based on our experiences working on fairness in machine learning at companies such as Facebook, Google, LinkedIn, and Microsoft, we will present open problems and research directions for the data mining/machine learning community.

Industry Day

Industry Day

A Case Study on Microsoft’s Is User Growth a Peril to Research Progress?

Monday, February 11 | 11:00 AM–12:30 PM
Puneet Agrawal

Striking a balance between business goals such as user growth and deep meaningful research is always a challenging task in an industrial research setting. In this talk, taking Microsoft’s Ruuh as a case study, we will discuss the challenges and opportunities in the industry when it comes to research. Microsoft’s Ruuh was conceptualized about 2.5 years back and the main product promise of Ruuh is to be able to talk to its users on any subject they choose. We realized that the promise meant thinking beyond the utilitarian notion of merely generating “relevant” responses and enabling Ruuh to comprehend and meet a wider range of user social needs, like expressing happiness when user’s favorite team wins, sharing a cute comment on showing the pictures of the user’s pet and so on. At the onset, this seems an impossible task to achieve coupled with aggressive release deadline and pressure to grow usage. However, in this talk we will discuss how our research progress helped the user growth and vice versa, and also discuss scenarios where we suffered setbacks. A good quality product leads to high usage which in turn provides the much-needed data to improve the research and understand the flaws in the current approach. At the same time, high usage of the product forces the team to focus on the efficiency, cost per query and other infrastructure related workloads. This talk will take real-world examples and explain these tradeoffs. More details of the talk are presented in last section.

‘No Interaction’ as Indicator of Search Satisfaction: Accounting for Good Abandonment in User Success Metrics

Monday, February 11 | 11:00 AM–12:30 PM
Widad Machmouchi

At Bing, measuring user success has always been a deciding factor as to which feature or change is shipped to production. Testing such changes is carried out through randomized controlled experiments, where success metrics are used to measure the treatment effect on user satisfaction. Over the years, we have designed and refined our metrics to capture various user interactions, from search queries to clicks and hovers, and interpreted them to predict users’ satisfaction with the search engine. One of the main scenarios that is hard to interpret is search result page abandonment, where the user doesn’t click on the page or interact with any specific element. In this scenario of abandonment, we need to differentiate cases where the user abandoned due to getting the information they need without clicking on any results, from those where the user abandoned due to a defective and/or unsatisfactory search result page. In this talk, we outline Bing’s journey in addressing this measurement problem. We talk about our initial effort of considering the presence of specific elements on the page as indicator of success; to our offline/online hybrid approach to identify good abandonment; and finally, to a fully-online solution that relies on a user’s behavior across their search session. We also cover the pitfalls of the different approaches, how we evaluate them and the current challenges and problems left to solve.


Half-Day Workshops

Task Intelligence Workshop

Friday, February 15 | 9:00 AM–2:00 PM

Organizers: Ahmed Hassan-Awadallah, Cathal Gurrin, Mark Sanderson, Ryen W. White

Invited Speaker: Paul Bennett

Accepted Paper: Supporting Complex Tasks Using Multiple Devices (Elnaz Nouri, Adam Fourney, Robert Sim, Ryen W. White)

Tasks are defined pieces of work that range in scope from specific (sending an email) to broad (planning a wedding) and are central to all aspects of information access and use. Task intelligence spans technologies and experiences to extract, understand, and support the completion of short- and long-term tasks. Helping users complete tasks is a key capability of search systems, digital assistants, and productivity applications and poses core challenges in data mining and knowledge representation and draws on additional research from areas such as machine learning and natural language processing. The workshop will comprise a mixture of research paper presentations, reports from data challenge participants, including system demonstrations if available.

Job Opportunities


Artificial Intelligence
Causality and Machine Learning

Post Doc Researcher

Artificial Intelligence

Data Scientist

Office AI
Substrate Query Intelligence

Data & Applied Scientist

Bing Ads
Cloud AI

Data Scientist II

Microsoft 365 team

Sr Data Scientist

Microsoft 365 team

Principal Data Scientist

Microsoft Developer


WSDM Conference Analytics

Microsoft Academic | February 1, 2019

The Microsoft Academic Graph makes it possible to gain analytic insights about any of the entities within it: publications, authors, institutions, topics, journals, and conferences.