MSR NYC Lab Opening


May 16, 2014


Duncan Watts, Fernando Diaz, Kate Crawford, David Rothschild, Justin Rao, Jennifer Chayes, Dan Huttenlocher, Kathy McKeown, danah boyd, and David Pennock


Microsoft Research New York City, Microsoft Research New England and New York City, Cornell Tech, Columbia University


In August 2013, Microsoft Research New York City moved into permanent lab space that was built-out specifically to meet the standards of the world-class research it needed to host. With strong ties to both the academic community and the technology industry, Microsoft Research is thrilled to open our doors in the heart of Silicon Alley, close to New York University and steps away from other Microsoft engineers in the Yammer and Skype divisions.

Duncan Watts – Computational Social Science: Exciting Progress and Future Directions

The past 15 years have witnessed a remarkable increase in both the scale and scope of social and behavioral data available to researchers. Over the same period, and driven by the same explosion in data, the study of social phenomena has increasingly become the province of computer scientists, physicists, and other “hard” scientists. Papers on social networks and related topics appear routinely in top science journals and computer science conferences; network science research centers and institutes are sprouting up at top universities; and funding agencies from DARPA to NSF have moved quickly to embrace what is being called “computational social science.” Against these exciting developments stands a stubborn fact: in spite of many thousands of published papers, there’s been surprisingly little progress on the “big” questions that motivated the field of computational social science—questions concerning systemic risk in financial systems, problem solving in complex organizations, and the dynamics of epidemics or social movements, among others. Of the many reasons for this state of affairs, I concentrate here on three. First, social science problems are almost always more difficult than they seem. Second, the data required to address many problems of interest to social scientists remain difficult to assemble. And third, thorough exploration of complex social problems often requires the complementary application of multiple research traditions—statistical modeling and simulation, social and economic theory, lab experiments, surveys, ethnographic fieldwork, historical or archival research, and practical experience—many of which will be unfamiliar to any one researcher. In addition to explaining the particulars of these challenges, I sketch out some ideas for addressing them.

Fernando Diaz and Kate Crawford – Social and Technical Challenges in Crisis Informatics

Over more than fifty years, information retrieval research has established a set of design principles which have been used to build information access tools for collections of legal documents, news archives, and even the Web. Crisis informatics refers to the study and development of information access tools for support during unexpected crisis events such as natural disasters and other human tragedies. These events often undermine many of the assumptions made in information retrieval research, resulting in system underperformance and catastrophic failure. We will present our approach to crisis informatics, which balances the insights from qualitative and ethnographic methodologies with engineering based on data-driven experimentation. By bringing together techniques from information retrieval and qualitative social science together, we can develop more robust and critical approaches that account for algorithmic challenges and the need for local knowledge and community engagement. This allows us to avoid biased results, signal gaps, and poor policy decisions. This talk will draw on collaborative work across MSR-NYC, MSRNE, MSR Israel and MSR Cambridge UK, including Javed Aslam, Matthew Ekstrand-Abueg, Megan Finn, Qi Guo, Virgil Pavlu, Soren Preibusch, Tetsuya Sakai, and Elad Yom-Tov.

David Rothschild and Justin Rao – Medium Data: Where Small Data Meets Big Data

People increasingly consume live events with second screens distracting them; in the near future this attention will be captured by the broadcaster. The broadcaster will display information, derived from new data analytics (econometrics and machine learning) on a mix of new and traditional data streams (fundamentals, social media, online, polling, and prediction games) with the goal of increasing interaction. The raw data from interactions is processed and fed back to the broadcast leading to more information and then more interactions. All of this happens in real time with increasingly sophisticated data infrastructure. While this is going to appear to the consumer as a “big data” revolution, in the near term, conventionally titled “big data” is going to be more of sideshow as the core revolution in low latency, quantifiable, personalized experiences is driven by “medium data”: faster infrastructure and more sophisticated analytics on traditional data sources.

Jennifer Chayes, Dan Huttenlocher, Kathleen McKeown, and Clay Shirky – Panel: Interaction Between the Academy and Silicon Alley in the Age of Data Science

Data Science, broadly defined, promises to advance many aspects of science and technology through personalization of everything from the way we interact with our phones to our medical treatments. In this panel, we discuss the promises and challenges of data science, as well as the responsibility of educating a new generation of data scientists.


Duncan Watts, Fernando Diaz, Kate Crawford, David Rothschild, Justin Rao, Jennifer Chayes, Dan Huttenlocher, Kathy McKeown, danah boyd, and David Pennock

Prior to joining Microsoft, Duncan Watts was a Senior Principal Research Scientist at Yahoo! Research, where he directed the Human Social Dynamics group. Prior to joining Yahoo!, he was a full professor of Sociology at Columbia University, where he taught from 2000-2007. His research on social networks and collective dynamics has appeared in a wide range of journals, from Nature, Science, and Physical Review Letters to the American Journal of Sociology and Harvard Business Review. He is also the author of three books, most recently “Everything is Obvious (Once You Know The Answer)” (Crown Business, 2011). He holds a B.Sc. in Physics from the Australian Defence Force Academy, and a Ph.D. in Theoretical and Applied Mechanics from Cornell University.

Fernando Diaz is a researcher at Microsoft. His primary research interest is formal information retrieval models and his research experience includes distributed information retrieval approaches to web search, interaction logging and modeling, interactive and faceted retrieval, mining of temporal patterns from news and query logs, cross-lingual information retrieval, graph-based retrieval methods, and synthesizing information from multiple corpora. Fernando received his PhD from the University of Massachusetts Amherst in 2008. His work on federation won the best paper awards at the WSDM 2009, SIGIR 2009, and ECIR 2011 conferences. His work on crisis informatics has received awards at SIGIR 2011 and ISCRAM 2013. He is a co-organizer of the Temporal Summarization track and Web track at TREC 2013 and WSDM 2014.

Kate Crawford is a Principal Research at Microsoft Research in NYC.

David Rothschild is an economist at Microsoft Research New York City. He has a Ph.D. in applied economics from the Wharton School of Business at the University of Pennsylvania. His primary body of work is on forecasting and understanding public interest and sentiment. Related work examines how the public absorbs information. He has written extensively, in both the academic and popular press, on polling, prediction markets, and predictions of upcoming events; most of his popular work has focused on predicting elections and an economist’s take on public policy. After joining Microsoft Research in May, he has been busy building prediction and sentiment models, as well as organizing novel/experimental polling and prediction games. In February 2012, he correctly predicted 50 of 51 Electoral College outcomes in the U.S. presidential election the following November.

Justin M. Rao is a an Economic Researcher at Microsoft Research in the New York City research lab. He is an empirical microeconomist who came to Microsoft following a two year stint with Yahoo! Research in Silicon Valley. He completed my Ph.D. in economics in 2010 at UC San Diego under the capable guidance of Jim Andreoni. More information on his research, talk videos, and more can be found at

Jennifer Tour Chayes is Managing Director of Microsoft Research New York City as well as the Microsoft Research New England lab in Cambridge. Before this, she was research area manager for Mathematics, Theoretical Computer Science and Cryptography at Microsoft Research Redmond. Chayes joined Microsoft Research in 1997, when she co-founded the Theory Group. Her research areas include phase transitions in discrete mathematics and computer science, structural and dynamical properties of self-engineered networks, and algorithmic game theory. She is the co-author of almost 100 scientific papers and the co-inventor of more than 20 patents.

Dan Huttenlocher has overall responsibility for programmatic aspects of the new campus, including the academic quality and direction of the campus’ degree programs and research. More specifically, he identifies effective strategies of working with companies and early stage investors in New York City as well as overseeing faculty recruitment and the campus’ entrepreneurial initiatives. Huttenlocher has a mix of academic and industry background. He received his bachelor’s degree from the University of Michigan and both his master’s and doctorate degrees from Massachusetts Institute of Technology. He serves as a Trustee of the John D. and Catherine T. MacArthur Foundation.

Kathleen McKeown is the Henry and Gertrude Rothschild Professor of Computer Science and currently serves as the Director of Columbia’s Institute for Data Sciences and Engineering. Her research focuses on text summarization, question answering and text-to-text generation. She leads a research project involving prediction of scientific impact from a large collection of journal articles. McKeown joined Columbia in 1982, immediately after earning her Ph.D. from University of Pennsylvania. In 1989, she became the first woman professor in the school to receive tenure, and later the first woman to serve as a department chair (1998-2003). McKeown has received numerous honors and awards for her research and teaching. She received the National Science Foundation Presidential Young Investigator Award in 1985, and also is the recipient of a National Science Foundation Faculty Award for Women, was selected as an AAAI Fellow, a Founding Fellow of the Association for Computational Linguistics and an ACM Fellow. In 2010, she won both the Columbia Great Teacher Award—an honor bestowed by the students—and the Anita Borg Woman of Vision Award for Innovation.

McKeown served as a board member of the Computing Research Association and as secretary of the board. She was president of the Association of Computational Linguistics in 1992, vice president in 1991, and secretary treasurer for 1995-1997. She was also a member of the Executive Council of the Association for Artificial Intelligence and the co-program chair of their annual conference in 1991.

My name is danah boyd and I’m a Principal Researcher at Microsoft Research, a Research Assistant Professor in Media, Culture, and Communication at New York University, and a Fellow at Harvard’s Berkman Center. I am an academic and a scholar and my research examines social media, youth practices, tensions between public and private, social network sites, and other intersections between technology and society.

My research focuses on how young people use social media aspart of their everyday practices. In recent years, I have studied Twitter, blogging, social network sites (e.g. Friendster, MySpace, Facebook…), tagging, and other forms of social media. I have written papers on a variety of different topics, from digital backchannels to social visualization design, privacy to teen drama. I also blog and tweet frequently on a wide variety of topics. Along with other members of the MacArthur Foundation-funded project on digital media and learning, I helped co-author a newly published book: Hanging Out, Messing Around, and Geeking Out: Kids Living and Learning with New Media. My new book: It’s Complicated: The Social Lives of Networked Teens will be released on February 25, 2014.

In 2008, I completed my PhD at the School of Information (iSchool) at the University of California-Berkeley. My dissertation research was funded as a part of the MacArthur Foundation’s Initiative on New Media and Learning. My research was supervised by a most astonishing committee: Mimi Ito, Annalee Saxenian, Cori Hayden, and Jenna Burrell. My beloved PhD advisor – Peter Lyman – lost his battle with brain cancer in July 2007. I miss him dreadfully.

I did my Master’s Degree at the MIT Media Lab’s Sociable Media Group with Judith Donath (supervised also by Henry Jenkins and Genevieve Bell). My master’s thesis focused on how people manage their presentation of self in relation to social contextual information in online environments. As an undergraduate, I studied computer science at Brown University, advised by Andy van Dam. My undergrad thesis focused on how prioritization of depth cues is dependent on levels of sex hormones in the body and how this affects engagement with virtual reality.

David Pennock is a Principal Researcher and Assistant Managing Director of Microsoft Research in New York City, focusing on algorithmic economics. He has over sixty academic publications relating to computational issues in electronic commerce and the web, including papers in PNAS, Science, IEEE Computer, Theoretical Computer Science, Algorithmica, AAAI, EC, KDD, UAI, SIGIR, ICML, NIPS, and WWW. He has authored three patents and thirteen patent applications. In 2005, he was named to MIT Technology Review’s list of 35 top technology innovators under age 35. Prior to his current position, David worked as a Principal Research Scientist at Yahoo! Research, a Research Scientist at NEC Laboratories America, a research intern at Microsoft Research, and in 2001 served as an adjunct professor at Pennsylvania State University. He received a Ph.D. in Computer Science from the University of Michigan, an M.S. in Computer Science from Duke University, and a B.S. in Physics from Duke. His work has been featured in Discover Magazine, New Scientist, CNN, the New York Times, the Economist, Surowiecki’s “The Wisdom of Crowds”, and other publications.