Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Optimizing imperative functions in relational databases with Froid

Optimizing imperative functions in relational databases with Froid

For decades, databases have supported declarative SQL as well as imperative functions and procedures as ways for users to express data processing tasks. While the evaluation of declarative SQL has received a lot of attention resulting in highly sophisticated techniques, improvements in the efficient evaluation of imperative programs have remained elusive. Imperative User-Defined Functions (UDFs) and procedures offer several benefits over SQL, including code modularity, reusability and readability. Because of these benefits, imperative programs often…

August 2018

Microsoft Research Blog

DoWhy – A library for causal inference

For decades, causal inference methods have found wide applicability in the social and biomedical sciences. As computing systems start intervening in our work and daily lives, questions of cause-and-effect are gaining importance in computer science as well. To enable widespread use of causal inference, we are pleased to announce a new software library, DoWhy. Its name is inspired by Judea Pearl’s do-calculus for causal inference. In addition to providing a programmatic interface for popular causal…

August 2018

Microsoft Research Blog

Announcing Microsoft Research Open Data – Datasets by Microsoft Research now available in the cloud

The Microsoft Research Outreach team has worked extensively with the external research community to enable adoption of cloud-based research infrastructure over the past few years. Through this process, we experienced the ubiquity of Jim Gray’s fourth paradigm of discovery based on data-intensive science – that is, almost all research projects have a data component to them. This data deluge also demonstrated a clear need for curated and meaningful datasets in the research community, not only…

June 2018

Microsoft Research Blog

Using transfer learning to address label noise for large-scale image classification

In this post, we introduce how to use transfer learning to address label noise for large-scale image classification tasks. We’ll avoid describing the approach using too much math. If you are interested in the deeper theory behind this approach, please refer to our paper, “CleanNet: Transfer learning for scalable image classifier training with label noise,” presented at CVPR 18 in Salt Lake City, Utah. One of the key factors driving recent advances in image classification…

June 2018

Microsoft Research Blog

Microsoft Unveils FASTER – a key-value store for large state management

At SIGMOD 2018, a team from Microsoft Research will be presenting a new embedded key-value store called FASTER, described in their paper “FASTER: A Concurrent Key-Value Store with In-Place Updates”. As its name suggests, FASTER makes a major leap forward in terms of supporting fast and frequent lookups and updates of large amounts of state information – a particularly challenging problem for applications in the cloud today. For example, in scenarios such as Internet-of-Things, billions…

June 2018

Microsoft Research Blog

Microsoft and Tsinghua University Work Together on Open Academic Data Research

In a recent collaboration, Microsoft and China’s Tsinghua University released an academic graph, named Open Academic Graph (OAG). This billion-scale academic graph integrates the current Microsoft Academic Graph (MAG) and Tsinghua’s AMiner academic graph. Specifically, it contains the metadata information of 155 million academic paper metadata from AMiner and 166 million papers from MAG. By consolidating metadata information of each, it generates nearly 65 million matching relationships between the two academic graphs [1]. The construction…

March 2018

Microsoft Research Blog

Improving AI Systems with Human Feedback and no Heartburn

Humans play an indispensable role in many modern AI-enabled services – not just as consumers of the service, but as the actual intelligence behind the artificial intelligence. From news portals to e-commerce websites, it is people’s ratings, clicks, and other interactions which provide a teaching signal used by the underlying intelligent systems to learn. While these human-in-the-loop systems improve through user interaction over time, they must also provide enough short-term benefit to people to be…

February 2018

Microsoft Research Blog

Microsoft researchers unlock the black box of network embedding

At the ACM Conference on Web Search and Data Mining 2018, my team will introduce research that, for the first time, provides a theoretical explanation of popular methods used to automatically map the structure and characteristics of networks, known as network embedding. We then use this theoretical explanation to present a new network embedding method that performs as well as or better than existing methods. Networks are fundamental ways of representing knowledge and relating to…

February 2018

Microsoft Research Blog

Class of 2018-19 PhD fellows to push frontiers of AI

Class of 2018-19 PhD fellows to push frontiers of AI By Sandy Blyth A graduate student working on technology that leverages human brain signals to accelerate robot learning and another student who is developing models of human conversations that capture what is explicitly communicated and implicitly conveyed are among the diverse group of ten fellows accepted to the Microsoft Research PhD Fellowship Program for the 2018-2019 academic year, Microsoft’s research organization announced on Tuesday. “These…

January 2018

Microsoft Research Blog

FigureQA

FigureQA: an annotated figure dataset for visual reasoning

Reasoning about figures Almost every scientific publication is accompanied by data visualizations in the form of graphs and charts. Figures are an intuitive aid for understanding the content of documents, so naturally, it is useful to leverage this visual information for machine reading comprehension. To enable research in this domain we built FigureQA, a new dataset composed of figure images – like bar graphs, line plots, and pie charts – and question and answer pairs about them. We…

November 2017

Microsoft Research Blog

Transportation Data Science at Microsoft

By Vani Mandava, Director, Data Science Outreach, Microsoft Research The National Science Foundation (NSF)-supported Big Data Innovation Hubs launched a National Transportation Data Challenge with a kickoff event in Seattle in May 2017. Microsoft Outreach, through its partnership with the Big Data Hubs organized an Azure workshop and participated in a panel discussion on ‘How Cloud Computing Can Enable Transportation Data Science.’ The kickoff was the first in a series of events that are being…

July 2017

Microsoft Research Blog