2022 has seen remarkable progress in foundational technologies that have helped to advance human knowledge and create new possibilities to address some of society’s most challenging problems. Significant advances in AI have also enabled Microsoft to bring new capabilities to customers through our products and services, including GitHub Copilot, an AI pair programmer capable of turning natural language prompts into code, and a preview of Microsoft Designer, a graphic design app that supports the creation of social media posts, invitations, posters, and one-of-a-kind images.
These offerings provide an early glimpse of how new AI capabilities, such as large language models, can enable people to interact with machines in increasingly powerful ways. They build on a significant, long-term commitment to fundamental research in computing and across the sciences, and the research community at Microsoft plays an integral role in advancing the state of the art in AI, while working closely with engineering teams and other partners to transform that progress into tangible benefits.
In 2022, Microsoft Research established AI4Science, a global organization applying the latest advances in AI and machine learning toward fundamentally transforming science; added to and expanded the capabilities of the company’s family of foundation models; worked to make these models and technologies more adaptable, collaborative, and efficient; further developed approaches to ensure that AI is used responsibly and in alignment with human needs; and pursued different approaches to AI, such as causal machine learning and reinforcement learning.
We shared our advances across AI and many other disciplines during our second annual Microsoft Research Summit, where members of our research community gathered virtually with their counterparts across industry and academia to discuss how emerging technologies are being explored and deployed to bring the greatest possible benefits to humanity.
Plenary sessions at the event focused on the transformational impact of deep learning on the way we practice science, research that empowers medical practitioners and reduces inequities in healthcare, and emerging foundations for planet-scale computing. Further tracks and sessions over three days provided deeper dives into the future of the cloud; efficient large-scale AI; amplifying human productivity and creativity; delivering precision healthcare; building user trust through privacy, identity, and responsible AI; and enabling a resilient and sustainable world.
In this blog post, we look back at some of the key achievements and notable work in AI and highlight other advances across our diverse, multidisciplinary, and global organization.
Advancing AI foundations and accelerating progress
Over the past year, the research community at Microsoft made significant contributions to the rapidly evolving landscape of powerful large-scale AI models. Microsoft Research and the Microsoft Turing team unveiled a new Turing Universal Language Representation model capable of performing both English and multilingual understanding tasks. In computer vision, advancements for the Project Florence-VL (Florence-Vision and Language) team spanned still imagery and video: its GIT model was the first to surpass human performance on the image captioning benchmark TextCaps; LAVENDER showed strong performance in video question answering, text-to-video retrieval, and video captioning; and GLIP and GLIPv2 combined localization and vision-language understanding. The group also introduced NUWA-Infinity, a model capable of converting text, images, and video into high-resolution images or long-duration video. Meanwhile, the Visual Computing Group scaled up its Transformer-based general-purpose computer vision architecture, Swin Transformer, achieving applicability across more vision tasks than ever before.
Researchers from Microsoft Research Asia and the Microsoft Turing team also introduced BEiT-3, a general-purpose multimodal foundation model that achieves state-of-the-art transfer performance on both vision and vision-language tasks. In BEiT-3, researchers introduce Multiway Transformers for general-purpose modeling, where the modular architecture enables both deep fusion and modality-specific encoding. Based on the shared backbone, BEiT-3 performs masked “language” modeling on images (Imglish), texts (English), and image-text pairs (“parallel sentences”) in a unified manner. The code and pretrained models will be available at GitHub.
One of the most crucial accelerators of progress in AI is the ability to optimize training and inference for large-scale models. In 2022, the DeepSpeed team made a number of breakthroughs to improve mixture of experts (MoE) models, making them more efficient, faster, and less costly. Specifically, they were able to reduce training cost by 5x, reduce MoE parameter size by up to 3.7x, and reduce MoE inference latency by 7.3x while offering up to 4.5x faster and 9x cheaper inference for MoE models compared to quality-equivalent dense models.
Transforming scientific discovery and adding societal value
Our ability to comprehend and reason about the natural world has advanced over time, and the new AI4Science organization, announced in July, represents another turn in the evolution of scientific discovery. Machine learning is already being used in the natural sciences to model physical systems using observational data. AI4Science aims to dramatically accelerate our ability to model and predict natural phenomena by creating deep learning emulators that learn by using computational solutions to fundamental equations as training data.
This new paradigm can help scientists gain greater insight into natural phenomena, right down to their smallest components. Such molecular understanding and powerful computational tools can help accelerate the discovery of new materials to combat climate change, and new drugs to help support the prevention and treatment of disease.
For instance, AI4Science’s Project Carbonix is working on globally accessible, at-scale solutions for decarbonizing the world economy, including reverse engineering materials that can pull carbon out of the environment and recycling carbon into materials. Collaborating on these efforts through the Microsoft Climate Research Initiative (MCRI) are domain experts from academia, industry, and government. Announced in June, MCRI is focused on areas such as carbon accounting, climate risk assessments, and decarbonization.
As part of the Generative Chemistry project, Microsoft researchers have been working with the global medicines company Novartis to develop and execute machine learning tools and human-in-the-loop approaches to enhance the entire drug discovery process. In April, they introduced MoLeR, a graph-based generative model for designing compounds that is more reflective of how chemists think about the process and is more efficient and practical than an earlier generative model the team developed.
While AI4Science is focused on computational simulation, we have seen with projects like InnerEye that AI can have societal value in many other ways. In March, Microsoft acquired Nuance Communications Inc., further cementing the companies’ shared commitment to outcome-based AI across industries, particularly in healthcare. Tools like the integration of Microsoft Teams and Dragon Ambient eXperience (Nuance DAX) to help ease the administrative burden of physicians and support meaningful doctor-patient interactions are already making a difference.
Making AI more adaptable, collaborative, and efficient
To help accelerate the capabilities of large-scale AI while building a landscape in which everyone can benefit from it, the research community at Microsoft aimed to drive progress in three areas: adaptability, collaboration, and efficiency.
To provide consistent value, AI systems must respond to changes in task and environment. Research in this area includes multi-task learning with task-aware routing of inputs, knowledge-infused decoding, model repurposing with data-centric ML, pruning and cognitive science or brain-inspired AI. A good example of our work toward adaptability is GODEL, or Grounded Open Dialogue Language Model, which ushers in a new class of pretrained language models that enable chatbots to help with tasks and then engage in more general conversations.
Microsoft’s research into more collaborative AI includes AdaTest, which leverages human expertise alongside the generative power of large language models to help people more efficiently find and correct bugs in natural language processing models. Researchers have also explored expanding the use of AI in creative processes, including a project in which science fiction writer Gabrielle Loisel used OpenAI’s GPT-3 to co-author a novella and other stories.
To enable more people to make use of AI in an efficient and sustainable way, Microsoft researchers are pursuing several new architectures and training paradigms. This includes new modular architectures and novel techniques, such as DeepSpeed Compression, a composable library for extreme compression and zero-cost quantization, and Z-Code Mixture of Experts models, which boost translation efficiency and were deployed in Microsoft Translator in 2022.
In December, researchers unveiled AutoDistil, a new technique that leverages knowledge distillation and neural architecture search to improve the balance between cost and performance when generating compressed models. They also introduced AdaMix, which improves the fine-tuning of large pretrained models for downstream tasks using mixture of adaptations modules for parameter-efficient model tuning. And vision-language model compression research on the lottery ticket hypothesis showed that pretrained language models can be significantly compressed without hurting their performance.
Building and deploying AI responsibly
Building AI that maximizes its benefit to humanity, and does so equitably, requires considering both the opportunities and risks that come with each new advancement in line with our guiding principles: fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability.
Helping to put these principles into practice is Microsoft’s Responsible AI Standard, which the company made publicly available in June. The standard comprises tools and steps that AI practitioners can execute in their workflows today to help ensure that building AI responsibly is baked into every stage of development. These standards will evolve as the tools and resources to responsibly build AI evolve in response to the rapid pace of AI advancement, particularly pertaining to the growing size of AI models and the new challenges they bring.
With FedKD and InclusiveFL, researchers tackled some of the obstacles in applying federated learning, an ML method for protecting privacy, to model training. Two separate teams explored solutions for the harmful language that large generative models can reproduce—one presenting a unified framework for both detoxifying and debiasing models and another introducing methods for making content moderation tools more robust. Meanwhile, researchers sought to strengthen human-AI collaboration by giving users more insight into how models arrive at their outputs via explanations provided by the models themselves.
The responsible development of AI also means deploying technologies that operate the way they were designed to—and the way people expect them to. In a pair of blog posts, researchers draw on their respective experiences developing a technology to support social agency in children who are born blind and another to support mental health practitioners in guiding patient treatment to stress the need for multiple measures of performance in determining the readiness of increasingly complex AI systems and the incorporation of domain experts and user research throughout the development process.
Advancing AI for decision making
Building the next generation of AI requires continuous research into fundamental new AI innovations. Two significant areas of study in 2022 were causal ML and reinforcement learning.
Identifying causal effects is an integral part of scientific inquiry. It helps us understand everything from educational outcomes to the effects of social policies to risk factors for diseases. Questions of cause and effect are also critical for the design and data-driven evaluation of many technological systems we build today.
This year, Microsoft Research continued its work on causal ML, which combines traditional machine learning with causal inference methods. To help data scientists better understand and deploy causal inference, Microsoft researchers built the DoWhy library, an end-to-end causal inference tool, in 2018. To broaden access to this critical knowledge base, DoWhy has now migrated to an independent open-source governance model in a new PyWhy GitHub organization. As part of this new collaborative model, Amazon Web Services is contributing new technology based on structural causal models.
At this year’s Conference on Neural Information Processing Systems (NeurIPS), researchers presented a suite of open-source causal tools and libraries that aims to simultaneously provide core causal AI functionality to practitioners and create a platform for research advances to be rapidly deployed. This includes ShowWhy, a no-code user interface suite that empowers domain experts to become decision scientists. We hope that our work accelerates use-inspired basic research for improvement of causal AI.
Reinforcement learning (RL)
Reinforcement learning is a powerful tool for learning which behaviors are likely to produce the best outcomes in a given scenario, typically through trial and error. But this powerful tool faces some challenges. Trial and error can consume enormous resources when applied to large datasets. And for many real-time applications, there’s no room to learn from mistakes.
To address RL’s computational bottleneck, Microsoft researchers developed Path Predictive Elimination, a reinforcement learning method that is robust enough to remove noise from continuously changing environments. Also in 2022, a Microsoft team released MoCapAct, a library of pretrained simulated models to enable advanced research on artificial humanoid control at a fraction of the compute resources currently required.
Researchers also developed a new method for using offline RL to augment human-designed strategies for making critical decisions. This team deployed game theory to design algorithms that can use existing data to learn policies that improve on current strategies.
Readers’ choice: Notable blog posts for 2022
- Microsoft has demonstrated the underlying physics required to create a new kind of qubit
- µTransfer: A technique for hyperparameter tuning of enormous neural networks
- Powering the next generation of trustworthy AI in a confidential cloud using NVIDIA GPUs
- DeepSpeed Compression: A composable library for extreme compression and zero-cost quantization
- DoWhy evolves to independent PyWhy model to help causal inference grow
- PeopleLens: Using AI to support social interaction between children who are blind and their peers
- Microsoft Research Summit 2022: What’s Next for Technology and Humanity?
- Using reinforcement learning to identify high-risk states and treatments in healthcare
- GODEL: Combining goal-oriented dialog with real-world conversations
- Swin Transformer supports 3-billion-parameter vision models that can train with higher-resolution images for greater task applicability
- FLUTE: A scalable federated learning simulation platform
- (De)ToxiGen: Leveraging large language models to build more robust hate speech detection tools
Thank you for reading
2022 was an exciting year for research, and we look forward to the future breakthroughs our global research community will deliver. In the coming year, you can expect to hear more from us about our vision, and the impact we hope to achieve. We appreciate the opportunity to share our work with you, and we hope you will subscribe to the Microsoft Research Newsletter for the latest developments.
Writers and Editors
Editor in Chief