Microsoft Research Blog

English

NeMoEval: A Benchmark Tool for Natural Language-based Network Management

May 7, 2024 | Kevin Hsieh

This is a benchmark tool to evaluate natural language-based network management using LLM-generated code.
Yatesbury: A Benchmark for East-West Network Security

May 7, 2024 | Kevin Hsieh

This dataset serves as a benchmark for evaluting the performance and efficiency of anomaly detectors in east-west data center network traffic.
LoftQ: Reimagining LLM fine-tuning with smarter initialization

May 7, 2024

LoftQ boosts LLM efficiency by streamlining the fine-tuning process, reducing computational demands while preserving high performance. Innovations like this can help make AI technology more energy-efficient.
Differentially Private Synthetic Data via Foundation Model APIs 1: Images

May 7, 2024

Generating differentially private (DP) synthetic data that closely resembles the original private data without leaking sensitive user information is a scalable way to mitigate privacy concerns in the current data-driven world. In contrast to current practices that train customized models for this task, we aim…
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training

May 7, 2024

Zero Redundancy Optimizer (ZeRO) has been used to train a wide range of large language models on massive GPUs clusters due to its ease of use, efficiency, and good scalability. However, when training on low-bandwidth clusters, or at scale which forces batch size per GPU…
Microsoft goes from bad boy to top cop in the age of AI

May 7, 2024 | Juan M. Lavista Ferres

Alongside efforts to use artificial intelligence to find new cures for cancer and combat climate change, the Microsoft AI for Good's small engineering team has another job: figuring out how to detect AI-powered deepfake videos, audio clips and images bombarding elections worldwide. So far, Lavista…
Privately Aligning Language Models with Reinforcement Learning

May 7, 2024

Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT. In this work, we initiate the study of privacy-preserving alignment of LLMs through Differential Privacy (DP) in…
Microsoft at ICLR 2024

May 7, 2024

Microsoft is proud to be a sponsor of The International Conference on Learning Representatives (ICLR) (opens in new tab). This premier gathering of professionals is dedicated to the advancement of the branch of artificial intelligence called representation learning. ICLR is globally renowned for presenting and…
Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

May 7, 2024

We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm that generates synthetic few-shot demonstrations from…
Unifying Feature and Cost Aggregation with Transformers for Dense Correspondence

May 7, 2024 | Sunghwan Hong, Seokju Cho, Seungryong Kim, and Stephen Lin

This paper introduces a Transformer-based integrative feature and cost aggregation network designed for dense matching tasks. In the context of dense matching, many works benefit from one of two forms of aggregation: feature aggregation, which pertains to the alignment of similar features, or cost aggregation,…
Addressing Signal Delay in Deep Reinforcement Learning

May 7, 2024 | Wei Wang, Dongqi Han, Xufang Luo, and Dongsheng Li

Despite the notable advancements in deep reinforcement learning (DRL) in recent years, a prevalent issue that is often overlooked is the impact of signal delay. Signal delay occurs when there is a lag between an agent's perception of the environment and its corresponding actions. In…
You Only Cache Once: Decoder-Decoder Architectures for Language Models

May 7, 2024

We introduce a decoder-decoder architecture, YOCO, for large language models, which only caches key-value pairs once. It consists of two components, i.e., a cross-decoder stacked upon a self-decoder. The self-decoder efficiently encodes global key-value (KV) caches that are reused by the cross-decoder via cross-attention. The…

No results