NeMoEval: A Benchmark Tool for Natural Language-based Network Management
This is a benchmark tool to evaluate natural language-based network management using LLM-generated code.
This is a benchmark tool to evaluate natural language-based network management using LLM-generated code.
This dataset serves as a benchmark for evaluting the performance and efficiency of anomaly detectors in east-west data center network traffic.
LoftQ boosts LLM efficiency by streamlining the fine-tuning process, reducing computational demands while preserving high performance. Innovations like this can help make AI technology more energy-efficient.
Generating differentially private (DP) synthetic data that closely resembles the original private data without leaking sensitive user information is a scalable way to mitigate privacy concerns in the current data-driven world. In contrast to current practices that train customized models for this task, we aim…
Zero Redundancy Optimizer (ZeRO) has been used to train a wide range of large language models on massive GPUs clusters due to its ease of use, efficiency, and good scalability. However, when training on low-bandwidth clusters, or at scale which forces batch size per GPU…
Alongside efforts to use artificial intelligence to find new cures for cancer and combat climate change, the Microsoft AI for Good's small engineering team has another job: figuring out how to detect AI-powered deepfake videos, audio clips and images bombarding elections worldwide. So far, Lavista…
Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT. In this work, we initiate the study of privacy-preserving alignment of LLMs through Differential Privacy (DP) in…
Microsoft is proud to be a sponsor of The International Conference on Learning Representatives (ICLR) (opens in new tab). This premier gathering of professionals is dedicated to the advancement of the branch of artificial intelligence called representation learning. ICLR is globally renowned for presenting and…
We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm that generates synthetic few-shot demonstrations from…
This paper introduces a Transformer-based integrative feature and cost aggregation network designed for dense matching tasks. In the context of dense matching, many works benefit from one of two forms of aggregation: feature aggregation, which pertains to the alignment of similar features, or cost aggregation,…
Despite the notable advancements in deep reinforcement learning (DRL) in recent years, a prevalent issue that is often overlooked is the impact of signal delay. Signal delay occurs when there is a lag between an agent's perception of the environment and its corresponding actions. In…
We introduce a decoder-decoder architecture, YOCO, for large language models, which only caches key-value pairs once. It consists of two components, i.e., a cross-decoder stacked upon a self-decoder. The self-decoder efficiently encodes global key-value (KV) caches that are reused by the cross-decoder via cross-attention. The…