News & features
BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI
| Chengquan Guo , Yuzhou Nie, Chulin Xie, Zinan Lin, Wenbo Guo, and Bo Li
BlueCodeAgent is an end-to-end blue-teaming framework built to boost code security using automated red-teaming processes, data, and safety rules to guide LLMs’ defensive decisions. Dynamic testing reduces false positives in vulnerability detection.
RedCodeAgent: Automatic red-teaming agent against diverse code agents
| Chengquan Guo , Chulin Xie, Yu Yang, Zhaorun Chen, Zinan Lin, Xander Davies, Yarin Gal, Dawn Song, and Bo Li
Code agents help streamline software development workflows, but may also introduce critical security risks. Learn how RedCodeAgent automates and improves “red-teaming” attack simulations to help uncover real-world threats that other methods overlook.
Research Focus: Week of August 26, 2024
Learn what’s next for AI at Research Forum on Sept. 3; WizardArena simulates human-annotated chatbot games; MInference speeds pre-filling for long-context LLMs via dynamic sparse attention; Reef: Fast succinct non-interactive zero-knowledge regex proofs.
The Crossroads of Innovation and Privacy: Private Synthetic Data for Generative AI
| Gbola Afonja, Robert Sim, Zinan Lin, Huseyin Atahan Inan, and Sergey Yekhanin
Synthetic data could potentially help address some privacy concerns with AI model development and training, but it comes with limitations. Researchers at Microsoft are exploring techniques for producing more realistic data with strong privacy protections.
In the news | TheSequence
Edge 371: Two-Step LLM Reasoning with Skeleton of Thoughts
Created by Microsoft Research, the technique models some of the aspects of human cognitive reasoning in LLMs. The Skeleton-of-Thoughts (SoT) technique, a recent innovation in the field of Large Language Models (LLMs), represents a significant shift in how these models…
Research Focus: Week of January 8, 2024
| Zinan Lin, Jinyu Li, Bhaskar Mitra, Siân Lindley, Liang Wang, Nan Yang, and Furu Wei
Mixture-of-linear-experts for long-term time series forecasting; Weakly-supervised streaming multilingual speech model with truly zero-shot capability; KBFormer: Diffusion model for structured entity completion; Identifying risks of AI-mediated data access:
Skeleton-of-Thought: Parallel decoding speeds up and improves LLM output
| Xuefei Ning and Zinan Lin
This research was accepted by the 2024 International Conference on Learning Representations. Large language models (LLMs) such as LLaMA and OpenAI’s GPT-4 are revolutionizing technology. However, one of the common complaints about LLMs is their speed, or lack thereof. In…
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
| Boxin Wang, Bo Li, and Zinan Lin
This paper received the outstanding benchmarks track paper award during NeurIPS 2023 (opens in new tab). How trustworthy are generative pre-trained transformer (GPT) models? To answer this question, University of Illinois Urbana-Champaign, together with Stanford University, University of California, Berkeley,…