Advances to low-bit quantization enable LLMs on edge devices
Advances in low-bit quantization techniques enable efficient operation of LLMs on resource-constrained edge devices. Discover how innovations like T-MAC, Ladder, and LUT Tensor Core improve computational efficiency and enhance hardware compatibility.
Research Focus: Week of January 27, 2025
In this issue: A new approach to multimodal pretraining for remote sensing; Managed-retention memory for the AI era; Improving detection of macular telangiectasia type 2; Generalizing symbolic automata.
Ideas: Bug hunting with Shan Lu
Struggles with programming languages helped research manager Shan Lu find her calling as a bug hunter. She discusses one bug that really haunted her, the thousands she’s identified since, and how she’s turning to LLMs…
Cloud Efficiency Optimization (CLEO)
Improving resource utilization and sustainability The goal of this project is to infuse optimization solutions across Microsoft’s Cloud, targeting improvements in resource utilization (space, power, compute) and sustainability. We are interested in both fundamental and…