Portrait of Wei Cui

Wei Cui

Principal Researcher

About

Research Areas (AI infra):

1. Low-cost inference and cost-efficient post-training on heterogeneous accelerators

2. Accelerating relational databases and query systems

3. Distributed and NVLink/XGMI optimization for LLMs and query processing

4. Cross-platform compilation for Cloud & Edge (CUDA, ROCm, DirectX, Vulkan, SYCL, WebGPU)

5. Coding agents, sandbox and multimodal inference systems

6. Advance Quantization and compression for cost-performance hardware (e.g., NVFP4, MXFP4, FP8 for Intel/AMD/A100/..)

 

Other Works:

microsoft/tutel: Tutel MoE: An Optimized Mixture-of-Experts Implementation (opens in new tab)

microsoft/antares: Antares: an automatic engine for multi-platform kernel generation and optimization (opens in new tab)