Portrait of Edward Hu

Edward Hu



I work on the large-scale deployment of GPT-3, principled approaches to large model training, and theories of infinitely wide neural networks.

Recently, my collaborators and I released Low-Rank Adaptation (LoRA) for large language models, which helps to adapt GPT-3 using 10,000x less storage space and to practically eliminate task-switching latency.

In 2020, Greg Yang and I released a paper on a new infinite-width limit that exhibits feature learning (ICML 2021), refuting the myth that wide models are linear in nature as suggested by the theory of Neural Tangent Kernel.

I was a member of the Microsoft Research AI Residency program. I graduated with a Bachelor of Science in Computer Science and Cognitive Science from Johns Hopkins University in 2019.