Dataset Source Code
Dion: Distributed Orthonormal Updates
Dion is a scalable optimizer that accelerates neural network training by applying orthonormal weight updates using amortized power iteration, which works efficiently on sharded matrices. It reduces communication overhead through low-rank compression and error feedback,…