Tool
Stochastic Mixture-of-Experts
This PyTorch package implements Taming Sparsely Activated Transformer with Stochastic Experts.