Deep Compression, DSD Training and EIE: Deep Neural Network Model Compression, Regularization and Hardware Acceleration
- Song Han | Stanford University
- ECC 2010
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on mobile phones and embedded systems with limited hardware resources. To address this limitation, this talk first introduces “Deep Compression” that can compress the deep neural networks by 10x-49x without loss of prediction accuracy[1][2][5]. Then this talk will describe DSD, the “Dense-Sparse-Dense” training method that regularizes CNN/RNN/LSTMs to improve the prediction accuracy of a wide range of neural networks given the same model size[3]. Finally this talk will discuss EIE, the “Efficient Inference Engine” that works directly on the deep-compressed DNN model and accelerates the inference, taking advantage of weight sparsity, activation sparsity and weight sharing, which is 13x faster and 3000x more energy efficient than a TitanX GPU[4]. References: [1] Han et al. Learning both Weights and Connections for Efficient Neural Networks (NIPS’15) [2] Han et al. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (ICLR’16, best paper award) [3] Han et al. DSD: Regularizing Deep Neural Networks with Dense-Sparse-Dense Training (submitted to NIPS’16) [4] Han et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network (ISCA’16) [5] Iandola, Han et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and
-
-
Kenneth Tran
Principal Research Software Engineer
-
-
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
- Dr. Pascal O. Zinn
-
-
-
-
-
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
- Sophia Mehdizadeh
-
Tongue-Gesture Recognition in Head-Mounted Displays
- Tan Gemicioglu
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
- Shoken Kaneko
-
-
-
-
Audio-based Toxic Language Detection
- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
- Forrest Iandola,
- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
- Ashique Khudabukhsh
-
-
-
Towards Mainstream Brain-Computer Interfaces (BCIs)
- Brendan Allison
-
-
-
-
Learning Structured Models for Safe Robot Control
- Subramanian Ramamoorthy
-