How does Batch Normalization Help Optimization?
- Andrew Ilyas | MIT
Batch normalization (BatchNorm) is a widely adopted technique that enables faster and more stable training of deep neural networks. However, despite its pervasiveness, the exact reasons for BatchNorm’s effectiveness are still poorly understood.
In this talk, we take a closer look at the underpinnings of the BatchNorm’s success. In particular, we examine the popular belief that the root of BatchNorm’s effectiveness is due to reduction of an effect called internal covariate shift (ICS). We then explore the connection between BatchNorm, ICS, and the optimization landscape of deep neural networks.
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
- Dr. Pascal O. Zinn
-
-
-
-
-
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
- Sophia Mehdizadeh
-
Tongue-Gesture Recognition in Head-Mounted Displays
- Tan Gemicioglu
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
- Shoken Kaneko
-
-
-
-
Audio-based Toxic Language Detection
- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
- Forrest Iandola,
- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
- Ashique Khudabukhsh
-
-
-
Towards Mainstream Brain-Computer Interfaces (BCIs)
- Brendan Allison
-
-
-
-
Learning Structured Models for Safe Robot Control
- Subramanian Ramamoorthy
-