Internal Covariate Shift & BatchNorm

Smoothing the landscape: watch how internal covariate shift makes training a rocky climb, and how normalization turns it into a slide.

Cautious 0.08 Aggressive

Current Insight

Smooth landscapes allow for Aggressive learning rates without the model "exploding."

👇

Click anywhere to drop a particle

Watch it try to reach the center (the Global Minimum)

Target
Gradient Path