Regularization – Dropout, L2
Description
Regularization techniques help prevent overfitting by adding constraints or modifications to the learning process, improving a model's ability to generalize to unseen data.
- Dropout: Randomly "drops" (sets to zero) a fraction of neurons during training, forcing the network to learn redundant representations and reduce dependency on any single neuron.
- L2 Regularization (Weight Decay): Adds a penalty proportional to the squared magnitude of model weights to the loss function, encouraging smaller weights and simpler models.
Combining dropout and L2 regularization often yields better generalization performance than using either alone.

Examples
Example of implementing Dropout and L2 regularization in a Keras model:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.regularizers import l2
model = Sequential([
Dense(128, activation='relu', kernel_regularizer=l2(0.001), input_shape=(input_dim,)),
Dropout(0.5),
Dense(64, activation='relu', kernel_regularizer=l2(0.001)),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
Real-World Applications
Speech Recognition
Regularization helps models generalize well to varied accents and noisy audio inputs.
Image Classification
Dropout and L2 reduce overfitting on large image datasets with many parameters.
Search Engines
Improves relevance of results by preventing overfitting on historical user queries.
Autonomous Vehicles
Ensures robustness in perception models trained on limited driving data.
Resources
Recommended Books
- Deep Learning by Ian Goodfellow
- Hands-On Machine Learning by Aurélien Géron
- CS231n: Regularization
Interview Questions
What is dropout and why is it used?
Dropout randomly disables a fraction of neurons during training to prevent co-adaptation, which helps reduce overfitting and improves generalization.
How does L2 regularization help prevent overfitting?
L2 regularization adds a penalty proportional to the sum of squared weights to the loss function, encouraging the model to keep weights small, which simplifies the model and reduces overfitting.
Can dropout and L2 regularization be used together?
Yes, using both together often improves model robustness by combining different regularization effects, reducing overfitting more effectively.