Overfitting#
Overfitting occurs when your machine learning model performs very well on training data but poorly on unseen (validation/test) data. Essentially, the model learns the training data “too well,” capturing noise rather than general patterns.
There are several signs that indicate overfitting:
High accuracy on training data but lower accuracy on validation data
Increasing validation loss after some epochs even though training loss decreases
Keras provides several built-in methods to combat overfitting:
Early Stopping: Stops training once the validation performance stops improving.
from tensorflow.keras.callbacks import EarlyStopping early_stop = EarlyStopping(monitor='val_loss', patience=5) model.fit(X_train, y_train, epochs=50, validation_data=(X_val, y_val), callbacks=[early_stop])
Dropout: Randomly sets a fraction of input units to 0 at each update during training time, which helps prevent overfitting.
model = tf.keras.Sequential([ tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.5), tf.keras.layers.Dense(10, activation='softmax') ])
L1/L2 Regularization: Adds a penalty on the size of coefficients to the loss function.
from tensorflow.keras import regularizers model = tf.keras.Sequential([ tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.01)), tf.keras.layers.Dense(10, activation='softmax') ])
Data Augmentation: Increases the diversity of your training set by applying random transformations to the training data.
data_augmentation = tf.keras.Sequential([ tf.keras.layers.RandomFlip("horizontal"), tf.keras.layers.RandomRotation(0.1), ]) model = tf.keras.Sequential([ data_augmentation, tf.keras.layers.Conv2D(32, 3, activation='relu'), tf.keras.layers.MaxPooling2D(), tf.keras.layers.Flatten(), tf.keras.layers.Dense(10, activation='softmax') ])