Overfitting

Overfitting#

Overfitting occurs when your machine learning model performs very well on training data but poorly on unseen (validation/test) data. Essentially, the model learns the training data “too well,” capturing noise rather than general patterns.

There are several signs that indicate overfitting:

High accuracy on training data but lower accuracy on validation data
Increasing validation loss after some epochs even though training loss decreases

Keras provides several built-in methods to combat overfitting:

Early Stopping: Stops training once the validation performance stops improving.

from tensorflow.keras.callbacks import EarlyStopping

early_stop = EarlyStopping(monitor='val_loss', patience=5)

model.fit(X_train, y_train, epochs=50,
          validation_data=(X_val, y_val),
          callbacks=[early_stop])

Dropout: Randomly sets a fraction of input units to 0 at each update during training time, which helps prevent overfitting.

model = tf.keras.Sequential([
   tf.keras.layers.Dense(128, activation='relu'),
   tf.keras.layers.Dropout(0.5),
   tf.keras.layers.Dense(10, activation='softmax')
])

L1/L2 Regularization: Adds a penalty on the size of coefficients to the loss function.

from tensorflow.keras import regularizers

model = tf.keras.Sequential([
   tf.keras.layers.Dense(128, activation='relu',
                         kernel_regularizer=regularizers.l2(0.01)),
   tf.keras.layers.Dense(10, activation='softmax')
])

Data Augmentation: Increases the diversity of your training set by applying random transformations to the training data.

data_augmentation = tf.keras.Sequential([
 tf.keras.layers.RandomFlip("horizontal"),
 tf.keras.layers.RandomRotation(0.1),
])

model = tf.keras.Sequential([
   data_augmentation,
   tf.keras.layers.Conv2D(32, 3, activation='relu'),
   tf.keras.layers.MaxPooling2D(),
   tf.keras.layers.Flatten(),
   tf.keras.layers.Dense(10, activation='softmax')
])