Overfitting

Overfitting#

Overfitting occurs when your machine learning model performs very well on training data but poorly on unseen (validation/test) data. Essentially, the model learns the training data “too well,” capturing noise rather than general patterns.

There are several signs that indicate overfitting:

  • High accuracy on training data but lower accuracy on validation data

  • Increasing validation loss after some epochs even though training loss decreases

Keras provides several built-in methods to combat overfitting:

  1. Early Stopping: Stops training once the validation performance stops improving.

    from tensorflow.keras.callbacks import EarlyStopping
    
    early_stop = EarlyStopping(monitor='val_loss', patience=5)
    
    model.fit(X_train, y_train, epochs=50,
              validation_data=(X_val, y_val),
              callbacks=[early_stop])
    
  2. Dropout: Randomly sets a fraction of input units to 0 at each update during training time, which helps prevent overfitting.

    model = tf.keras.Sequential([
       tf.keras.layers.Dense(128, activation='relu'),
       tf.keras.layers.Dropout(0.5),
       tf.keras.layers.Dense(10, activation='softmax')
    ])
    
  3. L1/L2 Regularization: Adds a penalty on the size of coefficients to the loss function.

    from tensorflow.keras import regularizers
    
    model = tf.keras.Sequential([
       tf.keras.layers.Dense(128, activation='relu',
                             kernel_regularizer=regularizers.l2(0.01)),
       tf.keras.layers.Dense(10, activation='softmax')
    ])
    
  4. Data Augmentation: Increases the diversity of your training set by applying random transformations to the training data.

    data_augmentation = tf.keras.Sequential([
     tf.keras.layers.RandomFlip("horizontal"),
     tf.keras.layers.RandomRotation(0.1),
    ])
    
    model = tf.keras.Sequential([
       data_augmentation,
       tf.keras.layers.Conv2D(32, 3, activation='relu'),
       tf.keras.layers.MaxPooling2D(),
       tf.keras.layers.Flatten(),
       tf.keras.layers.Dense(10, activation='softmax')
    ])