Skip to main content

Overfiting


Overfitting

What is overfitting? Overfitting is a common problem in machine learning, where a model performs well on training data but does not generalize well to unseen data (test data). If a model is overfitted, it means that the model has learned the training data too well, including the noise in the data. As a result, the model will perform poorly on new data because it has learned the noise in the training data rather than the underlying pattern.

When Overfiting Happens

Here are some methods to prevent overfitting:

  1. Cross-validation: Cross-validation It involves splitting the data into multiple subsets, training the model on one subset, and testing it on another. This helps to ensure that the model generalizes well to unseen data.

  2. Regularization: Regularization adding a penalty term to the loss function. This penalty term discourages the model from learning complex patterns in the data that may not generalize well to new data. L1 and L2 regularization are two common types of regularization, which are desciped in following equations:

    • L1 regularization:

      L1=Ξ»βˆ‘i=1n∣wi∣ L1 = \lambda \sum_{i=1}^{n} |w_i|

    • L2 regularization:

      L2=Ξ»βˆ‘i=1nwi2 L2 = \lambda \sqrt{\sum_{i=1}^{n} w_i^2}

    where wiw_i is the weight of the ii-th feature, and Ξ»\lambda is the regularization parameter.

  3. Early stopping: This is done by monitoring the performance of the model on a validation set and stopping the training process when the performance starts to decrease.

  4. Dropout: What dropout does is to randomly set a fraction of the input units to 0 at each update during training time, which helps prevent overfitting.

  5. Decrease the parameter: Decrease the number of parameters in the model, such as the number of layers or the number of units in each layer.

  6. Data augmentation: Data augmentation involves creating new training examples by applying transformations to the existing data, such as rotating, flipping, or scaling the images.

  7. Ensemble learning: Ensemble learning involves training multiple models on different subsets of the data and combining their predictions to make a final prediction.

  8. Feature selection: Feature selection involves selecting the most important features in the data and discarding the rest. This helps to reduce the complexity of the model and prevent overfitting.