What is Overfitting and Regularization in Machine Learning? 📂Machine Learning

What is Overfitting and Regularization in Machine Learning?

Overfitting

The phenomenon where the training loss decreases, but the test loss (or validation loss) does not decrease or rather increases is called overfitting.

Explanation

There is also a term called underfitting, which basically means the opposite, but frankly, it’s a meaningless term and not often used in practice.

A crucial point in machine learning is that the function trained with the available data must also work well with new data. Therefore, there is a term called generalization performance, referring to the model’s performance on unseen data. If likening to entrance exams, a student who scores perfect on mock exams but performs poorly on the actual college entrance exam can be considered as having overfitted to the mock exam questions. On the other hand, a student who scores well on mock exams and similarly well on the actual exam has good generalization performance.

Regularization

Any method that modifies the algorithm to reduce the test loss (not training loss) is called regularization.¹

Goodfellow defines regularization as “any modification we make to a learning algorithm that is intended to reduce its generalization error but not its training error.”

In other words, all methods to prevent overfitting are collectively called regularization. The first encounter in machine learning or deep learning studies is usually dropout.

Types

$\ell_{1}$ regularization
$\ell_{2}$ regularization
Weight decay
Early stopping
Dropout
Batch normalization
Label smoothing
Data augmentation
Flooding