Continuous Learning in Deep Learning
Explanation
Continual learning in deep learning refers to the sequential learning of multiple tasks by artificial neural networks, synonymous with lifelong learning or incremental learning. Unlike humans, who do not forget existing knowledge simply by learning something new – though they may forget over time, not due to the acquisition of new knowledge – artificial neural networks exhibit a decline in performance on previously learned tasks after sufficiently learning one task and then moving on to a new one.
For instance, consider a model that has been successfully trained to classify handwritten digits from 0 to 9 using the MNIST training set. Imagine that we want to extend the numeral system to hexadecimal, thus wanting to train the model on additional characters A, B, C, D, E, and F. During or after the training process for A~F, the accuracy for the previously learned digits 0~9 decreases. This phenomenon is known as catastrophic forgetting. Catastrophic forgetting occurs because the weights that have been learned are excessively adjusted, a process termed semantic drift. If a certain weight $w_{j}$ plays a crucial role in distinguishing cats and it changes during the process of learning a new task, the neural network becomes less capable of differentiating cats.
Therefore, a primary concern in continual learning is to minimize catastrophic forgetting so that learning new tasks does not detrimentally affect the performance on existing tasks. Furthermore, research is ongoing in various areas, such as accelerating the learning speed for new tasks as learning accumulates, or ensuring that the parameters do not significantly increase while learning new tasks.