What is Mutation in Genetic Algorithms? 📂Optimization

What is Mutation in Genetic Algorithms?

Terms ¹

In the genetic algorithm, a mutation, or simply a variation, is an operator that mainly introduces (typically random) changes to an individual’s chromosome.

Description

If individuals are born only through crossover mixing, traits that did not exist in the initial random population can never appear. Mutation and natural selection are the two most important pillars in real evolution, and likewise in genetic algorithms mutation serves to maintain the population’s diversity and to introduce new traits.

Bit flip

$$ x_{k} ! = x_{k} $$ If the population is represented in bits as in $\left\{ 0 , 1 \right\}^{n}$, the simplest way to implement mutation is to randomly choose some positions and flip the bits.

Uniform replacement

$$ x_{k} = U [ a , b ] $$ If a chromosome represented by continuous coordinate values has a certain dimension bounded above and below so that it lies in the interval $[a , b]$, mutation can be implemented by first choosing which position to change (as in bit flip) and then drawing a random number from a uniform distribution $U [ a, b ]$ to replace the value.

Gaussian mutation

$$ \mathbf{x} += \mathcal{N} ( \mathbf{0} , \Sigma ) $$ When coordinates have no explicit bounds, a common implementation is to repeatedly add increments that follow a multivariate normal distribution, analogous to Brownian motion. However, because this method can produce different mutation strengths per coordinate, it is necessary to identify the hyperparameters $\Sigma$ that control these scales.

Lévy jumps

One can also induce mutations by moving the population via Lévy flights. Such mutations are useful when you want to explore the search space broadly or induce explosive, large jumps.

Typically one would use a multivariate Cauchy distribution, but a caveat is that whereas a vector of random variables that each follow a normal distribution corresponds to a multivariate normal distribution, a vector of random variables each following a t-distribution does not in general constitute a multivariate t-distribution. Therefore, to obtain a multivariate Cauchy one must explicitly set the degrees of freedom to $1$ in the multivariate t-distribution.

I myself once got stuck because I did not know this fact.

Kochenderfer. (2025). Algorithms for Optimization(2nd Edition): p164. ↩︎