What is natural selection in genetic algorithms?
Definition
In a Genetic algorithm, natural selection, or simply selection, is the operation that chooses individuals with better values of the objective function.
Explanation 1
Because genetic algorithms are inspired by “survival of the fittest,” natural selection is the driving force of evolution itself and directly affects optimization performance. Here, selection means that the objective function $f$ we wish to maximize becomes the “environment,” and individuals that are suitable for $f$ are “selected” by surviving.
A typical implementation of natural selection is the roulette wheel method: compute the fitness $f \left( \mathbf{x}_{k} \right)$ for each individual $\mathbf{x}_{k}$ so that higher fitness yields a higher probability of being chosen. Assign weights called the expected value $E_{k}$ based on per-individual fitness, and sample a subset without replacement (sampling without replacement) to pass to the next generation, while the remainder are eliminated.
Boltzmann selection
Let the cardinality of the solution set be $N$.
Definition of the softmax function: let $\mathbf{x} := (x_{1} , \cdots , x_{n}) \in \mathbb{R}^{n}$. For $\displaystyle \sigma_{j} ( \mathbf{x} ) = {{ e^{x_{j}} } \over {\sum_{i=1}^{n} e^{x_{i}} }}$, the $\sigma : \mathbb{R}^{n} \to (0,1)^{n}$ defined as $\sigma ( \mathbf{x} ) := \left( \sigma_{1} (\mathbf{x}) , \cdots , \sigma_{n} (\mathbf{x} ) \right)$ is called the softmax.
$$ E_{k} = {\frac{ e^{ f ( \mathbf{x}_{k} ) / T } }{ \sum_{i=1}^{N} e^{ f ( \mathbf{x}_{i} ) / T } }} \qquad , k = 1, \cdots , N $$ Since $f : X \to \mathbb{R}$ can be negative in general, the simplest idea to make these positive is to take a softmax. What differs from the original softmax is the addition of a hyperparameter $T > 0$, called the temperature, which controls the selection pressure. This idea also appears in ChatGPT of LLMs (Large Language Model) that drew attention in the 2020s: as the temperature $T$ increases, it approaches $\exp \left( f \left( x_{k} \right) / T \right) \to 1$ and sampling becomes uniform, so the hyperparameter influences selection more than fitness itself. Conversely, when the temperature $T$ is low, the influence of fitness becomes stronger.
The selection method that introduces softmax and temperature is called Boltzmann selection, and the name, as is obvious, is taken from the Boltzmann distribution. If one uses a roulette wheel for selection, this is the most sensible and straightforward method.
Rank selection
Rank selection assigns expected values according to the rank (i.e., ordering) of fitness rather than using raw fitness directly. Because this method does not use fitness values directly and has no separate hyperparameter, it can be applied consistently even if we do not know $f$ well; for example, it can be useful when there are many instances of $f$ to be optimized, and it can also somewhat mitigate polarization of fitness within the population.
Mitchell, M. (1998). An introduction to genetic algorithms. MIT press. https://www.boente.eti.br/fuzzy/ebook-fuzzy-mitchell.pdf: p124~127. ↩︎
