Softmax Function in Deep Learning
Definition
Let’s refer to it as $\mathbf{x} := (x_{1} , \cdots , x_{n}) \in \mathbb{R}^{n}$.
For $\displaystyle \sigma_{j} ( \mathbf{x} ) = {{ e^{x_{j}} } \over {\sum_{i=1}^{n} e^{x_{i}} }}$, $\sigma ( \mathbf{x} ) := \left( \sigma_{1} (\mathbf{x}) , \cdots , \sigma_{n} (\mathbf{x} ) \right)$ is defined as $\sigma : \mathbb{R}^{n} \to (0,1)^{n}$, which is called the softmax.
Explanation
The softmax function is a type of activation function characterized by its domain being $\mathbb{R}^{n}$. It is used to normalize the values of a vector as input. For any $\mathbf{x} \in \mathbb{R}$, every component of $\sigma ( \mathbf{x} )$ is between $0$ and $1$, and when they are all added together, they exactly become $1$.
This property is similar to probability, and in practice, it is conveniently used in solving classification problems when implementing artificial neural networks.