logo

Softmax Function in Deep Learning 📂Machine Learning

Softmax Function in Deep Learning

Definition

Let’s refer to it as x:=(x1,,xn)Rn\mathbf{x} := (x_{1} , \cdots , x_{n}) \in \mathbb{R}^{n}.

For σj(x)=exji=1nexi\displaystyle \sigma_{j} ( \mathbf{x} ) = {{ e^{x_{j}} } \over {\sum_{i=1}^{n} e^{x_{i}} }}, σ(x):=(σ1(x),,σn(x))\sigma ( \mathbf{x} ) := \left( \sigma_{1} (\mathbf{x}) , \cdots , \sigma_{n} (\mathbf{x} ) \right) is defined as σ:Rn(0,1)n\sigma : \mathbb{R}^{n} \to (0,1)^{n}, which is called the softmax.

Explanation

The softmax function is a type of activation function characterized by its domain being Rn\mathbb{R}^{n}. It is used to normalize the values of a vector as input. For any xR\mathbf{x} \in \mathbb{R}, every component of σ(x)\sigma ( \mathbf{x} ) is between 00 and 11, and when they are all added together, they exactly become 11.

This property is similar to probability, and in practice, it is conveniently used in solving classification problems when implementing artificial neural networks.