What is One-Hot Encoding in Machine Learning?
📂Machine LearningWhat is One-Hot Encoding in Machine Learning?
Given a set X⊂Rn, suppose its subsets Xi satisfy the following.
Let’s call β={e1,…,eN} the standard basis of RN. Then, the following function, or mapping x∈X itself, is called one-hot encoding.
f:Xx→β↦ei if x∈Xi
It’s a commonly used method for labeling data in machine learning. Since there’s only one non-zero element, it’s called one-hot. This mapping is done to treat the data labels as qualitative variables rather than quantitative variables. Imagine assigning [1] as a label to a picture of clothes, and [2] to a picture of shoes. Even though there’s no meaning of being 2 times different between the two pictures, such meaning is represented in the labels. Moreover, if the predicted value is [5], it becomes ambiguous whether this should be considered closer to [1] or [2], or if it’s a failed prediction. Hence, by using labels like [10] and [01], unintended meanings are prevented from being attributed, and values can be obtained only within the intended range. Therefore, N=∣β∣ represents the number of classes to classify the data.
For instance, one-hot encoding the MNIST data is as follows.
\raisebox{0.5em}{⋯↦e2=[010⋯0]T\enspace \cdots \enspace \mapsto e_{2} = \begin{bmatrix} 0 & 1 & 0 & \cdots & 0\end{bmatrix}^{T}⋯↦e2=[010⋯0]T}
\raisebox{0.5em}{⋯↦e3=[001⋯0]T\enspace \cdots \enspace \mapsto e_{3} = \begin{bmatrix} 0 & 0 & 1 & \cdots & 0\end{bmatrix}^{T}⋯↦e3=[001⋯0]T}
\raisebox{0.5em}{⋯↦e10=[000⋯1]T\enspace \cdots \enspace \mapsto e_{10} = \begin{bmatrix} 0 & 0 & 0 & \cdots & 1\end{bmatrix}^{T}⋯↦e10=[000⋯1]T}
See Also