### One-hot encodings for machine learning

In this post, we’re going to discuss
*one-hot encoding*, and how we make use of it in machine learning.

In previous posts, we've talked about how labels for images in Keras were actually
*one-hot encoded vectors*. Let’s discuss exactly what this means.

### Labels

We know that when we’re training a neural network via supervised learning, we pass labeled input to our model, and the model gives us a predicted output.

If our model is an image classifier, for example, we may be passing labeled images of animals as input. When we do this, the model is usually not interpreting these labels as words, like
*dog* or
*cat*. Additionally, the output that our model gives us in regards to its predictions aren’t typically words like
*dog* or
*cat* either. Instead, most of the time our labels become
encoded, so they can take on the form of an integer or of a vector of integers.

### Hot and cold values

One type of
*encoding* that is widely used for encoding categorical data with
numerical values is called
*one-hot encoding*.

One-hot encodings transform our categorical labels into vectors of `0`

s and `1`

s. The length of these vectors is the number of classes or categories that our model is expected to
classify.

Value | Interpretation |
---|---|

0 | Cold |

1 | Hot |

#### Vectors of 0s and 1s

If we were classifying whether images were either of a dog or of a cat, then our one-hot encoded vectors that corresponded to these classes would each be of length `2`

reflecting the two categories.

If we added another category, like lizard, so that we could then classify whether images were of dogs, cats, or lizards, then our corresponding one-hot encoded vectors would each be of length `3`

since we now have three categories.

Alright, so we know the labels are transformed or
*encoded* into vectors. We know that each of these vectors has a length that is equal to the number of output categories, and we briefly mentioned that the vectors contain `0`

s and `1`

s.
Let’s go into further detail on this last piece.

### One-hot encodings for multiple categories

Let’s stick with the example of classifying images as being either of a
*cat*,
*dog*, or
*lizard*. With each of the corresponding vectors for these categories being of length `3`

, we can think of each index or each element within the vector corresponding to one of the three
categories.

Let’s say for this example that the cat label corresponds to the first element, dog corresponds to the second element, and lizard corresponds to the third element.

With each of these categories having their own
*place* in the corresponding vectors, we can now discuss the intuition behind the name
*one-hot*.

With each one-hot encoded vector, every element will be a zero EXCEPT for the element that corresponds to the actual category of the given input. This element will be a
*hot one*.

*One*of the indices of the vector is

*hot*!

Sticking with our same example, recall we said that a cat corresponded to the first element, dog to the second, and lizard to the third, so the corresponding one-hot encoded vectors for each of these categories would look like this.

Label | Index-0 | Index-1 | Index-2 |
---|---|---|---|

Cat | 1 | 0 | 0 |

Dog | 0 | 1 | 0 |

Lizard | 0 | 0 | 1 |

For cat, we see that the first element is a one and the next two elements are zeros. This is because each element within the vector is a zero except for the element that corresponds to the actual category, and we said that the cat category corresponded to the first element.

#### One vector for each category

Similarly, for dog, we see that the second element is a one, while the first and third elements are zeros. Lastly, for lizard, the third element is a one, while the first and second elements are zeros.

We can see that each time the model receives input that is a cat, it’s not interpreting the label as the word
*cat*, but instead is interpreting the label as this vector `[1,0,0]`

.

For images labeled as dog, the model is interpreting the dog label as the vector `[0,1,0]`

, and for images labeled as lizard, the model is interpreting the label as the vector `[0,0,1]`

.

Label | Vector |
---|---|

Cat | [1,0,0] |

Dog | [0,1,0] |

Lizard | [0,0,1] |

Just for clarity purposes, say we add another category, llama, to the mix. Now, we have four categories total, and so this will cause each one-hot encoded vector corresponding to each of these categories to be of length `4`

now.

The vectors will now look like this.

Label | Vector |
---|---|

Cat | [1,0,0,0] |

Dog | [0,1,0,0] |

Lizard | [0,0,1,0] |

Llama | [0,0,0,1] |

We can see that for each of our pre-existing categories of cat, dog, and lizard, we still have the corresponding
*one* for each of these vectors in the same places where they were before. The one is the first element for cat, second for dog, and third for lizard. The new, fourth element for each of our existing
categories is just a zero since this fourth element corresponds to the llama category.

Finally, the new one-hot encoded vector for the llama category is all zeros except for the fourth element, which is a one, since the fourth element corresponds to the llama category.

Note that we just arbitrarily said that cat corresponded to the first element, dog to the second, lizard to the third, and llama to the fourth, but this could very well be in a different order. This just depends on how the underlying code or library is doing the one-hot encoding.

### Wrapping up

If you’re interested in understanding how to view the mapping between which element or index corresponds to which label in Keras for image data, check out the post in the Keras series showing how that can be done.

We should now understand what one-hot encoding is and how labels are transformed into one-hot encoded vectors for classification purposes when working with artificial neural networks. I’ll see ya in the next one!