Dataset for Deep Learning - Fashion MNIST
text
Introducing Fashion-MNIST for machine learning
Welcome back to this series on neural network programming. In this post, we will introduce the Fashion-MNIST dataset.
We'll look at the dataset spec, how the dataset was built, and how the dataset differs from the original MNIST dataset of handwritten digits. Without further ado, let's get started.
Why study a dataset?
Let's kick things off by pondering the question of why we should take the time to study a dataset. Data is the primary ingredient of deep learning, and although it's our task as neural network programmers to let our neural networks learn from our data, we still have the responsibility of knowing the nature and history of the data we are using to actually do the training.
Computer programs in general consist of two primary components, code and data. With traditional programming, the programmer's job is to directly write the software or code, but with deep learning and neural networks, the software so to speak is the network itself and in particular, the network's weights that emerge automatically during the training process.
It's the programmer's job to oversee and guide the learning process though training. We can think of this as an indirect way of writing software or code. By using data and deep learning, neural network programmers can produce software capable of performing computations without writing code to explicitly carry out these computations.
For this reason, the role of data in developing software is shifting, and we'll likely see the role of software developers shift as well.
Data focused considerations:
- Who created the dataset?
- How was the dataset created?
- What transformations were used?
- What intent does the dataset have?
- Possible unintentional consequences?
- Is the dataset biased?
- Are there ethical issues with the dataset?
In practice, acquiring and accessing data is often of the hardest parts of deep learning, so keep this in mind as we go though this particular dataset. Take note of the general concepts and ideas that we see here.
What is the MNIST dataset?
The MNIST dataset, Modified National Institute of Standards and Technology database, is a famous dataset of handwritten digits that is commonly used for training image processing systems for machine learning. NIST stands for National Institute of Standards and Technology.
The M in MNIST stands for modified, and this is because there was an original NIST dataset of digits that was modified to give us MNIST.
MNIST is famous because of how often the dataset is used. It's common for two reasons:
- Beginners use it because it's easy
- Researchers use it to benchmark (compare) different models.
The dataset consists of 70,000
images of hand written digits with the following split:
-
60,000
training images -
10,000
testing images
The images were originally created by American Census Bureau employees and American high school students.
MNIST has been so widely used, and image recognition tech has improved so much that the dataset is considered to be too easy. This is why the Fashion-MNIST dataset was created.
What is Fashion-MNIST?
Fashion-MNIST as the name suggests is a dataset of fashion items. Specifically, the dataset has the following ten classes of fashion items:
Index | Label |
---|---|
0 | T-shirt/top |
1 | Trouser |
2 | Pullover |
3 | Dress |
4 | Coat |
5 | Sandal |
6 | Shirt |
7 | Sneaker |
8 | Bag |
9 | Ankle boot |
As we have seen in a previous post, a sample of the items look like this:
What's the origin of Fashion-MNIST
Where did the Fashion-MNIST images come from? Fashion-MNIST is based on the assortment on Zalando's website. Zalando is a German based multi-national fashion commerce company that was founded in 2008.
This is why we see zalandoresearch in the GitHub URL where the Fashion-MNIST dataset is available for download.
Zalando Research is the group from within the company that created the dataset.
We'll see more about how the images were collected when we review the paper that introduced the dataset, but first, let's answer another lurking question.
What puts the MNIST in Fashion-MNIST?
The reason the fashion MNIST dataset has MNIST in it's name is because the creators seek to replace the MNIST with Fashion-MNIST.
For this reason, the Fashion dataset was designed to mirror the original MNIST dataset as closely as possible while introducing higher difficulty in training due to simply having more complex data than hand written images.
We'll see the specific ways that Fashion-MNIST mirrors the original dataset in the paper, but one thing we have already seen is the number of classes.
- MNIST β has 10 classes (one for each digit 0-9)
- Fashion-MNIST β has 10 classes (this is intentional)
Let's check out the paper.
Reading the Fashion-MNIST paper on arXiv
The paper can be found here, and to see the paper, just click on the PDF link.
The first thing to notice about the paper is that the authors are from Zalando Research (the origin of Fashion-MNIST).
After reading the paper's abstract, we see why the dataset has been named Fashion-MNIST.
The Fashion-MNIST paper's abstract
"We present Fashion-MNIST, a new dataset comprising of 28 by 28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images. Fashion-MNIST is intended to serve as a direct dropin replacement for the original MNIST dataset for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits. The dataset is freely available at https://github.com/zalandoresearch/fashion-mnist."
The dataset was designed to be a drop-in replacement for the original MNIST. By making the Fashion-MNIST dataset specs match the original MNIST specs, the switch over from old to new can be smooth. The paper claims that the only change needed to switch datasets is to change the URL from where the MNIST dataset is fetched by pointing to the Fashion dataset.
The paper also give us some more insight as to why MNIST is so popular:
"The reason MNIST is so popular has to do with its size, allowing deep learning researchers to quickly check and prototype their algorithms. This is also complemented by the fact that all machine learning libraries (e.g. scikit-learn) and deep learning frameworks (e.g. Tensorflow, PyTorch) provide helper functions and convenient examples that use MNIST out of the box."
PyTorch does provide us with a package called torchvision
that makes it easy for us to get started with MNIST as well as Fashion-MNIST.
We'll be using torchvision
in our next post to load our training set into our project.
How Fashion-MNIST was built
Unlike the MNIST dataset, the fashion set wasn't hand-drawn, but the images in the dataset are actual images from Zalando's website.
However, they have been transformed to more closely correspond to the MNIST specifications. This is the general conversion process that each image from the site went through:
- Converted to PNG
- Trimmed
- Resized
- Sharpened
- Extended
- Negated
- Gray-scaled
To see a more detailed description of this process, be sure to check out section two of the paper.
Accessing Fashion-MNIST with torchvision
In summary, we have seen the origin and history of the Fashion-MNIST dataset, and although the dataset is designed to be more challenging as a computer vision problem, the set is still a great place to start.
We will be accessing Fashion-MNIST though a PyTorch vision library called torchvision
and building our first neural network that can accurately predict an output class given an input fashion
image.
I'll see you in the next one.
quiz
resources
updates
Committed by on