Neural Network Programming - Deep Learning with PyTorch

with deeplizard.

CNN Image Preparation Code Project - Learn to Extract, Transform, Load (ETL)

October 25, 2018 by


Extract, Transform, and Load (ETL) with PyTorch

Welcome back to this series on neural network programming with PyTorch. In this post, we will write our first code of part two of the series.

We’ll demonstrate a very simple extract, transform and load pipeline using torchvision, PyTorch’s computer vision package for machine learning. Without further ado, let’s get started.

ai cyborg

The project (Bird's-eye view)

There are four general steps that we’ll be following as we move through this project:

  1. Prepare the data
  2. Build the model
  3. Train the model
  4. Analyze the model’s results

The ETL process

In this post, we’ll kick things off by preparing the data. To prepare our data, we'll be following what is loosely known as an ETL process.

  • Extract data from a data source.
  • Transform data into a desirable format.
  • Load data into a suitable structure.

The ETL process can be thought of as a fractal process because it can be applied on various scales. The process can be applied on a small scale, like a single program, or on a large scale, all the way up to the enterprise level where there are huge systems handling each of the individual parts.


If you want to know more about the general data science pipeline, check out the data science post, where we cover this in greater detail.

Once we have completed the ETL process, we are ready to begin building and training our deep learning model. PyTorch has some built-in packages and classes that make the ETL process pretty easy.

PyTorch imports

We begin by importing all of the necessary PyTorch libraries.

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

import torchvision
import torchvision.transforms as transforms

This table describes the of each of these packages:

Package Description
torch The top-level PyTorch package and tensor library.
torch.nn A subpackage that contains modules and extensible classes for building neural networks.
torch.optim A subpackage that contains standard optimization operations like SGD and Adam.
torch.nn.functional A functional interface that contains typical operations used for building neural networks like loss functions and convolutions.
torchvision A package that provides access to popular datasets, model architectures, and image transformations for computer vision.
torchvision.transforms An interface that contains common transforms for image processing.

Other imports

The next imports are standard packages used for data science in Python:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.metrics import confusion_matrix
#from plotcm import plot_confusion_matrix

import pdb


Note that pdb is the Python debugger and the commented import is a local file that we’ll introduce in future posts for plotting the confusion matrix, and the last line sets the print options for PyTorch print statements.

We are ready now to prepare our data.

Preparing our data using PyTorch

Our ultimate goal when preparing our data is to do the following (ETL):

  1. Extract – Get the Fashion-MNIST image data from the source.
  2. Transform – Put our data into tensor form.
  3. Load – Put our data into an object to make it easily accessible.

For these purposes, PyTorch provides us with two classes:

Class Description An abstract class for representing a dataset. Wraps a dataset and provides access to the underlying data.

An abstract class is a Python class that has methods we must implement, so we can create a custom dataset by creating a subclass that extends the functionality of the Dataset class.

To create a custom dataset using PyTorch, we extend the Dataset class by creating a subclass that implements these required methods. Upon doing this, our new subclass can then be passed to the a PyTorch DataLoader object.

We will be using the fashion-MNIST dataset that comes built-in with the torchvision package, so we won’t have to do this for our project. Just know that the Fashion-MNIST built-in dataset class is doing this behind the scenes.

All subclasses of the Dataset class must override __len__, that provides the size of the dataset, and __getitem__, supporting integer indexing in range from 0 to len(self) exclusive.

Specifically, there are two methods that are required to be implemented. The __len__ method which returns the length of the dataset, and the __getitem__ method that gets an element from the dataset at a specific index location within the dataset.

PyTorch torchvision package

The torchvision package, gives us access to the following resources:

  • Datasets (like MNIST and Fashion-MNIST)
  • Models (like VGG16)
  • Transforms
  • Utils

Computer vision

All of these resources are related to deep learning computer vision tasks.


When we learned about the Fashion-MNIST dataset in our previous post, the arXiv paper that introduced the fashion dataset indicated that the authors wanted it to be a drop-in for the original MNIST dataset.

The idea was to make is so that frameworks like PyTorch could add Fashion-MNIST by just changing the URL for retrieving the data.

This is the case for PyTorch. The PyTorch FashionMNIST dataset simply extends the MNIST dataset and overrides the urls.

Here is the class definition from PyTorch's torchvision source code:

class FashionMNIST(MNIST):
    """`Fashion-MNIST <>`_ Dataset.

        root (string): Root directory of dataset where ``processed/``
            and  ``processed/`` exist.
        train (bool, optional): If True, creates dataset from ````,
            otherwise from ````.
        download (bool, optional): If true, downloads the dataset from the internet and
            puts it in root directory. If dataset is already downloaded, it is not
            downloaded again.
        transform (callable, optional): A function/transform that  takes in an PIL image
            and returns a transformed version. E.g, ``transforms.RandomCrop``
        target_transform (callable, optional): A function/transform that takes in the
            target and transforms it.
    urls = [

Let’s see now how we can take advantage of torchvision.

PyTorch Dataset class

To get an instance of the FashionMNIST dataset using torchvision, we just create one like so:

train_set = torchvision.datasets.FashionMNIST(

We specify the following arguments:

Parameter Description
root The location on disk where the data is located.
train If the dataset is the training set
download If the data should be downloaded.
transform A composition of transformations that should be performed on the dataset elements.

Since we want our images to be transformed into tensors, we use the built-in transforms.ToTensor() transformation, and since this dataset is going to be used for training, we’ll name the instance train_set.

When we run this code for the first time, the Fashion-MNIST dataset will be downloaded locally. Subsequent calls check for the data before downloading it. Thus, we don't have to worry about double downloads or repeated network calls.

PyTorch DataLoader class

To create a DataLoader wrapper for our training set, we do it like this:

train_loader =

We just pass train_set as an argument. Now, we can leverage the loader for tasks that would otherwise be pretty complicated to implement by hand:

  • batch_size (1000 in our case)
  • shuffle (True in our case)
  • num_workers (Default is 0 which means the main process will be used)

ETL summary

From an ETL perspective, we have achieved the extract, and the transform using torchvision when we created the dataset:

  1. Extract – The raw data was extracted from the web.
  2. Transform – The raw image data was transformed into a tensor.
  3. Load – The train_set wrapped by (loaded into) the data loader giving us access to the underlying data.

Now, we should have a good understanding of the torchvision module that is provided by PyTorch, and how we can use Datasets and DataLoaders in the PyTorch package to streamline ETL tasks.

In the next post, we’ll see how we can work with datasets and data loaders to access and view individual samples as well as batches of samples.

I’ll see you in the next one!


Preparing data for computer vision and artificial intelligence with PyTorch. Step one of our constitutional neural network coding project. References: Ted talk: 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👉 Check out the blog post and other resources for this video: 🔗 💻 DOWNLOAD ACCESS TO CODE FILES 🤖 Available for members of the deeplizard hivemind: 🔗 🧠 Support collective intelligence, join the deeplizard hivemind: 🔗 🤜 Support collective intelligence, create a quiz question for this video: 🔗 🚀 Boost collective intelligence by sharing this video on social media! ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: Peder B. Helland 👀 Follow deeplizard: Twitter: Facebook: Patreon: YouTube: Instagram: 🎓 Other deeplizard courses: Reinforcement Learning - NN Programming - DL Fundamentals - Keras - TensorFlow.js - Data Science - Trading - 🛒 Check out products deeplizard recommends on Amazon: 🔗 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard’s link: 🔗 🎵 deeplizard uses music by Kevin MacLeod 🔗 🔗 ❤️ Please use the knowledge gained from deeplizard content for good, not evil.