Data Augmentation explained

video

expand_more

text

expand_more

Data augmentation for machine learning

In this post, we'll be discussing data augmentation and under what circumstances we may want to use it.

Data augmentation occurs when we create new data based on modifications of our existing data. Essentially, we're creating new, augmented data by making reasonable modifications to data in our training set.

Data augmentation occurs when we create new data based on modifications of our existing data.

For example, we could augment image data by flipping the images, either horizontally or vertically. We could rotate the images, zoom in or out, crop, or even vary the color of the images. All of these are common data augmentation techniques.

Horizontal flip
Vertical flip
Rotation
Zoom in
Zoom out
Cropping
Color variations

Why would we want to do this, though? Why use data augmentation?

Why use data augmentation?

Well, we may just want or need to add more data to our training set. For example, say we have a relatively small amount of samples to include in our training set, and it's difficult to get more. Then we could create new data from our existing data set using data augmentation to create more samples.

Reducing overfitting

Additionally, we may want to use data augmentation to reduce overfitting. Recall, we mentioned this point in our post that covered overfitting.

If our model is overfitting, one technique to reduce it to add more data to the training set. Given the first point we just made a moment ago, we can easily create more data using data augmentation if we don't have access to additional data.

Also, in regards to overfitting, think about if we had a data set full of images of dogs, but most of the dogs were facing to the right.

If a model was trained on these images, it's reasonable to think that the model would believe that only these right-facing dogs were actually dogs. It may very well not classify left-facing dogs as actually being dogs when we deploy this model in the field or use it to predict on test images.

With this, producing new right-facing images of dogs by augmenting the original images of left-facing dogs would be a reasonable modification. We would do this by horizontally flipping the original images to produce new ones.

Now, some data augmentation techniques may not be appropriate to use on our given data set. Sticking with the dog example, we stated that horizontally flipping our dog images makes sense, however, it wouldn't necessarily be reasonable to modify our dog images by vertically flipping them.

In real world images of dogs, it's not really as likely that we'll be seeing many images of dogs flipped upside down on their heads or backs.

Wrapping up

Hopefully now you have an understanding for what data augmentation is and why it would make sense to use it. If you're interested in seeing how to do implement data augmentation in code using Keras, be sure to check out the Keras series where we implement data augmentation.

In that series, we show how to make 10 augmented images from a single original image of a dog by rotating the image, shifting the width and height, zooming, varying the color, and horizontally flipping the image.

This is actually pretty simple to implement when we use Keras' ImageDataGenerator class. You can find out all the necessary technical details for how this is done in this post. I'll see ya in the next one!

quiz

expand_more

resources

expand_more

In this video, we explain the concept of data augmentation, as it pertains to machine learning and deep learning. We also point to another resource to show how to implement data augmentation on images in code with Keras. 🕒🦎 VIDEO SECTIONS 🦎🕒 00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources 00:30 Help deeplizard add video timestamps - See example in the description 02:50 Collective Intelligence and the DEEPLIZARD HIVEMIND 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👋 Hey, we're Chris and Mandy, the creators of deeplizard! 👀 CHECK OUT OUR VLOG: 🔗 https://youtube.com/deeplizardvlog 💪 CHECK OUT OUR FITNESS CHANNEL: 🔗 https://www.youtube.com/channel/UCdCxHNCexDrAx78VfAuyKiA 🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order: 🔗 https://neurohacker.com/shop?rfsn=6488344.d171c6 ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: Mano Prime 👀 Follow deeplizard: Our vlog: https://youtube.com/deeplizardvlog Fitness: https://www.youtube.com/channel/UCdCxHNCexDrAx78VfAuyKiA Facebook: https://facebook.com/deeplizard Instagram: https://instagram.com/deeplizard Twitter: https://twitter.com/deeplizard Patreon: https://patreon.com/deeplizard YouTube: https://youtube.com/deeplizard 🎓 Deep Learning with deeplizard: AI Art for Beginners - https://deeplizard.com/course/sdcpailzrd Deep Learning Dictionary - https://deeplizard.com/course/ddcpailzrd Deep Learning Fundamentals - https://deeplizard.com/course/dlcpailzrd Learn TensorFlow - https://deeplizard.com/course/tfcpailzrd Learn PyTorch - https://deeplizard.com/course/ptcpailzrd Natural Language Processing - https://deeplizard.com/course/txtcpailzrd Reinforcement Learning - https://deeplizard.com/course/rlcpailzrd Generative Adversarial Networks - https://deeplizard.com/course/gacpailzrd Stable Diffusion Masterclass - https://deeplizard.com/course/dicpailzrd 🎓 Other Courses: DL Fundamentals Classic - https://deeplizard.com/learn/video/gZmobeGL0Yg Deep Learning Deployment - https://deeplizard.com/learn/video/SI1hVGvbbZ4 Data Science - https://deeplizard.com/learn/video/d11chG7Z-xk Trading - https://deeplizard.com/learn/video/ZpfCK_uHL9Y 🛒 Check out products deeplizard recommends on Amazon: 🔗 https://amazon.com/shop/deeplizard 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard's link: 🔗 https://amzn.to/2yoqWRn 🎵 deeplizard uses music by Kevin MacLeod 🔗 https://youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ ❤️ Please use the knowledge gained from deeplizard content for good, not evil.

updates

expand_more

DEEPLIZARD Message notifications

Update history for this page

Did you know you that deeplizard content is regularly updated and maintained?

Updated
Maintained

Spot something that needs to be updated? Don't hesitate to let us know. We'll fix it!

All relevant updates for the content on this page are listed below.

Deep Learning Fundamentals - Classic Edition