Underfitting in a neural network
In this post, we’ll discuss what it means when a model is said to be underfitting. We’ll also cover some techniques we can use to try to reduce or avoid underfitting when it happens.
In a previous post, we defined overfitting to be an issue that occurs when our model is able to predict on data it was trained on really well, but is unable to generalize and accurately predict on data it hasn’t seen before.
Underfitting is on the opposite end of the spectrum. A model is said to be underfitting when it’s not even able to classify the data it was trained on, let alone data it hasn’t seen before.
We can tell that a model is underfitting when the metrics given for the training data are poor, meaning that the training accuracy of the model is low and/or the training loss is high.
If the model is unable to classify data it was trained on, it’s likely not going to do well at predicting on data that it hasn’t seen before.
So now that we know what underfitting is, how can we reduce it?
Increase the complexity of the model
One thing we can do is increase the complexity of our model. This is the exact opposite of a technique we gave to reduce overfitting. If our data is more complex, and we have a relatively simple model, then the model may not be sophisticated enough to be able to accurately classify or predict on our complex data.
We can increase the complexity of our model by doing things such as:
- Increasing the number of layers in the model.
- Increasing the number of neurons in each layer.
- Changing what type of layers we’re using and where.
Add more features to the input samples
Another technique we can use to reduce underfitting is to add more features to the input samples in our training set if we can. These additional features may help our model classify the data better.
For example, say we have a model that is attempting to predict the price of a stock based on the last three closing prices of this stock. So our input would consist of three features:
- day 1 close
- day 2 close
- day 3 close
If we added additional features to this data, like, maybe the opening prices for these days, or the volume of the stock for these days, then perhaps this may help our model learn more about the data and improve it’s accuracy.
The last tip we'll discuss about reducing underfitting is to reduce dropout. Again, this is exactly opposite of a technique we gave in a previous post for reducing overfitting.
As mentioned in that post, dropout, which we’ll cover in more detail at a later time, is a regularization technique that randomly ignores a subset of nodes in a given layer. It essentially prevents these dropped out nodes from participating in producing a prediction on the data.
When using dropout, we can specify a percentage of the nodes we want to drop. So if we’re using a
50% dropout rate, and we see that our model is underfitting, then we can decrease
our amount of dropout by reducing the dropout percentage to something lower than
50 and see what types of metrics we get when we attempt to train again.
These nodes are only dropped out for purposes of training and not during validation. So, if we see that our model is fitting better to our validation data than it is to our training data, then this is a good indicator to reduce the amount of dropout that we’re using.
Hopefully now we understand the concept of underfitting, why it happens, and how we can reduce it if we see it happening in our models. In the next post, we'll look at regularization in a neural network. I'll see you there!