Activation Functions - Deep Learning Dictionary
text
Activation Functions - Deep Learning Dictionary
In a neural network, an activation function applies a nonlinear transformation to the output of a layer.
Activation functions are biologically inspired by activity in our brains where different neurons fire (or are activated) by different stimuli.
For any given node in a fully connected layer, recall that the node's value is calculated as the weighted sum of its inputs received from the previous layer. After taking this weighted sum, we pass the result to an activation function. This final result is passed as input to the following layer.
The activation function does an operation on this sum to transform it to a value that is often times bounded between some lower and/or upper limit. A key point of this transformation is that it's non-linear.
An important feature of linear functions is that the composition of two linear functions is also a linear function. This means that, even in very deep neural networks, if we only had linear transformations of our data values during a forward pass, the learned mapping in our network from input to output would also be linear.
Typically, the types of mappings that we are aiming to learn with our deep neural networks are more complex than simple linear mappings. Having non-linear activation functions allows our neural networks to compute arbitrarily complex functions.
There are many variations of activation functions, where each transform data differently and are used under different scenarios. Some of the more popular ones include:
- ReLU
- Sigmoid
- Softmax
We'll elaborate on how each of these functions transform data in later lessons.
quiz
resources
updates
Committed by on