Relu Activation Function - Deep Learning Dictionary
text
ReLU Activation Function - Deep Learning Dictionary
In a neural network, an activation function applies a nonlinear transformation to the output of a layer.
One of the most widely used activation functions today called \(\text{ReLU}\), short for Rectified Linear Unit, transforms its input to the maximum of either \(0\) or the input itself.
For a given value \(x\) passed to \(\text{ReLU}\), we define
$$\text{relu}(x)=\text{max}(0,x)$$
The table below summarizes how \(\text{ReLU}\) transforms its input.
Input | ReLU Output |
---|---|
Values less than or equal to \(0\) | \(0\) |
Values greater than \(0\) | The input value |
When \(\text{ReLU}\) is used as an activation function following a layer in a neural network, it accepts the weighted sum of outputs from the previous layer and transforms this sum to a value equal to either the sum itself or \(0\).
Intuitively, we can think of a given node's activated output (with \(\text{ReLU}\)) as being "more activated" the more positive it is.
\(\text{ReLU}\) is by far the most popular choice for an activation function to include in neural networks today. There have been several variations of \(\text{ReLU}\) that lead to marginal improvements when training networks, like leaky \(\text{ReLU}\) and parametric \(\text{ReLU}\) (\(\text{PreLU}\)).
quiz
resources
updates
Committed by on