Deep Learning Dictionary - Lightweight Crash Course

Deep Learning Course 1 of 7 - Level: Beginner

Relu Activation Function - Deep Learning Dictionary

video

expand_more chevron_left

text

expand_more chevron_left

ReLU Activation Function - Deep Learning Dictionary

In a neural network, an activation function applies a nonlinear transformation to the output of a layer.

One of the most widely used activation functions today called $$\text{ReLU}$$, short for Rectified Linear Unit, transforms its input to the maximum of either $$0$$ or the input itself.

For a given value $$x$$ passed to $$\text{ReLU}$$, we define

$$\text{relu}(x)=\text{max}(0,x)$$

The table below summarizes how $$\text{ReLU}$$ transforms its input.

Input ReLU Output
Values less than or equal to $$0$$ $$0$$
Values greater than $$0$$ The input value

When $$\text{ReLU}$$ is used as an activation function following a layer in a neural network, it accepts the weighted sum of outputs from the previous layer and transforms this sum to a value equal to either the sum itself or $$0$$.

Intuitively, we can think of a given node's activated output (with $$\text{ReLU}$$) as being "more activated" the more positive it is.

$$\text{ReLU}$$ is by far the most popular choice for an activation function to include in neural networks today. There have been several variations of $$\text{ReLU}$$ that lead to marginal improvements when training networks, like leaky $$\text{ReLU}$$ and parametric $$\text{ReLU}$$ ($$\text{PreLU}$$).

quiz

expand_more chevron_left

resources

expand_more chevron_left

expand_more chevron_left