Machine Learning & Deep Learning Fundamentals

with deeplizard.

Weight Initialization explained | A way to reduce the vanishing gradient problem

March 30, 2018 by


Let's talk about how the weights in an artificial neural network are initialized, how this initialization affects the training process, and what YOU can do about it! To kick off our discussion on weight initialization, we’re first going to discuss how these weights are initialized, and how these initialized values might negatively affect the training process. We’ll see that these randomly initialized weights actually contribute to the vanishing and exploding gradient problem we covered in the last video. With this in mind, we’ll then explore what we can do to influence how this initialization occurs. We’ll see how Xavier initialization (also called Glorot initialization) can help combat this problem. Then, we’ll see how we can specify how the weights for a given model are initialized in code using the kernel_initializer parameter for a given layer in Keras. Reference to original paper by Xavier Glorot and Yoshua Bengio: Follow deeplizard: YouTube: Twitter: Facebook: Steemit: Instagram: Support deeplizard on Patreon: Support deeplizard with crypto: Bitcoin: 1AFgm3fLTiG5pNPgnfkKdsktgxLCMYpxCN Litecoin: LTZ2AUGpDmFm85y89PFFvVR5QmfX6Rfzg3 Ether: 0x9105cd0ecbc921ad19f6d5f9dd249735da8269ef Recommended books on AI: The Most Human Human: What Artificial Intelligence Teaches Us About Being Alive: