Cheapest Flights from New York to Toronto

Toronto is the largest city in Canada and one of the most diverse. The city has taken numerous steps to make the city more accessible, environmentally friendly, and livable for both residents and…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Backpropagation simplified!

If you are reading this post, then you should already know that backpropagation algorithm is used to train a neural network through a method called chain rule. In simple terms, after each forward pass through a network, backpropagation performs a backward pass and adjusts the model’s parameters (weights and biases).

Backpropagation is by far the most fundamental building block in a neural network because this is the moment when the magic happens and your model actually learns through the data. I am sure you can see tons of resources online that attempt to explain how backpropagation works, but only few of them can actually simplify the learning process.

In this post, I will try to explain how the backpropagation algorith works in a simplified and generalized way. So, let’s get started!

Before we jump into the math part, here is a simple diagram of a fully connected neural network:

This network consists of one input layer, three hidden layers and one output layer. But the formulas that we will be deriving throughout this post will be valid for a neural network with any number of hidden layers.

We will assume that all the activation functions of our network are Sigmoid and the cost function is Root Mean Square Error.

So the cost function is:

Note that, in above cost function J, ŷ is predicted output and y is the real output.

2 other important formulas related with output layer that we will use later are:

Eq: 2, 3

In equation 2, a is the activation of layer L-1, b is bias term and z is the weighted output of layer L.

In order to update the weights between the last hidden layer and output layer, by chain rule, we can use this formula:

Note that we can easily derive from equation 2 that:

So, equation 4 can be written as:

Eq: 6

Let’s use δ to denote the derivative of cost function with respect to z for each layer. So we can write :

Eq: 7

By the way, we can calculate δ(L) using below formulas:

Eq: 8
Eq: 9

Of course, in equation 8, we can use below formula which can be derived from equation 1.

Going a little back, using equation 7, we can rewrite equation 6 as:

Actually, equation 11 is very important for us. Because we will use this as a generalized formula to update the weights of any layer. For example, for layer number L-1, just update the indexes of equation 11, and we can easily find the derivative of cost function w.r.t weights of layer L-1.

I mean this:

Eq: 12

Of course for this task we first need to find δ for that layer. And now we come to that part:)

So, where were we? Yes, we need to calculate:

Eq: 13

We know from chain rule that:

And again using chain rule, equation 14 can be rewritten further as:

Eq: 15

Then, by the help of equation 2 and 3, we can rewrite equation 15 as:

So by combining equation 7, 13 and 16, we can write:

Eq: 17

So, as you can guess, we can use equation 17 in Equation 12 and then we can use the result to update the weights of layer L-1. And please note that you can change the indexes by 1 and this time you can calculate δ for layer L-2 and so on.

If you have come till here, you noticed we have derived so many formulas. But at the end we could find a generalize formula to use in back propagation.

The key generalized formulas are: equation 7, 11 and 17. And remember, we start from the last layer, and we go backward. We can iteratively find δ for related layer , and then use it to calculate weight gradients.

And of course one last formula that you all know is the weight update rule:)

Eq: 18

where η is the learning rate.

I hope this post is useful for you to understand backpropagation. See you in next posts!

Add a comment

Related posts:

How to Build Quality Backlinks in 2020

Link earning is the most efficient and valuable technique for building high-quality backlinks.. “How to Build Quality Backlinks in 2020” is published by kamal Ganwani.

Boundary is much more than PAM

A lot of early HashiCorp products started out by solving a niche problem for the practitioner or hobbyist. They then graduated to enterprise use cases by adding collaboration and advanced cloud…

Suspect in pentagon document leak identified

The arrest of Jack Teixeira, a 21-year-old member of the Massachusetts Air National Guard, has sent shockwaves through the United States. Authorities say Teixeira is responsible for leaking secret…