Backpropagation is the standard Machine Learning algorithm. From what I understand, It works like this:

1. Set all the weights of your neural network to random values.

2. Take a training example, put it through your neural network and try to predict what your neural network will make of it.

3. Compare what your neural network’s output is with the output that you want (the correct output).

4. Adjust your weights.

5. Keep doing this over and over again until your network gets the output that you want. Once this happens, your neural network has learned.

This is called Supervised Learning because you can check if your network is correct or not by comparing the results of your network to your training examples.

The weights are adjusted using the Gradient Decent Algorithm. This is how the weights and biases are tweaked. The Gradient Decent is summarized well here. It finds the local minimum of a function. This is when the slope of the function is zero. (In other words, when the derivative of the function is equal to 0.) The problem is you can only efficiently find the derivative of a function that has a few variables. Neural Networks can get very big, sometimes having millions of variables. So computing the the derivatives of these types of functions is computationally intensive. That is why the Gradient Decent Algorithm is used for huge networks. The next step for me is to write a program for the Gradient Decent Algorithm and maybe also a program that can compute general derivatives.

Summary: Backpropagation finds the error that your weights have by comparing your neural network’s output to the output that you want you network to have and the Gradient Decent Algorithm is used to adjust these weights.

Discuss on Github

## References:

Michael A. Nielsen, “Neural Networks and Deep Learning”

Trask, Andrew. "A Neural Network in 13 Lines of Python (Part 2 - Gradient Descent)." - I Am Trask. N.p., 27 July 2015. Web. 23 Apr. 2016.