gradient_descent

gradient descent error

notes1

notes2

notes3

notes4

notes5

algorithm

If your data is large, you can use stochastic gradient descent. So, instead of updating weights on one batch/expoc of data, you update if more frequently