







If your data is large, you can use stochastic gradient descent.
So, instead of updating weights on one batch/expoc of data, you update if more frequently








If your data is large, you can use stochastic gradient descent.
So, instead of updating weights on one batch/expoc of data, you update if more frequently