Which of these is SGD, Batch Gradient Descent, and Mini-Batch? What do they have in common?
1. Which of these is SGD, Batch Gradient Descent, and Mini-Batch? What do they have in common? 2. Are they all equal? What are their respective advantages and disadvantages? 3. Given these advantages and disadvantages, which might we prefer?