Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Stanislaw Jastrzębski, Zachary Kenton, Devansh Arpit, Nicolas Ballas, Asja Fischer, Yoshua Bengio, Amos Storkey

Journal title: Artificial Neural Networks and Machine Learning – ICANN 2018 - 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III

Journal number: 11141

Journal publisher: Springer International Publishing

Published year: 2018

Published pages: 392-402

DOI identifier: 10.1007/978-3-030-01424-7_39

ISBN: 978-3-030-01423-0