S. McCandlish, et al. (2018) " An Empirical Model of "Large-Batch…: am

am

(no subject)

Feb 09, 2019 23:28

S. McCandlish, et al. (2018)
"An Empirical Model of
"Large-Batch Training."

"The gradient noise scale
predicts the paralleliz-
ability of neural network
training on a wide range
of tasks."

Leave a comment

Previous post

Next post