« search calendars« DIMACS/TRIPODS Workshop on Optimization and Machine Learning

« Statistical Properties of Stochastic Gradient Descent

Statistical Properties of Stochastic Gradient Descent

August 14, 2018, 11:30 AM - 12:00 PM

Location:

Iacocca Hall

Lehigh University

Bethlehem PA

Click here for map.

Panos Toulis, University of Chicago

Stochastic gradient descent (SGD) is remarkably multi-faceted: for machine learners it is a powerful optimization method, but for statisticians it is mainly a method for iterative estimation. While several important results are known for optimization properties of SGD, surprisingly little is known about its statistical properties. In this talk, I review recent results on doing statistics with SGD, which include analytic formulas for the asymptotic covariance matrix of SGD-based estimators and a numerically stable variant of SGD with implicit updates. Together these results open up the possibility of doing principled statistical analysis with SGD, including classical inference and hypothesis testing. Specifically about inference, I present current work showing that with appropriate selection of the learning rate the asymptotic covariance matrix of SGD is isotropic and parameter-free. As such, some SGD-based estimators can be easily transformed into pivotal quantities, which substantially simplifies inference. This is a unique and remarkable property of SGD, even compared to popular estimation methods favored by statisticians, such as maximum likelihood, highlighting the untapped potential of SGD for fast and principled estimation with large data sets.