DIMACS :: Details

DIMACS/TRIPODS Workshop on Optimization and Machine Learning

August 13, 2018 - August 15, 2018

Location:

Iacocca Hall

Lehigh University

Bethlehem PA

Click here for map.

Organizer(s):

Frank Curtis, Lehigh University

Stefanie Jegelka, Massachusetts Institute of Technology

Satyen Kale, Google

Edo Liberty, Amazon

Han Liu, Northwestern University

Francesco Orabona, Stony Brook University

Katya Scheinberg, Lehigh University

Martin Takáč, Lehigh University

Description

The DIMACS/TRIPODS workshop on Optimization in Machine Learning is being held at Lehigh University in association with two related events. It is immediately preceded by a three-day summer school for doctoral students interested in improving their theoretical and practical skills related to optimization approaches in machine learning. The workshop is followed by the annual Modeling and Optimization: Theory and Applications (MOPTA) conference, which covers related topics but is much broader in scope.

Although machine learning (ML) is still a relatively new field, it is growing at an astounding rate, and ML models are a driving force in optimization today. While some ML models—notably support vector machines and logistic regression—yield to more traditional convex optimization approaches, others—using deep neural networks—lead to nonlinear and nonconvex problems that stymie traditional solvers. This workshop will address several of the biggest optimization challenges that arise in practical ML applications and be organized around three key themes: 1) stochastic and nonconvex methods; 2) adaptive and online methods for learning under uncertainty; and 3) finding and exploiting special structure.

(1) Nonconvex and stochastic optimization methods. The optimization effort in ML applications has grown exponentially in the recent past, because of the difficulty and importance of training complex nonconvex models, such as deep neural networks. Stochastic methods have significant advantage on these problems because computing accurate function and gradient information required by deterministic methods may be prohibitive for very large data sets. At the same time, it is difficult to construct useful second order information based on stochastic estimates, and hence stochastic methods with fast convergence are still out of reach for training deep neural nets. On the other hand, exploiting second order information seems necessary for deep neural networks to avoid areas of saddle points, which terminate training with inferior solutions. Significant progress has been made recently for stochastic second-order methods and variance-reduction methods for convex models, as well as fast second-order deterministic methods for deep neural networks. Second-order methods also require advanced use of matrix approximation and sketching techniques to run efficiently. These methods are currently being tested and extended to stochastic nonconvex settings. Their promise makes them a timely and important workshop topic.

(2) Methods for Learning Under Uncertainty. Changing data distributions is a commonly encountered scenario in machine learning, especially when data are generated over time (as in spam filtering). Adaptive and online learning techniques able to handle changing data distributions are thus paramount in such situations. A recent trend is online learning methods that adapt to “easiness” in the data. This line of work provides algorithms that are able exploit structure in the data (if it exists) in order to learn faster, while retaining optimal worst-case convergence rates. One of the aims of this workshop is to investigate what notions of easiness are amenable to these sorts of algorithms. Another active area of research in adaptive online learning deals with methods that simultaneously provide competitive guarantees with predictors of arbitrary complexity. These guarantees naturally degrade gracefully with increasing complexity. These methods are intimately connected with optimization methods and convex analysis. Beyond their performance guarantees, such algorithms also tend to be parameter-free, making them immensely practical because they require no tuning. One aim of the workshop will be to gain insight on the optimal competitive guarantees that can be obtained in various scenarios of learning with unbounded complexity predictors.

(3) Special structure for optimization in machine learning. The identification of convexity in machine learning problems led to a surge in machine learning methods and related optimization algorithms, with broad impact in theory and applications. Many of these rely on optimality guarantees and scalability of the optimization. With the rise of deep learning and other complex models, and concomitant questions in computational versus statistical complexity, there is increasing interest in understanding the properties of nonconvex learning formulations. For example, in some cases nonconvex formulations may be statistically competitive if not. Still, the current theoretical understanding of empirical results for nonconvex learning is very limited and in need of deeper insight.

Scalability and guarantees for (nonconvex) optimization rely on additional mathematical structure. The workshop will explore types of structure that aid optimization and their potential impact on machine learning problems. Potential topics include geometric optimization, polynomial optimization, relaxations, restricted and generalized versions of convexity, and blending of discrete and continuous optimization. Examples of the latter are optimization problems arising from learning representations of discrete objects, nonconvex relaxations for discrete optimization, and exploiting structure from discrete optimization for nonconvex optimization. Recent examples in machine learning include results for matrix factorization and other recovery problems or submodular optimization.

Program

Monday, August 13, 2018

8:45 AM - 9:00 AM

Opening remarks

9:00 AM - 10:00 AM

Plenary talk: Tractable Nonconvex Optimization via Geometry

Suvrit Sra, Massachusetts Institute of Technology

10:00 AM - 10:30 AM

Uniform Convergence of Gradients for Non-convex Learning and Optimization

Karthik Sridharan, Cornell University

10:30 AM - 11:00 AM

Logistic Regression: The Importance of Being Improper

Satyen Kale, Google

11:00 AM - 11:30 AM

Break

11:30 AM - 12:00 PM

Building Algorithms by Playing Games

Jake Abernethy, University of Michigan

12:00 PM - 12:30 PM

Risk Bounds for Classification and Regression Models that Interpolate

Daniel Hsu, Columbia University

12:30 PM - 1:30 PM

Lunch

1:30 PM - 2:30 PM

Plenary talk: Representation, Optimization and Generalization Properties of Deep Neural Networks

Peter Bartlett, University of California, Berkeley

2:30 PM - 3:00 PM

Parameter-free Nonsmooth Convex Stochastic Optimization through Coin Betting

Francesco Orabona, Stony Brook University

3:00 PM - 3:30 PM

Data-dependent Hashing via Nonlinear Spectral Gaps

Alexandr Andoni, Columbia University

3:30 PM - 4:00 PM

Break

4:00 PM - 4:30 PM

Stochastic Optimization for AUC Maximization

Yiming Ying, University at Albany

4:30 PM - 5:00 PM

The Power of Interpolation: Machine Learning without Loss Functions and Regularization

Mikhail Belkin, Ohio State University

5:00 PM - 5:30 PM

Contextual Reinforcement Learning

John Langford, Microsoft Research

6:30 PM - 8:00 PM

Social time at the Comfort Suites Bar

Tuesday, August 14, 2018

9:00 AM - 9:30 AM

Maximizing Submodular Functions Exponentially Faster

Yaron Singer, Harvard University

9:30 AM - 10:00 AM

Robustness and Submodularity

Stefanie Jegelka, Massachusetts Institute of Technology

10:00 AM - 10:30 AM

Nonconvex Sparse Deconvolution: Geometry and Efficient Methods

John Wright, Columbia University

10:30 AM - 11:00 AM

Break

11:00 AM - 11:30 AM

Learning Over-Parameterized Models with Gradient Descent: An Average-Case Analysis over Quadratic Loss Functions

Hossein Mobahi, Google

11:30 AM - 12:00 PM

Statistical Properties of Stochastic Gradient Descent

Panos Toulis, University of Chicago

12:00 PM - 12:30 PM

Direct Runge-Kutta Discretization Achieves Acceleration

Aryan Mokhtari, Massachusetts Institute of Technology

12:30 PM - 1:30 PM

Lunch

1:30 PM - 2:30 PM

Plenary talk: Better Models in Optimization

John Duchi, Stanford University

2:30 PM - 3:00 PM

Frank-Wolfe Splitting via Augmented Lagrangian Method

Simon Lacoste-Julien, University of Montreal

3:00 PM - 3:30 PM

Second Order Optimization and Non-convex Machine Learning

Michael Mahoney, University of California, Berkeley

3:30 PM - 4:00 PM

Break

4:00 PM - 4:30 PM

Optimization over Nonnegative Polynomials

Amir Ali Ahmadi, Princeton University

4:30 PM - 5:00 PM

Stochastic Quasi-gradient Methods: Variance Reduction via Jacobian Sketching

Peter Richtarik, University of Edinburgh

5:00 PM - 7:00 PM

Poster session and cocktail reception

Wednesday, August 15, 2018

9:00 AM - 10:00 AM

Plenary talk: Deep Learning with Dense Connectivity

Kilian Weinberger, Cornell University

10:00 AM - 10:15 AM

Break

10:15 AM - 10:45 AM

A Positive Outlook on Negative Curvature

Daniel Robinson, Johns Hopkins University

10:45 AM - 11:15 AM

How to Characterize Worst-case Performance of Algorithms for Nonconvex Optimization

Frank Curtis, Lehigh University

11:15 AM - 11:45 AM

Convex Lens for Non-convex Problems

Benjamin Haeffele, Johns Hopkins University

11:45 AM - 12:00 PM

Break

12:00 PM - 1:00 PM

Plenary talk: Practical Conditional Gradient Algorithms

Stephen Wright, University of Wisconsin, Madison

1:00 PM - 2:00 PM

Lunch

2:00 PM - 2:30 PM

Dimensionality Reduction Techniques for Global Optimization

Coralia Cartis, University of Oxford

2:30 PM - 3:00 PM

“Active-set Complexity” of Proximal Gradient: How Long Does It Take to Find the Sparsity Pattern?

Mark Schmidt, University of British Columbia

3:00 PM - 3:30 PM

Stochastic Methods for Non-smooth Non-convex Optimization

Damek Davis, Cornell University

3:30 PM - 4:00 PM

Break

4:00 PM - 4:30 PM

New Framework For Convergence Analysis Of Stochastic Optimization Methods (Part 1)

Katya Scheinberg, Lehigh University

4:30 PM - 5:00 PM

New Framework For Convergence Analysis Of Stochastic Optimization Methods (Part 2)

Courtney Paquette, University of Waterloo

5:00 PM - 5:30 PM

Do We Need 2nd Order Methods in Machine Learning?

Martin Takáč, Lehigh University

Call For Participation

For the most up-to-date information on registration and participation, please see the workshop's primary webpage.

The workshop will feature plenary presentations each morning, two parallel sessions of invited presentations each afternoon, as well a contributed poster session and reception on August 14.

Workshop program and book of abstracts

Videos of DIMACS/TRIPODS and TRIPODS/MOPTA plenary presentations

Registration for this event is closed.