[Feature Pitch] Full-batch optimization toolkit #55279

rfeinman · 2021-04-03T17:41:00Z

🚀 Feature

It would be great to have a library of optimization routines for the deterministic setting (a la scipy.optimize) using PyTorch autograd mechanics. I have written a prototype library of PyTorch-based function minimizers and I'd like to guage the interest of the pytorch community. This could perhaps be a submodule of torch.optim, or it could be included elsewhere.

Motivation

For a variety of optimization problems, deterministic (or "full-batch") routines such as Newton and Quasi-Newton methods are preferrable over vanilla gradient descent. Although PyTorch offers many routines for stochastic optimization, utilities for deterministic optimization are scarce; only L-BFGS is included in the torch.optim package, and it's modified for mini-batch training.

MATLAB and SciPy are the industry standards for deterministic optimization. These libraries have a comprehensive set of routines; however, automatic differentiation is not supported. Therefore, the user must specify 1st- and 2nd-order gradients explicitly (they must be known) or use finite-difference approximations. This limits the applications considerably.

It would be wonderful to have a library of deterministic optimization routines in PyTorch. As a standalone entity, the library would be preferable over SciPy/MATLAB: the user only has to provide a function (not jacobian/hessian functions), and analytical derivatives are always used. But perhaps more importantly, it would be great to have these resources available inside of PyTorch to use alongside all of its other great tools.

Pitch

See complete pitch at pytorch-minimize.

My current library offers 8 different methods for unconstrained minimization. A complete description of these algorithms is provided in the readme. Constrained minimization is not currently implemented, but could perhaps be included in the future.

import torch
from torchmin import minimize  # <- perhaps ``from torch.optim import minimize``

def rosen(x):
    return torch.sum(100*(x[..., 1:] - x[..., :-1]**2)**2 
                     + (1 - x[..., :-1])**2)

# initial point
x0 = torch.tensor([1., 8.])

# BFGS
result = minimize(rosen, x0, method='bfgs')

# Newton Conjugate Gradient
result = minimize(rosen, x0, method='newton-cg')

methods: BFGS, L-BFGS, Conjugate Gradient (CG), Newton Conjugate Gradient (NCG), Newton Exact, Dogleg, Trust-Region Exact, Trust-Region NCG

Additional context

torch.optim is tailored to training neural networks with stochastic (mini-batch) gradient descent. I envision a minimize package that is tailored to a different set of problems, namely those of the deterministic or "full-batch" setting. For example, we may want to do various manipulations with a trained neural net such as: search for adversarial examples, traverse a latent space (e.g. styleGAN), or find a MAP estimate of latent variables. In addition, we may want to solve an optimization sub-problem inside the loop of our neural network optimization, as for example in EM algorithms like sparse dictionary learning.

cc @mruberry @kurtamohler @vincentqb

The text was updated successfully, but these errors were encountered:

rfeinman · 2021-04-11T20:06:04Z

Update: I've added a preliminary constrained optimizer to the proposed library. I included a tutorial showing how to find an optimal adversarial perturbation subject to perturbation norm constraint.

rfeinman · 2021-04-13T16:41:19Z

@mruberry @mrshenli - I had an alternative idea that I wanted to share.

I realize that adding a new "minimize" module could be difficult to coordinate since the schematic differs considerably from existing torch optimizers. As an alternative, I wrote a new "Minimize" class that inherits from optim.Optimizer and supports parameter specifications just like existing optimizers. At the moment, the optimizer wraps scipy.optimize.minimize and supports all of its various solver methods. Like optim.LBFGS, Minimize needs to re-evaluate the objective multiple times, so a closure callable must be provided.

Eventually it would be ideal to use custom torch implementations of the underlying solver algorithms and remove the scipy dependency. Currently, Minimize interacts with scipy behind the scenes, so the front-end schematic is exactly as other optimizers (parameters are torch.Tensor or nn.Parameter, objectives use torch, etc.). CUDA is supported but will only be used for evaluating the objective and gradient; other numerical computations performed by the underlying solvers are on CPU (with numpy arrays).

mruberry · 2021-04-14T07:26:14Z

Hey @rfeinman! This is a cool idea and I'd like to discuss it with more of the PyTorch team.

One question we always have, however, with possible extensions is: could the desired functionality be provided as a library on top of PyTorch, or does it need to be included in PyTorch itself? Are there advantages to including this in PyTorch that may not be obvious?

rfeinman · 2021-04-14T18:05:18Z

@mruberry valid question. Deterministic optimization is a well-established area of machine learning, and I feel that PyTorch users could get a lot out of having these tools in the core library. I've seen a number of forum posts and tickets asking about the Conjugate Gradient optimizer, which is mentioned in the torch.optim documentation page but not actually implemented (I think it was previously available in torch, but never ported to pytorch?). Clearly there seems to be an interest in optimization routines of this family.

Although PyTorch is primarily a neural network library, people now use it for a range of scientific computing applications, especially those which can be accelerated by GPU. In addition to supporting the neural network applications that I mentioned in my original ticket, I think deterministic optimizers can help expand the pytorch horizon by offering high-performance implementations of widely-used tools from libraries like SciPy and MATLAB. It would seem to fit well with other ongoing pytorch projects, such as the linalg module.

mruberry · 2021-04-14T18:11:28Z

Makes sense, @rfeinman. @vincentqb, what are your thoughts?

rfeinman · 2021-05-10T17:28:30Z

@mruberry @vincentqb I put together a write-up of the proposed API here. It includes a summary of both the functional and object-oriented APIs that I've mentioned. Would be curious to hear your thoughts.

tvercaut · 2023-09-26T21:42:06Z

It would be great to support such a functionality in pytorch as this is a core functionality available in scipy and there is no actively maintained third-party pytorch library for it.

This also echos related feature requests for Levenberg-Marquardt, Non-linear Conjugate Gradient and the like:
#83529 #51407 #80553

Other related libraries:
https://github.com/hahnec/torchimize

To some extent, one can already rely on LBFGS but this may not always be the best option and bound constraints have not been implemented yet. That is, there is no L-BFGS-B in pytorch:
#22281 #6564

tvercaut · 2023-10-04T20:26:57Z

I have just stumbled onto Theseus which addresses some of the needs discussed here: namely unconstrained non-linear-least squares.

rfeinman changed the title ~~Deterministic optimization routines (fminunc / fmincon)~~ [Feature Pitch] Deterministic optimization routines Apr 4, 2021

rfeinman changed the title ~~[Feature Pitch] Deterministic optimization routines~~ [Feature Pitch] Deterministic optimization toolkit Apr 4, 2021

mrshenli added module: determinism module: optimizer Related to torch.optim triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 7, 2021

jbschlosser changed the title ~~[Feature Pitch] Deterministic optimization toolkit~~ [Feature Pitch] Full-batch optimization toolkit Apr 23, 2021

jbschlosser removed the module: determinism label Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Pitch] Full-batch optimization toolkit #55279

[Feature Pitch] Full-batch optimization toolkit #55279

rfeinman commented Apr 3, 2021 •

edited

rfeinman commented Apr 11, 2021

rfeinman commented Apr 13, 2021 •

edited

mruberry commented Apr 14, 2021

rfeinman commented Apr 14, 2021 •

edited

mruberry commented Apr 14, 2021

rfeinman commented May 10, 2021 •

edited

tvercaut commented Sep 26, 2023

tvercaut commented Oct 4, 2023

[Feature Pitch] Full-batch optimization toolkit #55279

[Feature Pitch] Full-batch optimization toolkit #55279

Comments

rfeinman commented Apr 3, 2021 • edited

🚀 Feature

Motivation

Pitch

Additional context

rfeinman commented Apr 11, 2021

rfeinman commented Apr 13, 2021 • edited

mruberry commented Apr 14, 2021

rfeinman commented Apr 14, 2021 • edited

mruberry commented Apr 14, 2021

rfeinman commented May 10, 2021 • edited

tvercaut commented Sep 26, 2023

tvercaut commented Oct 4, 2023

rfeinman commented Apr 3, 2021 •

edited

rfeinman commented Apr 13, 2021 •

edited

rfeinman commented Apr 14, 2021 •

edited

rfeinman commented May 10, 2021 •

edited