New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Pitch] Full-batch optimization toolkit #55279
Comments
Update: I've added a preliminary constrained optimizer to the proposed library. I included a tutorial showing how to find an optimal adversarial perturbation subject to perturbation norm constraint. |
@mruberry @mrshenli - I had an alternative idea that I wanted to share. I realize that adding a new "minimize" module could be difficult to coordinate since the schematic differs considerably from existing torch optimizers. As an alternative, I wrote a new "Minimize" class that inherits from Eventually it would be ideal to use custom torch implementations of the underlying solver algorithms and remove the scipy dependency. Currently, Minimize interacts with scipy behind the scenes, so the front-end schematic is exactly as other optimizers (parameters are torch.Tensor or nn.Parameter, objectives use torch, etc.). CUDA is supported but will only be used for evaluating the objective and gradient; other numerical computations performed by the underlying solvers are on CPU (with numpy arrays). |
Hey @rfeinman! This is a cool idea and I'd like to discuss it with more of the PyTorch team. One question we always have, however, with possible extensions is: could the desired functionality be provided as a library on top of PyTorch, or does it need to be included in PyTorch itself? Are there advantages to including this in PyTorch that may not be obvious? |
@mruberry valid question. Deterministic optimization is a well-established area of machine learning, and I feel that PyTorch users could get a lot out of having these tools in the core library. I've seen a number of forum posts and tickets asking about the Conjugate Gradient optimizer, which is mentioned in the Although PyTorch is primarily a neural network library, people now use it for a range of scientific computing applications, especially those which can be accelerated by GPU. In addition to supporting the neural network applications that I mentioned in my original ticket, I think deterministic optimizers can help expand the pytorch horizon by offering high-performance implementations of widely-used tools from libraries like SciPy and MATLAB. It would seem to fit well with other ongoing pytorch projects, such as the |
Makes sense, @rfeinman. @vincentqb, what are your thoughts? |
@mruberry @vincentqb I put together a write-up of the proposed API here. It includes a summary of both the functional and object-oriented APIs that I've mentioned. Would be curious to hear your thoughts. |
It would be great to support such a functionality in pytorch as this is a core functionality available in scipy and there is no actively maintained third-party pytorch library for it. This also echos related feature requests for Levenberg-Marquardt, Non-linear Conjugate Gradient and the like: Other related libraries: To some extent, one can already rely on LBFGS but this may not always be the best option and bound constraints have not been implemented yet. That is, there is no L-BFGS-B in pytorch: |
I have just stumbled onto Theseus which addresses some of the needs discussed here: namely unconstrained non-linear-least squares. |
馃殌 Feature
It would be great to have a library of optimization routines for the deterministic setting (a la
scipy.optimize
) using PyTorch autograd mechanics. I have written a prototype library of PyTorch-based function minimizers and I'd like to guage the interest of the pytorch community. This could perhaps be a submodule oftorch.optim
, or it could be included elsewhere.Motivation
For a variety of optimization problems, deterministic (or "full-batch") routines such as Newton and Quasi-Newton methods are preferrable over vanilla gradient descent. Although PyTorch offers many routines for stochastic optimization, utilities for deterministic optimization are scarce; only L-BFGS is included in the
torch.optim
package, and it's modified for mini-batch training.MATLAB and SciPy are the industry standards for deterministic optimization. These libraries have a comprehensive set of routines; however, automatic differentiation is not supported. Therefore, the user must specify 1st- and 2nd-order gradients explicitly (they must be known) or use finite-difference approximations. This limits the applications considerably.
It would be wonderful to have a library of deterministic optimization routines in PyTorch. As a standalone entity, the library would be preferable over SciPy/MATLAB: the user only has to provide a function (not jacobian/hessian functions), and analytical derivatives are always used. But perhaps more importantly, it would be great to have these resources available inside of PyTorch to use alongside all of its other great tools.
Pitch
See complete pitch at pytorch-minimize.
My current library offers 8 different methods for unconstrained minimization. A complete description of these algorithms is provided in the readme. Constrained minimization is not currently implemented, but could perhaps be included in the future.
methods: BFGS, L-BFGS, Conjugate Gradient (CG), Newton Conjugate Gradient (NCG), Newton Exact, Dogleg, Trust-Region Exact, Trust-Region NCG
Additional context
torch.optim
is tailored to training neural networks with stochastic (mini-batch) gradient descent. I envision aminimize
package that is tailored to a different set of problems, namely those of the deterministic or "full-batch" setting. For example, we may want to do various manipulations with a trained neural net such as: search for adversarial examples, traverse a latent space (e.g. styleGAN), or find a MAP estimate of latent variables. In addition, we may want to solve an optimization sub-problem inside the loop of our neural network optimization, as for example in EM algorithms like sparse dictionary learning.cc @mruberry @kurtamohler @vincentqb
The text was updated successfully, but these errors were encountered: