Add a lot of optimizers including `optax` support to `sgd` by YigitElma · Pull Request #2041 · PlasmaControl/DESC

YigitElma · 2025-12-16T04:47:29Z

Unifies all the SGD type optimizers, developers only need to implement the update rule
x_scale is now used with SGD methods too
Adds wrappers for optax optimizers, and they can be called by optax-name
Any custom optax optimizer can be used via

        import optax
        from desc.optimize import Optimizer
        from desc.examples import get

        eq = get("DSHAPE")

        # Optimizer
        opt = optax.chain(
            optax.sgd(learning_rate=1.0),
            optax.scale_by_zoom_linesearch(max_linesearch_steps=15),
        )
        optimizer = Optimizer("optax-custom")
        eq.solve(optimizer=optimizer, options={"optax-options": {"update_rule": opt}})

github-actions · 2025-12-16T05:12:07Z

Memory benchmark result

|               Test Name                |      %Δ      |    Master (MB)     |      PR (MB)       |    Δ (MB)    |    Time PR (s)     |  Time Master (s)   |
| -------------------------------------- | ------------ | ------------------ | ------------------ | ------------ | ------------------ | ------------------ |
  test_objective_jac_w7x                 |    3.82 %    |     3.850e+03      |     3.997e+03      |    147.16    |       39.06        |       36.34        |
  test_proximal_jac_w7x_with_eq_update   |    1.56 %    |     6.493e+03      |     6.594e+03      |    101.02    |       162.92       |       161.90       |
  test_proximal_freeb_jac                |   -0.16 %    |     1.321e+04      |     1.319e+04      |    -20.82    |       85.01        |       83.21        |
  test_proximal_freeb_jac_blocked        |   -0.71 %    |     7.546e+03      |     7.492e+03      |    -53.29    |       74.37        |       74.28        |
  test_proximal_freeb_jac_batched        |   -0.00 %    |     7.487e+03      |     7.486e+03      |    -0.21     |       73.78        |       73.48        |
  test_proximal_jac_ripple               |   -1.39 %    |     3.550e+03      |     3.501e+03      |    -49.41    |       66.38        |       67.49        |
  test_proximal_jac_ripple_bounce1d      |    5.89 %    |     3.455e+03      |     3.659e+03      |    203.66    |       77.63        |       79.54        |
  test_eq_solve                          |   -0.33 %    |     2.030e+03      |     2.023e+03      |    -6.73     |       95.13        |       94.21        |

For the memory plots, go to the summary of Memory Benchmarks workflow and download the artifact.

codecov · 2025-12-16T05:26:48Z

Codecov Report

❌ Patch coverage is 98.18182% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 94.53%. Comparing base (8a6a904) to head (bbf39b7).
⚠️ Report is 5 commits behind head on master.

Files with missing lines	Patch %	Lines
desc/optimize/stochastic.py	98.14%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##           master    #2041    +/-   ##
========================================
  Coverage   94.53%   94.53%            
========================================
  Files         102      102            
  Lines       28712    28823   +111     
========================================
+ Hits        27143    27249   +106     
- Misses       1569     1574     +5

Files with missing lines	Coverage Δ
desc/optimize/_desc_wrappers.py	`91.17% <100.00%> (+0.13%)`	⬆️
desc/optimize/stochastic.py	`98.07% <98.14%> (+1.06%)`	⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

desc/optimize/stochastic.py

desc/particles.py

desc/optimize/stochastic.py

dpanici

just the small docstring fix, should be explitict that x_scale='"auto"` does no scaling here

f0uriest

I'd double check that the x_scale logic is correct

Also, did you look at whether we could just wrap stuff from optax?

From the examples eg https://optax.readthedocs.io/en/latest/api/optimizers.html#optax.adam

it looks like the user could just pass in an optax.solver and then we can just do

opt_state = solver.init(x0)
...
g = grad(x)*x_scale
updates, opt_state = solver.update(g, opt_state, x)
x = optax.apply_updates(x, x_scale*updates)

or something similar. That would give users access to a much wider array of first order optimizers, and save us having to do it all ourselves

f0uriest · 2026-01-06T21:17:00Z

desc/optimize/stochastic.py



-def sgd(
+def generic_sgd(


sgd is technically public (https://desc-docs.readthedocs.io/en/stable/_api/optimize/desc.optimize.sgd.html#desc.optimize.sgd) so if we want to change the name we should keep an alias to the old one with a deprecation warning. That said, I'm not sure we really need to change the name. "SGD" is already used pretty generically in the ML community for a bunch of first order stochastic methods like ADAM, ADAGRAD, RMSPROP, etc

Yeah, SGD is in fact the general name. I can revert to the old name, and just emphasize that "sgd" option is with nesterov momentum. I was trying to make a distinction I guess

f0uriest · 2026-01-06T21:23:23Z

desc/optimize/stochastic.py

+        for the update rule chosen.
+
+        - ``"alpha"`` : (float > 0) Learning rate. Defaults to
+          1e-1 * ||x_scaled|| / ||g_scaled||.


this seems pretty large (steps would be 10% of x), have you checked how robust this is?

I was trying to solve eq with these and even though none of them converged 10% was better for a variety of equilibrium. I haven't checked other optimization problems. Reverted the change and added a backguard against 0 and NaNs.

desc/optimize/stochastic.py

f0uriest · 2026-01-06T21:35:58Z

desc/optimize/stochastic.py

-    Where alpha is the step size and beta is the momentum parameter.
+    Update rule for ``'sgd'``:
+
+    .. math::


personally I prefer unicode for stuff like this. TeX looks nice in the rendered html docs, but is much harder to read as code.

I mostly agree and don't have a strong stance either way. My general preference is to use LaTeX for complex equations or public-facing objectives that users will first encounter in the documentation (like guiding center equations, optimization algorithms). For internal development notes or specific compute functions that aren't usually viewed on the web, I’m fine with Unicode since it keeps the source code more readable.

desc/particles.py

unalmis · 2026-01-06T22:39:30Z

Also, did you look at whether we could just wrap stuff from optax?

or something similar. That would give users access to a much wider array of first order optimizers, and save us having to do it all ourselves

Also optimistix has trust region methods with easy to use linear solvers.., e.g. normal conjugate gradient etc.

desc/optimize/_desc_wrappers.py

f0uriest · 2026-01-24T01:21:11Z

desc/optimize/stochastic.py

    return result
+
+
+def _sgd(g, v, alpha, beta):


is this the same as some version of optax-sgd? if so I'd vote to remove this and just do something like if method == "sgd": method = "optax-sgd". Then we can simplify a lot of the code here and just always assume we're using optax stuff

I am not 100% sure but it can be equivalent to optax-sgd with momentum=beta, learning_rate=alpha and nesterov=True. The amount of code dedicated to that is not much and it also handles future implementations (I don't know if anyone wants to add their sgd optimizers but anyway). If people want to I can add depreciation but it doesn't seem urgent to me.

tests/test_optimizer.py

CHANGELOG.md

YigitElma · 2026-01-28T18:32:25Z

A simple helper test to check _all_optax_optimizers is outdated or not.

    @pytest.mark.unit
    def test_available_optax_optimizers(self):
        """Test that all optax optimizers are included in _all_optax_optimizers."""
        optimizers = []
        # Optax doesn't have a specific module for optimizers, and there is no specific
        # base class for optimizers, so we have to manually exclude some outliers. The
        # class optax.GradientTransformationExtraArgs is the closest thing, but there
        # are some other classes that inherit from it that are not optimizers. Since
        # the optimizers are actually a function that returns an instance of
        # optax.GradientTransformationExtraArgs,
        names_to_exclude = [
            "GradientTransformationExtraArgs",
            "freeze",
            "scale_by_backtracking_linesearch",
            "scale_by_polyak",
            "scale_by_zoom_linesearch",
            "optimistic_adam",  # deprecated
        ]
        for name, obj in inspect.getmembers(optax):
            if name.startswith("_"):
                continue
            if callable(obj):
                try:
                    sig = inspect.signature(obj)
                    ins = {
                        p.name: 0.1
                        for p in sig.parameters.values()
                        if p.default is inspect._empty
                    }
                    if name == "noisy_sgd":
                        ins["key"] = 0
                    out = obj(**ins)
                    if isinstance(out, optax.GradientTransformationExtraArgs):
                        if name not in names_to_exclude:
                            optimizers.append(name)
                except Exception:
                    print(f"Could not instantiate: {name}")
                    pass

        msg = (
            "Wrapped optax optimizers can be out of date. If the newly added callable "
            "is not an optimizer, add it to the names_to_exclude list in this test."
        )
        print(optimizers)
        assert len(set(optimizers)) == len(_all_optax_optimizers), msg
        assert sorted(set(optimizers)) == sorted(_all_optax_optimizers), msg
        assert len(set(_all_optax_optimizers)) == len(_all_optax_optimizers), msg

dpanici · 2026-01-28T21:09:03Z

We are in favor of removing sgd code and have sgd alias to optax-sgd

f0uriest · 2026-02-17T04:14:16Z

desc/optimize/stochastic.py

+    )
+    deprecated_sgd = False
+    if method == "sgd":
+        # warn the user but do not fail the pytest


why is this here? shouldn't this be in the test?

There are multiple tests and some are parameterized, this seemed easier

But doesn't this mean the warning isn't actually emitted?

It is not treated as a warning but the message is still printed as a warning. So, this is equivalent to something like

print("DepreciationWarning: SGD is deprecated.")

with proper syntax for the warnings. It will be more annoying to add conditionals to parameterized tests; I would rather not do that

And tests look like

but still pass.

CHANGELOG.md

add adam optimizer

2fb97a1

YigitElma self-assigned this Dec 16, 2025

Merge branch 'master' into yge/adam

cf12086

YigitElma added 4 commits December 17, 2025 20:09

unify the api for generic sgd type optimizers

3ad6b89

update docs

79faa1a

add RMSProp too

9ea9e97

update changelog

6cc796b

YigitElma marked this pull request as ready for review December 18, 2025 01:42

YigitElma requested review from a team, ddudt, dpanici, f0uriest, rahulgaur104 and unalmis and removed request for a team December 18, 2025 01:42

YigitElma changed the title ~~Add ADAM optimizer~~ Add ADAM and RMSProp optimizers Dec 18, 2025

YigitElma commented Dec 18, 2025

View reviewed changes

desc/optimize/stochastic.py Outdated Show resolved Hide resolved

update docstring, update the equation in particles to latex

81a3bd4

YigitElma commented Dec 18, 2025

View reviewed changes

desc/particles.py Show resolved Hide resolved

dpanici reviewed Dec 19, 2025

View reviewed changes

desc/optimize/stochastic.py Show resolved Hide resolved

dpanici requested changes Dec 19, 2025

View reviewed changes

update docstring

731df93

YigitElma requested a review from dpanici December 19, 2025 19:39

YigitElma mentioned this pull request Dec 19, 2025

Allow combination of ESS and automatic scaling #2032

Open

Merge branch 'master' into yge/adam

53dccec

dpanici previously approved these changes Dec 22, 2025

View reviewed changes

f0uriest reviewed Jan 6, 2026

View reviewed changes

YigitElma added 6 commits January 7, 2026 16:50

remove adam and rmsprop, switch back to unicode for code readability

2305426

clean up

1644f29

Merge remote-tracking branch 'origin/master' into yge/adam

04b9ecf

remove redundant tests

6b25bbf

add support for custom optax optimizers

20ffc0a

update changelog

c4c7aab

YigitElma requested review from dpanici and f0uriest January 7, 2026 23:03

minor formatting

50bd906

dpanici previously approved these changes Jan 21, 2026

View reviewed changes

f0uriest reviewed Jan 24, 2026

View reviewed changes

address Rory's comments

17a0a1f

YigitElma dismissed dpanici’s stale review via 17a0a1f January 28, 2026 18:38

Merge branch 'master' into yge/adam

258d822

YigitElma added 2 commits January 29, 2026 12:01

deprecate sgd

042b686

update changelog

747b346

YigitElma requested review from dpanici and f0uriest January 29, 2026 18:30

YigitElma and others added 2 commits January 29, 2026 15:08

fix typo

5843eaa

Merge branch 'master' into yge/adam

f93e4e7

dpanici previously approved these changes Feb 5, 2026

View reviewed changes

f0uriest reviewed Feb 17, 2026

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Apply suggestion from @f0uriest

bbf39b7

f0uriest dismissed dpanici’s stale review via bbf39b7 February 17, 2026 04:19

Merge branch 'master' into yge/adam

0079f05

YigitElma requested review from dpanici and f0uriest February 18, 2026 16:37

Conversation

YigitElma commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Memory benchmark result

Uh oh!

codecov bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dpanici left a comment

Choose a reason for hiding this comment

Uh oh!

f0uriest left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

unalmis commented Jan 6, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

YigitElma commented Jan 28, 2026

Uh oh!

dpanici commented Jan 28, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

YigitElma commented Dec 16, 2025 •

edited

Loading

github-actions bot commented Dec 16, 2025 •

edited

Loading

codecov bot commented Dec 16, 2025 •

edited

Loading