Sharpness Aware Minimization (SAM) #684

amaarora · 2021-06-07T10:22:33Z

amaarora
Jun 7, 2021

Paper: https://arxiv.org/abs/2010.01412v3

@rwightman I am sure as you are already aware - SAM has been at the center of recent papers in CV. ViT, MLPs and NFNets all seem to benefit (also BN counterparts).

There's an open-source implementation here - https://github.com/google-research/sam

If you do agree, I am happy to start working on a PR to get this optimizer in TIMM.

References:

Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error
When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations

rwightman · 2021-06-07T16:07:32Z

rwightman
Jun 7, 2021
Maintainer

@amaarora I don't think it can be cleanly contained within an optimizer. It requires two forward pass with manipulation of the gradient in between to calc the perturbation. Since closures don't work with grad scaler, that breaks the optimizer abstraction and requires a custom train loop. Additionally there are some other questions I have regarding the grads when using DDP and grad clipping.

In the new bits approach it'd be best implemented as an UpdaterSam/UpdaterSamCuda/UpdaterSamXla ...might be possible to make that a bit more generic for other higher level operations that require fwd/bwd -> mutate grad -> modify weights -> fwd/bwd -> modify weights -> normal step

4 replies

AlejandroRigau Jun 11, 2022

Any updates with this, @rwightman ? It seems that SAM provides significant improvement to models like Efficientnet.

AlexeyAB Jun 11, 2022

Since closures don't work with grad scaler, that breaks the optimizer abstraction and requires a custom train loop.

@rwightman What is wrong with closures and grad scaler? It doesn't apply loss-scale or doesn't allow to use with autocast(enabled=True): inside closures or something else?

rwightman Jun 11, 2022
Maintainer

It's just, not supported

rwightman Jun 11, 2022
Maintainer

this would be a perfect task for functorch, if it was in a ready to use state, but I feel it's a bit rough with some important non-working cases still, the JAX impl of SAM I've seen are just so much cleaner implemented as grad transforms

rwightman · 2022-06-11T22:35:54Z

rwightman
Jun 11, 2022
Maintainer

@AlejandroRigau I've been watching related papers and impl. I'm overal not too happy with the state of most PyTorch impl in that they tend to ignore proper AMP usage completely (and I'm not going to add anything which wiill prevent use of AMP and also add extra overhead itself)..

GSAM looks like a decent impl to tweak and add AMP support to (I think there is a PR for most of the support, but needed improvement last I looked, that could be diff now) ... https://github.com/juntang-zhuang/GSAM

Also, @tmabraham has been looking at this a lot (and trying to convince me to add it), he was going to try some impl soon I think... he was discussing an alternative solution (MESA) with me recently as well...

0 replies

elisim · 2025-03-12T14:46:54Z

elisim
Mar 12, 2025

@rwightman Any updates about this topic? I would be happy to work on a PR.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Sharpness Aware Minimization (SAM) #684

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Sharpness Aware Minimization (SAM) #684

Uh oh!

amaarora Jun 7, 2021

Replies: 3 comments · 4 replies

Uh oh!

rwightman Jun 7, 2021 Maintainer

Uh oh!

AlejandroRigau Jun 11, 2022

Uh oh!

AlexeyAB Jun 11, 2022

Uh oh!

rwightman Jun 11, 2022 Maintainer

Uh oh!

rwightman Jun 11, 2022 Maintainer

Uh oh!

rwightman Jun 11, 2022 Maintainer

Uh oh!

elisim Mar 12, 2025

amaarora
Jun 7, 2021

Replies: 3 comments 4 replies

rwightman
Jun 7, 2021
Maintainer

rwightman Jun 11, 2022
Maintainer

rwightman Jun 11, 2022
Maintainer

rwightman
Jun 11, 2022
Maintainer

elisim
Mar 12, 2025