[Feature Request] Support Stochastic Weight Averaging (SWA) for improved stability

### 🚀 Feature

Stochastic Weight Averaging (SWA) is a recently proposed technique can potentially help improve training stability in DRL. There is now a new implementation in `torchcontrib`. Quoting/paraphrasing from their page:

> a simple procedure that improves generalization in deep learning over Stochastic Gradient Descent (SGD) at no additional cost, and can be used as a drop-in replacement for any other optimizer in PyTorch. SWA has a wide range of applications and features, [...] including [...] **improve the stability of training as well as the final average rewards of policy-gradient methods in deep reinforcement learning.**

See the [PyTorch SWA page](https://pytorch.org/blog/stochastic-weight-averaging-in-pytorch/) for more. 

### Motivation

SWA might help improve training stability as well as final reward in some DRL scenarios. It may also alleviate sensitivity to random-seed initialization.

### Pitch

See above :)

### Alternatives

_No response_

### Additional context

See the [PyTorch SWA page](https://pytorch.org/blog/stochastic-weight-averaging-in-pytorch/) for more. 

### Checklist

- [X] I have checked that there is no similar [issue](https://github.com/DLR-RM/rl-baselines3-zoo/issues) in the repo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support Stochastic Weight Averaging (SWA) for improved stability #321

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support Stochastic Weight Averaging (SWA) for improved stability #321

Description

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions