Description
Proposal
Create a more basic tutorial on using (Async)VectorEnvs and why you should learn them.
Focus on:
- How asyncvectorenvs can enhance learning speed and performance by utilizing more cores on computers as well as being able to train on multiple, differently parametrized, environments
- How, in principal, creating a proper Env class is enough to make it being suitable for parallization
- point to API class of the AsyncVector classes (and perhaps vice versa, link to the tutorial in the class API)
- how, e.g. the step function, returns vectors instead of scalars
- how the main (training) loop changes from the single Env tutorial cases because Envs now autoreset (and you probably no longer count on episodes but on steps)
- how the statistics wrapper class can still be used for plotting
I would say that perhaps taking the already excellent blackjact_agent tutorial and rewriting is using AsyncEnvs would be a great way of showing what needs to be done.
Motivation
I am new to this project and I found the BlackJack tutorial very handy to get up to speed with basic concepts. In addition, I also found the 'Making your own environment' to be very readable.
However, I find the learning to curve to speed up the learning process by making use of multithreading quite steep. While in hindsight I found the 'Training A2c with vector envs and domain randomization', I glossed over this tutorial many time as I assumed the tutorial was on A2C (which I didn't need at the time).
Then, this A2C tutorial covers two (complex) concepts at the same time: A2C agent and the use of VectorEnvs. I think this makes that tutorial to steep for beginners looking to learn the basics of vectorenvs.
Hence a separate tutorial in the 'Gymnasium Basics' section with title 'Parallel training of Environments' would solve:
- that people might miss the important concepts of the VectorEnvs becasue they think the tutorials is about a2c
- that people find the A2C tutorial too confusing to learn vectorenvs because it intertwined with another complex concept.
Pitch
An added tutorial that solely focuses on the concepts of vectorization (and especially for performance behaviour), explaining the main differences between the single and vector case.
Alternatives
No response
Additional context
I would like to help in writing this tutorial if appreciated although I will need to learn a bit more about the project before I can start.
Does the idea of taking the single Env blackjack q-learning tutorial and modify it using vectorenvs sound like a good idea? Or rather another setting?
Checklist
- I have checked that there is no similar issue in the repo