Add tutorial: qlearning with/without action masking for Taxi v3 env #1345

dantp-ai · 2025-03-29T17:30:00Z

Description

Adds a new tutorial demonstrating difference in learning with and without action masking for Taxi v3 environment using Q-learning.

Fixes #28 (Task: "How to use the action sample masking, with example from Taxi")

Type of change

Add new tutorial

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings

…for Taxi v3 env

pseudo-rnd-thoughts · 2025-03-31T16:11:41Z

Thanks for the tutorial @dantp-ai, Could you update the project main as the Test Gymnasium Tutorial CI was broken and is now fixed

dantp-ai · 2025-03-31T20:16:29Z

The Test Gymnasium Tutorial job failed.

I don't understand the output from this job:

Run if [ "" -eq 0 ]; then
/home/runner/work/_temp/618a3580-acb8-4abc-aec3-99c[21](https://github.com/Farama-Foundation/Gymnasium/actions/runs/14179527178/job/39722202860?pr=1345#step:11:22)ed1f93a.sh: line 1: [: : integer expression expected
Notice: All  tutorials passed.
/home/runner/work/_temp/618a3580-acb8-4abc-aec3-99c21ed1f93a.sh: line 3: [: : integer expression expected
/home/runner/work/_temp/618a3580-acb8-4abc-aec3-99c21ed1f93a.sh: line 11: [: : integer expression expected
Error: Process completed with exit code 2.

Seems to be related to line 11 from the run-tutorial.yml but I don't understand how that can fail?

dantp-ai · 2025-03-31T20:17:09Z

P.S.: I merged latest main into my branch.

pseudo-rnd-thoughts

Thanks @dantp-ai for updating, there was a bug in the testing that should be fixed now if you merge main again.

On the tutorial, the training looks good, but I would love to see a larger discussion about what an action mask is (how to get the data and space.sample) and why it can improve performance.

Could you center the tutorial on what action masking is, how to use it, and why it can improve performance with helpful code examples for each. Before moving into the training section as in reality, there is only 2 lines differences to "normal" training if that makes sense.

Finally, I would remove the run_experiment function and just merge the functionality into the other function.

If you make those changes, then I think we should be good to merge

Add tutorial for demonstrating qlearning with/without action masking …

e25de19

…for Taxi v3 env

dantp-ai mentioned this pull request Mar 29, 2025

[Proposal] Tutorials #28

Open

12 tasks

Merge branch 'main' into feat/add-tutorial-action-masking

8cb1ec0

pseudo-rnd-thoughts mentioned this pull request Apr 2, 2025

Fix run-tutorial.yml in edge case #1351

Merged

pseudo-rnd-thoughts requested changes Apr 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tutorial: qlearning with/without action masking for Taxi v3 env #1345

Add tutorial: qlearning with/without action masking for Taxi v3 env #1345

dantp-ai commented Mar 29, 2025

pseudo-rnd-thoughts commented Mar 31, 2025

dantp-ai commented Mar 31, 2025

dantp-ai commented Mar 31, 2025

pseudo-rnd-thoughts left a comment

Add tutorial: qlearning with/without action masking for Taxi v3 env #1345

Are you sure you want to change the base?

Add tutorial: qlearning with/without action masking for Taxi v3 env #1345

Conversation

dantp-ai commented Mar 29, 2025

Description

Type of change

Checklist:

pseudo-rnd-thoughts commented Mar 31, 2025

dantp-ai commented Mar 31, 2025

dantp-ai commented Mar 31, 2025

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment