Skip to content

Add tutorial: qlearning with/without action masking for Taxi v3 env #1345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dantp-ai
Copy link
Contributor

Description

Adds a new tutorial demonstrating difference in learning with and without action masking for Taxi v3 environment using Q-learning.

Fixes #28 (Task: "How to use the action sample masking, with example from Taxi")

Type of change

  • Add new tutorial

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings

@dantp-ai dantp-ai mentioned this pull request Mar 29, 2025
12 tasks
@pseudo-rnd-thoughts
Copy link
Member

Thanks for the tutorial @dantp-ai, Could you update the project main as the Test Gymnasium Tutorial CI was broken and is now fixed

@dantp-ai
Copy link
Contributor Author

The Test Gymnasium Tutorial job failed.

I don't understand the output from this job:

Run if [ "" -eq 0 ]; then
/home/runner/work/_temp/618a3580-acb8-4abc-aec3-99c[21](https://github.com/Farama-Foundation/Gymnasium/actions/runs/14179527178/job/39722202860?pr=1345#step:11:22)ed1f93a.sh: line 1: [: : integer expression expected
Notice: All  tutorials passed.
/home/runner/work/_temp/618a3580-acb8-4abc-aec3-99c21ed1f93a.sh: line 3: [: : integer expression expected
/home/runner/work/_temp/618a3580-acb8-4abc-aec3-99c21ed1f93a.sh: line 11: [: : integer expression expected
Error: Process completed with exit code 2.

Seems to be related to line 11 from the run-tutorial.yml but I don't understand how that can fail?

@dantp-ai
Copy link
Contributor Author

P.S.: I merged latest main into my branch.

Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dantp-ai for updating, there was a bug in the testing that should be fixed now if you merge main again.

On the tutorial, the training looks good, but I would love to see a larger discussion about what an action mask is (how to get the data and space.sample) and why it can improve performance.

Could you center the tutorial on what action masking is, how to use it, and why it can improve performance with helpful code examples for each. Before moving into the training section as in reality, there is only 2 lines differences to "normal" training if that makes sense.

Finally, I would remove the run_experiment function and just merge the functionality into the other function.

If you make those changes, then I think we should be good to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Tutorials
2 participants