There is nothing better to win over the casino. Especially in Blackjack, where you literally play against the dealer. So, we decided to train the Q-Learning agent, that can make the dealer feel uncanny.
Before installing the project make sure that you have already installed poetry for the dependencies management.
Also make sure, that your Poetry create the local virtual environment .venv in your project's folder.
poetry config virtualenvs.in-project trueAfter configuring the Poetry, clone the repository and launch the installation.
git clone [email protected]:Makkarik/dice-blackjack-mdp.git
cd dice-blackjack-mdp
poetry installThe Blackjack game is the game, that use a pair of dice instead of cards as a source of randomness. To get familiar with the game rules, you may see the original source at this link.
The Dice Blackjack has a state vector
where:
The action space consists of 6 actions:
0 - hit the first die (H1); 1 - hit the second die (H2); 2 - hit the sum (HΣ);
3 - stack the first die (S1); 4 - stack the second die (S2); 5 - stack the sum (SΣ).
The game ends with one of four possible rewards:
-1 - the player got busted (scored more than 21 points) or got less points than the dealer; 0 - the game ended with a tie; 1 - the player won over the dealer or dealer got busted; 2 - the player rolled a Blackjack combination (2 double values in first two rolls).
You may play the game by launching the environment file /src/env.py directly.
You may reproduce the results by launching the Training.ipynb notebook.
We have trained the agent for 500.000 episodes with non-linear
It is noticeable, that the acquired policy is strong enough to get the positive feedback from the game (average reward greater than 0). To make the policy human-readable, we have converted the Q-table to the Dice Blackjack cheatsheet with all possible cases.
The cells colored in gray designated the states with either no available actions or not encountered yet (heatmap for score of 5).
You may use the cheatsheet and play the game manually to check, if the policy is good enough or not.
The repository is equipped with pre-commit hooks for automatic code linting. All the code style requirements are listed in [tool.ruff.lint] section of pyproject.toml file.
For better experience, use the VS Code IDE with the installed Ruff extension.


