Skip to content

Commit d48a2d4

Browse files
author
izhigal
committed
formatting applied
1 parent 7fc56ed commit d48a2d4

File tree

143 files changed

+1554
-1656
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

143 files changed

+1554
-1656
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -245,7 +245,7 @@ You can use the the following interface to make an environment. You may optional
245245
* `allow_step_back`: Default `False`. `True` if allowing `step_back` function to traverse backward in the tree.
246246
* Game specific configurations: These fields start with `game_`. Currently, we only support `game_num_players` in Blackjack, .
247247

248-
Once the environemnt is made, we can access some information of the game.
248+
Once the environment is made, we can access some information of the game.
249249
* **env.num_actions**: The number of actions.
250250
* **env.num_players**: The number of players.
251251
* **env.state_shape**: The shape of the state space of the observations.

docs/games.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ At each decision point of the game, the corresponding player will be able to obs
9090
| ------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
9191
| seen\_cards | Three face-down cards distributed to the landlord after bidding. Then these cards will be made public to all players. | TQA |
9292
| landlord | An integer of landlord's id | 0 |
93-
| self | An integer of current player's id | 2 |
93+
| cls | An integer of current player's id | 2 |
9494
| trace | A list of tuples which records every actions in one game. The first entry of the tuple is player's id, the second is corresponding player's action. | \[(0, '8222'), (1, 'pass'), (2, 'pass'), (0 '6KKK'), (1, 'pass'), (2, 'pass'), (0, '8'), (1, 'Q')\] |
9595
| played\_cards | As the game progresses, the cards which have been played by the three players and sorted from low to high. | \['6', '8', '8', 'Q', 'K', 'K', 'K', '2', '2', '2'\] |
9696
| others\_hand | The union of the other two player's current hand | 333444555678899TTTJJJQQAA2R |
@@ -134,7 +134,7 @@ If the landlord first get rid of all the cards in his hand, he will win and rece
134134
## Mahjong
135135
Mahjong is a tile-based game developed in China, and has spread throughout the world since 20th century. It is commonly played
136136
by 4 players. The game is played with a set of 136 tiles. In turn players draw and discard tiles until
137-
The goal of the game is to complete the leagal hand using the 14th drawn tile to form 4 sets and a pair.
137+
The goal of the game is to complete the legal hand using the 14th drawn tile to form 4 sets and a pair.
138138
We revised the game into a simple version that all of the winning set are equal, and player will win as long as she complete
139139
forming 4 sets and a pair. Please refer the detail on [Wikipedia](https://en.wikipedia.org/wiki/Mahjong) or [Baike](https://baike.baidu.com/item/麻将/215).
140140

docs/high-level-design.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,4 +25,4 @@ Card games usually have similar structures. We abstract some concepts in card ga
2525
To summarize, in one `Game`, a `Dealer` deals the cards for each `Player`. In each `Round` of the game, a `Judger` will make major decisions about the next round and the payoffs in the end of the game.
2626

2727
## Agents
28-
We provide examples of several representative algorithms and wrap them as `Agent` to show how a learning algorithm can be connected to the toolkit. The first example is DQN which is a representative of the Reinforcement Learning (RL) algorithms category. The second example is NFSP which is a representative of the Reinforcement Learning (RL) with self-play. We also provide CFR (chance sampling) and DeepCFR which belong to Conterfactual Regret Minimization (CFR) category. Other algorithms from these three categories can be connected in similar ways.
28+
We provide examples of several representative algorithms and wrap them as `Agent` to show how a learning algorithm can be connected to the toolkit. The first example is DQN which is a representative of the Reinforcement Learning (RL) algorithms category. The second example is NFSP which is a representative of the Reinforcement Learning (RL) with self-play. We also provide CFR (chance sampling) and DeepCFR which belong to Counterfactual Regret Minimization (CFR) category. Other algorithms from these three categories can be connected in similar ways.

docs/toy-examples.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -339,7 +339,7 @@ def train(args):
339339
# Seed numpy, torch, random
340340
set_seed(args.seed)
341341

342-
# Initilize CFR Agent
342+
# Initialize CFR Agent
343343
agent = CFRAgent(
344344
env,
345345
os.path.join(

examples/evaluate.py

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,16 @@
1-
''' An example of evluating the trained models in RLCard
2-
'''
1+
"""An example of evaluating the trained models in RLCard"""
32
import os
43
import argparse
54

65
import rlcard
7-
from rlcard.agents import (
8-
DQNAgent,
9-
RandomAgent,
10-
)
6+
117
from rlcard.utils import (
128
get_device,
139
set_seed,
1410
tournament,
1511
)
1612

13+
1714
def load_model(model_path, env=None, position=None, device=None):
1815
if os.path.isfile(model_path): # Torch model
1916
import torch
@@ -29,14 +26,14 @@ def load_model(model_path, env=None, position=None, device=None):
2926
else: # A model in the model zoo
3027
from rlcard import models
3128
agent = models.load(model_path).agents[position]
32-
29+
3330
return agent
3431

35-
def evaluate(args):
3632

33+
def evaluate(args):
3734
# Check whether gpu is available
3835
device = get_device()
39-
36+
4037
# Seed numpy, torch, random
4138
set_seed(args.seed)
4239

@@ -54,6 +51,7 @@ def evaluate(args):
5451
for position, reward in enumerate(rewards):
5552
print(position, args.models[position], reward)
5653

54+
5755
if __name__ == '__main__':
5856
parser = argparse.ArgumentParser("Evaluation example in RLCard")
5957
parser.add_argument(
@@ -99,4 +97,3 @@ def evaluate(args):
9997

10098
os.environ["CUDA_VISIBLE_DEVICES"] = args.cuda
10199
evaluate(args)
102-

examples/human/blackjack_human.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
''' A toy example of self playing for Blackjack
2-
'''
1+
"""A toy example of self playing for Blackjack """
32

43
import rlcard
54
from rlcard.agents import RandomAgent as RandomAgent
@@ -23,7 +22,7 @@
2322

2423
print(">> Blackjack human agent")
2524

26-
while (True):
25+
while True:
2726
print(">> Start a new game")
2827

2928
trajectories, payoffs = env.run(is_training=False)

examples/human/gin_rummy_human.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
'''
1+
"""
22
Project: Gui Gin Rummy
33
File name: gin_rummy_human.py
44
Author: William Hale
55
Date created: 3/14/2020
6-
'''
6+
"""
77

88
# You need to install tkinter if it is not already installed.
99
# Tkinter is Python's defacto standard GUI (Graphical User Interface) package.

examples/human/leduc_holdem_human.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
''' A toy example of playing against pretrianed AI on Leduc Hold'em
2-
'''
1+
"""A toy example of playing against pretrianed AI on Leduc Hold'em"""
32

43
import rlcard
54
from rlcard import models
@@ -17,7 +16,7 @@
1716

1817
print(">> Leduc Hold'em pre-trained model")
1918

20-
while (True):
19+
while True:
2120
print(">> Start a new game")
2221

2322
trajectories, payoffs = env.run(is_training=False)

examples/human/limit_holdem_human.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
''' A toy example of playing against a random agent on Limit Hold'em
2-
'''
1+
"""A toy example of playing against a random agent on Limit Hold'em"""
32

43
import rlcard
54
from rlcard.agents import LimitholdemHumanAgent as HumanAgent
@@ -17,7 +16,7 @@
1716

1817
print(">> Limit Hold'em random agent")
1918

20-
while (True):
19+
while True:
2120
print(">> Start a new game")
2221

2322
trajectories, payoffs = env.run(is_training=False)

examples/human/nolimit_holdem_human.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
''' A toy example of playing against pretrianed AI on Leduc Hold'em
2-
'''
1+
"""A toy example of playing against pretrained AI on Leduc Hold'em"""
32
from rlcard.agents import RandomAgent
43

54
import rlcard
@@ -17,7 +16,7 @@
1716
env.set_agents([human_agent, human_agent2])
1817

1918

20-
while (True):
19+
while True:
2120
print(">> Start a new game")
2221

2322
trajectories, payoffs = env.run(is_training=False)

0 commit comments

Comments
 (0)