使用预训练模型 Action 不匹配问题

感谢您的工作！

我在使用 v1.0.2 Release 中的预训练权重进行测试时，出现了 tensor 维度的问题：

```Traceback (most recent call last):
  File ".../pymahjong-test.py", line 20, in <module>
    obs, reward, done, _ = env.step(a)  # reward is zero unless the game is over (done = True).
                           ^^^^^^^^^^^
  File ".../pymahjong/env_pymahjong.py", line 431, in _proceed_until_agent_turn
    action = self.opponent_agent.select(obs, action_mask=action_mask, greedy=True)
  File ".../pymahjong/models.py", line 171, in select
    action_mask.astype(np.float32).reshape([1, self.action_size]))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: cannot reshape array of size 54 into shape (1,47)
```

观察代码发现，pymahjong 中定义的 ACTION_DIM 为 54，似乎与文档中描述的 47 种 action 不符。

```python
class MahjongEnv(gym.Env):

    PLAYER_OBS_DIM = 93
    ORACLE_OBS_DIM = 18
    ACTION_DIM = 54
    MAHJONG_TILE_TYPES = 34
    INIT_POINTS = 25000
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用预训练模型 Action 不匹配问题 #41

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

使用预训练模型 Action 不匹配问题 #41

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions