Skip to content

使用预训练模型 Action 不匹配问题 #41

@qxforever

Description

@qxforever

感谢您的工作!

我在使用 v1.0.2 Release 中的预训练权重进行测试时,出现了 tensor 维度的问题:

  File ".../pymahjong-test.py", line 20, in <module>
    obs, reward, done, _ = env.step(a)  # reward is zero unless the game is over (done = True).
                           ^^^^^^^^^^^
  File ".../pymahjong/env_pymahjong.py", line 431, in _proceed_until_agent_turn
    action = self.opponent_agent.select(obs, action_mask=action_mask, greedy=True)
  File ".../pymahjong/models.py", line 171, in select
    action_mask.astype(np.float32).reshape([1, self.action_size]))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: cannot reshape array of size 54 into shape (1,47)

观察代码发现,pymahjong 中定义的 ACTION_DIM 为 54,似乎与文档中描述的 47 种 action 不符。

class MahjongEnv(gym.Env):

    PLAYER_OBS_DIM = 93
    ORACLE_OBS_DIM = 18
    ACTION_DIM = 54
    MAHJONG_TILE_TYPES = 34
    INIT_POINTS = 25000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions