Skip to content

Bugs encountered while training the agent #1

@Beliefuture

Description

@Beliefuture

@Bensk1 Hi, I have come across the two following questions while training the agent.

  1. the storage consumption of the corresponding index configuration set was exceeded marginally. Therefore, the training process was terminated.
    I am confused that why this would happen because if the index chosen would violate the constraints, it is considered to be a invalid action as illustrated in your paper.
  File "../swirl/stable_baselines/ppo2/ppo2.py", line 520, in _run
    if self.callback.on_step() is False:
  File "../swirl/stable_baselines/common/callbacks.py", line 94, in on_step
    return self._on_step()
  File "../swirl/stable_baselines/common/callbacks.py", line 170, in _on_step
    continue_training = callback.on_step() and continue_training
  File "***/swirl/stable_baselines/common/callbacks.py", line 94, in on_step
    return self._on_step()
  File "***/swirl/stable_baselines/common/callbacks.py", line 539, in _on_step
    return_episode_rewards=True)
  File "../swirl/stable_baselines/common/evaluation.py", line 41, in evaluate_policy
    obs, reward, done, _info = env.step(action)
  File "../swirl/stable_baselines/common/vec_env/base_vec_env.py", line 150, in step
    return self.step_wait()
  File "../swirl/stable_baselines/common/vec_env/vec_normalize.py", line 91, in step_wait
    obs, rews, news, infos = self.venv.step_wait()
  File "../swirl/stable_baselines/common/vec_env/dummy_vec_env.py", line 44, in step_wait
    self.envs[env_idx].step(self.actions[env_idx])
  File "***/lib/python3.7/site-packages/gym/wrappers/order_enforcing.py", line 13, in step
    observation, reward, done, info = self.env.step(action)
  File "***/swirl/gym_db/envs/db_env_v1.py", line 99, in step
    init=False, new_index=new_index, old_index_size=old_index_size
  File "***/gym_db/envs/db_env_v1.py", line 204, in _update_return_env_state
    "Storage consumption exceeds budget: "
AssertionError: Storage consumption exceeds budget: 500.08883199999997  > 500
  1. the action was invalid but still chosen. Therefore, the training process was terminated.
    To be specifically action[0] was chosen when it was invalid. But the mask vector was checked to be 0.
--------------------------------------
| approxkl           | nan           |
| clipfrac           | 0.1796875     |
| explained_variance | -2.03         |
| fps                | 0             |
| n_updates          | 250           |
| policy_entropy     | 0.14278165    |
| policy_loss        | nan           |
| serial_timesteps   | 16000         |
| time_elapsed       | 1.51e+03      |
| total_timesteps    | 16000         |
| value_loss         | 0.00048095174 |
--------------------------------------

Traceback (most recent call last):
  File "main.py", line 141, in <module>
    tb_log_name=experiment.id)  # the name of the run for tensorboard log
  File "../swirl/stable_baselines/ppo2/ppo2.py", line 342, in learn
    rollout = self.runner.run(callback)
  File "../swirl/stable_baselines/common/runners.py", line 59, in run
    return self._run()
  File "../swirl/stable_baselines/ppo2/ppo2.py", line 497, in _run
    self.obs[:], rewards, self.dones, infos = self.env.step(clipped_actions)
  File "../swirl/stable_baselines/common/vec_env/base_vec_env.py", line 150, in step
    return self.step_wait()
  File "../swirl/stable_baselines/common/vec_env/vec_normalize.py", line 91, in step_wait
    obs, rews, news, infos = self.venv.step_wait()
  File "../swirl/stable_baselines/common/vec_env/dummy_vec_env.py", line 44, in step_wait
    self.envs[env_idx].step(self.actions[env_idx])
  File "***/lib/python3.7/site-packages/gym/wrappers/order_enforcing.py", line 13, in step
    observation, reward, done, info = self.env.step(action)
  File "***/swirl/gym_db/envs/db_env_v1.py", line 79, in step
    self._step_asserts(action)
  File "***/swirl/gym_db/envs/db_env_v1.py", line 67, in _step_asserts
    ), f"Agent has chosen invalid action: {action}"
AssertionError: Agent has chosen invalid action: 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions