-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Unsure if this is caused by an issue related to #58, but after a seemingly arbitrary number of steps (~60k) in a slightly modified version of the CatheterBeam example, I see this error:
[WARNING] [LCPConstraintSolver(LCPConstraintSolver)] No convergence in unbuilt nlcp gaussseidel function : error =0.000597259 after 1000 iterations
Process ForkServerProcess-3:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/rbasdeo/.local/lib/python3.8/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 34, in _worker
observation = env.reset()
File "/home/rbasdeo/.local/lib/python3.8/site-packages/gym/wrappers/time_limit.py", line 27, in reset
return self.env.reset(**kwargs)
File "/home/rbasdeo/.local/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 16, in reset
return self.env.reset(**kwargs)
File "/home/rbasdeo/sofaGym/sofaGym_catheter/sofagym/envs/CatheterBeamSingle/CatheterBeamSingleEnv.py", line 75, in reset
super().reset()
File "/home/rbasdeo/sofaGym/sofaGym_catheter/sofagym/AbstractEnv.py", line 380, in reset
self.clean()
File "/home/rbasdeo/sofaGym/sofaGym_catheter/sofagym/AbstractEnv.py", line 276, in clean
clean_registry(self.past_actions)
File "/home/rbasdeo/sofaGym/sofaGym_catheter/sofagym/rpc_server.py", line 539, in clean_registry
actions_to_stateId.pop(str(instances[id]['history']))
KeyError: 'history'
Stable_baselines params:
DQN:
init_kwargs:
policy: 'MlpPolicy'
learning_rate: 0.0005
buffer_size: 4000
learning_starts: 1000
batch_size: 32
gamma: 1.0
train_freq: (1, 'step')
gradient_steps: 1
replay_buffer_class: null
replay_buffer_kwargs: null
optimize_memory_usage: False
target_update_interval: 500
exploration_fraction: 0.1
exploration_initial_eps: 1.0
exploration_final_eps: 0.02
max_grad_norm: 10
policy_kwargs: "dict(net_arch=[128, 128], optimizer_class=th.optim.Adam)"
device: "auto"
fit_kwargs:
total_timesteps: 3200000
max_episode_steps: 300
eval_freq: 10000
n_eval_episodes: 10
save_freq: 10000
video_length: 500