Skip to content

Abnormal data output and training period #30

@AlexGranger-scn

Description

@AlexGranger-scn

Sorry to disturb you. I come across a similar problem that I follow the instruction to run the project and I add --final-eval-only 0 but the project takes such an abnormal training period (I use a V100) , it has not finished the training after 8 hours.
And on my wandb website I can only see 2 data points for each diagram until now.
I wonder where I have made a mistake?

My output :

root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:# cd spr-release
root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:/spr-release# rm -rf wandb
root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:/spr-release# conda activate spr
(spr) root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:/spr-release# python -m scripts.run --public --game kung_fu_master --momentum-tau 1. --final-eval-only 0
wandb: Currently logged in as: alexgranger (use wandb login --relogin to force relogin)
wandb: Tracking run with wandb version 0.12.11
wandb: Run data is saved locally in wandb/run-20220327_233137-3l5mkumo
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run dry-eon-3
wandb: ⭐️ View project at https://wandb.ai/alexgranger/uncategorized
wandb: 🚀 View run at https://wandb.ai/alexgranger/uncategorized/runs/3l5mkumo
logger_context received log_dir outside of /root/spr-release/data: prepending by /root/spr-release/data/local///
2022-03-27 23:31:43.907745 | dqn_kung_fu_master_0 Runner master CPU affinity: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71].
2022-03-27 23:31:43.908008 | dqn_kung_fu_master_0 Runner master Torch threads: 36.
using seed 0
Spatial latent size is torch.Size([64, 7, 7])
Initialized model with 4045991 parameters
Spatial latent size is torch.Size([64, 7, 7])
Initialized model with 4045991 parameters
2022-03-27 23:32:05.931997 | dqn_kung_fu_master_0 Sampler decorrelating envs, max steps: 0
2022-03-27 23:32:05.933166 | dqn_kung_fu_master_0 Agent at itr 0, sample eps 1.0 (min itr: 50, max_itr: 1000)
2022-03-27 23:32:05.933360 | dqn_kung_fu_master_0 Serial Sampler initialized.
2022-03-27 23:32:05.933493 | dqn_kung_fu_master_0 Running 100000 iterations of minibatch RL.
2022-03-27 23:32:15.510158 | dqn_kung_fu_master_0 Initialized agent model on device: cuda:0.
2022-03-27 23:32:15.576100 | dqn_kung_fu_master_0 From sampler batch size 1, training batch size 32, and replay ratio 64, computed 2 updates per iteration.
2022-03-27 23:32:15.576465 | dqn_kung_fu_master_0 Agent setting min/max epsilon itrs: 2000, 2001
2022-03-27 23:32:15.679657 | dqn_kung_fu_master_0 Frame-based buffer using 4-frame sequences.
2022-03-27 23:32:23.633687 | dqn_kung_fu_master_0 itr #0 Evaluating agent...
2022-03-27 23:32:23.634627 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, eval eps 1.0
2022-03-27 23:32:24.872641 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, eval eps 1.0
2022-03-27 23:33:47.222620 | dqn_kung_fu_master_0 itr #0 Evaluation reached max num trajectories (100).
2022-03-27 23:33:47.224658 | dqn_kung_fu_master_0 itr #0 Evaluation runs complete.
2022-03-27 23:33:47.224950 | dqn_kung_fu_master_0 itr #0 saving snapshot...
2022-03-27 23:33:47.653684 | dqn_kung_fu_master_0 itr #0 saved
/root/.local/conda/envs/spr/lib/python3.6/site-packages/numpy/lib/function_base.py:380: RuntimeWarning: Mean of empty slice.
avg = a.mean(axis)
/root/.local/conda/envs/spr/lib/python3.6/site-packages/numpy/core/methods.py:170: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
2022-03-27 23:33:47.714302 | ----------------------------- -------------
2022-03-27 23:33:47.714680 | Diagnostics/StepsInEval 89813
2022-03-27 23:33:47.715209 | Diagnostics/TrajsInEval 100
2022-03-27 23:33:47.715299 | Diagnostics/CumEvalTime 83.5899
2022-03-27 23:33:47.715376 | Diagnostics/CumTrainTime 0.430582
2022-03-27 23:33:47.715452 | Diagnostics/Iteration 0
2022-03-27 23:33:47.715525 | Diagnostics/CumTime (s) 84.0205
2022-03-27 23:33:47.715588 | Diagnostics/CumSteps 1
2022-03-27 23:33:47.715665 | Diagnostics/CumCompletedTrajs 0
2022-03-27 23:33:47.715727 | Diagnostics/CumUpdates 0
2022-03-27 23:33:47.715780 | Diagnostics/StepsPerSecond nan
2022-03-27 23:33:47.715826 | Diagnostics/UpdatesPerSecond nan
2022-03-27 23:33:47.715885 | Diagnostics/ReplayRatio 0
2022-03-27 23:33:47.715938 | Diagnostics/CumReplayRatio 0
2022-03-27 23:33:47.715983 | LengthAverage 898.13
2022-03-27 23:33:47.716036 | LengthStd 136.602
2022-03-27 23:33:47.716108 | LengthMedian 890.5
2022-03-27 23:33:47.716180 | LengthMin 580
2022-03-27 23:33:47.716243 | LengthMax 1206
2022-03-27 23:33:47.716332 | ReturnAverage 2.76
2022-03-27 23:33:47.716428 | ReturnStd 2.0006
2022-03-27 23:33:47.716486 | ReturnMedian 2
2022-03-27 23:33:47.716547 | ReturnMin 0
2022-03-27 23:33:47.716599 | ReturnMax 9
2022-03-27 23:33:47.716667 | NonzeroRewardsAverage 2.76
2022-03-27 23:33:47.716727 | NonzeroRewardsStd 2.0006
2022-03-27 23:33:47.716780 | NonzeroRewardsMedian 2
2022-03-27 23:33:47.716849 | NonzeroRewardsMin 0
2022-03-27 23:33:47.716912 | NonzeroRewardsMax 9
2022-03-27 23:33:47.716963 | DiscountedReturnAverage 0.0968938
2022-03-27 23:33:47.717033 | DiscountedReturnStd 0.121937
2022-03-27 23:33:47.717095 | DiscountedReturnMedian 0.0532579
2022-03-27 23:33:47.717172 | DiscountedReturnMin 0
2022-03-27 23:33:47.717243 | DiscountedReturnMax 0.62236
2022-03-27 23:33:47.717322 | GameScoreAverage 441
2022-03-27 23:33:47.717409 | GameScoreStd 320.654
2022-03-27 23:33:47.717481 | GameScoreMedian 400
2022-03-27 23:33:47.717541 | GameScoreMin 0
2022-03-27 23:33:47.717614 | GameScoreMax 1500
2022-03-27 23:33:47.717686 | lossAverage nan
2022-03-27 23:33:47.717748 | lossStd nan
2022-03-27 23:33:47.717822 | lossMedian nan
2022-03-27 23:33:47.717870 | lossMin nan
2022-03-27 23:33:47.717941 | lossMax nan
2022-03-27 23:33:47.718022 | gradNormAverage nan
2022-03-27 23:33:47.718083 | gradNormStd nan
2022-03-27 23:33:47.718157 | gradNormMedian nan
2022-03-27 23:33:47.718227 | gradNormMin nan
2022-03-27 23:33:47.718302 | gradNormMax nan
2022-03-27 23:33:47.718378 | tdAbsErrAverage nan
2022-03-27 23:33:47.718439 | tdAbsErrStd nan
2022-03-27 23:33:47.718515 | tdAbsErrMedian nan
2022-03-27 23:33:47.718578 | tdAbsErrMin nan
2022-03-27 23:33:47.718652 | tdAbsErrMax nan
2022-03-27 23:33:47.718714 | modelRLLossAverage nan
2022-03-27 23:33:47.718766 | modelRLLossStd nan
2022-03-27 23:33:47.718833 | modelRLLossMedian nan
2022-03-27 23:33:47.718894 | modelRLLossMin nan
2022-03-27 23:33:47.718945 | modelRLLossMax nan
2022-03-27 23:33:47.719014 | RewardLossAverage nan
2022-03-27 23:33:47.719075 | RewardLossStd nan
2022-03-27 23:33:47.719148 | RewardLossMedian nan
2022-03-27 23:33:47.719210 | RewardLossMin nan
2022-03-27 23:33:47.719298 | RewardLossMax nan
2022-03-27 23:33:47.719371 | modelGradNormAverage nan
2022-03-27 23:33:47.719433 | modelGradNormStd nan
2022-03-27 23:33:47.719485 | modelGradNormMedian nan
2022-03-27 23:33:47.719553 | modelGradNormMin nan
2022-03-27 23:33:47.719615 | modelGradNormMax nan
2022-03-27 23:33:47.719690 | SPRLossAverage nan
2022-03-27 23:33:47.719753 | SPRLossStd nan
2022-03-27 23:33:47.719805 | SPRLossMedian nan
2022-03-27 23:33:47.719873 | SPRLossMin nan
2022-03-27 23:33:47.719976 | SPRLossMax nan
2022-03-27 23:33:47.720034 | ModelSPRLossAverage nan
2022-03-27 23:33:47.720104 | ModelSPRLossStd nan
2022-03-27 23:33:47.720166 | ModelSPRLossMedian nan
2022-03-27 23:33:47.720218 | ModelSPRLossMin nan
2022-03-27 23:33:47.720273 | ModelSPRLossMax nan
2022-03-27 23:33:47.720344 | ----------------------------- -------------
2022-03-27 23:33:47.720917 | dqn_kung_fu_master_0 itr #0 Optimizing over 10000 iterations.
2022-03-27 23:33:47.724755 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:47.725975 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:48.536667 | dqn_kung_fu_master_0 itr #200 Agent at itr 200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:48.537071 | dqn_kung_fu_master_0 itr #200 Agent at itr 200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [# ] 100% | ETA: 00:00:372022-03-27 23:33:49.265005 | dqn_kung_fu_master_0 itr #400 Agent at itr 400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:49.265436 | dqn_kung_fu_master_0 itr #400 Agent at itr 400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:49.970561 | dqn_kung_fu_master_0 itr #600 Agent at itr 600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:49.970965 | dqn_kung_fu_master_0 itr #600 Agent at itr 600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [## ] 100% | ETA: 00:00:342022-03-27 23:33:50.703183 | dqn_kung_fu_master_0 itr #800 Agent at itr 800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:50.703541 | dqn_kung_fu_master_0 itr #800 Agent at itr 800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [### ] 100% | ETA: 00:00:332022-03-27 23:33:51.414277 | dqn_kung_fu_master_0 itr #1000 Agent at itr 1000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:51.415657 | dqn_kung_fu_master_0 itr #1000 Agent at itr 1000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:52.193086 | dqn_kung_fu_master_0 itr #1200 Agent at itr 1200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:52.193518 | dqn_kung_fu_master_0 itr #1200 Agent at itr 1200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [#### ] 100% | ETA: 00:00:322022-03-27 23:33:52.937921 | dqn_kung_fu_master_0 itr #1400 Agent at itr 1400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:52.938333 | dqn_kung_fu_master_0 itr #1400 Agent at itr 1400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:53.651224 | dqn_kung_fu_master_0 itr #1600 Agent at itr 1600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:53.659388 | dqn_kung_fu_master_0 itr #1600 Agent at itr 1600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [##### ] 100% | ETA: 00:00:302022-03-27 23:33:54.373088 | dqn_kung_fu_master_0 itr #1800 Agent at itr 1800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:54.373503 | dqn_kung_fu_master_0 itr #1800 Agent at itr 1800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [###### ] 100% | ETA: 00:00:292022-03-27 23:33:55.090011 | dqn_kung_fu_master_0 itr #2000 Agent at itr 2000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:55.090488 | dqn_kung_fu_master_0 itr #2000 Agent at itr 2000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
/root/spr-release/src/algos.py:150: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad(True), rather than torch.tensor(sourceTensor).
opt_info.gradNorm.append(torch.tensor(grad_norm).item()) # grad_norm is a float sometimes, so wrap in tensor
/root/spr-release/src/algos.py:153: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
opt_info.modelGradNorm.append(torch.tensor(model_grad_norm).item())
2022-03-27 23:33:56.678533 | dqn_kung_fu_master_0 itr #2001 Agent at itr 2001, sample eps 0.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:56.679209 | dqn_kung_fu_master_0 itr #2001 Agent at itr 2001, sample eps 0.0 (min itr: 2000, max_itr: 2001)
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 03:16:422022-03-28 02:50:30.281019 | dqn_kung_fu_master_0 itr #9999 Evaluating agent...

2022-03-28 02:50:30.284811 | dqn_kung_fu_master_0 itr #9999 Agent at itr 9999, eval eps 0.001
2022-03-28 02:50:31.487369 | dqn_kung_fu_master_0 itr #9999 Agent at itr 9999, eval eps 0.001
2022-03-28 02:51:48.012106 | dqn_kung_fu_master_0 itr #9999 Evaluation reached max num trajectories (100).
2022-03-28 02:51:48.014187 | dqn_kung_fu_master_0 itr #9999 Evaluation runs complete.
2022-03-28 02:51:48.014417 | dqn_kung_fu_master_0 itr #9999 saving snapshot...
2022-03-28 02:51:48.556121 | dqn_kung_fu_master_0 itr #9999 saved
2022-03-28 02:51:48.688185 | ----------------------------- ---------------
2022-03-28 02:51:48.688282 | Diagnostics/StepsInEval 86153
2022-03-28 02:51:48.688353 | Diagnostics/TrajsInEval 100
2022-03-28 02:51:48.688369 | Diagnostics/CumEvalTime 161.319
2022-03-28 02:51:48.688457 | Diagnostics/CumTrainTime 11803.6
2022-03-28 02:51:48.688488 | Diagnostics/Iteration 9999
2022-03-28 02:51:48.688503 | Diagnostics/CumTime (s) 11964.9
2022-03-28 02:51:48.688517 | Diagnostics/CumSteps 10000
2022-03-28 02:51:48.688570 | Diagnostics/CumCompletedTrajs 10
2022-03-28 02:51:48.688599 | Diagnostics/CumUpdates 16000
2022-03-28 02:51:48.688613 | Diagnostics/StepsPerSecond 0.84723
2022-03-28 02:51:48.688627 | Diagnostics/UpdatesPerSecond 1.35557
2022-03-28 02:51:48.688641 | Diagnostics/ReplayRatio 51.2
2022-03-28 02:51:48.688696 | Diagnostics/CumReplayRatio 51.2
2022-03-28 02:51:48.688727 | LengthAverage 861.53
2022-03-28 02:51:48.688742 | LengthStd 50.7432
2022-03-28 02:51:48.688755 | LengthMedian 880
2022-03-28 02:51:48.688769 | LengthMin 793
2022-03-28 02:51:48.688783 | LengthMax 968
2022-03-28 02:51:48.688820 | ReturnAverage 3.85
2022-03-28 02:51:48.688851 | ReturnStd 0.952628
2022-03-28 02:51:48.688866 | ReturnMedian 4
2022-03-28 02:51:48.688879 | ReturnMin 3
2022-03-28 02:51:48.688893 | ReturnMax 7
2022-03-28 02:51:48.688906 | NonzeroRewardsAverage 3.85
2022-03-28 02:51:48.688938 | NonzeroRewardsStd 0.952628
2022-03-28 02:51:48.688963 | NonzeroRewardsMedian 4
2022-03-28 02:51:48.688998 | NonzeroRewardsMin 3
2022-03-28 02:51:48.689014 | NonzeroRewardsMax 7
2022-03-28 02:51:48.689027 | DiscountedReturnAverage 0.0332842
2022-03-28 02:51:48.689042 | DiscountedReturnStd 0.0181958
2022-03-28 02:51:48.689055 | DiscountedReturnMedian 0.0294325
2022-03-28 02:51:48.689069 | DiscountedReturnMin 0.0146507
2022-03-28 02:51:48.689117 | DiscountedReturnMax 0.074113
2022-03-28 02:51:48.689147 | GameScoreAverage 444
2022-03-28 02:51:48.689162 | GameScoreStd 116.893
2022-03-28 02:51:48.689175 | GameScoreMedian 400
2022-03-28 02:51:48.689189 | GameScoreMin 300
2022-03-28 02:51:48.689203 | GameScoreMax 800
2022-03-28 02:51:48.689216 | lossAverage 0.989625
2022-03-28 02:51:48.689230 | lossStd 0.261377
2022-03-28 02:51:48.689269 | lossMedian 0.940089
2022-03-28 02:51:48.689286 | lossMin 0.400819
2022-03-28 02:51:48.689300 | lossMax 3.95609
2022-03-28 02:51:48.689314 | gradNormAverage 0.901282
2022-03-28 02:51:48.689327 | gradNormStd 0.692513
2022-03-28 02:51:48.689341 | gradNormMedian 0.725976
2022-03-28 02:51:48.689355 | gradNormMin 0.24104
2022-03-28 02:51:48.689368 | gradNormMax 25.3801
2022-03-28 02:51:48.689382 | tdAbsErrAverage 0.196824
2022-03-28 02:51:48.689395 | tdAbsErrStd 0.382023
2022-03-28 02:51:48.689413 | tdAbsErrMedian 0.0395497
2022-03-28 02:51:48.689427 | tdAbsErrMin 0.000149499
2022-03-28 02:51:48.689472 | tdAbsErrMax 4.2798
2022-03-28 02:51:48.689502 | modelRLLossAverage 0
2022-03-28 02:51:48.689517 | modelRLLossStd 0
2022-03-28 02:51:48.689531 | modelRLLossMedian 0
2022-03-28 02:51:48.689544 | modelRLLossMin 0
2022-03-28 02:51:48.689558 | modelRLLossMax 0
2022-03-28 02:51:48.689572 | RewardLossAverage 0.731391
2022-03-28 02:51:48.689585 | RewardLossStd 0.103771
2022-03-28 02:51:48.689599 | RewardLossMedian 0.734766
2022-03-28 02:51:48.689613 | RewardLossMin 0.374675
2022-03-28 02:51:48.689626 | RewardLossMax 1.1154
2022-03-28 02:51:48.689640 | modelGradNormAverage 0.0540919
2022-03-28 02:51:48.689678 | modelGradNormStd 0.111477
2022-03-28 02:51:48.689707 | modelGradNormMedian 0.0481481
2022-03-28 02:51:48.689723 | modelGradNormMin 0.00393918
2022-03-28 02:51:48.689736 | modelGradNormMax 8.68913
2022-03-28 02:51:48.689750 | SPRLossAverage 0.0268906
2022-03-28 02:51:48.689764 | SPRLossStd 0.130056
2022-03-28 02:51:48.689777 | SPRLossMedian 0.0198887
2022-03-28 02:51:48.689791 | SPRLossMin 0.0023902
2022-03-28 02:51:48.689805 | SPRLossMax 9.96364
2022-03-28 02:51:48.689818 | ModelSPRLossAverage 0.00537812
2022-03-28 02:51:48.689832 | ModelSPRLossStd 0.0260111
2022-03-28 02:51:48.689866 | ModelSPRLossMedian 0.00397775
2022-03-28 02:51:48.689896 | ModelSPRLossMin 0.000478041
2022-03-28 02:51:48.689911 | ModelSPRLossMax 1.99273
2022-03-28 02:51:48.689925 | ----------------------------- ---------------
2022-03-28 02:51:48.690080 | dqn_kung_fu_master_0 itr #9999 Optimizing over 10000 iterations.
0% [##############################] 100% | ETA: 00:00:002022-03-28 06:55:33.890054 | dqn_kung_fu_master_0 itr #19999 Evaluating agent...

Total time elapsed: 04:03:45
2022-03-28 06:55:33.892141 | dqn_kung_fu_master_0 itr #19999 Agent at itr 19999, eval eps 0.001
2022-03-28 06:55:35.105315 | dqn_kung_fu_master_0 itr #19999 Agent at itr 19999, eval eps 0.001
2022-03-28 06:57:00.138217 | dqn_kung_fu_master_0 itr #19999 Evaluation reached max num trajectories (100).
2022-03-28 06:57:00.139980 | dqn_kung_fu_master_0 itr #19999 Evaluation runs complete.
2022-03-28 06:57:00.140213 | dqn_kung_fu_master_0 itr #19999 saving snapshot...
2022-03-28 06:57:00.719706 | dqn_kung_fu_master_0 itr #19999 saved
2022-03-28 06:57:00.860795 | ----------------------------- ---------------
2022-03-28 06:57:00.860942 | Diagnostics/StepsInEval 94353
2022-03-28 06:57:00.860964 | Diagnostics/TrajsInEval 100
2022-03-28 06:57:00.861059 | Diagnostics/CumEvalTime 247.567
2022-03-28 06:57:00.861093 | Diagnostics/CumTrainTime 26429.5
2022-03-28 06:57:00.861108 | Diagnostics/Iteration 19999
2022-03-28 06:57:00.861122 | Diagnostics/CumTime (s) 26677.1
2022-03-28 06:57:00.861136 | Diagnostics/CumSteps 20000
2022-03-28 06:57:00.861150 | Diagnostics/CumCompletedTrajs 20
2022-03-28 06:57:00.861214 | Diagnostics/CumUpdates 36000
2022-03-28 06:57:00.861284 | Diagnostics/StepsPerSecond 0.683718
2022-03-28 06:57:00.861315 | Diagnostics/UpdatesPerSecond 1.36744
2022-03-28 06:57:00.861330 | Diagnostics/ReplayRatio 64
2022-03-28 06:57:00.861344 | Diagnostics/CumReplayRatio 57.6
2022-03-28 06:57:00.861358 | LengthAverage 943.53
2022-03-28 06:57:00.861372 | LengthStd 73.1543
2022-03-28 06:57:00.861407 | LengthMedian 931
2022-03-28 06:57:00.861437 | LengthMin 852
2022-03-28 06:57:00.861452 | LengthMax 1275
2022-03-28 06:57:00.861466 | ReturnAverage 8.02
2022-03-28 06:57:00.861479 | ReturnStd 1.22458
2022-03-28 06:57:00.861493 | ReturnMedian 8
2022-03-28 06:57:00.861507 | ReturnMin 4
2022-03-28 06:57:00.861546 | ReturnMax 11
2022-03-28 06:57:00.861570 | NonzeroRewardsAverage 8.02
2022-03-28 06:57:00.861602 | NonzeroRewardsStd 1.22458
2022-03-28 06:57:00.861617 | NonzeroRewardsMedian 8
2022-03-28 06:57:00.861630 | NonzeroRewardsMin 4
2022-03-28 06:57:00.861644 | NonzeroRewardsMax 11
2022-03-28 06:57:00.861658 | DiscountedReturnAverage 0.492846
2022-03-28 06:57:00.861672 | DiscountedReturnStd 0.1024
2022-03-28 06:57:00.861706 | DiscountedReturnMedian 0.460267
2022-03-28 06:57:00.861735 | DiscountedReturnMin 0.371805
2022-03-28 06:57:00.861750 | DiscountedReturnMax 0.677866
2022-03-28 06:57:00.861764 | GameScoreAverage 1256
2022-03-28 06:57:00.861778 | GameScoreStd 201.157
2022-03-28 06:57:00.861792 | GameScoreMedian 1300
2022-03-28 06:57:00.861806 | GameScoreMin 600
2022-03-28 06:57:00.861819 | GameScoreMax 1800
2022-03-28 06:57:00.861857 | lossAverage 0.769059
2022-03-28 06:57:00.861887 | lossStd 0.138844
2022-03-28 06:57:00.861902 | lossMedian 0.767441
2022-03-28 06:57:00.861916 | lossMin 0.317752
2022-03-28 06:57:00.861930 | lossMax 1.36424
2022-03-28 06:57:00.861950 | gradNormAverage 0.713153
2022-03-28 06:57:00.861964 | gradNormStd 0.215024
2022-03-28 06:57:00.861978 | gradNormMedian 0.679873
2022-03-28 06:57:00.861992 | gradNormMin 0.255769
2022-03-28 06:57:00.862012 | gradNormMax 3.37628
2022-03-28 06:57:00.862059 | tdAbsErrAverage 0.176727
2022-03-28 06:57:00.862090 | tdAbsErrStd 0.270876
2022-03-28 06:57:00.862105 | tdAbsErrMedian 0.0628458
2022-03-28 06:57:00.862119 | tdAbsErrMin 0.000104254
2022-03-28 06:57:00.862133 | tdAbsErrMax 3.40443
2022-03-28 06:57:00.862147 | modelRLLossAverage 0
2022-03-28 06:57:00.862161 | modelRLLossStd 0
2022-03-28 06:57:00.862175 | modelRLLossMedian 0
2022-03-28 06:57:00.862189 | modelRLLossMin 0
2022-03-28 06:57:00.862203 | modelRLLossMax 0
2022-03-28 06:57:00.862236 | RewardLossAverage 0.541511
2022-03-28 06:57:00.862278 | RewardLossStd 0.0829435
2022-03-28 06:57:00.862293 | RewardLossMedian 0.543767
2022-03-28 06:57:00.862308 | RewardLossMin 0.246088
2022-03-28 06:57:00.862322 | RewardLossMax 0.854726
2022-03-28 06:57:00.862336 | modelGradNormAverage 0.130482
2022-03-28 06:57:00.862349 | modelGradNormStd 0.0415271
2022-03-28 06:57:00.862363 | modelGradNormMedian 0.125761
2022-03-28 06:57:00.862378 | modelGradNormMin 0.0329402
2022-03-28 06:57:00.862392 | modelGradNormMax 0.456768
2022-03-28 06:57:00.862425 | SPRLossAverage 0.0664854
2022-03-28 06:57:00.862461 | SPRLossStd 0.0181049
2022-03-28 06:57:00.862475 | SPRLossMedian 0.0649353
2022-03-28 06:57:00.862489 | SPRLossMin 0.0227788
2022-03-28 06:57:00.862502 | SPRLossMax 0.165078
2022-03-28 06:57:00.862516 | ModelSPRLossAverage 0.0132971
2022-03-28 06:57:00.862530 | ModelSPRLossStd 0.00362097
2022-03-28 06:57:00.862544 | ModelSPRLossMedian 0.0129871
2022-03-28 06:57:00.862558 | ModelSPRLossMin 0.00455576
2022-03-28 06:57:00.862572 | ModelSPRLossMax 0.0330155
2022-03-28 06:57:00.862613 | ----------------------------- ---------------
2022-03-28 06:57:00.862793 | dqn_kung_fu_master_0 itr #19999 Optimizing over 10000 iterations.
0% [###### ] 100% | ETA: 03:14:22

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions