-
Notifications
You must be signed in to change notification settings - Fork 30
Description
Sorry to disturb you. I come across a similar problem that I follow the instruction to run the project and I add --final-eval-only 0 but the project takes such an abnormal training period (I use a V100) , it has not finished the training after 8 hours.
And on my wandb website I can only see 2 data points for each diagram until now.
I wonder where I have made a mistake?
My output :
root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:# cd spr-release
root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:/spr-release# rm -rf wandb
root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:/spr-release# conda activate spr
(spr) root@dl-0947440725-pod-jupyter-fccff46d6-xvhs7:/spr-release# python -m scripts.run --public --game kung_fu_master --momentum-tau 1. --final-eval-only 0
wandb: Currently logged in as: alexgranger (use wandb login --relogin to force relogin)
wandb: Tracking run with wandb version 0.12.11
wandb: Run data is saved locally in wandb/run-20220327_233137-3l5mkumo
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run dry-eon-3
wandb: ⭐️ View project at https://wandb.ai/alexgranger/uncategorized
wandb: 🚀 View run at https://wandb.ai/alexgranger/uncategorized/runs/3l5mkumo
logger_context received log_dir outside of /root/spr-release/data: prepending by /root/spr-release/data/local///
2022-03-27 23:31:43.907745 | dqn_kung_fu_master_0 Runner master CPU affinity: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71].
2022-03-27 23:31:43.908008 | dqn_kung_fu_master_0 Runner master Torch threads: 36.
using seed 0
Spatial latent size is torch.Size([64, 7, 7])
Initialized model with 4045991 parameters
Spatial latent size is torch.Size([64, 7, 7])
Initialized model with 4045991 parameters
2022-03-27 23:32:05.931997 | dqn_kung_fu_master_0 Sampler decorrelating envs, max steps: 0
2022-03-27 23:32:05.933166 | dqn_kung_fu_master_0 Agent at itr 0, sample eps 1.0 (min itr: 50, max_itr: 1000)
2022-03-27 23:32:05.933360 | dqn_kung_fu_master_0 Serial Sampler initialized.
2022-03-27 23:32:05.933493 | dqn_kung_fu_master_0 Running 100000 iterations of minibatch RL.
2022-03-27 23:32:15.510158 | dqn_kung_fu_master_0 Initialized agent model on device: cuda:0.
2022-03-27 23:32:15.576100 | dqn_kung_fu_master_0 From sampler batch size 1, training batch size 32, and replay ratio 64, computed 2 updates per iteration.
2022-03-27 23:32:15.576465 | dqn_kung_fu_master_0 Agent setting min/max epsilon itrs: 2000, 2001
2022-03-27 23:32:15.679657 | dqn_kung_fu_master_0 Frame-based buffer using 4-frame sequences.
2022-03-27 23:32:23.633687 | dqn_kung_fu_master_0 itr #0 Evaluating agent...
2022-03-27 23:32:23.634627 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, eval eps 1.0
2022-03-27 23:32:24.872641 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, eval eps 1.0
2022-03-27 23:33:47.222620 | dqn_kung_fu_master_0 itr #0 Evaluation reached max num trajectories (100).
2022-03-27 23:33:47.224658 | dqn_kung_fu_master_0 itr #0 Evaluation runs complete.
2022-03-27 23:33:47.224950 | dqn_kung_fu_master_0 itr #0 saving snapshot...
2022-03-27 23:33:47.653684 | dqn_kung_fu_master_0 itr #0 saved
/root/.local/conda/envs/spr/lib/python3.6/site-packages/numpy/lib/function_base.py:380: RuntimeWarning: Mean of empty slice.
avg = a.mean(axis)
/root/.local/conda/envs/spr/lib/python3.6/site-packages/numpy/core/methods.py:170: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
2022-03-27 23:33:47.714302 | ----------------------------- -------------
2022-03-27 23:33:47.714680 | Diagnostics/StepsInEval 89813
2022-03-27 23:33:47.715209 | Diagnostics/TrajsInEval 100
2022-03-27 23:33:47.715299 | Diagnostics/CumEvalTime 83.5899
2022-03-27 23:33:47.715376 | Diagnostics/CumTrainTime 0.430582
2022-03-27 23:33:47.715452 | Diagnostics/Iteration 0
2022-03-27 23:33:47.715525 | Diagnostics/CumTime (s) 84.0205
2022-03-27 23:33:47.715588 | Diagnostics/CumSteps 1
2022-03-27 23:33:47.715665 | Diagnostics/CumCompletedTrajs 0
2022-03-27 23:33:47.715727 | Diagnostics/CumUpdates 0
2022-03-27 23:33:47.715780 | Diagnostics/StepsPerSecond nan
2022-03-27 23:33:47.715826 | Diagnostics/UpdatesPerSecond nan
2022-03-27 23:33:47.715885 | Diagnostics/ReplayRatio 0
2022-03-27 23:33:47.715938 | Diagnostics/CumReplayRatio 0
2022-03-27 23:33:47.715983 | LengthAverage 898.13
2022-03-27 23:33:47.716036 | LengthStd 136.602
2022-03-27 23:33:47.716108 | LengthMedian 890.5
2022-03-27 23:33:47.716180 | LengthMin 580
2022-03-27 23:33:47.716243 | LengthMax 1206
2022-03-27 23:33:47.716332 | ReturnAverage 2.76
2022-03-27 23:33:47.716428 | ReturnStd 2.0006
2022-03-27 23:33:47.716486 | ReturnMedian 2
2022-03-27 23:33:47.716547 | ReturnMin 0
2022-03-27 23:33:47.716599 | ReturnMax 9
2022-03-27 23:33:47.716667 | NonzeroRewardsAverage 2.76
2022-03-27 23:33:47.716727 | NonzeroRewardsStd 2.0006
2022-03-27 23:33:47.716780 | NonzeroRewardsMedian 2
2022-03-27 23:33:47.716849 | NonzeroRewardsMin 0
2022-03-27 23:33:47.716912 | NonzeroRewardsMax 9
2022-03-27 23:33:47.716963 | DiscountedReturnAverage 0.0968938
2022-03-27 23:33:47.717033 | DiscountedReturnStd 0.121937
2022-03-27 23:33:47.717095 | DiscountedReturnMedian 0.0532579
2022-03-27 23:33:47.717172 | DiscountedReturnMin 0
2022-03-27 23:33:47.717243 | DiscountedReturnMax 0.62236
2022-03-27 23:33:47.717322 | GameScoreAverage 441
2022-03-27 23:33:47.717409 | GameScoreStd 320.654
2022-03-27 23:33:47.717481 | GameScoreMedian 400
2022-03-27 23:33:47.717541 | GameScoreMin 0
2022-03-27 23:33:47.717614 | GameScoreMax 1500
2022-03-27 23:33:47.717686 | lossAverage nan
2022-03-27 23:33:47.717748 | lossStd nan
2022-03-27 23:33:47.717822 | lossMedian nan
2022-03-27 23:33:47.717870 | lossMin nan
2022-03-27 23:33:47.717941 | lossMax nan
2022-03-27 23:33:47.718022 | gradNormAverage nan
2022-03-27 23:33:47.718083 | gradNormStd nan
2022-03-27 23:33:47.718157 | gradNormMedian nan
2022-03-27 23:33:47.718227 | gradNormMin nan
2022-03-27 23:33:47.718302 | gradNormMax nan
2022-03-27 23:33:47.718378 | tdAbsErrAverage nan
2022-03-27 23:33:47.718439 | tdAbsErrStd nan
2022-03-27 23:33:47.718515 | tdAbsErrMedian nan
2022-03-27 23:33:47.718578 | tdAbsErrMin nan
2022-03-27 23:33:47.718652 | tdAbsErrMax nan
2022-03-27 23:33:47.718714 | modelRLLossAverage nan
2022-03-27 23:33:47.718766 | modelRLLossStd nan
2022-03-27 23:33:47.718833 | modelRLLossMedian nan
2022-03-27 23:33:47.718894 | modelRLLossMin nan
2022-03-27 23:33:47.718945 | modelRLLossMax nan
2022-03-27 23:33:47.719014 | RewardLossAverage nan
2022-03-27 23:33:47.719075 | RewardLossStd nan
2022-03-27 23:33:47.719148 | RewardLossMedian nan
2022-03-27 23:33:47.719210 | RewardLossMin nan
2022-03-27 23:33:47.719298 | RewardLossMax nan
2022-03-27 23:33:47.719371 | modelGradNormAverage nan
2022-03-27 23:33:47.719433 | modelGradNormStd nan
2022-03-27 23:33:47.719485 | modelGradNormMedian nan
2022-03-27 23:33:47.719553 | modelGradNormMin nan
2022-03-27 23:33:47.719615 | modelGradNormMax nan
2022-03-27 23:33:47.719690 | SPRLossAverage nan
2022-03-27 23:33:47.719753 | SPRLossStd nan
2022-03-27 23:33:47.719805 | SPRLossMedian nan
2022-03-27 23:33:47.719873 | SPRLossMin nan
2022-03-27 23:33:47.719976 | SPRLossMax nan
2022-03-27 23:33:47.720034 | ModelSPRLossAverage nan
2022-03-27 23:33:47.720104 | ModelSPRLossStd nan
2022-03-27 23:33:47.720166 | ModelSPRLossMedian nan
2022-03-27 23:33:47.720218 | ModelSPRLossMin nan
2022-03-27 23:33:47.720273 | ModelSPRLossMax nan
2022-03-27 23:33:47.720344 | ----------------------------- -------------
2022-03-27 23:33:47.720917 | dqn_kung_fu_master_0 itr #0 Optimizing over 10000 iterations.
2022-03-27 23:33:47.724755 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:47.725975 | dqn_kung_fu_master_0 itr #0 Agent at itr 0, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:48.536667 | dqn_kung_fu_master_0 itr #200 Agent at itr 200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:48.537071 | dqn_kung_fu_master_0 itr #200 Agent at itr 200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [# ] 100% | ETA: 00:00:372022-03-27 23:33:49.265005 | dqn_kung_fu_master_0 itr #400 Agent at itr 400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:49.265436 | dqn_kung_fu_master_0 itr #400 Agent at itr 400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:49.970561 | dqn_kung_fu_master_0 itr #600 Agent at itr 600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:49.970965 | dqn_kung_fu_master_0 itr #600 Agent at itr 600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [## ] 100% | ETA: 00:00:342022-03-27 23:33:50.703183 | dqn_kung_fu_master_0 itr #800 Agent at itr 800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:50.703541 | dqn_kung_fu_master_0 itr #800 Agent at itr 800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [### ] 100% | ETA: 00:00:332022-03-27 23:33:51.414277 | dqn_kung_fu_master_0 itr #1000 Agent at itr 1000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:51.415657 | dqn_kung_fu_master_0 itr #1000 Agent at itr 1000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:52.193086 | dqn_kung_fu_master_0 itr #1200 Agent at itr 1200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:52.193518 | dqn_kung_fu_master_0 itr #1200 Agent at itr 1200, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [#### ] 100% | ETA: 00:00:322022-03-27 23:33:52.937921 | dqn_kung_fu_master_0 itr #1400 Agent at itr 1400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:52.938333 | dqn_kung_fu_master_0 itr #1400 Agent at itr 1400, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:53.651224 | dqn_kung_fu_master_0 itr #1600 Agent at itr 1600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:53.659388 | dqn_kung_fu_master_0 itr #1600 Agent at itr 1600, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [##### ] 100% | ETA: 00:00:302022-03-27 23:33:54.373088 | dqn_kung_fu_master_0 itr #1800 Agent at itr 1800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:54.373503 | dqn_kung_fu_master_0 itr #1800 Agent at itr 1800, sample eps 1.0 (min itr: 2000, max_itr: 2001)
0% [###### ] 100% | ETA: 00:00:292022-03-27 23:33:55.090011 | dqn_kung_fu_master_0 itr #2000 Agent at itr 2000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:55.090488 | dqn_kung_fu_master_0 itr #2000 Agent at itr 2000, sample eps 1.0 (min itr: 2000, max_itr: 2001)
/root/spr-release/src/algos.py:150: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad(True), rather than torch.tensor(sourceTensor).
opt_info.gradNorm.append(torch.tensor(grad_norm).item()) # grad_norm is a float sometimes, so wrap in tensor
/root/spr-release/src/algos.py:153: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
opt_info.modelGradNorm.append(torch.tensor(model_grad_norm).item())
2022-03-27 23:33:56.678533 | dqn_kung_fu_master_0 itr #2001 Agent at itr 2001, sample eps 0.0 (min itr: 2000, max_itr: 2001)
2022-03-27 23:33:56.679209 | dqn_kung_fu_master_0 itr #2001 Agent at itr 2001, sample eps 0.0 (min itr: 2000, max_itr: 2001)
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 03:16:422022-03-28 02:50:30.281019 | dqn_kung_fu_master_0 itr #9999 Evaluating agent...
2022-03-28 02:50:30.284811 | dqn_kung_fu_master_0 itr #9999 Agent at itr 9999, eval eps 0.001
2022-03-28 02:50:31.487369 | dqn_kung_fu_master_0 itr #9999 Agent at itr 9999, eval eps 0.001
2022-03-28 02:51:48.012106 | dqn_kung_fu_master_0 itr #9999 Evaluation reached max num trajectories (100).
2022-03-28 02:51:48.014187 | dqn_kung_fu_master_0 itr #9999 Evaluation runs complete.
2022-03-28 02:51:48.014417 | dqn_kung_fu_master_0 itr #9999 saving snapshot...
2022-03-28 02:51:48.556121 | dqn_kung_fu_master_0 itr #9999 saved
2022-03-28 02:51:48.688185 | ----------------------------- ---------------
2022-03-28 02:51:48.688282 | Diagnostics/StepsInEval 86153
2022-03-28 02:51:48.688353 | Diagnostics/TrajsInEval 100
2022-03-28 02:51:48.688369 | Diagnostics/CumEvalTime 161.319
2022-03-28 02:51:48.688457 | Diagnostics/CumTrainTime 11803.6
2022-03-28 02:51:48.688488 | Diagnostics/Iteration 9999
2022-03-28 02:51:48.688503 | Diagnostics/CumTime (s) 11964.9
2022-03-28 02:51:48.688517 | Diagnostics/CumSteps 10000
2022-03-28 02:51:48.688570 | Diagnostics/CumCompletedTrajs 10
2022-03-28 02:51:48.688599 | Diagnostics/CumUpdates 16000
2022-03-28 02:51:48.688613 | Diagnostics/StepsPerSecond 0.84723
2022-03-28 02:51:48.688627 | Diagnostics/UpdatesPerSecond 1.35557
2022-03-28 02:51:48.688641 | Diagnostics/ReplayRatio 51.2
2022-03-28 02:51:48.688696 | Diagnostics/CumReplayRatio 51.2
2022-03-28 02:51:48.688727 | LengthAverage 861.53
2022-03-28 02:51:48.688742 | LengthStd 50.7432
2022-03-28 02:51:48.688755 | LengthMedian 880
2022-03-28 02:51:48.688769 | LengthMin 793
2022-03-28 02:51:48.688783 | LengthMax 968
2022-03-28 02:51:48.688820 | ReturnAverage 3.85
2022-03-28 02:51:48.688851 | ReturnStd 0.952628
2022-03-28 02:51:48.688866 | ReturnMedian 4
2022-03-28 02:51:48.688879 | ReturnMin 3
2022-03-28 02:51:48.688893 | ReturnMax 7
2022-03-28 02:51:48.688906 | NonzeroRewardsAverage 3.85
2022-03-28 02:51:48.688938 | NonzeroRewardsStd 0.952628
2022-03-28 02:51:48.688963 | NonzeroRewardsMedian 4
2022-03-28 02:51:48.688998 | NonzeroRewardsMin 3
2022-03-28 02:51:48.689014 | NonzeroRewardsMax 7
2022-03-28 02:51:48.689027 | DiscountedReturnAverage 0.0332842
2022-03-28 02:51:48.689042 | DiscountedReturnStd 0.0181958
2022-03-28 02:51:48.689055 | DiscountedReturnMedian 0.0294325
2022-03-28 02:51:48.689069 | DiscountedReturnMin 0.0146507
2022-03-28 02:51:48.689117 | DiscountedReturnMax 0.074113
2022-03-28 02:51:48.689147 | GameScoreAverage 444
2022-03-28 02:51:48.689162 | GameScoreStd 116.893
2022-03-28 02:51:48.689175 | GameScoreMedian 400
2022-03-28 02:51:48.689189 | GameScoreMin 300
2022-03-28 02:51:48.689203 | GameScoreMax 800
2022-03-28 02:51:48.689216 | lossAverage 0.989625
2022-03-28 02:51:48.689230 | lossStd 0.261377
2022-03-28 02:51:48.689269 | lossMedian 0.940089
2022-03-28 02:51:48.689286 | lossMin 0.400819
2022-03-28 02:51:48.689300 | lossMax 3.95609
2022-03-28 02:51:48.689314 | gradNormAverage 0.901282
2022-03-28 02:51:48.689327 | gradNormStd 0.692513
2022-03-28 02:51:48.689341 | gradNormMedian 0.725976
2022-03-28 02:51:48.689355 | gradNormMin 0.24104
2022-03-28 02:51:48.689368 | gradNormMax 25.3801
2022-03-28 02:51:48.689382 | tdAbsErrAverage 0.196824
2022-03-28 02:51:48.689395 | tdAbsErrStd 0.382023
2022-03-28 02:51:48.689413 | tdAbsErrMedian 0.0395497
2022-03-28 02:51:48.689427 | tdAbsErrMin 0.000149499
2022-03-28 02:51:48.689472 | tdAbsErrMax 4.2798
2022-03-28 02:51:48.689502 | modelRLLossAverage 0
2022-03-28 02:51:48.689517 | modelRLLossStd 0
2022-03-28 02:51:48.689531 | modelRLLossMedian 0
2022-03-28 02:51:48.689544 | modelRLLossMin 0
2022-03-28 02:51:48.689558 | modelRLLossMax 0
2022-03-28 02:51:48.689572 | RewardLossAverage 0.731391
2022-03-28 02:51:48.689585 | RewardLossStd 0.103771
2022-03-28 02:51:48.689599 | RewardLossMedian 0.734766
2022-03-28 02:51:48.689613 | RewardLossMin 0.374675
2022-03-28 02:51:48.689626 | RewardLossMax 1.1154
2022-03-28 02:51:48.689640 | modelGradNormAverage 0.0540919
2022-03-28 02:51:48.689678 | modelGradNormStd 0.111477
2022-03-28 02:51:48.689707 | modelGradNormMedian 0.0481481
2022-03-28 02:51:48.689723 | modelGradNormMin 0.00393918
2022-03-28 02:51:48.689736 | modelGradNormMax 8.68913
2022-03-28 02:51:48.689750 | SPRLossAverage 0.0268906
2022-03-28 02:51:48.689764 | SPRLossStd 0.130056
2022-03-28 02:51:48.689777 | SPRLossMedian 0.0198887
2022-03-28 02:51:48.689791 | SPRLossMin 0.0023902
2022-03-28 02:51:48.689805 | SPRLossMax 9.96364
2022-03-28 02:51:48.689818 | ModelSPRLossAverage 0.00537812
2022-03-28 02:51:48.689832 | ModelSPRLossStd 0.0260111
2022-03-28 02:51:48.689866 | ModelSPRLossMedian 0.00397775
2022-03-28 02:51:48.689896 | ModelSPRLossMin 0.000478041
2022-03-28 02:51:48.689911 | ModelSPRLossMax 1.99273
2022-03-28 02:51:48.689925 | ----------------------------- ---------------
2022-03-28 02:51:48.690080 | dqn_kung_fu_master_0 itr #9999 Optimizing over 10000 iterations.
0% [##############################] 100% | ETA: 00:00:002022-03-28 06:55:33.890054 | dqn_kung_fu_master_0 itr #19999 Evaluating agent...
Total time elapsed: 04:03:45
2022-03-28 06:55:33.892141 | dqn_kung_fu_master_0 itr #19999 Agent at itr 19999, eval eps 0.001
2022-03-28 06:55:35.105315 | dqn_kung_fu_master_0 itr #19999 Agent at itr 19999, eval eps 0.001
2022-03-28 06:57:00.138217 | dqn_kung_fu_master_0 itr #19999 Evaluation reached max num trajectories (100).
2022-03-28 06:57:00.139980 | dqn_kung_fu_master_0 itr #19999 Evaluation runs complete.
2022-03-28 06:57:00.140213 | dqn_kung_fu_master_0 itr #19999 saving snapshot...
2022-03-28 06:57:00.719706 | dqn_kung_fu_master_0 itr #19999 saved
2022-03-28 06:57:00.860795 | ----------------------------- ---------------
2022-03-28 06:57:00.860942 | Diagnostics/StepsInEval 94353
2022-03-28 06:57:00.860964 | Diagnostics/TrajsInEval 100
2022-03-28 06:57:00.861059 | Diagnostics/CumEvalTime 247.567
2022-03-28 06:57:00.861093 | Diagnostics/CumTrainTime 26429.5
2022-03-28 06:57:00.861108 | Diagnostics/Iteration 19999
2022-03-28 06:57:00.861122 | Diagnostics/CumTime (s) 26677.1
2022-03-28 06:57:00.861136 | Diagnostics/CumSteps 20000
2022-03-28 06:57:00.861150 | Diagnostics/CumCompletedTrajs 20
2022-03-28 06:57:00.861214 | Diagnostics/CumUpdates 36000
2022-03-28 06:57:00.861284 | Diagnostics/StepsPerSecond 0.683718
2022-03-28 06:57:00.861315 | Diagnostics/UpdatesPerSecond 1.36744
2022-03-28 06:57:00.861330 | Diagnostics/ReplayRatio 64
2022-03-28 06:57:00.861344 | Diagnostics/CumReplayRatio 57.6
2022-03-28 06:57:00.861358 | LengthAverage 943.53
2022-03-28 06:57:00.861372 | LengthStd 73.1543
2022-03-28 06:57:00.861407 | LengthMedian 931
2022-03-28 06:57:00.861437 | LengthMin 852
2022-03-28 06:57:00.861452 | LengthMax 1275
2022-03-28 06:57:00.861466 | ReturnAverage 8.02
2022-03-28 06:57:00.861479 | ReturnStd 1.22458
2022-03-28 06:57:00.861493 | ReturnMedian 8
2022-03-28 06:57:00.861507 | ReturnMin 4
2022-03-28 06:57:00.861546 | ReturnMax 11
2022-03-28 06:57:00.861570 | NonzeroRewardsAverage 8.02
2022-03-28 06:57:00.861602 | NonzeroRewardsStd 1.22458
2022-03-28 06:57:00.861617 | NonzeroRewardsMedian 8
2022-03-28 06:57:00.861630 | NonzeroRewardsMin 4
2022-03-28 06:57:00.861644 | NonzeroRewardsMax 11
2022-03-28 06:57:00.861658 | DiscountedReturnAverage 0.492846
2022-03-28 06:57:00.861672 | DiscountedReturnStd 0.1024
2022-03-28 06:57:00.861706 | DiscountedReturnMedian 0.460267
2022-03-28 06:57:00.861735 | DiscountedReturnMin 0.371805
2022-03-28 06:57:00.861750 | DiscountedReturnMax 0.677866
2022-03-28 06:57:00.861764 | GameScoreAverage 1256
2022-03-28 06:57:00.861778 | GameScoreStd 201.157
2022-03-28 06:57:00.861792 | GameScoreMedian 1300
2022-03-28 06:57:00.861806 | GameScoreMin 600
2022-03-28 06:57:00.861819 | GameScoreMax 1800
2022-03-28 06:57:00.861857 | lossAverage 0.769059
2022-03-28 06:57:00.861887 | lossStd 0.138844
2022-03-28 06:57:00.861902 | lossMedian 0.767441
2022-03-28 06:57:00.861916 | lossMin 0.317752
2022-03-28 06:57:00.861930 | lossMax 1.36424
2022-03-28 06:57:00.861950 | gradNormAverage 0.713153
2022-03-28 06:57:00.861964 | gradNormStd 0.215024
2022-03-28 06:57:00.861978 | gradNormMedian 0.679873
2022-03-28 06:57:00.861992 | gradNormMin 0.255769
2022-03-28 06:57:00.862012 | gradNormMax 3.37628
2022-03-28 06:57:00.862059 | tdAbsErrAverage 0.176727
2022-03-28 06:57:00.862090 | tdAbsErrStd 0.270876
2022-03-28 06:57:00.862105 | tdAbsErrMedian 0.0628458
2022-03-28 06:57:00.862119 | tdAbsErrMin 0.000104254
2022-03-28 06:57:00.862133 | tdAbsErrMax 3.40443
2022-03-28 06:57:00.862147 | modelRLLossAverage 0
2022-03-28 06:57:00.862161 | modelRLLossStd 0
2022-03-28 06:57:00.862175 | modelRLLossMedian 0
2022-03-28 06:57:00.862189 | modelRLLossMin 0
2022-03-28 06:57:00.862203 | modelRLLossMax 0
2022-03-28 06:57:00.862236 | RewardLossAverage 0.541511
2022-03-28 06:57:00.862278 | RewardLossStd 0.0829435
2022-03-28 06:57:00.862293 | RewardLossMedian 0.543767
2022-03-28 06:57:00.862308 | RewardLossMin 0.246088
2022-03-28 06:57:00.862322 | RewardLossMax 0.854726
2022-03-28 06:57:00.862336 | modelGradNormAverage 0.130482
2022-03-28 06:57:00.862349 | modelGradNormStd 0.0415271
2022-03-28 06:57:00.862363 | modelGradNormMedian 0.125761
2022-03-28 06:57:00.862378 | modelGradNormMin 0.0329402
2022-03-28 06:57:00.862392 | modelGradNormMax 0.456768
2022-03-28 06:57:00.862425 | SPRLossAverage 0.0664854
2022-03-28 06:57:00.862461 | SPRLossStd 0.0181049
2022-03-28 06:57:00.862475 | SPRLossMedian 0.0649353
2022-03-28 06:57:00.862489 | SPRLossMin 0.0227788
2022-03-28 06:57:00.862502 | SPRLossMax 0.165078
2022-03-28 06:57:00.862516 | ModelSPRLossAverage 0.0132971
2022-03-28 06:57:00.862530 | ModelSPRLossStd 0.00362097
2022-03-28 06:57:00.862544 | ModelSPRLossMedian 0.0129871
2022-03-28 06:57:00.862558 | ModelSPRLossMin 0.00455576
2022-03-28 06:57:00.862572 | ModelSPRLossMax 0.0330155
2022-03-28 06:57:00.862613 | ----------------------------- ---------------
2022-03-28 06:57:00.862793 | dqn_kung_fu_master_0 itr #19999 Optimizing over 10000 iterations.
0% [###### ] 100% | ETA: 03:14:22