Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Compute lp during loss execution #2688

Open
wants to merge 8 commits into
base: gh/vmoens/66/base
Choose a base branch
from

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jan 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2688

Note: Links to docs will display an error until the docs builds have been completed.

❌ 13 New Failures, 7 Unrelated Failures

As of commit aa372c3 with merge base dc25a55 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 10, 2025
vmoens added a commit that referenced this pull request Jan 10, 2025
ghstack-source-id: b5b186bb813fff68f4a3d90e7ea3e0c4b013a42c
Pull Request resolved: #2688
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 10, 2025
ghstack-source-id: 87180950cdd33a4d246d1a665d8c9c52aa5b0389
Pull Request resolved: #2688
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 13, 2025
ghstack-source-id: a13b83e361b4cf3bd0c67b320b5b3f29d0fad233
Pull Request resolved: #2688
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}32$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5683s 0.4667s 2.1425 Ops/s 2.1590 Ops/s $\color{#d91a1a}-0.76\%$
test_transformed 0.7638s 0.6552s 1.5262 Ops/s 1.5464 Ops/s $\color{#d91a1a}-1.31\%$
test_serial 1.5175s 1.4110s 0.7087 Ops/s 0.7032 Ops/s $\color{#35bf28}+0.78\%$
test_parallel 1.3958s 1.2670s 0.7893 Ops/s 0.7940 Ops/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[True-True-True-True-True] 0.3042ms 29.9381μs 33.4022 KOps/s 32.8918 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[True-True-True-True-False] 64.9610μs 17.6570μs 56.6349 KOps/s 55.7987 KOps/s $\color{#35bf28}+1.50\%$
test_step_mdp_speed[True-True-True-False-True] 66.3540μs 16.9278μs 59.0745 KOps/s 58.7384 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[True-True-True-False-False] 63.1280μs 9.9879μs 100.1213 KOps/s 98.3942 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[True-True-False-True-True] 95.3190μs 32.3290μs 30.9320 KOps/s 30.7117 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[True-True-False-True-False] 89.1090μs 19.5383μs 51.1815 KOps/s 49.8218 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[True-True-False-False-True] 70.7320μs 18.6283μs 53.6819 KOps/s 51.6054 KOps/s $\color{#35bf28}+4.02\%$
test_step_mdp_speed[True-True-False-False-False] 62.3960μs 11.7756μs 84.9211 KOps/s 83.5934 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[True-False-True-True-True] 96.0700μs 33.8875μs 29.5094 KOps/s 28.9433 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[True-False-True-True-False] 69.3000μs 21.6802μs 46.1250 KOps/s 45.4386 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[True-False-True-False-True] 50.0730μs 18.6723μs 53.5553 KOps/s 52.1567 KOps/s $\color{#35bf28}+2.68\%$
test_step_mdp_speed[True-False-True-False-False] 56.8270μs 11.8206μs 84.5979 KOps/s 83.4859 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[True-False-False-True-True] 71.6540μs 35.8616μs 27.8850 KOps/s 27.5854 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[True-False-False-True-False] 67.7070μs 23.2501μs 43.0106 KOps/s 40.9452 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_step_mdp_speed[True-False-False-False-True] 81.5220μs 20.5120μs 48.7521 KOps/s 48.2046 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[True-False-False-False-False] 52.1170μs 13.5163μs 73.9849 KOps/s 70.1094 KOps/s $\textbf{\color{#35bf28}+5.53\%}$
test_step_mdp_speed[False-True-True-True-True] 0.1073ms 33.5728μs 29.7861 KOps/s 28.9920 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[False-True-True-True-False] 67.7870μs 21.5715μs 46.3576 KOps/s 45.5465 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-True-True-False-True] 70.1610μs 21.4297μs 46.6643 KOps/s 44.3179 KOps/s $\textbf{\color{#35bf28}+5.29\%}$
test_step_mdp_speed[False-True-True-False-False] 42.1490μs 13.2607μs 75.4107 KOps/s 74.7197 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[False-True-False-True-True] 0.1231ms 35.8683μs 27.8798 KOps/s 27.8516 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[False-True-False-True-False] 65.9630μs 23.2393μs 43.0305 KOps/s 41.9965 KOps/s $\color{#35bf28}+2.46\%$
test_step_mdp_speed[False-True-False-False-True] 2.7238ms 23.3140μs 42.8926 KOps/s 41.7864 KOps/s $\color{#35bf28}+2.65\%$
test_step_mdp_speed[False-True-False-False-False] 49.1120μs 15.0571μs 66.4139 KOps/s 65.7372 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-False-True-True-True] 92.3330μs 37.6430μs 26.5654 KOps/s 26.1131 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[False-False-True-True-False] 57.2870μs 25.4974μs 39.2197 KOps/s 38.6622 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[False-False-True-False-True] 0.1212ms 23.4280μs 42.6839 KOps/s 42.7818 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[False-False-True-False-False] 43.2210μs 14.9975μs 66.6776 KOps/s 65.6081 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[False-False-False-True-True] 95.7210μs 38.2666μs 26.1324 KOps/s 24.3679 KOps/s $\textbf{\color{#35bf28}+7.24\%}$
test_step_mdp_speed[False-False-False-True-False] 85.7500μs 26.8119μs 37.2969 KOps/s 36.3169 KOps/s $\color{#35bf28}+2.70\%$
test_step_mdp_speed[False-False-False-False-True] 72.9070μs 24.6849μs 40.5106 KOps/s 40.0794 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-False-False-False-False] 80.5010μs 16.8181μs 59.4597 KOps/s 58.9617 KOps/s $\color{#35bf28}+0.84\%$
test_values[generalized_advantage_estimate-True-True] 10.3151ms 10.0358ms 99.6431 Ops/s 99.1284 Ops/s $\color{#35bf28}+0.52\%$
test_values[vec_generalized_advantage_estimate-True-True] 39.0755ms 36.1411ms 27.6694 Ops/s 29.1812 Ops/s $\textbf{\color{#d91a1a}-5.18\%}$
test_values[td0_return_estimate-False-False] 0.2384ms 0.1962ms 5.0976 KOps/s 4.6290 KOps/s $\textbf{\color{#35bf28}+10.12\%}$
test_values[td1_return_estimate-False-False] 26.6469ms 25.0399ms 39.9363 Ops/s 39.7347 Ops/s $\color{#35bf28}+0.51\%$
test_values[vec_td1_return_estimate-False-False] 39.4084ms 36.1960ms 27.6274 Ops/s 29.1816 Ops/s $\textbf{\color{#d91a1a}-5.33\%}$
test_values[td_lambda_return_estimate-True-False] 39.8335ms 35.8325ms 27.9076 Ops/s 27.7952 Ops/s $\color{#35bf28}+0.40\%$
test_values[vec_td_lambda_return_estimate-True-False] 39.7091ms 36.4672ms 27.4219 Ops/s 29.2773 Ops/s $\textbf{\color{#d91a1a}-6.34\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.8512ms 8.5487ms 116.9762 Ops/s 114.8192 Ops/s $\color{#35bf28}+1.88\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3337ms 1.9017ms 525.8425 Ops/s 504.2106 Ops/s $\color{#35bf28}+4.29\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4982ms 0.3682ms 2.7159 KOps/s 2.7251 KOps/s $\color{#d91a1a}-0.34\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 48.2555ms 45.6177ms 21.9213 Ops/s 26.5448 Ops/s $\textbf{\color{#d91a1a}-17.42\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.0356ms 3.1507ms 317.3856 Ops/s 317.3355 Ops/s $\color{#35bf28}+0.02\%$
test_dqn_speed[False-None] 2.2271ms 1.4415ms 693.7113 Ops/s 694.8260 Ops/s $\color{#d91a1a}-0.16\%$
test_dqn_speed[False-backward] 2.6979ms 2.0383ms 490.6140 Ops/s 513.3189 Ops/s $\color{#d91a1a}-4.42\%$
test_dqn_speed[True-None] 1.0176ms 0.4903ms 2.0397 KOps/s 1.9569 KOps/s $\color{#35bf28}+4.23\%$
test_dqn_speed[True-backward] 1.0270ms 0.9642ms 1.0372 KOps/s 1.0211 KOps/s $\color{#35bf28}+1.57\%$
test_dqn_speed[reduce-overhead-None] 0.6604ms 0.4867ms 2.0549 KOps/s 2.0302 KOps/s $\color{#35bf28}+1.22\%$
test_dqn_speed[reduce-overhead-backward] 1.0052ms 0.9425ms 1.0610 KOps/s 1.0458 KOps/s $\color{#35bf28}+1.46\%$
test_ddpg_speed[False-None] 3.5332ms 2.9878ms 334.6985 Ops/s 326.4027 Ops/s $\color{#35bf28}+2.54\%$
test_ddpg_speed[False-backward] 4.6656ms 4.3248ms 231.2242 Ops/s 221.8204 Ops/s $\color{#35bf28}+4.24\%$
test_ddpg_speed[True-None] 1.3661ms 1.0343ms 966.8613 Ops/s 972.9051 Ops/s $\color{#d91a1a}-0.62\%$
test_ddpg_speed[True-backward] 2.0310ms 1.9451ms 514.1238 Ops/s 491.1071 Ops/s $\color{#35bf28}+4.69\%$
test_ddpg_speed[reduce-overhead-None] 1.7012ms 1.0248ms 975.8132 Ops/s 944.5171 Ops/s $\color{#35bf28}+3.31\%$
test_ddpg_speed[reduce-overhead-backward] 2.2729ms 2.0104ms 497.4190 Ops/s 449.8643 Ops/s $\textbf{\color{#35bf28}+10.57\%}$
test_sac_speed[False-None] 10.5226ms 8.5616ms 116.8008 Ops/s 104.5619 Ops/s $\textbf{\color{#35bf28}+11.70\%}$
test_sac_speed[False-backward] 12.6369ms 11.6795ms 85.6199 Ops/s 78.6643 Ops/s $\textbf{\color{#35bf28}+8.84\%}$
test_sac_speed[True-None] 2.4641ms 1.8961ms 527.4081 Ops/s 496.7155 Ops/s $\textbf{\color{#35bf28}+6.18\%}$
test_sac_speed[True-backward] 3.8607ms 3.7446ms 267.0510 Ops/s 235.2714 Ops/s $\textbf{\color{#35bf28}+13.51\%}$
test_sac_speed[reduce-overhead-None] 2.1969ms 1.9211ms 520.5234 Ops/s 491.3643 Ops/s $\textbf{\color{#35bf28}+5.93\%}$
test_sac_speed[reduce-overhead-backward] 4.0335ms 3.7652ms 265.5917 Ops/s 256.6560 Ops/s $\color{#35bf28}+3.48\%$
test_redq_speed[False-None] 15.6331ms 13.7961ms 72.4842 Ops/s 70.5586 Ops/s $\color{#35bf28}+2.73\%$
test_redq_speed[False-backward] 0.2760s 28.6649ms 34.8859 Ops/s 42.2512 Ops/s $\textbf{\color{#d91a1a}-17.43\%}$
test_redq_speed[True-None] 6.9585ms 5.8415ms 171.1878 Ops/s 170.3578 Ops/s $\color{#35bf28}+0.49\%$
test_redq_speed[True-backward] 13.8544ms 13.1163ms 76.2411 Ops/s 72.6468 Ops/s $\color{#35bf28}+4.95\%$
test_redq_speed[reduce-overhead-None] 6.7188ms 5.7605ms 173.5951 Ops/s 160.1640 Ops/s $\textbf{\color{#35bf28}+8.39\%}$
test_redq_speed[reduce-overhead-backward] 13.9448ms 13.4910ms 74.1237 Ops/s 72.6169 Ops/s $\color{#35bf28}+2.08\%$
test_redq_deprec_speed[False-None] 16.8818ms 14.6949ms 68.0510 Ops/s 64.5444 Ops/s $\textbf{\color{#35bf28}+5.43\%}$
test_redq_deprec_speed[False-backward] 23.3880ms 20.6876ms 48.3381 Ops/s 46.1395 Ops/s $\color{#35bf28}+4.77\%$
test_redq_deprec_speed[True-None] 5.3335ms 4.1303ms 242.1126 Ops/s 216.6981 Ops/s $\textbf{\color{#35bf28}+11.73\%}$
test_redq_deprec_speed[True-backward] 11.4238ms 9.8331ms 101.6973 Ops/s 100.9693 Ops/s $\color{#35bf28}+0.72\%$
test_redq_deprec_speed[reduce-overhead-None] 5.2895ms 4.4258ms 225.9463 Ops/s 207.0111 Ops/s $\textbf{\color{#35bf28}+9.15\%}$
test_redq_deprec_speed[reduce-overhead-backward] 10.6343ms 9.5719ms 104.4730 Ops/s 98.3853 Ops/s $\textbf{\color{#35bf28}+6.19\%}$
test_td3_speed[False-None] 11.6211ms 9.1900ms 108.8144 Ops/s 100.8205 Ops/s $\textbf{\color{#35bf28}+7.93\%}$
test_td3_speed[False-backward] 12.5923ms 11.3112ms 88.4081 Ops/s 77.2295 Ops/s $\textbf{\color{#35bf28}+14.47\%}$
test_td3_speed[True-None] 2.1411ms 1.8030ms 554.6418 Ops/s 500.2925 Ops/s $\textbf{\color{#35bf28}+10.86\%}$
test_td3_speed[True-backward] 4.1696ms 3.8885ms 257.1692 Ops/s 241.2584 Ops/s $\textbf{\color{#35bf28}+6.59\%}$
test_td3_speed[reduce-overhead-None] 2.3690ms 1.8753ms 533.2476 Ops/s 512.5477 Ops/s $\color{#35bf28}+4.04\%$
test_td3_speed[reduce-overhead-backward] 5.1919ms 3.8129ms 262.2709 Ops/s 247.1239 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_cql_speed[False-None] 40.9141ms 38.4154ms 26.0312 Ops/s 25.0921 Ops/s $\color{#35bf28}+3.74\%$
test_cql_speed[False-backward] 52.5261ms 48.5637ms 20.5915 Ops/s 19.6808 Ops/s $\color{#35bf28}+4.63\%$
test_cql_speed[True-None] 17.6945ms 16.5703ms 60.3489 Ops/s 60.9667 Ops/s $\color{#d91a1a}-1.01\%$
test_cql_speed[True-backward] 30.5667ms 24.2159ms 41.2952 Ops/s 41.9828 Ops/s $\color{#d91a1a}-1.64\%$
test_cql_speed[reduce-overhead-None] 18.0729ms 16.6581ms 60.0309 Ops/s 59.1402 Ops/s $\color{#35bf28}+1.51\%$
test_cql_speed[reduce-overhead-backward] 25.8274ms 24.4515ms 40.8974 Ops/s 40.5325 Ops/s $\color{#35bf28}+0.90\%$
test_a2c_speed[False-None] 8.8314ms 8.1588ms 122.5671 Ops/s 119.2688 Ops/s $\color{#35bf28}+2.77\%$
test_a2c_speed[False-backward] 17.5778ms 15.9769ms 62.5904 Ops/s 60.4876 Ops/s $\color{#35bf28}+3.48\%$
test_a2c_speed[True-None] 5.4286ms 4.6722ms 214.0304 Ops/s 206.0629 Ops/s $\color{#35bf28}+3.87\%$
test_a2c_speed[True-backward] 12.7638ms 12.2299ms 81.7667 Ops/s 84.9165 Ops/s $\color{#d91a1a}-3.71\%$
test_a2c_speed[reduce-overhead-None] 5.3966ms 4.7875ms 208.8761 Ops/s 208.9088 Ops/s $\color{#d91a1a}-0.02\%$
test_a2c_speed[reduce-overhead-backward] 15.5940ms 11.8959ms 84.0629 Ops/s 81.9772 Ops/s $\color{#35bf28}+2.54\%$
test_ppo_speed[False-None] 9.3117ms 8.4093ms 118.9164 Ops/s 119.6681 Ops/s $\color{#d91a1a}-0.63\%$
test_ppo_speed[False-backward] 17.6684ms 16.7026ms 59.8710 Ops/s 59.8465 Ops/s $\color{#35bf28}+0.04\%$
test_ppo_speed[True-None] 4.4510ms 4.0305ms 248.1086 Ops/s 232.5087 Ops/s $\textbf{\color{#35bf28}+6.71\%}$
test_ppo_speed[True-backward] 11.1975ms 10.5603ms 94.6940 Ops/s 90.8406 Ops/s $\color{#35bf28}+4.24\%$
test_ppo_speed[reduce-overhead-None] 5.2671ms 4.0035ms 249.7835 Ops/s 250.0515 Ops/s $\color{#d91a1a}-0.11\%$
test_ppo_speed[reduce-overhead-backward] 11.0163ms 10.2121ms 97.9232 Ops/s 93.1629 Ops/s $\textbf{\color{#35bf28}+5.11\%}$
test_reinforce_speed[False-None] 8.6849ms 7.0731ms 141.3808 Ops/s 132.5009 Ops/s $\textbf{\color{#35bf28}+6.70\%}$
test_reinforce_speed[False-backward] 11.0579ms 10.5612ms 94.6859 Ops/s 88.4847 Ops/s $\textbf{\color{#35bf28}+7.01\%}$
test_reinforce_speed[True-None] 3.5352ms 2.9102ms 343.6245 Ops/s 297.2317 Ops/s $\textbf{\color{#35bf28}+15.61\%}$
test_reinforce_speed[True-backward] 9.8991ms 9.3699ms 106.7251 Ops/s 106.2533 Ops/s $\color{#35bf28}+0.44\%$
test_reinforce_speed[reduce-overhead-None] 3.8500ms 3.1144ms 321.0885 Ops/s 330.9755 Ops/s $\color{#d91a1a}-2.99\%$
test_reinforce_speed[reduce-overhead-backward] 10.0814ms 9.5430ms 104.7886 Ops/s 105.7233 Ops/s $\color{#d91a1a}-0.88\%$
test_iql_speed[False-None] 38.4182ms 34.3435ms 29.1176 Ops/s 27.9079 Ops/s $\color{#35bf28}+4.33\%$
test_iql_speed[False-backward] 56.1895ms 47.9700ms 20.8464 Ops/s 20.1465 Ops/s $\color{#35bf28}+3.47\%$
test_iql_speed[True-None] 13.6931ms 11.3299ms 88.2624 Ops/s 84.7020 Ops/s $\color{#35bf28}+4.20\%$
test_iql_speed[True-backward] 23.9518ms 22.7533ms 43.9497 Ops/s 42.2308 Ops/s $\color{#35bf28}+4.07\%$
test_iql_speed[reduce-overhead-None] 12.0812ms 11.2228ms 89.1042 Ops/s 84.4396 Ops/s $\textbf{\color{#35bf28}+5.52\%}$
test_iql_speed[reduce-overhead-backward] 24.3652ms 22.9338ms 43.6038 Ops/s 41.6385 Ops/s $\color{#35bf28}+4.72\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0393ms 5.2483ms 190.5394 Ops/s 170.1059 Ops/s $\textbf{\color{#35bf28}+12.01\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8849ms 0.5424ms 1.8436 KOps/s 1.7940 KOps/s $\color{#35bf28}+2.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8655ms 0.5237ms 1.9095 KOps/s 1.8762 KOps/s $\color{#35bf28}+1.78\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.1550ms 5.1722ms 193.3399 Ops/s 185.5267 Ops/s $\color{#35bf28}+4.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3827ms 0.5360ms 1.8658 KOps/s 1.7885 KOps/s $\color{#35bf28}+4.32\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2.1079ms 0.5115ms 1.9549 KOps/s 1.9024 KOps/s $\color{#35bf28}+2.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4768ms 1.6988ms 588.6653 Ops/s 575.0904 Ops/s $\color{#35bf28}+2.36\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.5195ms 1.6073ms 622.1621 Ops/s 611.9915 Ops/s $\color{#35bf28}+1.66\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.5344ms 5.5675ms 179.6141 Ops/s 172.1915 Ops/s $\color{#35bf28}+4.31\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.7269ms 0.6919ms 1.4452 KOps/s 1.4478 KOps/s $\color{#d91a1a}-0.18\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8891ms 0.6522ms 1.5334 KOps/s 1.5071 KOps/s $\color{#35bf28}+1.74\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8957ms 5.3327ms 187.5231 Ops/s 187.2345 Ops/s $\color{#35bf28}+0.15\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.1492ms 0.5498ms 1.8188 KOps/s 1.7537 KOps/s $\color{#35bf28}+3.71\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8414ms 0.5348ms 1.8697 KOps/s 1.8616 KOps/s $\color{#35bf28}+0.44\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.7204ms 4.9842ms 200.6332 Ops/s 181.9578 Ops/s $\textbf{\color{#35bf28}+10.26\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.3301ms 0.5484ms 1.8234 KOps/s 1.7860 KOps/s $\color{#35bf28}+2.09\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7529ms 0.5116ms 1.9548 KOps/s 1.9484 KOps/s $\color{#35bf28}+0.33\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5832ms 5.1910ms 192.6426 Ops/s 172.5916 Ops/s $\textbf{\color{#35bf28}+11.62\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0914ms 0.6882ms 1.4531 KOps/s 1.4547 KOps/s $\color{#d91a1a}-0.11\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.8520ms 0.6618ms 1.5111 KOps/s 1.4519 KOps/s $\color{#35bf28}+4.08\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4643s 13.6017ms 73.5205 Ops/s 202.2506 Ops/s $\textbf{\color{#d91a1a}-63.65\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.7075ms 2.3225ms 430.5613 Ops/s 422.0243 Ops/s $\color{#35bf28}+2.02\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.5727ms 1.4543ms 687.6158 Ops/s 665.3696 Ops/s $\color{#35bf28}+3.34\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.2803ms 4.4910ms 222.6664 Ops/s 209.3775 Ops/s $\textbf{\color{#35bf28}+6.35\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.1347ms 2.4931ms 401.1083 Ops/s 414.3954 Ops/s $\color{#d91a1a}-3.21\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.7778ms 1.3642ms 733.0132 Ops/s 803.2655 Ops/s $\textbf{\color{#d91a1a}-8.75\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4549s 13.5282ms 73.9195 Ops/s 215.0498 Ops/s $\textbf{\color{#d91a1a}-65.63\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 16.7720ms 2.7820ms 359.4508 Ops/s 378.0501 Ops/s $\color{#d91a1a}-4.92\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.9829ms 1.4999ms 666.7259 Ops/s 643.5892 Ops/s $\color{#35bf28}+3.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 15.1097ms 13.2373ms 75.5439 Ops/s 71.5683 Ops/s $\textbf{\color{#35bf28}+5.56\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.9184ms 15.4696ms 64.6429 Ops/s 62.5719 Ops/s $\color{#35bf28}+3.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.7272ms 22.2775ms 44.8884 Ops/s 43.6612 Ops/s $\color{#35bf28}+2.81\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.2832ms 15.6668ms 63.8291 Ops/s 62.7660 Ops/s $\color{#35bf28}+1.69\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 24.1072ms 22.1385ms 45.1702 Ops/s 44.1851 Ops/s $\color{#35bf28}+2.23\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.8193ms 17.0660ms 58.5960 Ops/s 57.9382 Ops/s $\color{#35bf28}+1.14\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8329s 0.7357s 1.3592 Ops/s 1.3514 Ops/s $\color{#35bf28}+0.58\%$
test_transformed 0.9581s 0.9569s 1.0450 Ops/s 1.0090 Ops/s $\color{#35bf28}+3.57\%$
test_serial 2.1806s 2.1245s 0.4707 Ops/s 0.4633 Ops/s $\color{#35bf28}+1.59\%$
test_parallel 1.8470s 1.8038s 0.5544 Ops/s 0.5474 Ops/s $\color{#35bf28}+1.28\%$
test_step_mdp_speed[True-True-True-True-True] 0.1728ms 40.2871μs 24.8219 KOps/s 24.8095 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-True-True-True-False] 49.0610μs 23.7620μs 42.0840 KOps/s 42.5606 KOps/s $\color{#d91a1a}-1.12\%$
test_step_mdp_speed[True-True-True-False-True] 48.9010μs 22.4738μs 44.4962 KOps/s 44.0020 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[True-True-True-False-False] 45.0100μs 13.1084μs 76.2868 KOps/s 75.7870 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[True-True-False-True-True] 0.1015ms 42.6110μs 23.4681 KOps/s 23.0783 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[True-True-False-True-False] 87.1810μs 26.0182μs 38.4347 KOps/s 38.3896 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-True-False-False-True] 0.1003ms 25.2506μs 39.6030 KOps/s 39.8175 KOps/s $\color{#d91a1a}-0.54\%$
test_step_mdp_speed[True-True-False-False-False] 52.6110μs 15.3266μs 65.2460 KOps/s 64.3073 KOps/s $\color{#35bf28}+1.46\%$
test_step_mdp_speed[True-False-True-True-True] 81.7210μs 45.1003μs 22.1728 KOps/s 21.9100 KOps/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[True-False-True-True-False] 62.3410μs 28.4236μs 35.1820 KOps/s 35.6454 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[True-False-True-False-True] 96.3810μs 24.6314μs 40.5986 KOps/s 40.3297 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-True-False-False] 42.8900μs 15.3744μs 65.0433 KOps/s 64.4063 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-False-False-True-True] 0.1404ms 47.7151μs 20.9577 KOps/s 21.3208 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[True-False-False-True-False] 66.3010μs 30.2047μs 33.1074 KOps/s 33.0735 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-False-False-False-True] 55.2900μs 26.7495μs 37.3838 KOps/s 37.9192 KOps/s $\color{#d91a1a}-1.41\%$
test_step_mdp_speed[True-False-False-False-False] 58.8310μs 17.4593μs 57.2761 KOps/s 56.7731 KOps/s $\color{#35bf28}+0.89\%$
test_step_mdp_speed[False-True-True-True-True] 0.1120ms 45.4274μs 22.0131 KOps/s 22.0201 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[False-True-True-True-False] 54.0700μs 28.2614μs 35.3839 KOps/s 35.2619 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[False-True-True-False-True] 58.2610μs 28.9286μs 34.5678 KOps/s 34.8431 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[False-True-True-False-False] 43.9400μs 17.0925μs 58.5051 KOps/s 58.3852 KOps/s $\color{#35bf28}+0.21\%$
test_step_mdp_speed[False-True-False-True-True] 74.0210μs 47.7891μs 20.9253 KOps/s 21.0874 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[False-True-False-True-False] 59.1310μs 30.7783μs 32.4905 KOps/s 33.1376 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[False-True-False-False-True] 3.3391ms 31.5371μs 31.7087 KOps/s 32.1903 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-True-False-False-False] 56.2500μs 19.6299μs 50.9426 KOps/s 51.6407 KOps/s $\color{#d91a1a}-1.35\%$
test_step_mdp_speed[False-False-True-True-True] 80.4810μs 50.7052μs 19.7219 KOps/s 19.9521 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[False-False-True-True-False] 66.7600μs 33.3919μs 29.9473 KOps/s 30.2834 KOps/s $\color{#d91a1a}-1.11\%$
test_step_mdp_speed[False-False-True-False-True] 98.2310μs 31.3277μs 31.9207 KOps/s 31.9582 KOps/s $\color{#d91a1a}-0.12\%$
test_step_mdp_speed[False-False-True-False-False] 48.8400μs 19.5439μs 51.1668 KOps/s 51.3686 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[False-False-False-True-True] 82.4900μs 51.3106μs 19.4892 KOps/s 19.4881 KOps/s $+0.01\%$
test_step_mdp_speed[False-False-False-True-False] 63.7310μs 34.9892μs 28.5802 KOps/s 28.5693 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[False-False-False-False-True] 65.3110μs 33.0010μs 30.3021 KOps/s 30.3103 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[False-False-False-False-False] 85.7310μs 20.8345μs 47.9972 KOps/s 46.3937 KOps/s $\color{#35bf28}+3.46\%$
test_values[generalized_advantage_estimate-True-True] 25.1545ms 24.5814ms 40.6811 Ops/s 40.8974 Ops/s $\color{#d91a1a}-0.53\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1048s 2.9912ms 334.3176 Ops/s 330.8086 Ops/s $\color{#35bf28}+1.06\%$
test_values[td0_return_estimate-False-False] 0.1048ms 80.1365μs 12.4787 KOps/s 12.4665 KOps/s $\color{#35bf28}+0.10\%$
test_values[td1_return_estimate-False-False] 58.0766ms 55.3281ms 18.0740 Ops/s 18.1755 Ops/s $\color{#d91a1a}-0.56\%$
test_values[vec_td1_return_estimate-False-False] 1.3588ms 1.0802ms 925.7738 Ops/s 925.2186 Ops/s $\color{#35bf28}+0.06\%$
test_values[td_lambda_return_estimate-True-False] 93.7159ms 90.6580ms 11.0305 Ops/s 11.2480 Ops/s $\color{#d91a1a}-1.93\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3761ms 1.0749ms 930.2812 Ops/s 925.1335 Ops/s $\color{#35bf28}+0.56\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.8699ms 24.4972ms 40.8210 Ops/s 39.7846 Ops/s $\color{#35bf28}+2.60\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0778ms 0.7557ms 1.3233 KOps/s 1.3257 KOps/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7679ms 0.6707ms 1.4910 KOps/s 1.4939 KOps/s $\color{#d91a1a}-0.19\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5318ms 1.4762ms 677.4088 Ops/s 677.2161 Ops/s $\color{#35bf28}+0.03\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7302ms 0.6824ms 1.4655 KOps/s 1.4520 KOps/s $\color{#35bf28}+0.93\%$
test_dqn_speed[False-None] 6.9919ms 1.5242ms 656.0935 Ops/s 663.0045 Ops/s $\color{#d91a1a}-1.04\%$
test_dqn_speed[False-backward] 2.2368ms 2.1295ms 469.5890 Ops/s 469.3275 Ops/s $\color{#35bf28}+0.06\%$
test_dqn_speed[True-None] 0.5980ms 0.5437ms 1.8394 KOps/s 1.7969 KOps/s $\color{#35bf28}+2.36\%$
test_dqn_speed[True-backward] 1.2552ms 1.2005ms 832.9696 Ops/s 891.9257 Ops/s $\textbf{\color{#d91a1a}-6.61\%}$
test_dqn_speed[reduce-overhead-None] 0.6629ms 0.5873ms 1.7028 KOps/s 1.6636 KOps/s $\color{#35bf28}+2.35\%$
test_dqn_speed[reduce-overhead-backward] 1.1029ms 1.0641ms 939.7471 Ops/s 1.0268 KOps/s $\textbf{\color{#d91a1a}-8.48\%}$
test_ddpg_speed[False-None] 3.2913ms 2.9066ms 344.0448 Ops/s 336.9320 Ops/s $\color{#35bf28}+2.11\%$
test_ddpg_speed[False-backward] 4.8084ms 4.3402ms 230.4064 Ops/s 231.9354 Ops/s $\color{#d91a1a}-0.66\%$
test_ddpg_speed[True-None] 1.1288ms 1.0747ms 930.5212 Ops/s 892.4455 Ops/s $\color{#35bf28}+4.27\%$
test_ddpg_speed[True-backward] 2.3353ms 2.2988ms 435.0010 Ops/s 430.6939 Ops/s $\color{#35bf28}+1.00\%$
test_ddpg_speed[reduce-overhead-None] 1.1968ms 1.0939ms 914.2018 Ops/s 907.1371 Ops/s $\color{#35bf28}+0.78\%$
test_ddpg_speed[reduce-overhead-backward] 1.8135ms 1.7695ms 565.1307 Ops/s 556.5688 Ops/s $\color{#35bf28}+1.54\%$
test_sac_speed[False-None] 8.4202ms 8.0070ms 124.8910 Ops/s 123.0243 Ops/s $\color{#35bf28}+1.52\%$
test_sac_speed[False-backward] 11.9578ms 11.4055ms 87.6773 Ops/s 87.8664 Ops/s $\color{#d91a1a}-0.22\%$
test_sac_speed[True-None] 2.1550ms 1.5374ms 650.4356 Ops/s 648.6400 Ops/s $\color{#35bf28}+0.28\%$
test_sac_speed[True-backward] 3.4726ms 3.4023ms 293.9150 Ops/s 286.2596 Ops/s $\color{#35bf28}+2.67\%$
test_sac_speed[reduce-overhead-None] 23.1967ms 12.7044ms 78.7132 Ops/s 78.1599 Ops/s $\color{#35bf28}+0.71\%$
test_sac_speed[reduce-overhead-backward] 1.6081ms 1.5179ms 658.7847 Ops/s 723.7301 Ops/s $\textbf{\color{#d91a1a}-8.97\%}$
test_redq_speed[False-None] 8.2071ms 7.4949ms 133.4239 Ops/s 131.8764 Ops/s $\color{#35bf28}+1.17\%$
test_redq_speed[False-backward] 12.5592ms 11.6369ms 85.9334 Ops/s 87.7674 Ops/s $\color{#d91a1a}-2.09\%$
test_redq_speed[True-None] 2.1794ms 2.0164ms 495.9388 Ops/s 504.1150 Ops/s $\color{#d91a1a}-1.62\%$
test_redq_speed[True-backward] 3.8856ms 3.8113ms 262.3793 Ops/s 262.8152 Ops/s $\color{#d91a1a}-0.17\%$
test_redq_speed[reduce-overhead-None] 2.1322ms 1.9864ms 503.4252 Ops/s 484.3986 Ops/s $\color{#35bf28}+3.93\%$
test_redq_speed[reduce-overhead-backward] 3.6847ms 3.6162ms 276.5317 Ops/s 263.1471 Ops/s $\textbf{\color{#35bf28}+5.09\%}$
test_redq_deprec_speed[False-None] 9.4112ms 9.0159ms 110.9153 Ops/s 109.3108 Ops/s $\color{#35bf28}+1.47\%$
test_redq_deprec_speed[False-backward] 12.4453ms 11.9986ms 83.3431 Ops/s 81.7775 Ops/s $\color{#35bf28}+1.91\%$
test_redq_deprec_speed[True-None] 2.3807ms 2.3225ms 430.5634 Ops/s 424.5936 Ops/s $\color{#35bf28}+1.41\%$
test_redq_deprec_speed[True-backward] 4.2570ms 4.1498ms 240.9777 Ops/s 249.2543 Ops/s $\color{#d91a1a}-3.32\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4453ms 2.3446ms 426.5157 Ops/s 425.0508 Ops/s $\color{#35bf28}+0.34\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.5701ms 4.1421ms 241.4254 Ops/s 250.7867 Ops/s $\color{#d91a1a}-3.73\%$
test_td3_speed[False-None] 35.1088ms 8.1945ms 122.0326 Ops/s 123.1774 Ops/s $\color{#d91a1a}-0.93\%$
test_td3_speed[False-backward] 10.9824ms 10.4834ms 95.3888 Ops/s 94.2559 Ops/s $\color{#35bf28}+1.20\%$
test_td3_speed[True-None] 1.6430ms 1.5939ms 627.3831 Ops/s 614.5235 Ops/s $\color{#35bf28}+2.09\%$
test_td3_speed[True-backward] 3.4544ms 3.2845ms 304.4573 Ops/s 301.1798 Ops/s $\color{#35bf28}+1.09\%$
test_td3_speed[reduce-overhead-None] 60.2577ms 26.5757ms 37.6284 Ops/s 37.7156 Ops/s $\color{#d91a1a}-0.23\%$
test_td3_speed[reduce-overhead-backward] 1.5043ms 1.4559ms 686.8551 Ops/s 681.5168 Ops/s $\color{#35bf28}+0.78\%$
test_cql_speed[False-None] 17.3778ms 16.7744ms 59.6147 Ops/s 59.0456 Ops/s $\color{#35bf28}+0.96\%$
test_cql_speed[False-backward] 23.0070ms 22.2035ms 45.0380 Ops/s 44.5436 Ops/s $\color{#35bf28}+1.11\%$
test_cql_speed[True-None] 3.2133ms 3.0709ms 325.6344 Ops/s 341.3585 Ops/s $\color{#d91a1a}-4.61\%$
test_cql_speed[True-backward] 5.5331ms 5.0775ms 196.9489 Ops/s 196.4098 Ops/s $\color{#35bf28}+0.27\%$
test_cql_speed[reduce-overhead-None] 0.3684s 15.1434ms 66.0353 Ops/s 74.9363 Ops/s $\textbf{\color{#d91a1a}-11.88\%}$
test_cql_speed[reduce-overhead-backward] 1.5850ms 1.5286ms 654.1916 Ops/s 584.0719 Ops/s $\textbf{\color{#35bf28}+12.01\%}$
test_a2c_speed[False-None] 3.2995ms 3.2124ms 311.2971 Ops/s 308.4408 Ops/s $\color{#35bf28}+0.93\%$
test_a2c_speed[False-backward] 6.7006ms 6.0957ms 164.0512 Ops/s 156.5191 Ops/s $\color{#35bf28}+4.81\%$
test_a2c_speed[True-None] 1.1117ms 1.0058ms 994.1853 Ops/s 958.4140 Ops/s $\color{#35bf28}+3.73\%$
test_a2c_speed[True-backward] 2.8728ms 2.7806ms 359.6333 Ops/s 356.3391 Ops/s $\color{#35bf28}+0.92\%$
test_a2c_speed[reduce-overhead-None] 21.6854ms 11.7186ms 85.3342 Ops/s 85.3548 Ops/s $\color{#d91a1a}-0.02\%$
test_a2c_speed[reduce-overhead-backward] 1.2044ms 1.1211ms 892.0024 Ops/s 862.3936 Ops/s $\color{#35bf28}+3.43\%$
test_ppo_speed[False-None] 3.9513ms 3.7696ms 265.2819 Ops/s 263.5718 Ops/s $\color{#35bf28}+0.65\%$
test_ppo_speed[False-backward] 7.6030ms 7.1419ms 140.0183 Ops/s 139.6874 Ops/s $\color{#35bf28}+0.24\%$
test_ppo_speed[True-None] 1.0709ms 0.9797ms 1.0207 KOps/s 1.0007 KOps/s $\color{#35bf28}+2.00\%$
test_ppo_speed[True-backward] 2.7893ms 2.6478ms 377.6654 Ops/s 392.3937 Ops/s $\color{#d91a1a}-3.75\%$
test_ppo_speed[reduce-overhead-None] 0.6028ms 0.5547ms 1.8027 KOps/s 1.8397 KOps/s $\color{#d91a1a}-2.01\%$
test_ppo_speed[reduce-overhead-backward] 1.0201ms 0.9759ms 1.0247 KOps/s 841.7409 Ops/s $\textbf{\color{#35bf28}+21.74\%}$
test_reinforce_speed[False-None] 2.3752ms 2.2757ms 439.4334 Ops/s 434.8111 Ops/s $\color{#35bf28}+1.06\%$
test_reinforce_speed[False-backward] 3.7812ms 3.2743ms 305.4093 Ops/s 292.8532 Ops/s $\color{#35bf28}+4.29\%$
test_reinforce_speed[True-None] 0.8973ms 0.8301ms 1.2046 KOps/s 1.1334 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_reinforce_speed[True-backward] 2.4526ms 2.3759ms 420.9011 Ops/s 387.8985 Ops/s $\textbf{\color{#35bf28}+8.51\%}$
test_reinforce_speed[reduce-overhead-None] 0.3016s 12.3457ms 81.0002 Ops/s 91.4550 Ops/s $\textbf{\color{#d91a1a}-11.43\%}$
test_reinforce_speed[reduce-overhead-backward] 1.0567ms 1.0300ms 970.8296 Ops/s 846.3628 Ops/s $\textbf{\color{#35bf28}+14.71\%}$
test_iql_speed[False-None] 9.8822ms 9.3593ms 106.8451 Ops/s 107.4539 Ops/s $\color{#d91a1a}-0.57\%$
test_iql_speed[False-backward] 13.8336ms 13.0357ms 76.7121 Ops/s 75.1019 Ops/s $\color{#35bf28}+2.14\%$
test_iql_speed[True-None] 1.8963ms 1.7847ms 560.3169 Ops/s 546.0630 Ops/s $\color{#35bf28}+2.61\%$
test_iql_speed[True-backward] 4.4371ms 4.2822ms 233.5249 Ops/s 235.9240 Ops/s $\color{#d91a1a}-1.02\%$
test_iql_speed[reduce-overhead-None] 20.7705ms 11.8592ms 84.3229 Ops/s 86.7590 Ops/s $\color{#d91a1a}-2.81\%$
test_iql_speed[reduce-overhead-backward] 1.5302ms 1.4410ms 693.9855 Ops/s 689.4828 Ops/s $\color{#35bf28}+0.65\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.1529ms 6.5068ms 153.6855 Ops/s 151.7871 Ops/s $\color{#35bf28}+1.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5119ms 0.2827ms 3.5373 KOps/s 2.6400 KOps/s $\textbf{\color{#35bf28}+33.99\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4484ms 0.2528ms 3.9553 KOps/s 3.3340 KOps/s $\textbf{\color{#35bf28}+18.64\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4477ms 6.2198ms 160.7767 Ops/s 160.5251 Ops/s $\color{#35bf28}+0.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.4142ms 0.3221ms 3.1051 KOps/s 3.7450 KOps/s $\textbf{\color{#d91a1a}-17.09\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5667ms 0.3038ms 3.2911 KOps/s 4.0906 KOps/s $\textbf{\color{#d91a1a}-19.54\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6268ms 1.3147ms 760.6253 Ops/s 771.4100 Ops/s $\color{#d91a1a}-1.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5012ms 1.2414ms 805.5703 Ops/s 817.1148 Ops/s $\color{#d91a1a}-1.41\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6416ms 6.4469ms 155.1131 Ops/s 154.1467 Ops/s $\color{#35bf28}+0.63\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8737ms 0.5080ms 1.9684 KOps/s 2.3232 KOps/s $\textbf{\color{#d91a1a}-15.27\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7864ms 0.3992ms 2.5052 KOps/s 2.2737 KOps/s $\textbf{\color{#35bf28}+10.18\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3829ms 6.2428ms 160.1840 Ops/s 159.0915 Ops/s $\color{#35bf28}+0.69\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6889ms 0.3524ms 2.8376 KOps/s 3.1009 KOps/s $\textbf{\color{#d91a1a}-8.49\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6188ms 0.3550ms 2.8170 KOps/s 3.2634 KOps/s $\textbf{\color{#d91a1a}-13.68\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4951ms 6.1663ms 162.1723 Ops/s 159.9269 Ops/s $\color{#35bf28}+1.40\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3574ms 0.2725ms 3.6695 KOps/s 3.2473 KOps/s $\textbf{\color{#35bf28}+13.00\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5196ms 0.2858ms 3.4985 KOps/s 3.1155 KOps/s $\textbf{\color{#35bf28}+12.29\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4808ms 6.3710ms 156.9610 Ops/s 155.5120 Ops/s $\color{#35bf28}+0.93\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7259ms 0.4549ms 2.1983 KOps/s 2.1799 KOps/s $\color{#35bf28}+0.84\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6689ms 0.4376ms 2.2852 KOps/s 2.0794 KOps/s $\textbf{\color{#35bf28}+9.90\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1675ms 5.5302ms 180.8259 Ops/s 182.1530 Ops/s $\color{#d91a1a}-0.73\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.0093ms 2.0399ms 490.2257 Ops/s 427.9972 Ops/s $\textbf{\color{#35bf28}+14.54\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.6889ms 1.2356ms 809.2940 Ops/s 850.8164 Ops/s $\color{#d91a1a}-4.88\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.5356ms 5.4877ms 182.2268 Ops/s 182.3106 Ops/s $\color{#d91a1a}-0.05\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.8322ms 2.0317ms 492.2079 Ops/s 450.7189 Ops/s $\textbf{\color{#35bf28}+9.21\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.9797ms 0.9366ms 1.0676 KOps/s 767.6317 Ops/s $\textbf{\color{#35bf28}+39.08\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5052s 15.7819ms 63.3637 Ops/s 32.4962 Ops/s $\textbf{\color{#35bf28}+94.99\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.4821ms 2.2529ms 443.8705 Ops/s 457.5056 Ops/s $\color{#d91a1a}-2.98\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.4032ms 1.2534ms 797.8522 Ops/s 848.7671 Ops/s $\textbf{\color{#d91a1a}-6.00\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 16.3874ms 15.6557ms 63.8746 Ops/s 64.0853 Ops/s $\color{#d91a1a}-0.33\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.2108ms 17.8639ms 55.9790 Ops/s 54.8055 Ops/s $\color{#35bf28}+2.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.7019ms 20.1018ms 49.7468 Ops/s 49.2699 Ops/s $\color{#35bf28}+0.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.2507ms 17.9762ms 55.6291 Ops/s 54.2401 Ops/s $\color{#35bf28}+2.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.7075ms 20.0295ms 49.9263 Ops/s 49.6146 Ops/s $\color{#35bf28}+0.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.2695ms 19.3533ms 51.6709 Ops/s 50.3273 Ops/s $\color{#35bf28}+2.67\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 15, 2025
ghstack-source-id: f16d93a5fab2016d436c808896c9cf24f783a754
Pull Request resolved: #2688
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 15, 2025
ghstack-source-id: ba002535bca1e834a088a40136b767db33a20ee6
Pull Request resolved: #2688
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 15, 2025
ghstack-source-id: b57e5ac479db313680f9e506cada87100453fd4c
Pull Request resolved: #2688
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 15, 2025
ghstack-source-id: afc425b8d9abb888b9de13d63acd3ad873d9b8bb
Pull Request resolved: #2688
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 16, 2025
ghstack-source-id: 05ff3db60c33b275db46849b2bfe578fe04d9699
Pull Request resolved: #2688
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants