Commit c6cecc2
Update to Pytorch 2.8 + cu129, update docker containers to 24.04 (#362)
* Pytorch 2.8 + cu129
* Recombine jax and torch env
* py3.12_torch2.8+cu129 (#364)
Co-authored-by: pierre.delaunay <[email protected]>
* Do not init process group on prepare step
* Add Distributed env variable on prepare when required
* Udpate purejaxrl to use the latest distrax + tfp-nightly
* ignore tensorflow-probability
* Pin Dependencies (#366)
* ignore tensorflow-probability
* Pin Dependencies
---------
Co-authored-by: pierre.delaunay <[email protected]>
* replace tree_map by tree.map
* update container to match current cuda version
* Update dockerfile
* Add a timer for rsync and add concept of job runner pipeline
* Handle rerun of jobs with dependencies
* make the server try its best even if the cluster is down
---------
Co-authored-by: pierre.delaunay <[email protected]>1 parent e5538b2 commit c6cecc2
File tree
37 files changed
+1199
-766
lines changed- .pin
- benchmarks
- brax
- cleanrl_jax
- diffusion
- dinov2
- flops
- geo_gnn
- huggingface
- lightning
- llama
- llava
- llm
- bench
- purejaxrl
- recursiongfn
- rlhf
- timm
- torchvision_ddp
- torchvision
- vjepa
- config
- constraints
- extra
- docker
- milabench
- web
- scripts/slurm
37 files changed
+1199
-766
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
393 | 393 | | |
394 | 394 | | |
395 | 395 | | |
396 | | - | |
397 | | - | |
| 396 | + | |
| 397 | + | |
398 | 398 | | |
399 | 399 | | |
400 | 400 | | |
| |||
0 commit comments