-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hello!
I'm writing about two things:
- Introducing an updated .yml file
- If my results are correct
Updated yml file
I found that the provided pytorch39.yml file had lots of extra unneeded dependencies and also had incorrect versions of the required dependencies, such as SB3. This made installation difficult, so after some debugging I came up with an updated ym
l file that can be dropped in instead:
name: SaMI
channels:
- pytorch
- nvidia
- conda-forge
- defaults
dependencies:
# Core
- python=3.10
- pip
# Everything else via pip
- pip:
# PyTorch with CUDA 13.0 (must be first, with extra-index-url)
- "--extra-index-url=https://download.pytorch.org/whl/cu130"
- "torch>=2.3"
- "torchvision"
# RL frameworks
- "stable-baselines3==2.2.1"
- "gymnasium==0.29.1"
- "gym==0.23.0" # older gym version required for compatibility
- "panda_gym"
- "pygame"
# Physics simulators
- "Cython<3,>=0.29" # required before mujoco-py
- "mujoco-py"
- "pybullet"
# Data science & visualization
- "pandas"
- "matplotlib"
- "seaborn"
- "scikit-learn"
- "opencv-python"
# Experiment tracking & monitoring
- "wandb"
- "tensorboard"
# Utilities
- "tqdm==4.65.0"
- "rich==13.5.3"
- "psutil"
- "line_profiler"
- "memory_profiler"
- "openpyxl"
- "imageio[ffmpeg]"
- "beautifulsoup4"I have tested this on my machine a couple times as fresh installations. The only thing to note is that others will need to note their CUDA version.
Results Questions
I am interested in extending this research and as such I wanted to make sure I was able to run the experiments correctly. I will show two of the environments that I tried.
Panda-Gym
1.0.10.0.-0.mp4
As seen above I cannot replicate the panda results.
Walker
1.5.0.6.-0.mp4
As seen above I was able to get the walker to have the lying on the floor slightly kicking behavior as seen in the original repo. However, a lot of the other runs do not exhibit this behavior (instead kicking wildly and then just flopping on the floor).
Question
Due to the varying behaviors of the agents, I am unsure if I was able to replicate the walker results correctly. Can someone confirm if these are valid results or not? Thank you!