Replies: 2 comments 1 reply
-
|
Hey @alektebel, great timing -- we just received PR #3537 which adds a vanilla PILCO implementation. A few thoughts: MBRL is definitely in scope. Low-dimensional control with model-based methods is very much welcome in Re: duplication with #3537. The current PILCO PR uses analytical moment matching (Deisenroth & Rasmussen's original formulation). Interestingly, the author mentioned they tried MC moment matching but couldn't get it to stabilize. MC-PILCO is a distinct enough algorithm that it warrants its own implementation -- especially given the ICRA 2025 results you mention. What would be most useful:
Would recommend taking a look at #3537, maybe commenting there on the MC approach, and then opening a draft PR. Happy to review early. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @vmoens , Update: left a comment on PSXBRosa's PILCO PR #3537 to coordinate on shared primitives, since as per today, pulling up a draft PR would make duplicities on the core primitives. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I've been doing model-based RL for robotics
control (my thesis was PPO/DDPG on a prosthetic hand in MuJoCo), so sample efficiency is something I care about practically. MC-PILCO recently won the AI Olympics at ICRA 2025 on underactuated pendulum tasks, which shows it's still very much alive.
TorchRL already has
ModelBasedEnvBase,WorldModelWrapperand the rest of the MBRL infrastructure, but no PILCO variant insota-implementations/. I'd like to add one.Roughly:
ProbabilisticDynamicsModel+MCPILCOLoss+ a training script benchmarking data efficiency against PPO/SAC on Pendulum-v1.A few things I'd like to check before starting:
sota-implementations/, given the current LLM focus?Happy to share a draft design or prototype if useful.
Beta Was this translation helpful? Give feedback.
All reactions