Skip to content

Benchmark and replicate algorithm performance #388

Open
@AdamGleave

Description

@AdamGleave

Tune hyperparameters / match implementation details / fix bugs until we replicate the performance of reference implementations of algorithms. I'm not concerned about an exact match -- if we do about as well on average but better and worse depending on environments this seems OK.

Concretely, should test BC, AIRL, GAIL, DRLHP, DAgger on at least the seals versions of CartPole, MountainCar, HalfCheetah, Hopper.

Baselines: paper results as first port of call. But some paper results are confounded by different environment version, especially fixed vs variable horizon. SB2 GAIL is a good sanity check. If need be reference implementations of most other algorithms exist, but can be hard to run.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions