Reinforcement learning is wrong on so many levels.
This repositoru contains code for the paper "Methods for Optimization Problems with Markovian Stochasticity and Non-Euclidean Geometry" and some other snippets that the authors found useful.
The environment is built in docker, you can build the container:
make buildAnd run it in detached mode
make runAlso, you can just install the requirements for the project with pip, uv, or conda.
The experiments for the paper are located in the paper_experiments. Each script is configured with a hydra config, entitled the same as the experiment file.
Experiments with projections:
cd paper_experiments;
python experiment_mdpo.pyExperiments without projections:
cd paper_experiments;
python experiment_ampo.pyNote
To enable wandb logging, don't forget to specify the WANDB_API_KEY environment variable.