🐛 Bug
Many algorithms based on OpenAI gym domains expect the domain to define the action_space and observation_space attributes, and especially ray's rllib's single agent algorithms like AlphaZero.
See for instance ray's rllib's AlphaZero implementation which makes use of those attributes.
The current implementation of scikit-decide's rllib wrapper provides only a multi-agent environment wrapper via AsRLlibMultiAgentEnv which does not define the action_space and observation_space attributes (which is fine for rllib's multi-agent environments). Therefore scikit-decide's rllib wrapper should additionally provide a single-agent environment wrapper for algorithms like rllib's AlphaZero, which defines the action_space and observation_space attributes.
To Reproduce
Define a scikit-decide RL domain and pass it to ray's rllib's AlphaZero algorithm.
The following exception will be thrown when solving the domain:
AttributeError: 'AsRLlibMultiAgentEnv' object has no attribute 'action_space'
Expected behavior
No exception is thrown because an environment wrapper likeAsRLlibSingleAgentEnv (to be defined) should define the action_space and observation_space attributes.