Implement CIC agent

Strategy:
- create env-wrapper that changes the observation-type to a dict that also has a "skill"-key and generate skill on the environment level; also remove the reward
- change the policies to intercept the flow between the feature extractor and Actor; it should split the observation into skill and rest pass both separatly through their respective encoder and combine the state and skill embeddings to return
- the critic should ignore the skill variable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement CIC agent #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement CIC agent #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions