Skip to content

API and algorithm structure unification #2

@jfpettit

Description

@jfpettit

Algorithms in qpolgrad have been organized to define functions for loss calculation. Those functions are then called in the update function for the algorithm. A2C and PPO need to be brought up to that same structure.

Specifically:

  • Define compute_policy_loss and compute_value_loss functions in A2C and PPO.
  • Modify the update rules for both algorithms to call the loss computation functions.
  • Update docstrings to reflect your changes! If there aren't docstrings (sorry), add them!

👍

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions