Policy Optimization API Usage

Hi, I'm trying to do policy optimization using YLearn. I have read the docs about this but didn't understand the meaning very well. Formally, a policy optimization problem can be written as: $x^{*}=\text{argmax}_x\mathbb{E}[\mathcal{Y}|\text{do}(\mathcal{X}=x), \mathcal{S}]$. Then how do $\mathcal{Y}$, $\mathcal{X}$ and $\mathcal{S}$ represented in the arguments of the est.fit() api in [https://ylearn.readthedocs.io/en/latest/sub/policy.html](url) respectively? I need a more concrete explanation to better use the given api, thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Policy Optimization API Usage #56

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Policy Optimization API Usage #56

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions