Skip to content

Terminal state concept in POMCP algorithm #75

@GijsMargadant

Description

@GijsMargadant

I'm trying to implement a maintenance planning algorithm using POMCP. In this context, the decision maker is mainly interested in knowing when to perform a certain action given current and historical sensor observations. In this context, there also exists a concept of terminal states. When such terminal state is reached, any further actions are irrelevant, e.g., whenever the component fails or maintenance is initiated. The concept of terminal states is also mentioned in the original POMCP paper from Silver, D., and Veness, J. (2010). In particular, their Simulate and Rollout functions take it into account.

Because of a previous issue I opened (#73), I took a closer look at the _rollout function. It seems the current stopping condition only takes the max tree-depth into account. Is this observation correct or am I missing something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions