Terminal state concept in POMCP algorithm

I'm trying to implement a maintenance planning algorithm using POMCP. In this context, the decision maker is mainly interested in knowing when to perform a certain action given current and historical sensor observations. In this context, there also exists a concept of terminal states. When such terminal state is reached, any further actions are irrelevant, e.g., whenever the component fails or maintenance is initiated. The concept of terminal states is also mentioned in the original POMCP paper from Silver, D., and Veness, J. (2010). In particular, their `Simulate` and `Rollout` functions take it into account.

Because of a previous issue I opened (#73), I took a closer look at the [`_rollout`](https://github.com/h2r/pomdp-py/blob/7eeb6538f25eb8597e3c8a643906e525141a62ed/pomdp_py/algorithms/po_uct.pyx#L399) function. It seems the [current stopping condition](https://github.com/h2r/pomdp-py/blob/7eeb6538f25eb8597e3c8a643906e525141a62ed/pomdp_py/algorithms/po_uct.pyx#L407) only takes the max tree-depth into account. Is this observation correct or am I missing something?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Terminal state concept in POMCP algorithm #75

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Terminal state concept in POMCP algorithm #75

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions