-
Notifications
You must be signed in to change notification settings - Fork 55
Description
I'm trying to implement a maintenance planning algorithm using POMCP. In this context, the decision maker is mainly interested in knowing when to perform a certain action given current and historical sensor observations. In this context, there also exists a concept of terminal states. When such terminal state is reached, any further actions are irrelevant, e.g., whenever the component fails or maintenance is initiated. The concept of terminal states is also mentioned in the original POMCP paper from Silver, D., and Veness, J. (2010). In particular, their Simulate and Rollout functions take it into account.
Because of a previous issue I opened (#73), I took a closer look at the _rollout function. It seems the current stopping condition only takes the max tree-depth into account. Is this observation correct or am I missing something?