You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/markov/index.mld
+5-1Lines changed: 5 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -58,4 +58,8 @@ kicks off an infinite loop modelling an Markov Decision Process (MDP). Each iter
58
58
+ infering a method to next measure the state of the MDP (in other words producing an {e observer})
59
59
+ executing the action
60
60
61
-
{b The implementation of this loop is included in the [markov] library.} It is found in {{!/markov/src/agent.ml.html#module-Make}the implementation} of the [Agent.Make] functor.
61
+
{b The implementation of this loop is included in the [markov] library.} It is found in the implementation of the [Agent.Make] functor.
62
+
63
+
{1 Reference Implementation}
64
+
The {{!/blue/page-index}blue} library implements the [Markov] interfaces (albeit with a simple implementation).
Copy file name to clipboardExpand all lines: markov/agent.mli
+3-1Lines changed: 3 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,6 @@
1
-
(** Exposes the functor [Agent.Make] **)
1
+
(** Exposes the functor [Agent.Make] which returns an [Agent.S] module that is parameterised by the provided implementations of [Agent.MarkovCompressorType], [Agent.RewardType] and [Agent.RLPolicyType].
2
+
3
+
[Agent.S.act initial_policy] commences an infinite loop using the policy to take actions and produce {e observers} (functions returning a {e state}). When the observer resolves to a state, the loop repeats. *)
2
4
3
5
(** Handle the continuous-time stream of information from a system and compress the information into a Markovian state representation such that the sequence of states returned by sequential calls to [observe] have the Markov property. *)
0 commit comments