You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improved naming of function to get and set underlying Markov states in RLToyEnv; improved API of image_continuous.get_image_representation() to accept epistemic_uncertainty and aleatoric_uncertainty std dev vectors to add to bar plots.
<codeclass="sig-name descname"><spanclass="pre">get_augmented_state</span></code><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="reference internal" href="../_modules/mdp_playground/envs/rl_toy_env.html#RLToyEnv.get_augmented_state"><spanclass="viewcode-link"><spanclass="pre">[source]</span></span></a><aclass="headerlink" href="#mdp_playground.envs.rl_toy_env.RLToyEnv.get_augmented_state" title="Permalink to this definition">¶</a></dt>
<codeclass="sig-name descname"><spanclass="pre">get_markov_state</span></code><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="reference internal" href="../_modules/mdp_playground/envs/rl_toy_env.html#RLToyEnv.get_markov_state"><spanclass="viewcode-link"><spanclass="pre">[source]</span></span></a><aclass="headerlink" href="#mdp_playground.envs.rl_toy_env.RLToyEnv.get_markov_state" title="Permalink to this definition">¶</a></dt>
717
717
<dd><p>gets underlying Markovian state of the MDP</p>
<codeclass="sig-name descname"><spanclass="pre">get_augmented_state</span></code><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="reference internal" href="../_modules/mdp_playground/envs/rl_toy_env.html#RLToyEnv.get_augmented_state"><spanclass="viewcode-link"><spanclass="pre">[source]</span></span></a><aclass="headerlink" href="#id1" title="Permalink to this definition">¶</a></dt>
840
+
<codeclass="sig-name descname"><spanclass="pre">get_markov_state</span></code><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="reference internal" href="../_modules/mdp_playground/envs/rl_toy_env.html#RLToyEnv.get_markov_state"><spanclass="viewcode-link"><spanclass="pre">[source]</span></span></a><aclass="headerlink" href="#id1" title="Permalink to this definition">¶</a></dt>
841
841
<dd><p>Intended to return the full augmented state which would be Markovian. (However, it’s not Markovian wrt the noise in P and R because we’re not returning the underlying RNG.) Currently, returns the augmented state which is the sequence of length “delay + sequence_length + 1” of past states for both discrete and continuous environments. Additonally, the current state derivatives are also returned for continuous environments.</p>
Copy file name to clipboardExpand all lines: docs/_build/html/_modules/mdp_playground/envs/rl_toy_env.html
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -629,7 +629,7 @@ <h1>Source code for mdp_playground.envs.rl_toy_env</h1><div class="highlight"><p
629
629
<spanclass="sd"> the reward function of the MDP, R</span>
630
630
<spanclass="sd"> R(state, action)</span>
631
631
<spanclass="sd"> defined as a lambda function in the call to init_reward_function() and is equivalent to calling reward_function()</span>
632
-
<spanclass="sd">get_augmented_state()</span>
632
+
<spanclass="sd">get_markov_state()</span>
633
633
<spanclass="sd"> gets underlying Markovian state of the MDP</span>
634
634
<spanclass="sd"> reset()</span>
635
635
<spanclass="sd"> Resets environment state</span>
@@ -1834,9 +1834,9 @@ <h1>Source code for mdp_playground.envs.rl_toy_env</h1><div class="highlight"><p
1834
1834
<spanclass="bp">self</span><spanclass="o">.</span><spanclass="n">reward</span><spanclass="o">+=</span><spanclass="bp">self</span><spanclass="o">.</span><spanclass="n">term_state_reward</span><spanclass="o">*</span><spanclass="bp">self</span><spanclass="o">.</span><spanclass="n">reward_scale</span><spanclass="c1"># Scale before or after?</span>
<spanclass="sd">'''Intended to return the full augmented state which would be Markovian. (However, it's not Markovian wrt the noise in P and R because we're not returning the underlying RNG.) Currently, returns the augmented state which is the sequence of length "delay + sequence_length + 1" of past states for both discrete and continuous environments. Additonally, the current state derivatives are also returned for continuous environments.</span>
1841
1841
1842
1842
<spanclass="sd"> Returns</span>
@@ -2042,7 +2042,7 @@ <h1>Source code for mdp_playground.envs.rl_toy_env</h1><div class="highlight"><p
2042
2042
2043
2043
<spanclass="n">config</span><spanclass="p">[</span><spanclass="s2">"generate_random_mdp"</span><spanclass="p">]</span><spanclass="o">=</span><spanclass="kc">True</span><spanclass="c1"># This supersedes previous settings and generates a random transition function, a random reward function (for random specific sequences)</span>
<spanclass="n">action</span><spanclass="o">=</span><spanclass="n">env</span><spanclass="o">.</span><spanclass="n">action_space</span><spanclass="o">.</span><spanclass="n">sample</span><spanclass="p">()</span><spanclass="c1"># take a #random action</span>
0 commit comments