You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: spiceaidocs/content/en/concepts/interpretations/_index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,6 +30,6 @@ The interpretation is defined as a time range from `start` to `end`, with a `nam
30
30
31
31
Interpretations can be used to provide hints to the reward function on how to reward a time step. In the above example, when the training reaches Tuesday, the reward function author might choose to reward buys even higher based on that expert input.
32
32
33
-
When the action specific reward function is called, if there is an interpretation in that time range, it will be provided to the reward function in `[state].interpretations`. E.g. if an interpretation overlapped with new state then `new_state.interpretations` would contain a list of the overlapping interpretations.
33
+
When the action specific reward function is called, if there is an interpretation in that time range, it will be provided to the reward function in `[state]_interpretations`. E.g. if an interpretation overlapped with new state then `next_state_interpretations` would contain a list of the overlapping interpretations.
34
34
35
35
Comparing Spice.ai recommendations to interpretations is also one way of testing Spice.ai recommendations against expected actions for input data.
|prev_state|[SimpleNamespace](https://docs.python.org/3/library/types.html#types.SimpleNamespace)| The observation state when the action was taken |
28
-
|new_state |[SimpleNamespace](https://docs.python.org/3/library/types.html#types.SimpleNamespace)| The observation state from directly after the action was taken |
|current_state|[dict](https://docs.python.org/3.8/library/stdtypes.html#typesmapping)| The observation state when the action was taken|
28
+
|next_state |[dict](https://docs.python.org/3.8/library/stdtypes.html#typesmapping)| The observation state one granularity step after the action was taken |
29
29
30
30
### Example
31
31
@@ -37,36 +37,36 @@ training:
37
37
- reward: close_valve
38
38
# Reward keeping moisture content above 25%
39
39
with: |
40
-
if new_state.sensors_garden_moisture > 0.25:
40
+
if next_state["sensors_garden_moisture"] > 0.25:
41
41
reward = 200
42
42
43
43
# Penalize low moisture content depending on how far the garden has dried out
Copy file name to clipboardExpand all lines: spiceaidocs/content/en/reference/pod/_index.md
+10-18Lines changed: 10 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -122,7 +122,7 @@ Pod time, time-series and time-data related configuration is defined in the `tim
122
122
123
123
A list of time categories, such as `month` or `weekday` enabling the automatic creation of fields from the observation `time`. For example, by specifiying `month` the Spice.ai engine automatically creates a field in the data called `time_month_<month>` with a value calculated from the month of which that timestamp relates. This enables learning from cyclical patterns, such as monthly or daily cycles.
124
124
125
-
***Example***
125
+
**_Example_**
126
126
127
127
```yaml
128
128
time:
@@ -758,17 +758,15 @@ training:
758
758
759
759
A python code block that will be run before an action specific reward code block runs. Use this to define common variables that will be useful to reference in the specific reward code blocks.
760
760
761
-
Access observation state variables by specifying their fully qualified names and prefixing with `prev_state.` for the value at the previous state before the action was taken, and `new_state.` for the value of the state right after the action was taken.
762
-
763
761
**Example**
764
762
765
763
```yaml
766
764
training:
767
765
reward_init: |
768
766
# Compute price change between previous state and this one
The path to a Python file that defines the reward functions to use, instead of python code blocks.
788
+
787
789
### `training.rewards`
788
790
789
791
**Required**. Defines how to reward the Spice.ai runtime during training so that it learns to take more intelligent actions.
@@ -822,18 +824,8 @@ training:
822
824
823
825
### `training.rewards[*].with`
824
826
825
-
A python code block that needs to assign a variable to `reward` to specify which reward to give the Spice.ai agent for taking this action.
827
+
If `training.reward_funcs` is defined, then this should be the name of the function defined in the python file to use for specifying which reward to give the Spice.ai agent for taking this action.
826
828
827
-
Access observation state variables by specifying their fully qualified names and prefixing with `prev_state.` for the value at the previous state before the action was taken, and `new_state.` for the value of the state right after the action was taken.
829
+
If `training.reward_funcs` is not defined, then this is a python code block that needs to assign a variable to `reward` to specify which reward to give the Spice.ai agent for taking this action.
828
830
829
-
```yaml
830
-
training:
831
-
rewards:
832
-
- reward: jump
833
-
with: |
834
-
# If we weren't able to jump, penalize trying to jump
835
-
if new_state.game.character.height > prev_state.game.character.height:
836
-
reward = 1
837
-
else:
838
-
reward = -1
839
-
```
831
+
See [Rewards]({{<ref "concepts/rewards">}}) for more information on how to define reward functions.
0 commit comments