Skip to content

Commit 46b04a5

Browse files
Add Soft Actor-Critic documentation (#87)
* Add Soft Actor-Critic documentation * Update spiceaidocs/content/en/deep-learning-ai/sac.md * Add more doc updates for SAC Co-authored-by: Phillip LeBlanc <phillip@spiceai.io>
1 parent 5cb2ee7 commit 46b04a5

2 files changed

Lines changed: 19 additions & 4 deletions

File tree

spiceaidocs/content/en/deep-learning-ai/_index.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,11 @@ Spice.ai provides a standard interface that a deep learning algorithm can be imp
1515

1616
By default, Spice.ai will use [Deep Q-Learning]({{<ref "deep-learning-ai/dql">}}). To use a different algorithm, call `spice train` with the parameter `--learning-algorithm` set to one of the following values:
1717

18-
| --learning-algorithm | Algorithm |
19-
| -------------------- | ----------------------------------------------------------- |
20-
| dql | [Deep Q-Learning]({{<ref "deep-learning-ai/dql">}}) |
21-
| vpg | [Vanilla Policy Gradient]({{<ref "deep-learning-ai/vpg">}}) |
18+
| --learning-algorithm | Algorithm |
19+
| -------------------- | ---------------------------------------------------------------- |
20+
| dql | [Deep Q-Learning]({{<ref "deep-learning-ai/dql">}}) |
21+
| vpg | [Vanilla Policy Gradient]({{<ref "deep-learning-ai/vpg">}}) |
22+
| sacd | [Soft Actor-Critic (Discrete)]({{<ref "deep-learning-ai/sac">}}) |
2223

2324
**Example**
2425

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
type: docs
3+
title: "Soft Actor-Critic"
4+
linkTitle: "Soft Actor-Critic"
5+
weight: 50
6+
description: Spice.ai implementation of the Soft Actor-Critic algorithm (SAC)
7+
---
8+
9+
The SAC (Soft Actor-Critic) algorithm was developed in 2018. It is a off-policy, model-free reinforcement learning algorithm that aims not only at maximizing the reward but also the entropy (acting as randomly as possible). The entropy maximization helps exploring possibilities and trying actions that seems to be equally rewarding.
10+
11+
The Spice.ai implementation of Soft Actor-Critic has been modified to work for discrete action sets.
12+
13+
Berkeley AI Research blog: https://bair.berkeley.edu/blog/2018/12/14/sac/
14+
Arxiv paper: https://arxiv.org/abs/1801.01290

0 commit comments

Comments
 (0)