Skip to content

Commit 1b4d449

Browse files
Fix hooks for multiplayer case (#1071)
* Fix devcontainer * Fix hooks for multiplayer case * split off NEWS.md into per-package files * Bump version, fix hooks * Fix syntax error * Fix syntax * Fix syntax * Fix linted issues * Fix type signatures * Fix test * Fix hooks * Add RL.jl to devcontainer * Fix spellcheck error * Fix syntax * Add RLFarm news
1 parent 82707e8 commit 1b4d449

File tree

11 files changed

+216
-308
lines changed

11 files changed

+216
-308
lines changed

.devcontainer/devcontainer.json

+3-2
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,14 @@
22
"customizations": {
33
"vscode": {
44
"extensions": [
5-
"julialang.language-julia"
5+
"julialang.language-julia",
6+
"ms-azuretools.vscode-docker"
67
]
78
}
89
},
910
"runArgs": [
1011
"--privileged"
1112
],
1213
"dockerFile": "Dockerfile",
13-
"updateContentCommand": "julia -e 'using Pkg; Pkg.develop(path=\"src/ReinforcementLearningBase\"); Pkg.develop(path=\"src/ReinforcementLearningEnvironments\"); Pkg.develop(path=\"src/ReinforcementLearningCore\"); Pkg.develop(path=\"src/ReinforcementLearningZoo\");'"
14+
"updateContentCommand": "julia -e 'using Pkg; Pkg.develop(path=\"src/ReinforcementLearningBase\"); Pkg.develop(path=\"src/ReinforcementLearningEnvironments\"); Pkg.develop(path=\"src/ReinforcementLearningCore\"); Pkg.develop(path=\"src/ReinforcementLearningFarm\"); Pkg.develop(path=\"src/ReinforcementLearning\");'"
1415
}

NEWS.md

+6-245
Original file line numberDiff line numberDiff line change
@@ -1,252 +1,13 @@
11
# ReinforcementLearning.jl Release Notes
22

3-
## ReinforcementLearning.jl@v0.10.2
3+
#### v0.11.0
44

5-
- Pin sub-packages to pre-refactor versions
6-
- Agent calls now accept keyword arguments that will be passed to the policy. E.g. if the policy accepts a testmode.
7-
8-
### ReinforcementLearningExperiments.jl
9-
10-
#### v0.3
11-
12-
- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments
13-
14-
#### v0.2
15-
16-
- Drop `ReinforcementLearning.jl` from dependencies, use `ReinforcementLearningCore.jl` instead
17-
18-
#### v0.1.4
19-
20-
- Support `device_rng` in SAC [#606](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/606)
21-
22-
#### v0.1.3
23-
24-
- Test experiments on GPU by default [#549](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/549)
25-
26-
#### v0.1.2
27-
28-
- Added an experiment for DQN training on discrete `PendulumEnv` (#537)
29-
30-
### ReinforcementLearningEnvironments.jl
31-
32-
#### v0.8
33-
34-
- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments
35-
36-
#### v0.7.2
37-
38-
- Reduce allocations, improve performance of `RandomWalk1D`
39-
- Add tests to `RandomWalk1D`
40-
- Chase down JET.jl errors, fix
41-
- Update `TicTacToeEnv` and `RockPaperScissorsEnv` to support new `MultiAgentPolicy` setup
42-
43-
#### v0.6.12
44-
45-
- Bugfix bug with `is_discrete_space` [#566](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/issues/566)
46-
47-
#### v0.6.11
48-
49-
- Bugfix of CartPoleEnv with keyword arguments
50-
51-
#### v0.6.10
52-
53-
- Bugfix of CartPoleEnv with Float32
54-
55-
#### v0.6.9
56-
57-
- Added a continuous option for CartPoleEnv [#543](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/543).
58-
59-
#### v0.6.8
60-
61-
- Support `action_space(::TicTacToeEnv, player)`.
62-
63-
#### v0.6.7
64-
65-
- Fixed bugs in plotting `MountainCarEnv` (#537)
66-
- Implemented plotting for `PendulumEnv` (#537)
67-
68-
#### v0.6.6
69-
70-
- Bugfix with `ZeroTo` [#534](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/534)
71-
72-
#### v0.6.4
73-
74-
- Add `GraphShortestPathEnv`. [#445](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/445)
75-
76-
#### v0.6.3
77-
78-
- Add `StockTradingEnv` from the paper [Deep Reinforcement Learning for
79-
Automated Stock Trading: An Ensemble
80-
Strategy](https://github.com/AI4Finance-Foundation/FinRL-Trading).
81-
This environment is a good testbed for multi-continuous action space
82-
algorithms. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428)
83-
84-
#### v0.6.2
85-
86-
- Add `SequentialEnv` environment wrapper to turn a simultaneous environment
87-
into a sequential one.
88-
89-
#### v0.6.1
90-
91-
- Drop GR in RLEnvs and lazily load ploting functions.[#309](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/309), [#310](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/310)
92-
93-
#### v0.6.0
94-
95-
- Set `AcrobotEnv` into lazy loading to reduce the dependency of `OrdinaryDiffEq`.
96-
97-
### ReinforcementLearningZoo.jl
98-
99-
#### v0.7.0
100-
101-
- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments
102-
- Reduce excess `TDLearner` allocations by using Tuple instead of Array
103-
104-
#### v0.4.1
105-
106-
- Make keyword argument `n_actions` in `TabularPolicy` optional. [#300](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/300)
107-
108-
#### v0.6.0
109-
110-
- Extensive refactor based on RLBase.jl `v0.11`, most components not **yet** ported
111-
112-
#### v0.5.11
113-
114-
- Fix multi-dimension action space in TD3. [#624](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/issues/624)
115-
116-
#### v0.5.10
117-
118-
- Support `device_rng` in SAC [#606](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/606)
119-
120-
#### v0.5.7
5+
- Complete major refactor, API consistency improvements and incorporate ReinforcementLearningTrajectories.jl
1216

122-
- Fix warning about `vararg.data` in [email protected] [#560](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/560)
7+
#### v0.10.2
1238

124-
#### v0.5.6
125-
126-
- Make BC GPU compatible [#553](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/553)
127-
128-
#### v0.5.5
129-
130-
- Make most algorithms GPU compatible [#549](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/549)
131-
132-
#### v0.5.4
133-
134-
- Support `length` method for `VectorWSARTTrajectory`.
135-
136-
#### v0.5.3
137-
138-
- Revert part of the unexpected change of PPO in the last PR.
139-
140-
#### v0.5.2
141-
142-
- Fixed the bug with MaskedPPOTrajectory reported [here](https://discourse.julialang.org/t/using-ppopolicy-with-custom-environment-with-action-masking-in-reinforcementlearning-jl/69625/6)
143-
144-
#### v0.5.0
145-
146-
- Update the complete SAC implementation and modify some details based on the
147-
original paper. [#365](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/365)
148-
- Add some extra keyword parameters for `BehaviorCloningPolicy` to use it
149-
online. [#390](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/390)
150-
151-
#### v0.4.0
152-
153-
- Moved all the experiments into a new package `ReinforcementLearningExperiments.jl`. The related dependencies are also removed (`BSON.jl`, `StableRNGs.jl`, `TensorBoardLogger.jl`).
154-
155-
### ReinforcementLearningDatasets.jl
156-
157-
#### v0.1.0
158-
159-
- Add functionality for fetching d4rl datasets as an iterable DataSet. Credits: https://arxiv.org/abs/2004.07219
160-
- This supports d4rl and d4rl-pybullet and Google Research DQN atari datasets.
161-
- Uses DataDeps for data dependency management.
162-
- This package also supports RL Unplugged Datasets.
163-
- Support for [google-research/deep_ope](https://github.com/google-research/deep_ope) added.
164-
165-
166-
167-
### ReinforcementLearningBase.jl
168-
169-
#### v0.12.0
170-
171-
- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments
172-
173-
#### v0.9.7
174-
175-
- Update POMDPModelTools -> POMDPTools
176-
- Add `next_player!` method to support `Sequential` `MultiAgent` environments
177-
178-
#### v0.9.6
179-
180-
- Implement `Base.:(==)` for `Space`. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428)
181-
182-
#### v0.9.5
183-
184-
- Add default `Base.:(==)` and `Base.hash` method for `AbstractEnv`. [#348](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/348)
185-
186-
### ReinforcementLearningCore.jl
187-
188-
#### v0.10.1
189-
190-
- Fix hook issue with 'extra' call; always run `push!` at end of episode, regardless of whether stopped or terminated
191-
192-
#### v0.10.0
193-
194-
- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments
195-
196-
#### v0.9.3
197-
198-
- Add back multi-agent support with `MultiAgentPolicy` and `MultiAgentHook`
199-
200-
#### v0.9.2
201-
202-
- Use correct Flux.stack function signature
203-
- Reduce allocations, improve performance of `RandomPolicy`
204-
- Chase down JET.jl errors, fix
205-
- Add tests for `StopAfterStep`, `StopAfterEpisode`
206-
- Add tests, improve performance of `RewardsPerEpisode`
207-
- Refactor `Agent` for speedup
208-
209-
#### v0.8.11
210-
211-
- When sending a `CircularArrayBuffer` to GPU devices, convert `CircularArrayBuffer` into `CuArray` instead of the adapted `CircularArrayBuffer` of `CuArray`. [#606](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/606)
212-
213-
#### v0.8.10
214-
215-
- Update dependency of `CircularArrayBuffers` to `v0.1.9`. [#602](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/602)
216-
- Add `CovGaussianNetwork`. [#597](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/597)
217-
#### v0.8.8
218-
219-
- Fix warning about `vararg.data` in [email protected] [#560](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/560)
220-
221-
#### v0.8.7
222-
223-
- Make `GaussianNetwork` differentiable. [#549](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/549)
224-
225-
#### v0.8.6
226-
227-
- Fixed a bug [1] with the `DoOnExit` hook (#537)
228-
- Added some convenience hooks for rendering rollout episodes (#537)
229-
230-
#### v0.8.5
231-
232-
- Fixed the method overwritten warning of `device` from `CUDA.jl`.
233-
234-
#### v0.8.3
235-
236-
- Add extra two optional keyword arguments (`min_σ` and `max_σ`) in
237-
`GaussianNetwork` to clip the output of `logσ`. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428)
238-
239-
#### v0.8.2
240-
241-
- Add GaussianNetwork and DuelingNetwork into ReinforcementLearningCore.jl as general components. [#370](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/370)
242-
- Export `WeightedSoftmaxExplorer`.
243-
[#382](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/382)
244-
245-
#### v0.8.1
246-
247-
- Minor bug & typo fixes
9+
- Pin sub-packages to pre-refactor versions
10+
- Agent calls now accept keyword arguments that will be passed to the policy. E.g. if the policy accepts a testmode.
24811

249-
#### v0.8.0
25012

251-
- Removed `ResizeImage` preprocessor to reduce the dependency of `ImageTransformations`.
252-
- Show unicode plot at the end of an experiment in the `TotalRewardPerEpisode` hook.
13+
#### v0.9.0

src/ReinforcementLearningBase/NEWS.md

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
### ReinforcementLearningBase.jl Release Notes
2+
3+
#### v0.12.0
4+
5+
- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments
6+
7+
#### v0.9.7
8+
9+
- Update POMDPModelTools -> POMDPTools
10+
- Add `next_player!` method to support `Sequential` `MultiAgent` environments
11+
12+
#### v0.9.6
13+
14+
- Implement `Base.:(==)` for `Space`. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428)
15+
16+
#### v0.9.5
17+
18+
- Add default `Base.:(==)` and `Base.hash` method for `AbstractEnv`. [#348](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/348)

src/ReinforcementLearningCore/NEWS.md

+74
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
#### v0.15.1
2+
3+
- Fix MultiPlayer hook bugs
4+
- Clarify that the correct `push!` syntax is `push!(hook, stage, policy, env)` or `push!(hook, stage, policy, env, player)`; `push!(hook)` or other permutations now error as not implemented.
5+
6+
#### v0.15.0
7+
8+
- First version released with ReinforcementLearning v0.11
9+
10+
#### v0.10.1
11+
12+
- Fix hook issue with 'extra' call; always run `push!` at end of episode, regardless of whether stopped or terminated
13+
14+
#### v0.10.0
15+
16+
- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments
17+
18+
#### v0.9.3
19+
20+
- Add back multi-agent support with `MultiAgentPolicy` and `MultiAgentHook`
21+
22+
#### v0.9.2
23+
24+
- Use correct Flux.stack function signature
25+
- Reduce allocations, improve performance of `RandomPolicy`
26+
- Chase down JET.jl errors, fix
27+
- Add tests for `StopAfterStep`, `StopAfterEpisode`
28+
- Add tests, improve performance of `RewardsPerEpisode`
29+
- Refactor `Agent` for speedup
30+
31+
#### v0.8.11
32+
33+
- When sending a `CircularArrayBuffer` to GPU devices, convert `CircularArrayBuffer` into `CuArray` instead of the adapted `CircularArrayBuffer` of `CuArray`. [#606](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/606)
34+
35+
#### v0.8.10
36+
37+
- Update dependency of `CircularArrayBuffers` to `v0.1.9`. [#602](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/602)
38+
- Add `CovGaussianNetwork`. [#597](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/597)
39+
#### v0.8.8
40+
41+
- Fix warning about `vararg.data` in [email protected] [#560](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/560)
42+
43+
#### v0.8.7
44+
45+
- Make `GaussianNetwork` differentiable. [#549](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/549)
46+
47+
#### v0.8.6
48+
49+
- Fixed a bug [1] with the `DoOnExit` hook (#537)
50+
- Added some convenience hooks for rendering rollout episodes (#537)
51+
52+
#### v0.8.5
53+
54+
- Fixed the method overwritten warning of `device` from `CUDA.jl`.
55+
56+
#### v0.8.3
57+
58+
- Add extra two optional keyword arguments (`min_σ` and `max_σ`) in
59+
`GaussianNetwork` to clip the output of `logσ`. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428)
60+
61+
#### v0.8.2
62+
63+
- Add GaussianNetwork and DuelingNetwork into ReinforcementLearningCore.jl as general components. [#370](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/370)
64+
- Export `WeightedSoftmaxExplorer`.
65+
[#382](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/382)
66+
67+
#### v0.8.1
68+
69+
- Minor bug & typo fixes
70+
71+
#### v0.8.0
72+
73+
- Removed `ResizeImage` preprocessor to reduce the dependency of `ImageTransformations`.
74+
- Show unicode plot at the end of an experiment in the `TotalRewardPerEpisode` hook.

src/ReinforcementLearningCore/Project.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
name = "ReinforcementLearningCore"
22
uuid = "de1b191a-4ae0-4afa-a27b-92d07f46b2d6"
3-
version = "0.15.0"
3+
version = "0.15.1"
44

55
[deps]
66
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"

0 commit comments

Comments
 (0)