|
1 | 1 | # ReinforcementLearning.jl Release Notes
|
2 | 2 |
|
3 |
| -## ReinforcementLearning.jl@v0.10.2 |
| 3 | +#### v0.11.0 |
4 | 4 |
|
5 |
| -- Pin sub-packages to pre-refactor versions |
6 |
| -- Agent calls now accept keyword arguments that will be passed to the policy. E.g. if the policy accepts a testmode. |
7 |
| - |
8 |
| -### ReinforcementLearningExperiments.jl |
9 |
| - |
10 |
| -#### v0.3 |
11 |
| - |
12 |
| -- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments |
13 |
| - |
14 |
| -#### v0.2 |
15 |
| - |
16 |
| -- Drop `ReinforcementLearning.jl` from dependencies, use `ReinforcementLearningCore.jl` instead |
17 |
| - |
18 |
| -#### v0.1.4 |
19 |
| - |
20 |
| -- Support `device_rng` in SAC [#606](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/606) |
21 |
| - |
22 |
| -#### v0.1.3 |
23 |
| - |
24 |
| -- Test experiments on GPU by default [#549](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/549) |
25 |
| - |
26 |
| -#### v0.1.2 |
27 |
| - |
28 |
| -- Added an experiment for DQN training on discrete `PendulumEnv` (#537) |
29 |
| - |
30 |
| -### ReinforcementLearningEnvironments.jl |
31 |
| - |
32 |
| -#### v0.8 |
33 |
| - |
34 |
| -- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments |
35 |
| - |
36 |
| -#### v0.7.2 |
37 |
| - |
38 |
| -- Reduce allocations, improve performance of `RandomWalk1D` |
39 |
| -- Add tests to `RandomWalk1D` |
40 |
| -- Chase down JET.jl errors, fix |
41 |
| -- Update `TicTacToeEnv` and `RockPaperScissorsEnv` to support new `MultiAgentPolicy` setup |
42 |
| - |
43 |
| -#### v0.6.12 |
44 |
| - |
45 |
| -- Bugfix bug with `is_discrete_space` [#566](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/issues/566) |
46 |
| - |
47 |
| -#### v0.6.11 |
48 |
| - |
49 |
| -- Bugfix of CartPoleEnv with keyword arguments |
50 |
| - |
51 |
| -#### v0.6.10 |
52 |
| - |
53 |
| -- Bugfix of CartPoleEnv with Float32 |
54 |
| - |
55 |
| -#### v0.6.9 |
56 |
| - |
57 |
| -- Added a continuous option for CartPoleEnv [#543](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/543). |
58 |
| - |
59 |
| -#### v0.6.8 |
60 |
| - |
61 |
| -- Support `action_space(::TicTacToeEnv, player)`. |
62 |
| - |
63 |
| -#### v0.6.7 |
64 |
| - |
65 |
| -- Fixed bugs in plotting `MountainCarEnv` (#537) |
66 |
| -- Implemented plotting for `PendulumEnv` (#537) |
67 |
| - |
68 |
| -#### v0.6.6 |
69 |
| - |
70 |
| -- Bugfix with `ZeroTo` [#534](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/534) |
71 |
| - |
72 |
| -#### v0.6.4 |
73 |
| - |
74 |
| -- Add `GraphShortestPathEnv`. [#445](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/445) |
75 |
| - |
76 |
| -#### v0.6.3 |
77 |
| - |
78 |
| -- Add `StockTradingEnv` from the paper [Deep Reinforcement Learning for |
79 |
| - Automated Stock Trading: An Ensemble |
80 |
| - Strategy](https://github.com/AI4Finance-Foundation/FinRL-Trading). |
81 |
| - This environment is a good testbed for multi-continuous action space |
82 |
| - algorithms. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428) |
83 |
| - |
84 |
| -#### v0.6.2 |
85 |
| - |
86 |
| -- Add `SequentialEnv` environment wrapper to turn a simultaneous environment |
87 |
| - into a sequential one. |
88 |
| - |
89 |
| -#### v0.6.1 |
90 |
| - |
91 |
| -- Drop GR in RLEnvs and lazily load ploting functions.[#309](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/309), [#310](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/310) |
92 |
| - |
93 |
| -#### v0.6.0 |
94 |
| - |
95 |
| -- Set `AcrobotEnv` into lazy loading to reduce the dependency of `OrdinaryDiffEq`. |
96 |
| - |
97 |
| -### ReinforcementLearningZoo.jl |
98 |
| - |
99 |
| -#### v0.7.0 |
100 |
| - |
101 |
| -- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments |
102 |
| -- Reduce excess `TDLearner` allocations by using Tuple instead of Array |
103 |
| - |
104 |
| -#### v0.4.1 |
105 |
| - |
106 |
| -- Make keyword argument `n_actions` in `TabularPolicy` optional. [#300](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/300) |
107 |
| - |
108 |
| -#### v0.6.0 |
109 |
| - |
110 |
| -- Extensive refactor based on RLBase.jl `v0.11`, most components not **yet** ported |
111 |
| - |
112 |
| -#### v0.5.11 |
113 |
| - |
114 |
| -- Fix multi-dimension action space in TD3. [#624](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/issues/624) |
115 |
| - |
116 |
| -#### v0.5.10 |
117 |
| - |
118 |
| -- Support `device_rng` in SAC [#606](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/606) |
119 |
| - |
120 |
| -#### v0.5.7 |
| 5 | +- Complete major refactor, API consistency improvements and incorporate ReinforcementLearningTrajectories.jl |
121 | 6 |
|
122 |
| -- Fix warning about `vararg.data` in [email protected] [#560](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/560) |
| 7 | +#### v0.10.2 |
123 | 8 |
|
124 |
| -#### v0.5.6 |
125 |
| - |
126 |
| -- Make BC GPU compatible [#553](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/553) |
127 |
| - |
128 |
| -#### v0.5.5 |
129 |
| - |
130 |
| -- Make most algorithms GPU compatible [#549](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/549) |
131 |
| - |
132 |
| -#### v0.5.4 |
133 |
| - |
134 |
| -- Support `length` method for `VectorWSARTTrajectory`. |
135 |
| - |
136 |
| -#### v0.5.3 |
137 |
| - |
138 |
| -- Revert part of the unexpected change of PPO in the last PR. |
139 |
| - |
140 |
| -#### v0.5.2 |
141 |
| - |
142 |
| -- Fixed the bug with MaskedPPOTrajectory reported [here](https://discourse.julialang.org/t/using-ppopolicy-with-custom-environment-with-action-masking-in-reinforcementlearning-jl/69625/6) |
143 |
| - |
144 |
| -#### v0.5.0 |
145 |
| - |
146 |
| -- Update the complete SAC implementation and modify some details based on the |
147 |
| - original paper. [#365](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/365) |
148 |
| -- Add some extra keyword parameters for `BehaviorCloningPolicy` to use it |
149 |
| - online. [#390](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/390) |
150 |
| - |
151 |
| -#### v0.4.0 |
152 |
| - |
153 |
| -- Moved all the experiments into a new package `ReinforcementLearningExperiments.jl`. The related dependencies are also removed (`BSON.jl`, `StableRNGs.jl`, `TensorBoardLogger.jl`). |
154 |
| - |
155 |
| -### ReinforcementLearningDatasets.jl |
156 |
| - |
157 |
| -#### v0.1.0 |
158 |
| - |
159 |
| -- Add functionality for fetching d4rl datasets as an iterable DataSet. Credits: https://arxiv.org/abs/2004.07219 |
160 |
| -- This supports d4rl and d4rl-pybullet and Google Research DQN atari datasets. |
161 |
| -- Uses DataDeps for data dependency management. |
162 |
| -- This package also supports RL Unplugged Datasets. |
163 |
| -- Support for [google-research/deep_ope](https://github.com/google-research/deep_ope) added. |
164 |
| - |
165 |
| - |
166 |
| - |
167 |
| -### ReinforcementLearningBase.jl |
168 |
| - |
169 |
| -#### v0.12.0 |
170 |
| - |
171 |
| -- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments |
172 |
| - |
173 |
| -#### v0.9.7 |
174 |
| - |
175 |
| -- Update POMDPModelTools -> POMDPTools |
176 |
| -- Add `next_player!` method to support `Sequential` `MultiAgent` environments |
177 |
| - |
178 |
| -#### v0.9.6 |
179 |
| - |
180 |
| -- Implement `Base.:(==)` for `Space`. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428) |
181 |
| - |
182 |
| -#### v0.9.5 |
183 |
| - |
184 |
| -- Add default `Base.:(==)` and `Base.hash` method for `AbstractEnv`. [#348](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/348) |
185 |
| - |
186 |
| -### ReinforcementLearningCore.jl |
187 |
| - |
188 |
| -#### v0.10.1 |
189 |
| - |
190 |
| -- Fix hook issue with 'extra' call; always run `push!` at end of episode, regardless of whether stopped or terminated |
191 |
| - |
192 |
| -#### v0.10.0 |
193 |
| - |
194 |
| -- Transition to `RLCore.forward`, `RLBase.act!`, `RLBase.plan!` and `Base.push!` syntax instead of functional objects for hooks, policies and environments |
195 |
| - |
196 |
| -#### v0.9.3 |
197 |
| - |
198 |
| -- Add back multi-agent support with `MultiAgentPolicy` and `MultiAgentHook` |
199 |
| - |
200 |
| -#### v0.9.2 |
201 |
| - |
202 |
| -- Use correct Flux.stack function signature |
203 |
| -- Reduce allocations, improve performance of `RandomPolicy` |
204 |
| -- Chase down JET.jl errors, fix |
205 |
| -- Add tests for `StopAfterStep`, `StopAfterEpisode` |
206 |
| -- Add tests, improve performance of `RewardsPerEpisode` |
207 |
| -- Refactor `Agent` for speedup |
208 |
| - |
209 |
| -#### v0.8.11 |
210 |
| - |
211 |
| -- When sending a `CircularArrayBuffer` to GPU devices, convert `CircularArrayBuffer` into `CuArray` instead of the adapted `CircularArrayBuffer` of `CuArray`. [#606](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/606) |
212 |
| - |
213 |
| -#### v0.8.10 |
214 |
| - |
215 |
| -- Update dependency of `CircularArrayBuffers` to `v0.1.9`. [#602](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/602) |
216 |
| -- Add `CovGaussianNetwork`. [#597](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/597) |
217 |
| -#### v0.8.8 |
218 |
| - |
219 |
| -- Fix warning about `vararg.data` in [email protected] [#560](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/560) |
220 |
| - |
221 |
| -#### v0.8.7 |
222 |
| - |
223 |
| -- Make `GaussianNetwork` differentiable. [#549](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/549) |
224 |
| - |
225 |
| -#### v0.8.6 |
226 |
| - |
227 |
| -- Fixed a bug [1] with the `DoOnExit` hook (#537) |
228 |
| -- Added some convenience hooks for rendering rollout episodes (#537) |
229 |
| - |
230 |
| -#### v0.8.5 |
231 |
| - |
232 |
| -- Fixed the method overwritten warning of `device` from `CUDA.jl`. |
233 |
| - |
234 |
| -#### v0.8.3 |
235 |
| - |
236 |
| -- Add extra two optional keyword arguments (`min_σ` and `max_σ`) in |
237 |
| - `GaussianNetwork` to clip the output of `logσ`. [#428](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/428) |
238 |
| - |
239 |
| -#### v0.8.2 |
240 |
| - |
241 |
| -- Add GaussianNetwork and DuelingNetwork into ReinforcementLearningCore.jl as general components. [#370](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/370) |
242 |
| -- Export `WeightedSoftmaxExplorer`. |
243 |
| - [#382](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/pull/382) |
244 |
| - |
245 |
| -#### v0.8.1 |
246 |
| - |
247 |
| -- Minor bug & typo fixes |
| 9 | +- Pin sub-packages to pre-refactor versions |
| 10 | +- Agent calls now accept keyword arguments that will be passed to the policy. E.g. if the policy accepts a testmode. |
248 | 11 |
|
249 |
| -#### v0.8.0 |
250 | 12 |
|
251 |
| -- Removed `ResizeImage` preprocessor to reduce the dependency of `ImageTransformations`. |
252 |
| -- Show unicode plot at the end of an experiment in the `TotalRewardPerEpisode` hook. |
| 13 | +#### v0.9.0 |
0 commit comments