Skip to content

Commit 164f735

Browse files
bergio13ameligranaDatseris
authored
GSoC: Integration of Agents.jl with RL methods (#1170)
* Create .gitkeep * Add files via upload * Add files via upload * Delete examples/rl/.gitkeep * Implement ReinforcementLearningABM and integrate with POMDPs/Crux interface for RL - Reorganized old interface examples - Added ReinforcementLearningABM and RLEnvironmentWrapper to enable compatibility with POMDPs-based RL algorithms provided by Crux. - Implemented necessary POMDPs functions: actions, observations, observation, initialstate, initialobs, gen, isterminal, discount, and state_space. - Added step_rl! and rl_agent_step! functions for RL agent behavior. - Added examples to show how the new model type works * ignore log * fix wolfsheep * fix indexing, stepping with policies and training config * fix wolfsheep * add scheduler * add scheduler * add discount rates to rl config * add plot + start refactoring as extension * refactor in extension and fix plotting * Use less restrictive versions of libs * remove piracy * add documentation for rl functions + refactor rl code * fix * fix * fix docstring * fix exports * fix interface observation function * delete old files and rename examples * delete old files * add tests for rlabm * create tutorial + fix training wolfsheep + add deps for tests * fix tests * fix makie version * Update Project.toml * Update Project.toml * remove debug prints * edit api docs * edit docs * Update Agents.jl * Update rl_boltzmann.jl * fix * improve boltzmann tutorial + fixes * refactor observation_radius + fixes * fix example * fix tutorial * add tests for extension + improve docs for RLABM * fix tests * Update examples/rl_boltzmann.jl * Update examples/rl_boltzmann.jl * Update examples/rl_boltzmann.jl * Update examples/rl_boltzmann.jl * Update examples/rl_boltzmann.jl * Update examples/rl_boltzmann.jl * update example * Update Project.toml * fix stepping * Update rl_boltzmann.jl * Update rl_boltzmann.jl * Update rl_boltzmann.jl * Update ext/AgentsVisualizations/src/interaction.jl Co-authored-by: George Datseris <datseris.george@gmail.com> * Refactor RL training functions and update documentation - Simplified the `train_model!` function signature to read agent types from the model's RL configuration. - Set default `max_steps` to 100 in `setup_rl_training` and related functions. - Updated observation and reward functions to accept agent objects instead of IDs. - Removed redundant arguments and examples from documentation. - Improved test cases to reflect changes in function signatures and configurations. - Cleaned up unused code and comments in RL utility functions. * Update Project.toml * Update Project.toml * Refactor RL configuration to use RLConfig struct and update related tests * Refactor RLConfig constructor to improve parameter organization and readability * add changelog and cross reference the RLABM struct * add note about `using Crux` * more cross linkage and stating the obvious * reverse order of arguments in observation function: (agent, model). * show videos in the RL example online * rmove colorschemes dependency * Add some info of what the obserrvation function is and why it is important * clarify reward docs * add new version for model init docs * clarify RL libarry extension * remove figures uploaded --------- Co-authored-by: Adriano Meligrana <68152031+Tortar@users.noreply.github.com> Co-authored-by: George Datseris <datseris.george@gmail.com>
1 parent 2dcc3da commit 164f735

23 files changed

+3795
-58
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,6 @@ test/adata.arrow
1717
test/mdata.arrow
1818
*.csv
1919
*.arrow
20-
tutorial.md
20+
tutorial.md
21+
log
22+
examples/rl/log

CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,13 @@
11
# v6.3
22

3-
**Experimental**: The `StructVector` container type, that supports a Struct-of-Arrays (SoA) layout, has been added. It currently only works with single agent types.
3+
This new version brings two powerful, but both **experimental** features:
4+
5+
1. Allowing the agent container to be based on `StructVectors`, using a Struct-of-Arrays internal layout. Currently it only supports single agent types. To use this, pass `container = StructVector` to `StandardABM` or `EventQueueABM` constructors, as well as use the helper construct `SoAType{A}` for dispatch purposes instead of your agent type `A` (read the docstring of `SoAType`).
6+
2. Native integration of ABMs with Reinforcement Learning is now provided by the new model type `ReinforcementLearningABM`. To learn how to use this functionality checkout the new tutorial on the Boltzmann Wealth Model with Reinforcement Learning.
7+
8+
These two features are labelled experimental because they did not yet undergo extensive testing by a broader pool of users. As versions progress and bug reports come in, and get solved, the features will mature to fully stable.
9+
10+
The experimental status allows us as developers to make breaking changes, if need be, to either feature, in order to fully stabilize them.
411

512
# v6.2
613

Project.toml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,25 +30,31 @@ StructArrays = "09ab397b-f2b6-538f-b94a-2f83cf4a842a"
3030

3131
[weakdeps]
3232
Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45"
33+
Crux = "e51cc422-768a-4345-bb8e-2246287ae729"
34+
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
3335
GraphMakie = "1ecd5474-83a3-4783-bb4f-06765db800d2"
3436
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
3537
OSMMakie = "76b6901f-8821-46bb-9129-841bc9cfe677"
38+
POMDPs = "a93abf59-7444-517b-a68a-c42f96afdd7d"
3639

3740
[extensions]
3841
AgentsArrow = "Arrow"
3942
AgentsGraphVisualizations = ["Makie", "GraphMakie"]
4043
AgentsOSMVisualizations = ["Makie", "OSMMakie"]
4144
AgentsVisualizations = "Makie"
45+
AgentsRL = ["Crux", "POMDPs", "Flux"]
4246

4347
[compat]
4448
Arrow = "2"
4549
CSV = "0.9.7, 0.10"
4650
CommonSolve = "0.2.4"
51+
Crux = "0.1"
4752
DataFrames = "0.21, 0.22, 1"
4853
DataStructures = "0.18, 0.19"
4954
Distributed = "1"
5055
Distributions = "0.25"
5156
Downloads = "1"
57+
Flux = "0.14, 0.15, 0.16"
5258
GraphMakie = "0.5, 0.6"
5359
Graphs = "1.4"
5460
JLD2 = "0.4, 0.5, 0.6"
@@ -57,8 +63,9 @@ LightOSM = "0.2, 0.3"
5763
LightSumTypes = "5"
5864
LinearAlgebra = "1"
5965
MacroTools = "0.5"
60-
Makie = "0.20, 0.21, 0.22, 0.24"
66+
Makie = "0.20, 0.21, 0.22, 0.23, 0.24"
6167
OSMMakie = "0.0, 0.1"
68+
POMDPs = "0.9, 1"
6269
PrecompileTools = "1"
6370
ProgressMeter = "1.5"
6471
Random = "1"

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,13 @@
1010
Agents.jl is a pure [Julia](https://julialang.org/) framework for agent-based modeling (ABM): a computational simulation methodology where autonomous agents react to their environment (including other agents) given a predefined set of rules.
1111
Some major highlights of Agents.jl are:
1212

13-
1. It is fast (faster than MASON, NetLogo, or Mesa)
13+
1. It is fast (typically [faster than established competitors](https://github.com/JuliaDynamics/ABMFrameworksComparison))
1414
2. It is simple: has a very short learning curve and requires writing minimal code
1515
3. Has an extensive interface of thousands of out-of-the box possible agent actions
1616
4. Straightforwardly allows simulations on Open Street Maps
1717
5. Allows both traditional discrete-time ABM simulations as well as continuous time
18-
"event queue based" ABM simulations.
18+
"event queue" based ABM simulations.
19+
6. Provides native and unrestricted integration of ABM simulations with Reinforcement Learning (RL).
1920

2021
More information and an extensive list of features can be found in the documentation, which you can either find [online](https://juliadynamics.github.io/Agents.jl/stable/) or build locally by running the `docs/make.jl` file.
2122

docs/Project.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ BlackBoxOptim = "a134a8b2-14d6-55f6-9291-3336d3ab0209"
66
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
77
CellListMap = "69e1c6dd-3888-40e6-b3c8-31ac5f578864"
88
ColorTypes = "3da002f7-5984-5a60-b8a6-cbb66c0b333f"
9+
Crux = "e51cc422-768a-4345-bb8e-2246287ae729"
910
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
1011
DelaunayTriangulation = "927a84f5-c5f4-47a5-9785-b46e178433df"
1112
DiffEqCallbacks = "459566f4-90b8-5000-8ac3-15dfb0a30def"
@@ -15,6 +16,7 @@ DocumenterTools = "35a29f4d-8980-5a13-9543-d66fff28ecb8"
1516
DrWatson = "634d3b9d-ee7a-5ddf-bec9-22491ea816e1"
1617
LightSumTypes = "f56206fc-af4c-5561-a72a-43fe2ca5a923"
1718
FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
19+
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
1820
GLMakie = "e9467ef8-e4e7-5192-8a1a-b1aee30e663a"
1921
GraphMakie = "1ecd5474-83a3-4783-bb4f-06765db800d2"
2022
GraphRecipes = "bd48cda9-67a9-57be-86fa-5b3c104eda73"
@@ -28,6 +30,7 @@ MonteCarloMeasurements = "0987c9cc-fe09-11e8-30f0-b96dd679fdca"
2830
OSMMakie = "76b6901f-8821-46bb-9129-841bc9cfe677"
2931
OrdinaryDiffEq = "1dea7af3-3e70-54e6-95c3-0bf5283fa5ed"
3032
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
33+
POMDPs = "a93abf59-7444-517b-a68a-c42f96afdd7d"
3134
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
3235
RecipesBase = "3cdcf5f2-1ef4-517c-9805-6587b60abb01"
3336
SimpleWeightedGraphs = "47aef6b3-ad0c-573a-a1e2-d07658019622"

docs/build_docs_with_style.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ function build_docs_with_style(pages, modules...;
5252
)
5353
settings = (
5454
modules = [modules...],
55-
format = Documenter.HTML(
55+
format = Documenter.HTML(;
5656
prettyurls = CI,
5757
assets = [
5858
asset("https://fonts.googleapis.com/css?family=Montserrat|Source+Code+Pro&display=swap", class=:css),

docs/make.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ println("Loading packages...")
33
using Agents
44
using LightOSM
55
using CairoMakie
6+
using Crux # precompile it to avoid doc messages
67
import Literate
78

89
pages = [
@@ -15,6 +16,7 @@ pages = [
1516
"examples/predator_prey.md",
1617
"examples/rabbit_fox_hawk.md",
1718
"examples/event_rock_paper_scissors.md",
19+
"examples/rl_boltzmann.md",
1820
"examples.md"
1921
],
2022
"api.md",

docs/src/api.md

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,26 +8,39 @@ In this page we list the remaining API functions, which constitute the bulk of A
88
- [`AgentBasedModel`](@ref)
99
- [`StandardABM`](@ref)
1010
- [`EventQueueABM`](@ref)
11+
- [`ReinforcementLearningABM`](@ref)
1112

1213
```@docs
1314
AgentBasedModel
1415
step!(::AgentBasedModel, args...)
1516
```
1617

17-
### Discrete time models
18+
## Discrete time ABM
1819

1920
```@docs
2021
StandardABM
2122
```
2223

23-
### Continuous time models
24+
## Continuous time ABM
2425

2526
```@docs
2627
EventQueueABM
2728
AgentEvent
2829
add_event!
2930
```
3031

32+
## Reinforcement learning ABM
33+
34+
```@docs
35+
ReinforcementLearningABM
36+
set_rl_config!
37+
create_policy_network
38+
create_value_network
39+
train_model!
40+
get_trained_policies
41+
copy_trained_policies!
42+
```
43+
3144
## Agent types
3245

3346
```@docs
@@ -94,6 +107,7 @@ OpenStreetMapSpace
94107
```
95108

96109
## Adding agents
110+
97111
```@docs
98112
add_agent!
99113
add_agent_own_pos!
@@ -102,6 +116,7 @@ random_position
102116
```
103117

104118
## Moving agents
119+
105120
```@docs
106121
move_agent!
107122
walk!
@@ -110,6 +125,7 @@ get_direction
110125
```
111126

112127
### Movement with paths
128+
113129
For [`OpenStreetMapSpace`](@ref), and [`GridSpace`](@ref)/[`ContinuousSpace`](@ref) using [`Pathfinding`](@ref), a special
114130
movement method is available.
115131

@@ -121,19 +137,22 @@ is_stationary
121137
```
122138

123139
## Removing agents
140+
124141
```@docs
125142
remove_agent!
126143
remove_all!
127144
sample!
128145
```
129146

130147
## Space utility functions
148+
131149
```@docs
132150
normalize_position
133151
spacesize
134152
```
135153

136154
## [`DiscreteSpace` exclusives](@id DiscreteSpace_exclusives)
155+
137156
```@docs
138157
positions
139158
npositions
@@ -154,6 +173,7 @@ isempty(::Int, ::ABM)
154173
```
155174

156175
## `GraphSpace` exclusives
176+
157177
```@docs
158178
add_edge!
159179
rem_edge!
@@ -162,6 +182,7 @@ rem_vertex!
162182
```
163183

164184
## [`ContinuousSpace` exclusives](@id ContinuosSpace_exclusives)
185+
165186
```@docs
166187
nearest_neighbor
167188
get_spatial_property
@@ -174,6 +195,7 @@ manhattan_distance
174195
```
175196

176197
## `OpenStreetMapSpace` exclusives
198+
177199
```@docs
178200
OSM
179201
OSM.lonlat
@@ -190,6 +212,7 @@ OSM.download_osm_network
190212
```
191213

192214
## Nearby Agents
215+
193216
```@docs
194217
nearby_ids
195218
nearby_agents
@@ -205,6 +228,7 @@ Most iteration in Agents.jl is **dynamic** and **lazy**, when possible, for perf
205228

206229
**Dynamic** means that when iterating over the result of e.g. the [`ids_in_position`](@ref) function, the iterator will be affected by actions that would alter its contents.
207230
Specifically, imagine the scenario
231+
208232
```@example docs
209233
using Agents
210234
# We don't need to make a new agent type here,
@@ -218,16 +242,20 @@ for id in ids_in_position((1, 1, 1, 1), model)
218242
end
219243
collect(allids(model))
220244
```
245+
221246
You will notice that only 1 agent was removed. This is simply because the final state of the iteration of `ids_in_position` was reached unnaturally, because the length of its output was reduced by 1 _during_ iteration.
222247
To avoid problems like these, you need to `collect` the iterator to have a non dynamic version.
223248

224249
**Lazy** means that when possible the outputs of the iteration are not collected and instead are generated on the fly.
225250
A good example to illustrate this is [`nearby_ids`](@ref), where doing something like
251+
226252
```julia
227253
a = random_agent(model)
228254
sort!(nearby_ids(random_agent(model), model))
229255
```
256+
230257
leads to error, since you cannot `sort!` the returned iterator. This can be easily solved by adding a `collect` in between:
258+
231259
```@example docs
232260
a = random_agent(model)
233261
sort!(collect(nearby_agents(a, model)))
@@ -248,13 +276,13 @@ index_mapped_groups
248276
```
249277

250278
## Data collection and analysis
279+
251280
```@docs
252281
run!
253282
ensemblerun!
254283
paramscan
255284
```
256285

257-
258286
### Manual data collection
259287

260288
The central simulation function is [`run!`](@ref).
@@ -269,6 +297,7 @@ dataname
269297
```
270298

271299
For example, the core loop of `run!` is just
300+
272301
```julia
273302
df_agent = init_agent_dataframe(model, adata)
274303
df_model = init_model_dataframe(model, mdata)
@@ -287,6 +316,7 @@ while until(t, t0, n, model)
287316
end
288317
return df_agent, df_model
289318
```
319+
290320
(here `until` and `should_we_collect` are internal functions)
291321

292322
## [Schedulers](@id Schedulers)
@@ -311,15 +341,19 @@ Schedulers.ByKind
311341
```
312342

313343
### [Advanced scheduling](@id advanced_scheduling)
344+
314345
You can use [Function-like objects](https://docs.julialang.org/en/v1/manual/methods/#Function-like-objects) to make your scheduling possible of arbitrary events.
315346
For example, imagine that after the `n`-th step of your simulation you want to fundamentally change the order of agents. To achieve this you can define
347+
316348
```julia
317349
mutable struct MyScheduler
318350
n::Int # step number
319351
w::Float64
320352
end
321353
```
354+
322355
and then define a calling method for it like so
356+
323357
```julia
324358
function (ms::MyScheduler)(model::ABM)
325359
ms.n += 1 # increment internal counter by 1 each time its called
@@ -334,17 +368,20 @@ function (ms::MyScheduler)(model::ABM)
334368
end
335369
end
336370
```
371+
337372
and pass it to e.g. `step!` by initializing it
373+
338374
```julia
339375
ms = MyScheduler(100, 0.5)
340376
step!(model, agentstep, modelstep, 100; scheduler = ms)
341377
```
342378

343-
344379
### How to use `Distributed`
380+
345381
To use the `parallel=true` option of [`ensemblerun!`](@ref) you need to load `Agents` and define your fundamental types at all processors. See the [Performance Tips](@ref) page for parallelization.
346382

347383
## Path-finding
384+
348385
```@docs
349386
Pathfinding
350387
Pathfinding.AStar
@@ -354,6 +391,7 @@ Pathfinding.random_walkable
354391
```
355392

356393
### Pathfinding Metrics
394+
357395
```@docs
358396
Pathfinding.DirectDistance
359397
Pathfinding.MaxDistance
@@ -364,8 +402,10 @@ Building a custom metric is straightforward, if the provided ones do not suit yo
364402
See the [Developer Docs](@ref) for details.
365403

366404
## Save, Load, Checkpoints
405+
367406
There may be scenarios where interacting with data in the form of files is necessary. The following
368407
functions provide an interface to save/load data to/from files.
408+
369409
```@docs
370410
AgentsIO.save_checkpoint
371411
AgentsIO.load_checkpoint
@@ -374,6 +414,7 @@ AgentsIO.dump_to_csv
374414
```
375415

376416
It is also possible to write data to file at predefined intervals while running your model, instead of storing it in memory:
417+
377418
```@docs
378419
offline_run!
379420
```

docs/style.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ Base.iterate(c::CyclicContainer, i = 1) = iterate(c.c, i)
1818

1919
COLORSCHEME = [
2020
"#7143E0",
21-
"#0A9A84",
2221
"#191E44",
22+
"#0A9A84",
2323
"#AF9327",
24-
"#701B80",
25-
"#2E6137",
24+
"#791457",
25+
"#6C768C",
2626
]
2727

2828
COLORS = CyclicContainer(COLORSCHEME)

0 commit comments

Comments
 (0)