Name		Name	Last commit message	Last commit date
Latest commit Cannot retrieve latest commit at this time. History 335 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md

Repository files navigation

ReinforcementLearningTrajectories

Design

The relationship of several concepts provided in this package:

┌───────────────────────────────────┐
│ Trajectory                        │
│ ┌───────────────────────────────┐ │
│ │ EpisodesBuffer wrapping a     | |
| | AbstractTraces                │ │
│ │             ┌───────────────┐ │ │
│ │ :trace_A => │ AbstractTrace │ │ │
│ │             └───────────────┘ │ │
│ │                               │ │
│ │             ┌───────────────┐ │ │
│ │ :trace_B => │ AbstractTrace │ │ │
│ │             └───────────────┘ │ │
│ │  ...             ...          │ │
│ └───────────────────────────────┘ │
│          ┌───────────┐            │
│          │  Sampler  │            │
│          └───────────┘            │
│         ┌────────────┐            │
│         │ Controller │            │
│         └────────────┘            │
└───────────────────────────────────┘

`Trajectory`

A Trajectory contains 3 parts:

A container to store data. (Usually an AbstractTraces)
A sampler to determine how to sample a batch from container
A controller to decide when to sample a new batch from the container

Typical usage:

julia> t = Trajectory(
               container = Traces(a=Int[], b=Bool[]), 
               sampler = BatchSampler(3), 
               controller = InsertSampleRatioController(1.0, 3, 0, 0)
           );

julia> push!(t, (a=1,));

julia> for i in 1:5
           push!(t, (a=i, b=iseven(i)))
       end

julia> for batch in t
           println(batch)
       end
(a = [1, 3, 1], b = Bool[1, 1, 1])
(a = [4, 1, 4], b = Bool[0, 0, 0])
(a = [1, 4, 1], b = Bool[1, 0, 0])
(a = [1, 1, 4], b = Bool[1, 0, 0])

Traces

Traces
MultiplexTraces
CircularSARTTraces
NormalizedTraces

Samplers

BatchSampler
MetaSampler
MultiBatchSampler
EpisodesSampler

Controllers

InsertSampleRatioController
AsyncInsertSampleRatioController

Please refer tests for common usage. (TODO: generate docs and add links to above data structures)

Acknowledgement

This async version is mainly inspired by deepmind/reverb.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReinforcementLearningTrajectories

Design

`Trajectory`

Acknowledgement

About

Releases

Packages

Languages

License

Tortar/ReinforcementLearningTrajectories.jl

Folders and files

Latest commit

History

Repository files navigation

ReinforcementLearningTrajectories

Design

Trajectory

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Trajectory`

Packages