v0.11.0
ReinforcementLearning v0.11.0
Merged pull requests:
- Reactivate some tests for RLExperiments (#790) (@jeremiahpslewis)
- Drop RL.jl as dependency from Experiments (#795) (@jeremiahpslewis)
- Fix compat for RLBase (#796) (@jeremiahpslewis)
- Fix RLCore version, prep for bump (#797) (@jeremiahpslewis)
- Add reexport compat (#798) (@jeremiahpslewis)
- Bump compat helper (#799) (@jeremiahpslewis)
- Fix IntervalSets compat for RLEnvironments (#800) (@jeremiahpslewis)
- Bump RLZoo.jl version for release (#815) (@jeremiahpslewis)
- Fix RLExperiments compat (#816) (@jeremiahpslewis)
- Expand RLZoo compat (#817) (@jeremiahpslewis)
- Bump RLExperiments, require 0.11 (#818) (@jeremiahpslewis)
- Pin ReinforcementLearningZoo.jl to 0.6 in RLExperiments (#819) (@jeremiahpslewis)
- Drop RL.jl from CompatHelper (until refactor complete) (#824) (@jeremiahpslewis)
- Bump Github Actions cache version (#825) (@jeremiahpslewis)
- Basic allocation fixes for RandomWalk / RandomPolicy (#827) (@jeremiahpslewis)
- Bump CI.yml GitHub action versions (#828) (@jeremiahpslewis)
- Add tests, improve performance of RewardsPerEpisode (#829) (@jeremiahpslewis)
- Refactor and add tests to TotalBatchRewardPerEpisode (#830) (@jeremiahpslewis)
- Tests, refactor for TimePerStep (#831) (@jeremiahpslewis)
- DoEveryNStep tests, performance tweaks (#832) (@jeremiahpslewis)
- Add DoOnExit test (#833) (@jeremiahpslewis)
- Expand PR Template (#835) (@jeremiahpslewis)
- Fix branch name (master -> main) (#837) (@jeremiahpslewis)
- Add test_noop! to remaining hooks (#840) (@jeremiahpslewis)
- Make TimePerStep test robust (#841) (@jeremiahpslewis)
- Reactivate docs (#842) (@jeremiahpslewis)
- Add activate_devmode!() explanation to tips.md (#845) (@jeremiahpslewis)
- Bump compat of RL.jl to 0.11.0-dev (#846) (@jeremiahpslewis)
- add kwargs to agent (#847) (@HenriDeh)
- Gaussian network refactor and tests (#849) (@HenriDeh)
- Agent Refactor (#850) (@jeremiahpslewis)
- Bump RLCore (#851) (@jeremiahpslewis)
- Include codecov in CI (#854) (@HenriDeh)
- Fix a typo in MPO (#855) (@HenriDeh)
- DoEvery should not trigger on t = 1 (#856) (@HenriDeh)
- update CI Julia version (#857) (@jeremiahpslewis)
- Tweak CI to check on dep changes (#858) (@HenriDeh)
- CompatHelper: bump compat for FillArrays to 1 for package ReinforcementLearningCore, (keep existing compat) (#859) (@github-actions[bot])
- MultiAgent Proposal (#861) (@jeremiahpslewis)
- CompatHelper: add new compat entry for ReinforcementLearningCore at version 0.9 for package ReinforcementLearningEnvironments, (keep existing compat) (#865) (@github-actions[bot])
- Multiplayer Fixes (Clean up errors) (#867) (@jeremiahpslewis)
- Added a section to the home page about getting help for Reinforcement… (#868) (@LooseTerrifyingSpaceMonkey)
- Bump StatsBase compat (#873) (@jeremiahpslewis)
- ComposedHooks, MultiHook fixes (#874) (@jeremiahpslewis)
- Fix RLEnvs compat (#875) (@jeremiahpslewis)
- Add back ComposedStop (#876) (@jeremiahpslewis)
- Bump RLBase to v0.11.1 (#877) (@jeremiahpslewis)
- Further refinements (#879) (@jeremiahpslewis)
- Use multiple dispatch / methods plan! and act! (#880) (@jeremiahpslewis)
- RLCore.update! -> Base.push! API change (#884) (@jeremiahpslewis)
- Add compat for CommonRLInterface (#886) (@jeremiahpslewis)
- Fix hook issues (#887) (@jeremiahpslewis)
- CompatHelper: bump compat for ReinforcementLearningZoo to 0.6 for package ReinforcementLearningExperiments, (keep existing compat) (#888) (@github-actions[bot])
- Stacknamespace (#889) (@HenriDeh)
- allow more recent versions (#890) (@HenriDeh)
- Fix stack (#891) (@HenriDeh)
- CompatHelper: add new compat entry for DelimitedFiles at version 1 for package ReinforcementLearningEnvironments, (keep existing compat) (#894) (@github-actions[bot])
- Update implement new alg docs (#896) (@jeremiahpslewis)
- NFQ (#897) (@CasBex)
- fixed problem with sequential multi agent envs (#898) (@Mytolo)
- Sketch out optimise! refactor (#899) (@jeremiahpslewis)
- Bug fix optimise! (#902) (@jeremiahpslewis)
- Breaking changes to optimise! interface: Bump RLCore to v0.11 and RLZoo to v0.8 (#903) (@jeremiahpslewis)
- Swap out rng code (#905) (@jeremiahpslewis)
- CompatHelper: bump compat for NNlib to 0.9 for package ReinforcementLearningZoo, (keep existing compat) (#906) (@github-actions[bot])
- Fix dispatch and update documentation (#907) (@HenriDeh)
- QBasedPolicy optimise! forwards to learner. (#909) (@HenriDeh)
- Bump RLZoo version for NNlib (#911) (@jeremiahpslewis)
- Add performance testing run loop (#914) (@jeremiahpslewis)
- Fix Timer bug (#915) (@jeremiahpslewis)
- couple of improvements to MPO (#919) (@HenriDeh)
- Rework the run loop (#921) (@HenriDeh)
- adjusted pettingzoo to PettingZooEnv simultaneous environment more co… (#925) (@Mytolo)
- fixed devmode / project files (#932) (@Mytolo)
- fixed DQNLearner Gpu isse (#933) (@Mytolo)
- fixing prob. /w symbol/ string correspondence (#934) (@Mytolo)
- Bump flux compat (#935) (@jeremiahpslewis)
- Reduce find_all_max allocations and increase speed based on chatgpt s… (#938) (@jeremiahpslewis)
- Add Buildkite / GPU tests (#942) (@jeremiahpslewis)
- Add RLZoo and RLExperiments to Buildkite (#943) (@jeremiahpslewis)
- Drop deprecated @provide interface (#944) (@jeremiahpslewis)
- CI Improvements (#946) (@jeremiahpslewis)
- Github Actions Fixes (#947) (@jeremiahpslewis)
- gpu updates RLExperiments, RLZoo (#949) (@jeremiahpslewis)
- Bump RLCore version (#950) (@jeremiahpslewis)
- Refactor TRPO and VPG with EpisodesSampler (#952) (@HenriDeh)
- Fix TotalRewardPerEpisode bug (#953) (@jeremiahpslewis)
- update docs to loop refactor (#955) (@HenriDeh)
- CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningEnvironments, (keep existing compat) (#956) (@github-actions[bot])
- CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningZoo, (keep existing compat) (#957) (@github-actions[bot])
- CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningExperiments, (keep existing compat) (#958) (@github-actions[bot])
- CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningCore, (keep existing compat) (#962) (@github-actions[bot])
- CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningZoo, (keep existing compat) (#963) (@github-actions[bot])
- CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningExperiments, (keep existing compat) (#964) (@github-actions[bot])
- TargetNetwork (#966) (@HenriDeh)
- CompatHelper: bump compat for GPUArrays to 9 for package ReinforcementLearningCore, (keep existing compat) (#969) (@github-actions[bot])
- Prioritised DQN GPU (#974) (@CasBex)
- CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningZoo, (keep existing compat) (#975) (@github-actions[bot])
- CompatHelper: bump compat for ReinforcementLearningZoo to 0.8 for package ReinforcementLearningExperiments, (keep existing compat) (#976) (@github-actions[bot])
- CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningExperiments, (keep existing compat) (#977) (@github-actions[bot])
- Nfq refactor (#980) (@CasBex)
- Fix and refactor SAC (#985) (@HenriDeh)
- CompatHelper: bump compat for DomainSets to 0.7 for package ReinforcementLearningBase, (keep existing compat) (#986) (@github-actions[bot])
- remove rlenv dep for tests (#989) (@HenriDeh)
- CompatHelper: add new compat entry for CUDA at version 5 for package ReinforcementLearningExperiments, (keep existing compat) (#991) (@github-actions[bot])
- CompatHelper: add new compat entry for IntervalSets at version 0.7 for package ReinforcementLearningExperiments, (keep existing compat) (#994) (@github-actions[bot])
- Conservative Q-Learning (#995) (@HenriDeh)
- CompatHelper: add new compat entry for Parsers at version 2 for package ReinforcementLearningCore, (keep existing compat) (#997) (@github-actions[bot])
- CompatHelper: add new compat entry for MLUtils at version 0.4 for package ReinforcementLearningZoo, (keep existing compat) (#998) (@github-actions[bot])
- CompatHelper: add new compat entry for Statistics at version 1 for package ReinforcementLearningCore, (keep existing compat) (#999) (@github-actions[bot])
- Update CQL_SAC.jl (#1003) (@HenriDeh)
- Bump tj-actions/changed-files from 35 to 41 in /.github/workflows (#1006) (@dependabot[bot])
- Make it compatible with Adapt 4 and Metal 1 (#1008) (@joelreymont)
- Bump RLCore, RLEnv (#1012) (@jeremiahpslewis)
- Fix PPO per #1007 (#1013) (@jeremiahpslewis)
- Fix RLCore version (#1018) (@jeremiahpslewis)
- Add Devcontainer, handle DomainSets 0.7 (#1019) (@jeremiahpslewis)
- Initial GPUArray transition (#1020) (@jeremiahpslewis)
- Update TagBot.yml for subprojects (#1021) (@jeremiahpslewis)
- Fix offline agent test (#1025) (@joelreymont)
- Fix spell check CI errors (#1027) (@joelreymont)
- GPU Code Migration Part 2.1 (#1029) (@jeremiahpslewis)
- Bump RLZoo to v0.8 (#1031) (@jeremiahpslewis)
- Fix RLZoo version (#1032) (@jeremiahpslewis)
- Drop devmode, prepare RL.jl v0.11 for release (#1035) (@jeremiahpslewis)
- Update docs script for new 'limited' RL.jl release (#1038) (@jeremiahpslewis)
- Tabular Approximator fixes (pre v0.11 changes) (#1040) (@jeremiahpslewis)
- Swap RLZoo for RLFarm in CI, drop RLExperiments (#1041) (@jeremiahpslewis)
- Buildkite tweaks for monorepo (#1042) (@jeremiahpslewis)
- Drop archived projects (#1043) (@jeremiahpslewis)
- Simplify Experiment code after dropping RLExperiment (#1044) (@jeremiahpslewis)
- Fix code coverage scope so it ignores test dir (#1045) (@jeremiahpslewis)
- Fix reset and stop conditions (#1046) (@jeremiahpslewis)
- Drop Functors and use Flux.@layer (#1048) (@jeremiahpslewis)
- Fix naming consistency and add missing hook tests (#1049) (@jeremiahpslewis)
- Add SARS tdlearning back to lib (#1050) (@jeremiahpslewis)
- Update FluxModelApproximator references to FluxApproximator (#1051) (@jeremiahpslewis)
- Epsilon Speedy Explorer (#1052) (@jeremiahpslewis)
- Add TotalRewardPerEpisodeLastN hook (#1053) (@jeremiahpslewis)
- Fix abstract_learner for multiplayer games (#1054) (@jeremiahpslewis)
- Update versions (#1055) (@jeremiahpslewis)
- Update Docs for v0.11 release (#1056) (@jeremiahpslewis)
- Update Katex version, fix vulnerability (#1058) (@jeremiahpslewis)
Closed issues:
- A3C (#133)
- Implement TRPO/ACER (#134)
- ViZDoom is broken (#130)
- bullet3 environment (#128)
- Box2D environment (#127)
- Add MAgent (#125)
- Regret Policy Gradients (#131)
- Add MCTS related algorithms (#132)
- bsuite (#124)
- Implement Fully Parameterized Quantile Function for Distributional Reinforcement Learning. (#135)
- Experimental support of Torch.jl (#136)
- Add Game 2048 (#122)
- add CUDA accelerated Env (#121)
- Unify common network architectures and patterns (#139)
- Asynchronous Methods for Deep Reinforcement Learning (#142)
- R2D2 (#143)
- Recurrent Models (#144)
- Cross language support (#103)
- Add an example running in K8S (#100)
- Flux as service (#154)
- Change clip_by_global_norm! into a Optimizer (#193)
- Derivative-Free Reinforcement Learning (#206)
- Support Tables.jl and PrettyTables.jl for Trajectories (#232)
- Reinforcement Learning and Combinatorial Optimization (#250)
- Model based reinforcement learning (#262)
- Support CircularVectorSARTTrajectory RLZoo (#316)
- Rename some functions to help beginners navigate source code (#326)
- Support multiple discrete action space (#347)
- Combine transformers and RL (#392)
- How to display/render AtariEnv? (#546)
- Refactor of DQN Algorithms (#557)
- JuliaRL_BasicDQN_CartPole example fails (#568)
- Gain in VPGPolicy does not account for terminal states? (#578)
- Question: Can ReinforcementLearning.jl handle Partially Observed Markov Processes (POMDPs)? (#608)
- Explain current implementation of PPO in detail (#620)
- Make documentation on trace normalization (#633)
TDLearner
time step parameter (#648)- estimate v.s. basis in policies (#677)
- Q-learning update timing (#702)
- various eligibility trace-equipped TD methods (#709)
- Improve the logging mechanism during training (#725)
- questions while looking at implementation of VPG (#729)
- SAC example experiment does not work (#736)
- Custom environment action and state space explanation (#738)
- how to load a saved model and test it? (#755)
- Bounds Error at UPDATE_FREQ Step (#758)
StopAfterEpisode
returns 1 more episode usingStepsPerEpisode()
hook (#759)- Move basic definition of environment wrapper into RLBase (#760)
- Precompilation error - DomainSets not in dependencies (#761)
- How to set RLBase.state_space() if the number of state space is uncertain (#762)
- how to use MultiAgentManager on different algorithms? (#764)
- Example run of Offline RL that totally depends on dataset without online environment (#765)
- Deep RL example for LSTM (#772)
- MonteCarloLearner incorrect PreActStage behavior (#779)
- Prioritised Experience Replay (#780)
- Outdated dependencies (#781)
- Running experiments throws a "UndefVarError: params not defined" message (#784)
- Failing MPOCovariance experiment (#791)
- Logo image was not found (#836)
- Reactivate docs on CI/CD (#838)
- update docs: Tips for developers (#844)
- Package dependencies not compatible (#860)
- need help from an expert (#862)
- Installing ReinforcementLearning.jl downgrades Flux.jl (#864)
- Fix Experiment type setup (#881)
- LoadError: UndefVarError:
params
not defined (#882) - Rename update! to push! (#883)
- Contribute Neural Fitted Q-iteration algorithm (#895)
- PPo policy experiments failing (#910)
- Executing RLBase.plan! after end of experiment (#913)
- EpisodeSampler in Trajectories (#927)
- Hook RewardsPerEpisode broken (#945)
- Can implement this ARZ algorithm ? (#965)
- AssertionError: action in env.action_space (#967)
- Fixing SAC Policy (#970)
- Prioritized DQN experiment nonfunctional (#971)
- Prioritised DQN failing on GPU (#973)
- An error (#983)
- params() is no longer supported in Flux (#996)
- GPU Compile error on PPO with MaskedPPOTrajectory (#1007)
- RL Core tests fail sporadically (#1010)
- RL Env tests fail with latest OpenSpiel patches (#1011)
- Tutorial OpenSpiel KuhnOpenNSFP fails (#1024)
- CI: Should spell check be dropped or fixed? (#1026)
- Simple ReinforcementLearning example crashes (#1034)
- Website: How do implement a new algorithm is outdated (#1037)
- Review TabularApproximator (#1039)