Skip to content

v0.11.0

Compare
Choose a tag to compare
@github-actions github-actions released this 26 Mar 18:29
· 15 commits to main since this release
de5893f

ReinforcementLearning v0.11.0

Diff since v0.10.2

Merged pull requests:

Closed issues:

  • A3C (#133)
  • Implement TRPO/ACER (#134)
  • ViZDoom is broken (#130)
  • bullet3 environment (#128)
  • Box2D environment (#127)
  • Add MAgent (#125)
  • Regret Policy Gradients (#131)
  • Add MCTS related algorithms (#132)
  • bsuite (#124)
  • Implement Fully Parameterized Quantile Function for Distributional Reinforcement Learning. (#135)
  • Experimental support of Torch.jl (#136)
  • Add Game 2048 (#122)
  • add CUDA accelerated Env (#121)
  • Unify common network architectures and patterns (#139)
  • Asynchronous Methods for Deep Reinforcement Learning (#142)
  • R2D2 (#143)
  • Recurrent Models (#144)
  • Cross language support (#103)
  • Add an example running in K8S (#100)
  • Flux as service (#154)
  • Change clip_by_global_norm! into a Optimizer (#193)
  • Derivative-Free Reinforcement Learning (#206)
  • Support Tables.jl and PrettyTables.jl for Trajectories (#232)
  • Reinforcement Learning and Combinatorial Optimization (#250)
  • Model based reinforcement learning (#262)
  • Support CircularVectorSARTTrajectory RLZoo (#316)
  • Rename some functions to help beginners navigate source code (#326)
  • Support multiple discrete action space (#347)
  • Combine transformers and RL (#392)
  • How to display/render AtariEnv? (#546)
  • Refactor of DQN Algorithms (#557)
  • JuliaRL_BasicDQN_CartPole example fails (#568)
  • Gain in VPGPolicy does not account for terminal states? (#578)
  • Question: Can ReinforcementLearning.jl handle Partially Observed Markov Processes (POMDPs)? (#608)
  • Explain current implementation of PPO in detail (#620)
  • Make documentation on trace normalization (#633)
  • TDLearner time step parameter (#648)
  • estimate v.s. basis in policies (#677)
  • Q-learning update timing (#702)
  • various eligibility trace-equipped TD methods (#709)
  • Improve the logging mechanism during training (#725)
  • questions while looking at implementation of VPG (#729)
  • SAC example experiment does not work (#736)
  • Custom environment action and state space explanation (#738)
  • how to load a saved model and test it? (#755)
  • Bounds Error at UPDATE_FREQ Step (#758)
  • StopAfterEpisode returns 1 more episode using StepsPerEpisode() hook (#759)
  • Move basic definition of environment wrapper into RLBase (#760)
  • Precompilation error - DomainSets not in dependencies (#761)
  • How to set RLBase.state_space() if the number of state space is uncertain (#762)
  • how to use MultiAgentManager on different algorithms? (#764)
  • Example run of Offline RL that totally depends on dataset without online environment (#765)
  • Deep RL example for LSTM (#772)
  • MonteCarloLearner incorrect PreActStage behavior (#779)
  • Prioritised Experience Replay (#780)
  • Outdated dependencies (#781)
  • Running experiments throws a "UndefVarError: params not defined" message (#784)
  • Failing MPOCovariance experiment (#791)
  • Logo image was not found (#836)
  • Reactivate docs on CI/CD (#838)
  • update docs: Tips for developers (#844)
  • Package dependencies not compatible (#860)
  • need help from an expert (#862)
  • Installing ReinforcementLearning.jl downgrades Flux.jl (#864)
  • Fix Experiment type setup (#881)
  • LoadError: UndefVarError: params not defined (#882)
  • Rename update! to push! (#883)
  • Contribute Neural Fitted Q-iteration algorithm (#895)
  • PPo policy experiments failing (#910)
  • Executing RLBase.plan! after end of experiment (#913)
  • EpisodeSampler in Trajectories (#927)
  • Hook RewardsPerEpisode broken (#945)
  • Can implement this ARZ algorithm ? (#965)
  • AssertionError: action in env.action_space (#967)
  • Fixing SAC Policy (#970)
  • Prioritized DQN experiment nonfunctional (#971)
  • Prioritised DQN failing on GPU (#973)
  • An error (#983)
  • params() is no longer supported in Flux (#996)
  • GPU Compile error on PPO with MaskedPPOTrajectory (#1007)
  • RL Core tests fail sporadically (#1010)
  • RL Env tests fail with latest OpenSpiel patches (#1011)
  • Tutorial OpenSpiel KuhnOpenNSFP fails (#1024)
  • CI: Should spell check be dropped or fixed? (#1026)
  • Simple ReinforcementLearning example crashes (#1034)
  • Website: How do implement a new algorithm is outdated (#1037)
  • Review TabularApproximator (#1039)