Add SARS tdlearning back to lib #1050

jeremiahpslewis · 2024-03-15T16:59:17Z

One of a couple pull requests on the way to getting plain vanilla q-learning working
Drops optimiser from TabularApproximator, too many unforced errors with it specified
Approximator becomes FluxModelApproximator for clarity
TabularApproximator and FluxModelApproximator are independent subtypes of AbstractLearner

PR Checklist

Update NEWS.md?
Unit tests for all structs / functions?
Integration and correctness tests using a simple env?
PR Review?
Add or update documentation?
Write docstrings for new methods?

…ication

codecov · 2024-03-15T17:56:02Z

Codecov Report

Attention: Patch coverage is 80.35714% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 57.50%. Comparing base (55f60b0) to head (a0330b8).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1050      +/-   ##
==========================================
+ Coverage   55.68%   57.50%   +1.82%     
==========================================
  Files          69       71       +2     
  Lines        2726     2751      +25     
==========================================
+ Hits         1518     1582      +64     
+ Misses       1208     1169      -39

Files	Coverage Δ
...ementLearningCore/src/policies/agent/agent_base.jl	`71.42% <ø> (ø)`
...e/src/policies/learners/flux_model_approximator.jl	`100.00% <100.00%> (ø)`
...Core/src/policies/learners/tabular_approximator.jl	`100.00% <100.00%> (ø)`
...ementLearningFarm/src/ReinforcementLearningFarm.jl	`100.00% <ø> (ø)`
...arningCore/src/policies/learners/target_network.jl	`95.83% <80.00%> (ø)`
...ntLearningCore/src/policies/learners/td_learner.jl	`91.30% <91.30%> (ø)`
...rcementLearningCore/src/policies/q_based_policy.jl	`71.42% <66.66%> (+71.42%)`	⬆️
.../src/policies/explorers/epsilon_greedy_explorer.jl	`23.45% <14.28%> (+23.45%)`	⬆️

... and 3 files with indirect coverage changes

src/ReinforcementLearningCore/src/policies/explorers/epsilon_greedy_explorer.jl

src/ReinforcementLearningCore/test/policies/q_based_policy.jl

src/ReinforcementLearningFarm/src/algorithms/tabular/td_learner.jl

HenriDeh

LGTM. Good idea naming the FluxApproximator. One day this could be moving to a Flux extension.

Jeremiah Lewis added 11 commits March 15, 2024 14:48

q learning changes

1b50dd9

init td work

7b16d91

qlearning passes first check

d002258

Add test for TDLearner and optimise! function

f10a4ff

Refactor tabular approximators to force optimization functions specif…

862d08d

…ication

Update TabularApproximator constructors and tests

d080cf7

Update CartPoleEnv initialization to use Float32

11449b5

add missing test dependency to ci

845aad4

fix tests

527f3e7

fix missing arg

aecc739

Add ReinforcementLearningFarm package to development dependencies

98d8831

jeremiahpslewis requested a review from HenriDeh March 18, 2024 10:26

HenriDeh reviewed Mar 18, 2024

View reviewed changes

Jeremiah Lewis added 9 commits March 19, 2024 10:59

per pr review

8f9b606

fix type signature

6c96eaa

move tdlearner to rlcore

7cee80e

rewire qlearning

e9af3b7

fix tests

87760ff

naming

b1c5c12

fixes

5aed0b2

Update learners.jl and related files

1366548

add note

60326de

jeremiahpslewis requested a review from HenriDeh March 19, 2024 16:49

add missing q-learning tests

a0330b8

jeremiahpslewis enabled auto-merge (squash) March 19, 2024 20:03

HenriDeh approved these changes Mar 20, 2024

View reviewed changes

jeremiahpslewis merged commit bf37d4d into main Mar 20, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SARS tdlearning back to lib #1050

Add SARS tdlearning back to lib #1050

jeremiahpslewis commented Mar 15, 2024 •

edited

Loading

codecov bot commented Mar 15, 2024 •

edited

Loading

HenriDeh left a comment

Add SARS tdlearning back to lib #1050

Add SARS tdlearning back to lib #1050

Conversation

jeremiahpslewis commented Mar 15, 2024 • edited Loading

codecov bot commented Mar 15, 2024 • edited Loading

Codecov Report

HenriDeh left a comment

Choose a reason for hiding this comment

jeremiahpslewis commented Mar 15, 2024 •

edited

Loading

codecov bot commented Mar 15, 2024 •

edited

Loading