Description
The types in built-in policies and algorithms like QBasedPolicy
and TDLearner
are overly specific and prevent users from using the existing code to extend to new algorithms. Rather, it forces users to rewrite large chunks of code.
For example, QBasedPolicy
is defined as struct QBasedPolicy{L<:TDLearner,E<:AbstractExplorer} <: AbstractPolicy
and all the methods for it similarly. Therefore, I cannot write a new learner and use it in a QBasedPolicy
, even though all the methods for it seem to be very general.
Another example is TDLearner
which is defined as Base.@kwdef mutable struct TDLearner{M,A} <: AbstractLearner where {A<:TabularApproximator,M<:Symbol}
. However, the constructor for it only allows M=:SARS
. This makes me have to rewrite the whole struct if I want to write a new TD learning algorithm or if I want to use a different kind of approximation (e.g linear).
In my opinion, these restrictions should be removed and they should be replaced with general types such as AbstractLearner
.