Description
Here's a list of issues for discussion regarding the DiscreteDP
class.
-
This is the major issue: There is an intrinsic limitation of the current approach of
DiscreteDP
, that one has to prepare arraysR
andQ
(ands_indices
anda_indices
with the "state-action pairs formulation") in advance. If the problem is large, the memory can become full upon creating those arrays (before passing them toDiscreteDP
). For example, I tried to setup the problem given in Comparison-Programming-Languages-Economics, where there are 17,820 x 5 = 89,100 states and 17,820 actions, so there are 89,100 x 17,820 ~ 1.6 x 10^9 state-action pairs. But I haven't been able to creates_indices
(size 1.6 x 10^9),a_indices
(size 1.6 x 10^9),R
(size1.6 x 10^9), andQ
(3 nonzero entries for each of 1.6 x 10^9 pairs) without having the memory full.One (and only?) resolution would be to have
DiscreteDP
accept functions (callables) forR
andQ
. -
With the "state-action pairs formulation",
DiscreteDP
should work without thes_indices
anda_indices
arrays when all the actions are available at all states (otherwise they should be necessary). -
Would we want a finite horizon algorithm?
- The current implementation does not accept
beta=1
because of the convergence/zero-division issues.
EDIT: Done by DiscreteDP: Allow beta=1 #244.
- The current implementation does not accept
-
Are there other important informations to store in
DPSolveResult
? -
The method operator_iteration may be replaced with
compute_fixed_point
, whereas the latter does not exactly match here. See also ENH: make calls tocompute_fixed_point
allocate less memory #40 and ENH: moved the lucas_tree tuple to a new LucasTree class #41. -
Inputs checking is not exhaustive.
- It is not checked that the entries of
Q
are nonnegative and sum to one for each state-action pair. - For the "state-action pairs formulation", it is not checked that there are no duplicate action indices for the same state. (For this, it may be better to write custom code to sort
s_indices
anda_indices
(when the inputs are not sorted), which currently is delegated tospcipy.sparse.csr_matrix
.)
If we check everything, maybe better to supply an option to skip the checks for the case when the user is sure that the inputs are correct and creates an instance many times in a loop? Or, maybe should collect the checking procedures in a method and assume that the user calls it manually?
- It is not checked that the entries of
-
__repr__
and__str__
are missing. What do we want to print?