Skip to content

DiscreteDP: Issues #185

Open
Open
@oyamad

Description

@oyamad

Here's a list of issues for discussion regarding the DiscreteDP class.

  1. This is the major issue: There is an intrinsic limitation of the current approach of DiscreteDP, that one has to prepare arrays R and Q (and s_indices and a_indices with the "state-action pairs formulation") in advance. If the problem is large, the memory can become full upon creating those arrays (before passing them to DiscreteDP). For example, I tried to setup the problem given in Comparison-Programming-Languages-Economics, where there are 17,820 x 5 = 89,100 states and 17,820 actions, so there are 89,100 x 17,820 ~ 1.6 x 10^9 state-action pairs. But I haven't been able to create s_indices (size 1.6 x 10^9), a_indices (size 1.6 x 10^9), R (size1.6 x 10^9), and Q (3 nonzero entries for each of 1.6 x 10^9 pairs) without having the memory full.

    One (and only?) resolution would be to have DiscreteDP accept functions (callables) for R and Q.

  2. With the "state-action pairs formulation", DiscreteDP should work without the s_indices and a_indices arrays when all the actions are available at all states (otherwise they should be necessary).

  3. Would we want a finite horizon algorithm?

    • The current implementation does not accept beta=1 because of the convergence/zero-division issues.

    EDIT: Done by DiscreteDP: Allow beta=1 #244.

  4. Are there other important informations to store in DPSolveResult?

  5. The method operator_iteration may be replaced with compute_fixed_point, whereas the latter does not exactly match here. See also ENH: make calls to compute_fixed_point allocate less memory #40 and ENH: moved the lucas_tree tuple to a new LucasTree class #41.

  6. Inputs checking is not exhaustive.

    • It is not checked that the entries of Q are nonnegative and sum to one for each state-action pair.
    • For the "state-action pairs formulation", it is not checked that there are no duplicate action indices for the same state. (For this, it may be better to write custom code to sort s_indices and a_indices (when the inputs are not sorted), which currently is delegated to spcipy.sparse.csr_matrix.)

    If we check everything, maybe better to supply an option to skip the checks for the case when the user is sure that the inputs are correct and creates an instance many times in a loop? Or, maybe should collect the checking procedures in a method and assume that the user calls it manually?

  7. __repr__ and __str__ are missing. What do we want to print?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions