Papers to Review

## Sequence Modeling RL 
SOTA RL sequence modelers use transformers and attention which enable long range dependencies and parallel inference, however they have quadratic complexity in sequence length and hidden space dimensionality (lecture 13 below).
- **[CS885 Lecture 13](https://cs.uwaterloo.ca/~ppoupart/teaching/cs885-winter25/slides/cs885-lecture13.pdf)**
- **[Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752)**
- [Deep Transformer Q-Networks for Partially Observable Reinforcement Learning](https://arxiv.org/pdf/2206.01078)
- [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/pdf/2106.01345)
- [Efficiently Modeling Long Sequences with Structured State Spaces](https://arxiv.org/pdf/2111.00396)
- [HiPPO: Recurrent Memory with Optimal Polynomial Projections](https://arxiv.org/pdf/2008.07669)

## Partially Observable RL, DRQN
- **[CS885 Lecture 8b](https://cs.uwaterloo.ca/~ppoupart/teaching/cs885-winter25/slides/cs885-lecture8b.pdf)**
-  **[Deep Recurrent Q-Learning for Partially Observable MDPs](https://arxiv.org/abs/1507.06527)**


## Constrained RL
Constrained reinforcement learning ensures safety in RL, it helps to incorporate constrained objectives (lecture 10 below).
- [CS885 Lecture 10](https://cs.uwaterloo.ca/~ppoupart/teaching/cs885-winter25/slides/cs885-lecture10.pdf)


## Distributional RL
- [CS885 Lecture 9](https://cs.uwaterloo.ca/~ppoupart/teaching/cs885-winter25/slides/cs885-lecture9.pdf)


## Diabetes RL
- [A safe-enhanced fully closed-loop artificial pancreas controller based on deep reinforcement learning](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0317662)
- [An Improved Strategy for Blood Glucose Control Using Multi-Step Deep Reinforcement Learning](https://arxiv.org/html/2403.07566)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Papers to Review #10

Sequence Modeling RL

Partially Observable RL, DRQN

Constrained RL

Distributional RL

Diabetes RL

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Papers to Review #10

Description

Sequence Modeling RL

Partially Observable RL, DRQN

Constrained RL

Distributional RL

Diabetes RL

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions