Reinforcement Learning in Finance

Description

CAIS++ Spring 2019 Project: Building an Agent to Trade with Reinforcement Learning

Timeline

February 3rd:
- In meeting:
  - First Meeting, Environment Set-Up, DQN Explanation, Project Planning
- Homework:
  - Read first three chapter of Spinning Up
  - Watch the first half of Stanford RL Lecture
  - Code up your own LSTM on the AAPL data. Check each other's out for inspiration, find online resources, ask questions in the slack. Should be a useful exercise for everyone!
  - (Optional: Watch LSTM Intro Video)
February 10th: Working LSTM Model
- State:
  - Current Stock Price
  - percent change from (n-1) to n open
- Action Space: Buy, Sell, Hold (3-dimensional continuous) as percentage of bankroll
- Reward:
  - Life Span (define maximum length agent can interact with environment)
    - Receives reward based on profit/loss at the end
    - Sparse reward, harder to train
- Punishment
  - Test extra severe punishment for losses
  - set thresh-hold time before it can trade again based on punishment
- Architecture of model
  - One day of encoding LSTM
    - Observation, no actions taken
  - Second day of Policy Net
    - Based on methodology learned from encoding
    - Actions are taken
  - Two-day batches
- Model Dimensions
  - Encoding LSTM
    - #layers of LSTM, #layers of FCs
    - input size, hidden units size, encoding vector size
  - Policy LSTM
    - input size (state space size)
    - output size (action space size: 3d continuous)
- Homework:
  - Jincheng and Yang: Begin building Encoding / Policy Net Models
  - Chris: Look through Andrew's current LSTM model
  - Grant: Do the preprocessing data
  - Tomas: - Continue working on RL architecture - Make graph of prices + volume over batch - Visualize price gaps
- Pre-Process Data
- Visualization
- Gym Trading Environment
- Integrate LSTM into DDDQN
February 18th: Working DQN
- Done for homework
  - Built first policy gradient model (Jincheng)
  - Worked on data pre-processing (Tomas)
- Today's plan
  - Data pre-processing
  - Use data as input into the gym
  - Finalize the model
February 24th: Work day
- Finish pre-processing
- Finish trading gym
  - simulate.py
    - Change action 'quit' to quit when timeseries ends
    - change time series to remove seconds
  - series_env.py
    - in class seriesenv
      - do not need daily_start_time, daily_end_time
      - remove randomization of start index (in def 'seed')
- Finish pipelining
February 28th: Hackathon
- TODO
  - Review the current reward function in series_env
  - Finish building the dataset
  - Merge dataset with environment and test
  - Begin building the model
  - Create sine wave csv for testing
March 3rd:
- Work on implementing LSTM (Chris & Caroline)
- Create test datasets (Grant)
- Integrate dataset with gym (Yang & Jincheng)
March 10th: Add details like trading costs, slippage, and ask-bid spread

Outstanding To Do's

Working Actor-Critic Model
Add trading costs, slippage, ask-bid spread
Compute performance statistics and visualization
Build back testing environment
Integrate NLP sentiment analysis as feature
Add more indicators to model
Clean up README
Do we hold positions overnight? I think initially no. There are also weird jumps over holidays and weekends.
Take into account high, low, close, volume data

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.idea		.idea
Backtester		Backtester
LSTMs		LSTMs
RL demos		RL demos
Supervised		Supervised
data		data
model		model
pre-processing		pre-processing
tradinggym		tradinggym
.gitignore		.gitignore
README.md		README.md
training.py		training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning in Finance

Description

Timeline

Outstanding To Do's

Interesting Resources

Reinforcement Learning Education

Papers

Primary

Secondary

Medium Articles

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning in Finance

Description

Timeline

Outstanding To Do's

Interesting Resources

Reinforcement Learning Education

Papers

Primary

Secondary

Medium Articles

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages