Skip to content

tkgaolol/RL_sekiro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

RL Sekiro

An advanced reinforcement learning project that trains AI agents to play Sekiro: Shadows Die Twice using Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) algorithms.

Acknowledgement

This project is inspired by the following repositories:

🎮 Project Overview

This project implements reinforcement learning agents that can learn to play Sekiro: Shadows Die Twice by:

  • Capturing and processing game screens in real-time
  • Detecting game states (health, boss health, rigidity meters)
  • Executing actions using simulated keyboard inputs
  • Learning optimal strategies through DQN and PPO algorithms

🚀 Features

  • Multiple RL Algorithms: Support for both DQN and PPO training
  • Real-time Game Integration: Direct screen capture and keyboard control
  • Advanced State Detection: Health bars, rigidity meters, and game state recognition
  • Configurable Training: Extensive hyperparameter configuration
  • Model Management: Save, load, and manage trained models
  • TensorBoard Logging: Comprehensive training visualization
  • Training/Testing Modes: Separate optimized configurations

🛠️ Installation

  1. Clone the repository:
git clone https://github.com/tkgaolol/RL_sekiro.git
cd RL_sekiro
  1. Install dependencies:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install opencv-python numpy pywin32 tensorboard
  1. Game Setup:
    • Install Sekiro: Shadows Die Twice
      • Set game to windowed mode
      • Adjust game resolution to 1024*576
      • Set keyboard to use
        • Y for reset horizon or fix to target
        • J for attach
        • M for defense
    • Install Fling Trainer: https://flingtrainer.com/trainer/sekiro-shadows-die-twice-trainer/
      • Set infinite resurrections and resurrection no cooldown
    • Use env_example.py to test the game

🎯 Usage

DQN Training

Start DQN training from scratch:

cd src/algorithm/dqn
python training.py

Train with specific number of episodes:

python training.py --episodes 1000

Continue training from latest model:

python training.py --latest

Load specific model:

python training.py --model model_episode_1000.pth

List available models:

python training.py --list-models

DQN Testing

Test with latest trained model:

cd src/algorithm/dqn
python testing.py --latest

Test specific model:

python testing.py --model model_episode_1000.pth

Test with pure exploitation (no exploration):

python testing.py --latest --no-exploration

PPO Training

Start PPO training:

cd src/algorithm/ppo
python training.py

PPO training with options:

python training.py --episodes 2000 --latest

PPO Testing

Test PPO model:

cd src/algorithm/ppo
python testing.py --model path/to/model.pth --episodes 5

⚙️ Configuration

Key Configuration Files

  • src/config.py: Main configuration file containing:
    • Image processing parameters
    • Training hyperparameters
    • Reward system settings
    • Game state detection thresholds

Important Parameters

# Training Resolution (faster training)
TRAIN_WIDTH = 120
TRAIN_HEIGHT = 125

# Testing Resolution (better quality)
TEST_WIDTH = 480
TEST_HEIGHT = 500

# Action Space
ACTION_SIZE = 6  # [nothing, attack, jump, defense, dodge, forward]

# Training Episodes
EPISODES = 3000

# Reward System
DEATH_PENALTY = -30
BOSS_KILL_REWARD = 50
BOSS_DAMAGE_REWARD = 15

🎮 Game Actions

The AI can perform the following actions:

  • Movement: Forward movement
  • Combat: Attack, Defense, Dodge
  • Navigation: Jump
  • Idle: Nothing (no action)

🧠 Environment Components

ObservationManager

  • Real-time screen capture
  • Image preprocessing and normalization
  • Game state detection (health, rigidity)

ActionManager

  • Keyboard input simulation
  • Action space definition
  • Action execution timing

RewardManager

  • Reward calculation based on game state changes
  • Health/damage tracking
  • Win/loss detection

GameController

  • Game restart functionality
  • Emergency break system
  • Training pause/resume controls

📊 Monitoring & Logging

TensorBoard

View training progress:

tensorboard --logdir dqn_logs
# or
tensorboard --logdir ppo_logs

Model Storage

  • Models saved in dqn_model/ and ppo_model/
  • Automatic timestamped sessions
  • Episode-based checkpoints

Controls During Training

  • Press 'T': Start/resume training
  • Press 'P': Pause training
  • Press 'ESC': Emergency stop

🏗️ Project Structure

DQN_play_sekiro/
├── src/
│   ├── config.py              # Main configuration
│   ├── model_manager.py       # Model management utilities
│   ├── env/                   # Environment components
│   │   ├── env_core.py        # Main environment class
│   │   ├── observation.py     # Screen capture & processing
│   │   ├── actions.py         # Action management
│   │   ├── rewards.py         # Reward calculation
│   │   ├── game_control.py    # Game control utilities
│   │   └── directkeys.py      # Keyboard input simulation
│   └── algorithm/
│       ├── dqn/               # DQN implementation
│       │   ├── dqn.py         # DQN network & agent
│       │   ├── training.py    # Training script
│       │   └── testing.py     # Testing script
│       └── ppo/               # PPO implementation
│           ├── ppo.py         # PPO networks & agent
│           ├── training.py    # Training script
│           └── testing.py     # Testing script
├── dqn_model/                 # DQN model storage
├── dqn_logs/                  # DQN training logs
├── ppo_model/                 # PPO model storage
└── ppo_logs/                  # PPO training logs

⚠️ Disclaimer

This project is for educational and research purposes only. Use responsibly and in accordance with the game's terms of service.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages