Deep Reinforcement Learning for Qubit State Preparation

This project explores using Deep Reinforcement Learning (PPO) to control quantum systems. We simulate the quantum physics using QuTiP and wrap it in a Gymnasium environment.

The core research question is: Can an AI agent learn to prepare complex entangled states where analytical solutions are difficult?

📂 Project Structure

project/
  ├── main.py              # Unified Entry Point
  ├── src/                 # Source code
  │   ├── simulation.py    # Physics (N-Qubit Lindblad Master Equation)
  │   ├── environment.py   # Gym Wrapper (Curriculum, Dynamic Actions)
  │   ├── train.py         # Training Logic (Curriculum Learning)
  │   ├── benchmark.py     # Visualization & Robustness Check
  ├── data/                # Artifacts
  │   ├── models/          # Trained PPO agents
  │   ├── plots/           # Performance plots

⚛️ Physics & Control

The agent controls a system of $N$ qubits.

Hamiltonian

For $N=2$, the system includes local drives and an Ising interaction:

$$ H(t) = \sum*{i=1}^2 H_i(t) + \frac{J}{2} \sigma*{z,1} \sigma_{z,2} $$

The agent outputs a 4D action vector at every time step:

Amplitude 1 ($\Omega_1$)
Phase 1 ($\phi_1$)
Amplitude 2 ($\Omega_2$)
Phase 2 ($\phi_2$)

We also model Crosstalk (Leakage) and T1 Relaxation Noise.

🚀 Installation

pip install -r requirements.txt

(Requires qutip, gymnasium, stable-baselines3, numpy, matplotlib, tensorboard)

🏃 Usage

The project uses a Unified Pipeline. A single command runs the entire sequence: Train $\rightarrow$ Visualize $\rightarrow$ Benchmark.

1. Single Qubit ($N=1$)

Task: Prepare Superposition $|+\rangle$ from $|0\rangle$.

python main.py --n_qubits 1

2. Two Qubits ($N=2$)

Task: Prepare Entangled Bell State $|\Phi^+\rangle = \frac{|00\rangle + |11\rangle}{\sqrt{2}}$.

python main.py --n_qubits 2 --steps 200000

Note: N=2 uses Curriculum Learning. It first learns to flip qubits ($|11\rangle$), then learns to entangle them ($|\Phi^+\rangle$). We recommend 200k+ steps for convergence.

📊 Benchmarks

The pipeline automatically generates plots in data/plots/:

n1_robustness.png: AI vs Analytical Pulse under T1 noise.
n2_robustness.png: AI Entanglement Fidelity vs Noise.
n2_pulse.png: The pulse shape discovered by the AI.

Key Results

Robustness: The RL agent learns "robust" pulses that are resilient to noise, maintaining moderate fidelity even in high-decoherence regimes.
Curriculum: Layering the learning process (Pulse Control $\rightarrow$ Interaction Timing) is crucial for converging on the Bell State.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data/exports		data/exports
src		src
.gitignore		.gitignore
2105.09902v2.pdf		2105.09902v2.pdf
README.md		README.md
TL_DR RL-Qubit-State.md		TL_DR RL-Qubit-State.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Reinforcement Learning for Qubit State Preparation

📂 Project Structure

⚛️ Physics & Control

Hamiltonian

🚀 Installation

🏃 Usage

1. Single Qubit ($N=1$)

2. Two Qubits ($N=2$)

📊 Benchmarks

Key Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deep Reinforcement Learning for Qubit State Preparation

📂 Project Structure

⚛️ Physics & Control

Hamiltonian

🚀 Installation

🏃 Usage

1. Single Qubit ($N=1$)

2. Two Qubits ($N=2$)

📊 Benchmarks

Key Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages