Description
TEP - cuQuantum benchmark and integration
Author
@refraction-ray
Status
Draft
Created
2025-02-04
Abstract
This TEP proposes the benchmark and potential integration of NVIDIA cuQuantum libraries into TensorCircuit. This integration will involve benchmarking TensorCircuit and cuQuantum performance, developing interfaces in TensorCircuit to leverage cuQuantum’s optimized functionalities.
Motivation and Scope
TensorCircuit currently provides a versatile platform for quantum circuit simulation, leveraging tensor network techniques for efficient computation. One key question is whether TC provides similar performance compared to optimized cuQuantum package. A set of systematic and carefully designed benchmarks on GPU is necessary. Also note the cuQuantum may be very fragile to support AD/VMAP/JIT features in TC, which is a huge weakness for cuQuantum package.
This TEP aims to:
- Benchmark current TensorCircuit performance: Establish a baseline for performance comparison on relevant quantum circuit simulation tasks using existing TensorCircuit functionalities.
- Benchmark and integrate cuStateVec for state-vector simulations: Develop an interface within TensorCircuit to utilize cuStateVec for state-vector based circuit simulations.
- Benchmark and integrate cuTensorNet for tensor network contraction: Explore and implement integration strategies for cuTensorNet to accelerate tensor network contraction within TensorCircuit.
- Provide a user-friendly interface: Ensure that utilizing cuQuantum is straightforward for TensorCircuit users.
Usage and Impact
Users will be able to leverage cuQuantum acceleration in TensorCircuit by selecting a cuQuantum backend option when creating or running circuits. The exact interface is subject to implementation details, but the goal is to make it as seamless as possible.
Example Usage (Illustrative - API may change):
import tensorcircuit as tc
import numpy as np
# Create a circuit as usual
n_qubits = 20
c = tc.Circuit(n_qubits)
print(tc.cuquantum.expectation_ps(c, z=[1], modes=..., **kws))
Backward compatibility
Related Work
Some hints on the performance from Nvidia side: https://thequantuminsider.com/2023/12/22/nvidia-cuquantum-23-10-accelerating-quantum-computing-with-enhanced-sdk/
Implementation
Before commencing integration, a comprehensive benchmarking phase is essential to establish a performance baseline and accurately measure the impact of cuQuantum. Only if the performance gain is promising, we start the integration phase, where a key focus will be on addressing potential compatibility issues with TensorCircuit's Automatic Differentiation (AD), Just-In-Time compilation (JIT), and Vectorized Map (VMAP) features. We acknowledge that cuQuantum's current capabilities may pose challenges for seamless AD/JIT/VMAP integration.