Skip to content

cuQuantum benchmark and integration #11

Open
@refraction-ray

Description

@refraction-ray

TEP - cuQuantum benchmark and integration

Author
@refraction-ray

Status
Draft

Created
2025-02-04

Abstract

This TEP proposes the benchmark and potential integration of NVIDIA cuQuantum libraries into TensorCircuit. This integration will involve benchmarking TensorCircuit and cuQuantum performance, developing interfaces in TensorCircuit to leverage cuQuantum’s optimized functionalities.

Motivation and Scope

TensorCircuit currently provides a versatile platform for quantum circuit simulation, leveraging tensor network techniques for efficient computation. One key question is whether TC provides similar performance compared to optimized cuQuantum package. A set of systematic and carefully designed benchmarks on GPU is necessary. Also note the cuQuantum may be very fragile to support AD/VMAP/JIT features in TC, which is a huge weakness for cuQuantum package.

This TEP aims to:

  • Benchmark current TensorCircuit performance: Establish a baseline for performance comparison on relevant quantum circuit simulation tasks using existing TensorCircuit functionalities.
  • Benchmark and integrate cuStateVec for state-vector simulations: Develop an interface within TensorCircuit to utilize cuStateVec for state-vector based circuit simulations.
  • Benchmark and integrate cuTensorNet for tensor network contraction: Explore and implement integration strategies for cuTensorNet to accelerate tensor network contraction within TensorCircuit.
  • Provide a user-friendly interface: Ensure that utilizing cuQuantum is straightforward for TensorCircuit users.

Usage and Impact

Users will be able to leverage cuQuantum acceleration in TensorCircuit by selecting a cuQuantum backend option when creating or running circuits. The exact interface is subject to implementation details, but the goal is to make it as seamless as possible.

Example Usage (Illustrative - API may change):

import tensorcircuit as tc
import numpy as np

# Create a circuit as usual
n_qubits = 20
c = tc.Circuit(n_qubits)
print(tc.cuquantum.expectation_ps(c, z=[1], modes=..., **kws))

Backward compatibility

Related Work

Some hints on the performance from Nvidia side: https://thequantuminsider.com/2023/12/22/nvidia-cuquantum-23-10-accelerating-quantum-computing-with-enhanced-sdk/

Implementation

Before commencing integration, a comprehensive benchmarking phase is essential to establish a performance baseline and accurately measure the impact of cuQuantum. Only if the performance gain is promising, we start the integration phase, where a key focus will be on addressing potential compatibility issues with TensorCircuit's Automatic Differentiation (AD), Just-In-Time compilation (JIT), and Vectorized Map (VMAP) features. We acknowledge that cuQuantum's current capabilities may pose challenges for seamless AD/JIT/VMAP integration.

Alternatives

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    ideaIdeas for enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions