You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/docs/parallel_execution.md
+13-8Lines changed: 13 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,16 +77,21 @@ When mapping multiple circuits onto disjoint qubit regions of a single QPU, the
77
77
78
78
This approach runs in under 1 second (vs. 183s for full partitioning) because it avoids the expensive operations: no Floyd-Warshall (O(N³)), no SABRE mapping iterations, and no custom hardware initialization. The error rate data is read directly from the backend object, which is already loaded. The Qiskit transpiler handles routing within each partition at `optimization_level=1`.
79
79
80
-
**Results on ibm_fez (156 qubits, QFT benchmark at 4 qubits, 3 circuits):**
The lightweight approach achieves within 2% of free-transpiler fidelity at negligible computational cost. The key insight is that reading 2-qubit gate error rates from the backend target is essentially free, and even a simple scoring pass using this data dramatically improves partition quality — the difference between selecting low-error qubit neighborhoods (avg gate error ~0.002) versus blindly using edge-of-chip qubits with potentially much higher error rates.
90
+
For comparison, sequential allocation (qubits starting from 0) produced 0.32–0.56 fidelity, and the full error-aware partitioning approach took 183 seconds per run.
91
+
92
+
The lightweight approach achieves 83–98% of free-transpiler fidelity at negligible computational cost (~1-2 seconds). The key insight is that reading 2-qubit gate error rates from the backend target is essentially free, and even a simple scoring pass using this data dramatically improves partition quality. Scoring also considers subgraph diameter (lower = shorter SWAP paths) and internal edge count (more edges = better routing options), which helps avoid chain-shaped partitions on heavy-hex topologies.
93
+
94
+
For wider or deeper circuits where the transpiler may route through qubits outside the assigned partition, the system automatically falls back to pre-transpilation onto restricted coupling maps, trading some fidelity for guaranteed execution. See `doc/_design/parallel_partition_mapping_tech_note.md` for full technical details.
90
95
91
96
## Distributed Statevector Execution — Run Larger Circuits
0 commit comments