Skip to content

Commit 909a751

Browse files
authored
Merge pull request #149 from wxj6000/patch-2
Improve the description of gpu4pyscf
2 parents b063802 + 302668b commit 909a751

File tree

1 file changed

+38
-84
lines changed

1 file changed

+38
-84
lines changed

source/user/gpu.rst

+38-84
Original file line numberDiff line numberDiff line change
@@ -12,30 +12,23 @@ GPU Acceleration (GPU4PySCF)
1212
Introduction
1313
============
1414

15-
Modern GPUs accelerate quantum chemistry calculation significantly, but also have an advantage in cost saving `[1]`_.
15+
Modern GPUs accelerate quantum chemistry calculation significantly, but also have an advantage in cost saving `[1]`_ `[2]`_.
1616
Some of basic PySCF modules, such as SCF and DFT, are accelerated with GPU via a plugin package
1717
GPU4PySCF (See the end of this page for the supported functionalities). For the density fitting scheme,
1818
GPU4PySCF on A100-80G can be 1000x faster than PySCF on single-core CPU. The speedup of direct SCF scheme is relatively low.
1919

20-
.. _[1]: https://arxiv.org/abs/2404.09452
20+
.. _[1]: https://arxiv.org/abs/2407.09700
21+
.. _[2]: https://arxiv.org/abs/2404.09452
2122

2223
Installation
2324
============
2425
The binary package of GPU4PySCF is released based on the CUDA version.
2526

26-
.. list-table::
27-
:widths: 25 25 25 25
28-
:header-rows: 1
29-
30-
* - CUDA version
31-
- GPU4PySCF
32-
- cuTensor
33-
* - CUDA 11.x
34-
- ``pip3 install gpu4pyscf-cuda11x``
35-
- ``pip3 install cutensor-cu11``
36-
* - CUDA 12.x
37-
- ``pip3 install gpu4pyscf-cuda12x``
38-
- ``pip3 install cutensor-cu12``
27+
============ =================================== ==============================
28+
CUDA version GPU4PySCF cuTensor
29+
CUDA 11.x ``pip3 install gpu4pyscf-cuda11x`` ``pip3 install cutensor-cu11``
30+
CUDA 12.x ``pip3 install gpu4pyscf-cuda12x`` ``pip3 install cutensor-cu12``
31+
============ =================================== ==============================
3932

4033
Usage of GPU4PySCF
4134
==================
@@ -44,7 +37,7 @@ classes and methods in GPU4PySCF are named identically to those in PySCF,
4437
ensuring a familiar interface for users. However, GPU4PySCF classes do not
4538
directly inherit from PySCF classes.
4639

47-
PySCF objects and GPU4PySCF objects can be converted to each other using the `to_gpu` and `to_cpu` methods.
40+
PySCF objects and GPU4PySCF objects can be converted to each other using the :func:`to_gpu` and :func:`to_cpu` methods.
4841
The conversion process can automatically, recursively translate all attributes between GPU and CPU instances.
4942
For example, numpy arrays on the CPU are converted into CuPy arrays on the GPU, and vice versa.
5043
If certain attributes are exclusive to either the CPU or GPU, these attributes will be appropriately handled.
@@ -94,7 +87,7 @@ There are two approaches to execute the computation on GPU.
9487
When the GPU task is done, the GPU4PySCF object can be converted into the corresponding PySCF object via :func:`mf.to_cpu()`.
9588

9689
In GPU4PySCF, wavefunctions, density matrices, and other array data are stored in CuPy arrays.
97-
To transfer these data to NumPy arrays on the CPU, the `.get()` method of the CuPy array can be invoked.
90+
To transfer these data to NumPy arrays on the CPU, the :func:`.get()` method of the CuPy array can be invoked.
9891
For more detailed information on handling CuPy array conversions, please refer to the `CuPy APIs` documentation.
9992

10093
.. Cupy APIs: https://docs.cupy.dev/en/stable/user_guide/index.html
@@ -124,75 +117,36 @@ orbitals followed by orbital localization using the Boys method on the CPU::
124117
mf = mf.to_cpu()
125118
loc_orb = lo.Boys(mol, mf.mo_coeff[:,[2,3,4]]).kernel()
126119

127-
**GPU Implementation Availability**: The `to_gpu` method is implemented for
120+
**GPU Implementation Availability**: The :func:`to_gpu` method is implemented for
128121
almost all methods in PySCF. However, the actual availability of GPU4PySCF
129122
implementations for specific modules may vary. If a GPU4PySCF module is
130-
available, `to_gpu` will return a GPU4PySCF instance. Otherwise, it will raise a
131-
`NotImplementedError`.
123+
available, :func:`to_gpu` will return a GPU4PySCF instance. Otherwise, it will raise a
124+
:func:`NotImplementedError`.
132125

133126
Functionalities supported by GPU4PySCF
134127
======================================
135-
.. list-table::
136-
:widths: 25 25 25 25
137-
:header-rows: 1
138-
139-
* - Method
140-
- SCF
141-
- Gradient
142-
- Hessian
143-
* - direct SCF
144-
- O
145-
- GPU
146-
- CPU
147-
* - density fitting
148-
- O
149-
- O
150-
- O
151-
* - LDA
152-
- O
153-
- O
154-
- O
155-
* - GGA
156-
- O
157-
- O
158-
- O
159-
* - mGGA
160-
- O
161-
- O
162-
- O
163-
* - hybrid
164-
- O
165-
- O
166-
- O
167-
* - unrestricted
168-
- O
169-
- O
170-
- O
171-
* - PCM solvent
172-
- GPU
173-
- GPU
174-
- FD
175-
* - SMD solvent
176-
- GPU
177-
- GPU
178-
- FD
179-
* - dispersion correction
180-
- CPU*
181-
- CPU*
182-
- FD
183-
* - nonlocal correlation
184-
- O
185-
- O
186-
- NA
187-
* - ECP
188-
- CPU
189-
- CPU
190-
- CPU
191-
* - MP2
192-
- GPU
193-
- CPU
194-
- CPU
195-
* - CCSD
196-
- GPU
197-
- CPU
198-
- NA
128+
129+
====================== ===== ========= =========
130+
Method SCF Gradient Hessian
131+
direct SCF O GPU CPU
132+
density fitting O O O
133+
LDA O O O
134+
GGA O O O
135+
mGGA O O O
136+
hybrid O O O
137+
unrestricted O O O
138+
PCM solvent GPU GPU FD
139+
SMD solvent GPU GPU FD
140+
dispersion correction CPU* CPU* FD
141+
nonlocal correlation O O NA
142+
ECP CPU CPU CPU
143+
MP2 GPU CPU CPU
144+
CCSD GPU CPU NA
145+
====================== ===== ========= =========
146+
147+
- ‘O’: carefully optimized for GPU.
148+
- ‘CPU’: only cpu implementation.
149+
- ‘GPU’: drop-in replacement or naive implementation.
150+
- ‘FD’: use finite-difference gradient to approximate the exact Hessian matrix.
151+
- ’NA’: not available.
152+
- ‘CPU*’: DFTD3 [100]/DFTD4 [101] on CPU.

0 commit comments

Comments
 (0)