@@ -12,30 +12,23 @@ GPU Acceleration (GPU4PySCF)
12
12
Introduction
13
13
============
14
14
15
- Modern GPUs accelerate quantum chemistry calculation significantly, but also have an advantage in cost saving `[1] `_.
15
+ Modern GPUs accelerate quantum chemistry calculation significantly, but also have an advantage in cost saving `[1] `_ ` [2] `_ .
16
16
Some of basic PySCF modules, such as SCF and DFT, are accelerated with GPU via a plugin package
17
17
GPU4PySCF (See the end of this page for the supported functionalities). For the density fitting scheme,
18
18
GPU4PySCF on A100-80G can be 1000x faster than PySCF on single-core CPU. The speedup of direct SCF scheme is relatively low.
19
19
20
- .. _[1] : https://arxiv.org/abs/2404.09452
20
+ .. _[1] : https://arxiv.org/abs/2407.09700
21
+ .. _[2] : https://arxiv.org/abs/2404.09452
21
22
22
23
Installation
23
24
============
24
25
The binary package of GPU4PySCF is released based on the CUDA version.
25
26
26
- .. list-table ::
27
- :widths: 25 25 25 25
28
- :header-rows: 1
29
-
30
- * - CUDA version
31
- - GPU4PySCF
32
- - cuTensor
33
- * - CUDA 11.x
34
- - ``pip3 install gpu4pyscf-cuda11x ``
35
- - ``pip3 install cutensor-cu11 ``
36
- * - CUDA 12.x
37
- - ``pip3 install gpu4pyscf-cuda12x ``
38
- - ``pip3 install cutensor-cu12 ``
27
+ ============ =================================== ==============================
28
+ CUDA version GPU4PySCF cuTensor
29
+ CUDA 11.x ``pip3 install gpu4pyscf-cuda11x `` ``pip3 install cutensor-cu11 ``
30
+ CUDA 12.x ``pip3 install gpu4pyscf-cuda12x `` ``pip3 install cutensor-cu12 ``
31
+ ============ =================================== ==============================
39
32
40
33
Usage of GPU4PySCF
41
34
==================
@@ -44,7 +37,7 @@ classes and methods in GPU4PySCF are named identically to those in PySCF,
44
37
ensuring a familiar interface for users. However, GPU4PySCF classes do not
45
38
directly inherit from PySCF classes.
46
39
47
- PySCF objects and GPU4PySCF objects can be converted to each other using the `to_gpu ` and `to_cpu ` methods.
40
+ PySCF objects and GPU4PySCF objects can be converted to each other using the :func: `to_gpu ` and :func: `to_cpu ` methods.
48
41
The conversion process can automatically, recursively translate all attributes between GPU and CPU instances.
49
42
For example, numpy arrays on the CPU are converted into CuPy arrays on the GPU, and vice versa.
50
43
If certain attributes are exclusive to either the CPU or GPU, these attributes will be appropriately handled.
@@ -94,7 +87,7 @@ There are two approaches to execute the computation on GPU.
94
87
When the GPU task is done, the GPU4PySCF object can be converted into the corresponding PySCF object via :func: `mf.to_cpu() `.
95
88
96
89
In GPU4PySCF, wavefunctions, density matrices, and other array data are stored in CuPy arrays.
97
- To transfer these data to NumPy arrays on the CPU, the `.get() ` method of the CuPy array can be invoked.
90
+ To transfer these data to NumPy arrays on the CPU, the :func: `.get() ` method of the CuPy array can be invoked.
98
91
For more detailed information on handling CuPy array conversions, please refer to the `CuPy APIs ` documentation.
99
92
100
93
.. Cupy APIs: https://docs.cupy.dev/en/stable/user_guide/index.html
@@ -124,75 +117,36 @@ orbitals followed by orbital localization using the Boys method on the CPU::
124
117
mf = mf.to_cpu()
125
118
loc_orb = lo.Boys(mol, mf.mo_coeff[:,[2,3,4]]).kernel()
126
119
127
- **GPU Implementation Availability **: The `to_gpu ` method is implemented for
120
+ **GPU Implementation Availability **: The :func: `to_gpu ` method is implemented for
128
121
almost all methods in PySCF. However, the actual availability of GPU4PySCF
129
122
implementations for specific modules may vary. If a GPU4PySCF module is
130
- available, `to_gpu ` will return a GPU4PySCF instance. Otherwise, it will raise a
131
- `NotImplementedError `.
123
+ available, :func: `to_gpu ` will return a GPU4PySCF instance. Otherwise, it will raise a
124
+ :func: `NotImplementedError `.
132
125
133
126
Functionalities supported by GPU4PySCF
134
127
======================================
135
- .. list-table ::
136
- :widths: 25 25 25 25
137
- :header-rows: 1
138
-
139
- * - Method
140
- - SCF
141
- - Gradient
142
- - Hessian
143
- * - direct SCF
144
- - O
145
- - GPU
146
- - CPU
147
- * - density fitting
148
- - O
149
- - O
150
- - O
151
- * - LDA
152
- - O
153
- - O
154
- - O
155
- * - GGA
156
- - O
157
- - O
158
- - O
159
- * - mGGA
160
- - O
161
- - O
162
- - O
163
- * - hybrid
164
- - O
165
- - O
166
- - O
167
- * - unrestricted
168
- - O
169
- - O
170
- - O
171
- * - PCM solvent
172
- - GPU
173
- - GPU
174
- - FD
175
- * - SMD solvent
176
- - GPU
177
- - GPU
178
- - FD
179
- * - dispersion correction
180
- - CPU*
181
- - CPU*
182
- - FD
183
- * - nonlocal correlation
184
- - O
185
- - O
186
- - NA
187
- * - ECP
188
- - CPU
189
- - CPU
190
- - CPU
191
- * - MP2
192
- - GPU
193
- - CPU
194
- - CPU
195
- * - CCSD
196
- - GPU
197
- - CPU
198
- - NA
128
+
129
+ ====================== ===== ========= =========
130
+ Method SCF Gradient Hessian
131
+ direct SCF O GPU CPU
132
+ density fitting O O O
133
+ LDA O O O
134
+ GGA O O O
135
+ mGGA O O O
136
+ hybrid O O O
137
+ unrestricted O O O
138
+ PCM solvent GPU GPU FD
139
+ SMD solvent GPU GPU FD
140
+ dispersion correction CPU* CPU* FD
141
+ nonlocal correlation O O NA
142
+ ECP CPU CPU CPU
143
+ MP2 GPU CPU CPU
144
+ CCSD GPU CPU NA
145
+ ====================== ===== ========= =========
146
+
147
+ - ‘O’: carefully optimized for GPU.
148
+ - ‘CPU’: only cpu implementation.
149
+ - ‘GPU’: drop-in replacement or naive implementation.
150
+ - ‘FD’: use finite-difference gradient to approximate the exact Hessian matrix.
151
+ - ’NA’: not available.
152
+ - ‘CPU*’: DFTD3 [100]/DFTD4 [101] on CPU.
0 commit comments