Skip to content

Commit fb7db07

Browse files
QG-phyYJQ-7
andauthored
Feat/modular julia pardiso (#310)
* feat(postprocess): Add functionality to export system data for Pardiso/Julia band structure calculations and include related tests. * feat: simplify band calculation by using solve_eigen_at_k directly - Remove redundant solve_eigen_dense function and fixed band window logic - Replace custom eigenvalue solving with existing solve_eigen_at_k function - Update example notebook with new imports and execution outputs * docs: improve dptb to Pardiso example notebook - Fix root_dir path from Pardiso_teach to To_pardiso - Add sys.path manipulation for proper module imports - Expand Julia installation instructions with detailed steps - Add MKL and Pardiso environment variable configuration - Include troubleshooting section for common issues - Clear unnecessary cell outputs and update execution counts * Refactor Julia backend: Modular architecture with Dense/Pardiso solver support * feat(pardiso): enforce MKL Pardiso and add Apple Silicon compatibility check Replace generic PardisoSolver with MKLPardisoSolver for consistent performance and add explicit architecture check for Apple Silicon systems to prevent runtime failures with helpful guidance to use numpy solver instead. * feat(postprocess): implement modular Julia backend with HDF5 output - Implement `dos_calculation.jl` for Density of States calculations - Update `main.jl` to switch between `band` and `dos` tasks - Refactor `band_calculation.jl` to export `bandstructure.h5` using native HDF5 - Add incremental `bands.dat` text output for real-time tracking - Add verification notebook `examples/To_pardiso/dptb_to_Pardiso_new.ipynb` * Refactor: Modular Julia backend and Optimized JSON Schema - Refactored to use direct ASE integration in . - Implemented modular Julia backend structure in . - Optimized : compressed species, removed redundant orbital arrays, aligned formatting. - Updated example notebook with new API usage. - Improved logging: Output logs to both console and . - Renamed output directory to . * feat: Add `pdso` command-line entry point for running the Julia/Pardiso solver backend. * Refactor: Rename Julia backend to 'pardiso', add legacy support, and fix spin handling - Renamed `dptb/postprocess/julia` to `dptb/postprocess/pardiso`. - Updated `io.jl` to support legacy `.dat` files via `load_structure_dat`. - Fixed `load_structure_json` to correctly account for spin degeneracy. - Updated `pdso.py` and tests to reflect directory rename. * refactor: Make JSON-based Pardiso export the default `to_pardiso` method and rename the legacy text export to `to_pardiso_debug`. * update unit test * feat: Add a comprehensive Pardiso tutorial notebook and enhance the `pdso` entrypoint with ill-conditioned state projection parameters. * feat: Add Julia installation scripts and documentation for the Pardiso backend, including platform support details. * refactor: Rename solver functions for clarity, encapsulate Pardiso solver in a module, and add ill-conditioned projection and improved eigenvector handling to the dense solver. (#2) * fix: address pardiso review comments after merge * fix: address follow-up pardiso review comments * fix: stabilize pardiso solver cleanup scope * chore: remove CLAUDE.md from tracking and add to .gitignore * docs: extend pardiso examples * docs: align pardiso documentation --------- Co-authored-by: YJQ-7 <yjq7472@163.com>
1 parent 155e85e commit fb7db07

29 files changed

Lines changed: 3981 additions & 22 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,3 +186,4 @@ poetry.toml
186186

187187
# JetBrains
188188
.idea/
189+
CLAUDE.md

README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,45 @@ Installing **DeePTB** is straightforward with UV, a fast Python package manager.
180180
> [!TIP]
181181
> For easier installation with automatic GPU/CPU detection, use the **From Source** method above instead.
182182

183+
- **Julia Backend** (Optional - for High-Performance Pardiso Solver)
184+
185+
> [!NOTE]
186+
> **Platform Support**: Pardiso backend currently supports **Linux only**.
187+
> - **macOS**: Not supported (Intel MKL limitations)
188+
> - **Windows**: Use WSL2 (Windows Subsystem for Linux)
189+
190+
If you want to use the Pardiso backend for accelerated band structure calculations:
191+
192+
**Automated Installation** (Recommended):
193+
```bash
194+
./install_julia.sh
195+
```
196+
197+
**Manual Installation**:
198+
1. Install Julia:
199+
```bash
200+
# Linux (macOS can install Julia, but the Pardiso backend is not supported)
201+
curl -fsSL https://install.julialang.org | sh
202+
```
203+
2. Install required packages:
204+
```bash
205+
julia install_julia_packages.jl
206+
```
207+
208+
**Verify Installation**:
209+
```bash
210+
julia -e 'using Pardiso; println("Pardiso available: ", Pardiso.MKL_PARDISO_LOADED[])'
211+
```
212+
213+
**Usage**:
214+
```bash
215+
dptb pdso band.json -i model.pth -stu structure.vasp -o ./output
216+
```
217+
218+
For more details, see:
219+
- [Pardiso Backend README](dptb/postprocess/pardiso/README.md)
220+
- [Example Tutorial](examples/To_pardiso/README.md)
221+
183222
## Test code
184223

185224
To ensure the code is correctly installed, please run the unit tests first:

docs/pardiso_architecture.md

Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
"""
2+
Proposed architecture for Pardiso integration.
3+
4+
This document outlines a better design for integrating high-performance
5+
eigensolvers (like Pardiso) into DeePTB's postprocessing workflow.
6+
7+
## Current Issues
8+
9+
1. **Tight coupling**: Julia script is standalone, requires manual invocation
10+
2. **Data redundancy**: Full H(R), S(R) exported to files
11+
3. **Fragmented workflow**: Python → files → Julia → files → Python
12+
4. **Limited reusability**: Only works for band calculations
13+
14+
## Proposed Architecture
15+
16+
### Phase 1: Modular Julia Backend (Current PR)
17+
18+
```
19+
dptb/postprocess/pardiso/
20+
├── io/
21+
│ └── io.jl # Read structure.json/legacy .dat files and H5 matrices
22+
├── solvers/
23+
│ ├── dense_solver.jl # Dense LAPACK solver
24+
│ └── pardiso_solver.jl # Pardiso eigenvalue solver
25+
├── tasks/
26+
│ ├── band_calculation.jl # Band structure task
27+
│ └── dos_calculation.jl # DOS task
28+
├── utils/
29+
│ ├── hamiltonian.jl # H(R) -> H(k)
30+
│ └── kpoints.jl # K-point generation
31+
└── main.jl # Entry point (calls tasks)
32+
```
33+
34+
**Benefits**:
35+
- Modular: Easy to add new tasks (DOS, optical, etc.)
36+
- Testable: Each module can be unit tested
37+
- Maintainable: Clear separation of concerns
38+
39+
### Phase 2: Python Solver Abstraction
40+
41+
```python
42+
# dptb/postprocess/unified/solvers/base.py
43+
from abc import ABC, abstractmethod
44+
45+
class EigenSolver(ABC):
46+
\"\"\"Abstract base class for eigenvalue solvers.\"\"\"
47+
48+
@abstractmethod
49+
def solve_at_k(self, H_k: np.ndarray, S_k: np.ndarray,
50+
num_bands: int, **kwargs) -> Tuple[np.ndarray, np.ndarray]:
51+
\"\"\"
52+
Solve generalized eigenvalue problem H|ψ⟩ = E S|ψ⟩ at single k-point.
53+
54+
Returns
55+
-------
56+
eigenvalues : np.ndarray (num_bands,)
57+
eigenvectors : np.ndarray (norb, num_bands)
58+
\"\"\"
59+
pass
60+
61+
@abstractmethod
62+
def solve_batch(self, kpoints: np.ndarray, H_R: dict, S_R: dict,
63+
num_bands: int, **kwargs) -> np.ndarray:
64+
\"\"\"
65+
Solve for multiple k-points efficiently.
66+
67+
Returns
68+
-------
69+
eigenvalues : np.ndarray (nk, num_bands)
70+
\"\"\"
71+
pass
72+
73+
74+
# dptb/postprocess/unified/solvers/numpy_solver.py
75+
class NumpySolver(EigenSolver):
76+
\"\"\"Default numpy/scipy solver (current implementation).\"\"\"
77+
78+
def solve_at_k(self, H_k, S_k, num_bands, **kwargs):
79+
if S_k is None:
80+
evals, evecs = np.linalg.eigh(H_k)
81+
else:
82+
evals, evecs = scipy.linalg.eigh(H_k, S_k)
83+
return evals[:num_bands], evecs[:, :num_bands]
84+
85+
def solve_batch(self, kpoints, H_R, S_R, num_bands, **kwargs):
86+
eigenvalues = []
87+
for kpt in kpoints:
88+
H_k = self._construct_hk(kpt, H_R)
89+
S_k = self._construct_hk(kpt, S_R) if S_R else None
90+
evals, _ = self.solve_at_k(H_k, S_k, num_bands)
91+
eigenvalues.append(evals)
92+
return np.array(eigenvalues)
93+
94+
95+
# dptb/postprocess/unified/solvers/pardiso_solver.py
96+
class PardisoSolver(EigenSolver):
97+
\"\"\"Julia/Pardiso backend solver for large systems.\"\"\"
98+
99+
def __init__(self, julia_script_path=None, use_pyjulia=False):
100+
self.use_pyjulia = use_pyjulia
101+
if use_pyjulia:
102+
self._init_pyjulia()
103+
else:
104+
self.julia_script = julia_script_path or self._default_script_path()
105+
106+
def solve_batch(self, kpoints, H_R, S_R, num_bands, **kwargs):
107+
if self.use_pyjulia:
108+
# Direct Julia call (no file I/O)
109+
return self._solve_via_pyjulia(kpoints, H_R, S_R, num_bands)
110+
else:
111+
# File-based workflow (current approach)
112+
return self._solve_via_files(kpoints, H_R, S_R, num_bands)
113+
114+
def _solve_via_files(self, kpoints, H_R, S_R, num_bands):
115+
# Export data
116+
temp_dir = self._export_data(H_R, S_R, kpoints)
117+
# Call Julia script
118+
result = subprocess.run(['julia', self.julia_script, temp_dir])
119+
# Load results
120+
eigenvalues = np.load(os.path.join(temp_dir, 'eigenvalues.npy'))
121+
return eigenvalues
122+
123+
124+
# Usage in BandAccessor
125+
class BandAccessor:
126+
def __init__(self, tbsystem, solver='numpy'):
127+
self.tbsystem = tbsystem
128+
self.solver = get_solver(solver) # Factory function
129+
130+
def compute(self, **kwargs):
131+
hr, sr = self.tbsystem.calculator.get_hr(self.tbsystem.data)
132+
eigenvalues = self.solver.solve_batch(
133+
self.kpoints, hr, sr, self.num_bands
134+
)
135+
self.eigenvalues = eigenvalues
136+
return self
137+
```
138+
139+
**Benefits**:
140+
- Unified interface: All properties (band, DOS, optical) use same solver
141+
- Pluggable: Easy to add new solvers (cuSOLVER, ELPA, etc.)
142+
- Backward compatible: Default numpy solver maintains current behavior
143+
144+
### Phase 3: Backend System (Future)
145+
146+
```python
147+
# User-facing API
148+
import dptb
149+
dptb.set_backend('pardiso') # Global setting
150+
151+
tbsys = TBSystem(data, calculator)
152+
bands = tbsys.band.compute(kpath) # Automatically uses Pardiso
153+
154+
# Or per-calculation
155+
bands = tbsys.band.compute(kpath, solver='pardiso')
156+
```
157+
158+
## Implementation Roadmap
159+
160+
### Immediate (Current PR)
161+
1. ✅ Optimize data format (JSON instead of text files)
162+
2. ✅ Use OrbitalMapper.atom_norb (avoid recomputation)
163+
3. ✅ Modularize Julia script
164+
4. ✅ Add Python export tests and lightweight Julia-load coverage
165+
166+
### Short-term (Next PR)
167+
1. Create `dptb/postprocess/unified/solvers/` module
168+
2. Implement `EigenSolver` ABC
169+
3. Refactor current code into `NumpySolver`
170+
4. Implement `PardisoSolver` (file-based)
171+
5. Integrate into `BandAccessor`
172+
173+
### Medium-term
174+
1. Add PyJulia support (optional dependency)
175+
2. Implement `PardisoSolver._solve_via_pyjulia()`
176+
3. Add solver benchmarks
177+
4. Documentation and examples
178+
179+
### Long-term
180+
1. Add more solvers (cuSOLVER for GPU, ELPA for HPC)
181+
2. Auto-selection based on system size
182+
3. Distributed computing support
183+
184+
## Design Principles
185+
186+
1. **Separation of concerns**: I/O, solving, post-processing are separate
187+
2. **Pluggable architecture**: Easy to add new solvers
188+
3. **Backward compatibility**: Default behavior unchanged
189+
4. **Performance**: Minimize data copying and file I/O
190+
5. **Testability**: Each component can be unit tested
191+
6. **Documentation**: Clear examples and API docs
192+
193+
## Example Usage
194+
195+
```python
196+
# Current workflow (after Phase 1)
197+
tbsys = TBSystem(data, calculator)
198+
tbsys.to_pardiso_json("pardiso_input")
199+
# Run `dptb pdso` or call dptb/postprocess/pardiso/main.jl manually.
200+
# Band results are written to bandstructure.h5/bands.dat.
201+
202+
# After Phase 2
203+
tbsys = TBSystem(data, calculator)
204+
bands = tbsys.band.compute(kpath, solver='pardiso')
205+
bands.plot()
206+
207+
# After Phase 3
208+
dptb.set_backend('pardiso')
209+
tbsys = TBSystem(data, calculator)
210+
bands = tbsys.band.compute(kpath) # Automatically fast!
211+
```
212+
"""

0 commit comments

Comments
 (0)