Skip to content

Puzzling performance with Spectra when solving a general eigen for large sparse matrix (real symmetric both L and M) #165

@jianbinghuang

Description

@jianbinghuang

bTripletSet.txt
clTripletSet.txt

For two sparse matrices uploaded above (in the form of csc), the following code takes about 9 seconds to run in python
evals_np, evecs_np = sla.eigsh(L_eigsh, k=k_eig, M=Mmat, sigma=eigs_sigma)
where L_eigsh is read from clTripletSet.txt and Mmat is read from bTripletSet.txt using the folloiwng code. K_eig is set to be 128, and eigs_sigma is set to be 1e-8.

def read_sparse_matrix_from_file(file_path):
rows = []
cols = []
values = []
with open(file_path, 'r') as file:
for line in file:
row, col, value = line.strip().split()
rows.append(int(row))
cols.append(int(col))
values.append(float(value))
num_rows = max(rows) + 1
num_cols = max(cols) + 1
matrix = scipy.sparse.coo_matrix((values, (rows, cols)), shape=(num_rows, num_cols))
return matrix

Using the same two matrices and trying to solve the same general eigenvalue problem takes much longer - I stopped the process after waiting for 10 minutes. The code below is used for this purpose, after sparseA and sparseB are formulated from the same two files above where sparseA was formulated from clTripletSet.txt and sparseB was formulated from bTripletSet.txt.

template std::tuple<Vec, Vec<Vec<std::tuple<int32_t, T>>>>
ComputeSymmetricGeneralEigenSparse(
const Eigen::SparseMatrix& sparseA,
const Eigen::SparseMatrix& sparseB,
int32_t nEigenValuesRequested)
{
int32_t matrixDim = static_cast<int32_t>(sparseA.rows());

// Construct eigen solver object, requesting half eigenvalues
using OpType = Spectra::SymShiftInvert<T, Eigen::Sparse, Eigen::Sparse>;
using BOpType = Spectra::SparseSymMatProd<T>;

OpType op(sparseA, sparseB);
BOpType Bop(sparseB);
auto nev = nEigenValuesRequested;
auto ncv = (nev + sparseA.rows()) / 2;
bool done = false;
Vec<T> sparseEigenVals;
Vec<Vec<std::tuple<int32_t, T>>> sparseEigenVecs;
T shift = T(-0.01);
int32_t nMaxIters = 10;
int32_t nIters = 0;
while (!done && nIters <= nMaxIters) {
    ++nIters;
    done = true;
    try {
        Spectra::SymGEigsShiftSolver<OpType, BOpType, Spectra::GEigsMode::ShiftInvert> eigs(op, Bop,
            nev, ncv, shift);

        // Initialize and compute
        eigs.init();
        int32_t maxit = 5000;
        int32_t nEigenValues = static_cast<int32_t>(eigs.compute(Spectra::SortRule::LargestMagn, maxit));
        while (nEigenValues < nEigenValuesRequested) {
            maxit *= 2;
            nEigenValues = static_cast<int32_t>(eigs.compute(Spectra::SortRule::LargestMagn, maxit));
        }
        auto sparseEigenValuesRaw = eigs.eigenvalues();
        auto sparseEigenVecsRaw = eigs.eigenvectors();
        sparseEigenVals.resize(nEigenValues);
        sparseEigenVecs.resize(nEigenValues);
        for (auto i = 0; i < nEigenValues; ++i) {
            sparseEigenVals[i] = sparseEigenValuesRaw[i];
            for (auto j = 0; j < matrixDim; ++j) {
                if (sparseEigenVecsRaw.col(i)[j] != T(0)) {
                    sparseEigenVecs[i].push_back({ j,
                        sparseEigenVecsRaw.col(i)[j] });
                }
            }
        }
    }
    catch (...) {
        done = false;
        shift *= T(2.0);
    }
}
return make_tuple(sparseEigenVals, sparseEigenVecs);

}

Any advise on how to make Spectra performance into par with scipy.sparse.linalg.eigsh (with arpack as its underlying implementation), or is such performance gap expected?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions