We've had a customer report that their model slowed significantly during LU decomposition within the past couple months. The most likely culprit is the update to openblas instead of reference fblas. Even after setting OMP_NUM_THREADS or the petsc-openblas equivalent environment variable to ensure we're not oversubscribing, the user reports the performance regression persists. In this case the user is using superLU_dist for the LU decomposition. I don't believe the user's problem size is big enough such that we have a 64 bit to 32 bit truncation issue, but I'm going to confirm with him
We've had a customer report that their model slowed significantly during LU decomposition within the past couple months. The most likely culprit is the update to openblas instead of reference fblas. Even after setting
OMP_NUM_THREADSor the petsc-openblas equivalent environment variable to ensure we're not oversubscribing, the user reports the performance regression persists. In this case the user is using superLU_dist for the LU decomposition. I don't believe the user's problem size is big enough such that we have a 64 bit to 32 bit truncation issue, but I'm going to confirm with him