Conversation
…variables, level 2 (critical path for now) only
…per-level basis; chebyshev filter parameters are still only setable via the command line
…, generate leftover near nulls when restricting some fine near null vectors
… restricting finer near-nulls
|
Completed an initial visual review of this PR. Looks like a good contribution. Left a few comments, and I'll test it on some clover multigrid shortly. |
No problem! There are certainly plenty of higher-priority things bouncing around, on all of our plates. Whenever you can get to it is fine and appreciated. |
|
FYI, one of the recent merges of |
maddyscientist
left a comment
There was a problem hiding this comment.
Finished visual review of this PR. Looks mostly good to my eye with some relatively small tweaks needed.
| } | ||
|
|
||
| /** | ||
| @brief Return if we're on the coarsest grid right now |
| /** | ||
| @brief Load the null space vectors in from file | ||
| @param B Loaded null-space vectors (pre-allocated) | ||
| @param B Load null-space vectors to here |
There was a problem hiding this comment.
Can you add [in]/[out] tags to all the doxgyen you've touched in this file?
| /** Number of iterations between null vectors generated from each starting vector */ | ||
| int filter_iterations_between_vectors[QUDA_MAX_MG_LEVEL]; | ||
|
|
||
| /** Conservative estimate of largest eigenvalue of operator used for Chebyshev filter setup */ |
There was a problem hiding this comment.
Is the doxygen correct here: min is largest e-value and max is lower bound?
| sigma_old = sigma; | ||
| } | ||
| blas::copy(out, *tmp2); | ||
| blas::copy(out, tmp_2); |
There was a problem hiding this comment.
Just to note that this copy can be replaced with a swap. I've already applied this optimization to the feature/multi-rhs, so it's perhaps moot.
| extern quda::mgarray<double> filter_lambda_min; | ||
| extern quda::mgarray<double> filter_lambda_max; | ||
|
|
||
|
|
| // Prepare to do the Cholesky decomposition for a thin-QR | ||
| std::vector<Complex> Vdagv_(num_vec * num_vec); | ||
|
|
||
| // outstanding bugfix |
|
|
||
| // Initializing to random vectors | ||
| if (!refresh) { | ||
| int num_initialize = param.mg_global.filter_startup_vectors[param.level]; |
| if (sqrt(nrm2) > 1e-16) ax(1.0/sqrt(nrm2), *B[i]);// i/<i,i> | ||
| else errorQuda("\nCannot normalize %u vector (nrm=%e)\n", i, sqrt(nrm2)); | ||
| } | ||
| if (getVerbosity() >= QUDA_VERBOSE) { |
There was a problem hiding this comment.
replace these four lines with two lines of logQuda
| (*solve)(*out, *in); | ||
| diracSmoother->reconstruct(x, b, QUDA_MAT_SOLUTION); | ||
|
|
||
| if (getVerbosity() >= QUDA_VERBOSE) printfQuda("Solution = %g\n", norm2(x)); |
|
|
||
| // before entering the eigen solver, let's free the B vectors to save some memory | ||
| ColorSpinorParam bParam(*param.B[0]); | ||
| for (int i = 0; i < (int)param.B.size(); i++) delete param.B[i]; |
There was a problem hiding this comment.
Did this optimization to reduce memory get deleted intentionally?
|
@weinbe2 Seems to work fine for me with with the changes to our interface that I implemented in etmc/tmLQCD#548 to preserve the status quo. I will keep track of this PR and make any adjustments that may become necessary due to ongoing changes. |
kostrzewa
left a comment
There was a problem hiding this comment.
No issues from my side, haven't tested anything beyond our status quo, however.
|
For future reference, do not merge, it appears an issue creeped in not in and the error that popped up is sm_80, fast build, QUDA_PRECISION=12 |
This PR refactors and expands the number of methods by which near-null vectors can be generated in QUDA. Due to the nature of the refactor and cleanup, this PR is interface breaking, but in principle in a future-proof way -- near-null vector generation methods are now specified via an
enum,QudaNullVectorSetupType, so it is straightforward to add more options in a non-breaking fashion. This PR also codifies the existing behavior that, if an input near-null vector is specified (via--mg-load-vecfrom the command line, for ex), it is loaded and all other options are ignored.The full list of methods now includes:
arXiv:2103.05034, P. Boyle and A. YamaguchiThis PR also supports "polishing" near-null vectors generated by other methods with more iterations of inverse iterations.
The incomplete test vector support in QUDA has been mostly removed as it requires a fuller refactor that is outside of the scope of this PR, though it is on the to-do list in the future as it is a demonstrably successful approach.
Command line arguments
The core command line argument, which has been repurposed, is
--mg-setup-type [level] [method], where[method]can beinverse-iterations(default),chebyshev-filter,eigenvectors,test-vectors,restrict-fine, andfree-field(wheretest-vectorsgracefullyerrorQudas out)Inverse iterations
No options for inverse iterations have been changed;
--mg-setup-tol,--mg-setup-maxiter, etc, all behave as expected.Eigenvectors
No options for eigenvectors have been changed.
Chebyshev Filter
The Chebyshev filter has a flexible set of parameters describing generating a set of near-null vectors related to the initial low-pass filter, the number of starting vectors, and subsequent generation from a low-passed starting vector.
--mg-setup-filter-startup-vectors- the number of random starting vectors, default 1. As some examples, if the number of near-null vectors for level 1 is 24,--mg-setup-filter-startup-vectors 1 1corresponds to one starting vector with 24 near-nulls generated;[...] 1 3corresponds to three starting vectors with 3 near-nulls generated from each, etc. In cases like[...] 1 5, where 5 doesn't divide into 24, 5 near-null vectors are generated from the first four starting vectors, and 4 from the last --- 4 * 4 + 4 = 24.--mg-setup-filter-startup-iterations- number of iterations for the initial low-pass filter, default 1000.--mg-setup-filter-startup-rescale-frequency- an empirical feature; since the norm of a vector could overflow (or individual values thereof), the vector can be renormalized with some frequency in a way that preserves the Chebyshev recursion; empirical default 50.--mg-setup-filter-lambda-max- upper bound for the Chebyshev filters, default power iterations to guess an upper bound--mg-setup-filter-lambda-min- lower bound to use for the initial low pass filter, modes smaller than that value are enhanced. Default 1.--mg-setup-filter-iterations-between-vectors- number of iterations between subsequent near-null vectors after the initial low-pass filter. As an example, if startup iterations is 1000, and the number of iterations between vectors is 150, a near-null vector is generated after 1000 (initial) matrix applications, then at 1150, 1300, 1450... . Default 150.Restriction
There are no special flags for restriction in and of itself, but:
--mg-setup-restrict-remaining-type [level] [method]can be used to specify which method to use to generate the "remaining" near null vectors if fine nvec != coarse nvec. Parameters for the remainder method are taken from the flags for each method."Polishing" near-null vectors
"Polishing" near-null vectors with inverse iterations is enabled by specifying a non-zero number of polish iterations via
--mg-setup-maxiter-inverse-iterations-polish [level] [numbers], where the default[numbers]is 0, corresponding to no polishing. Parameters for polishing are taken from the flags for inverse iteration setup.Reference commands
A base command where we use inverse iterations for level 1 and a custom method for level 2 (specified by
SETUP_FLAGS_LEVEL2) is:Where we will fill in
SETUP_FLAGS_LEVEL2for different options.Inverse Iterations
A standard setup is
Where the
--mg-setup-typeflag is optional as inverse iterations are the defaultEigenvectors
A reference setup without polynomial acceleration is
Chebyshev filter
A reference setup where 4 base vectors are used -> 8 near-null vectors are generated from each base vector, the minimum of the low pass filter is
1.0, a 500 iteration low pass filter with rescaling every 50 iterations is used, and there are 100 iterations between subsequent near-nulls is:Restriction
A reference setup with restriction, then using inverse iterations for the remaining 8 vectors (32 on level 2 minus 24 on level 1) is:
--mg-setup-restrict-remaining-typecan be changed appropriately, grabbing other reference flags as appropriate.Polishing with inverse iterations
As an example, the parameters for a Chebyshev filter can be included, and then they can be polished for 50 iterations via adding:
Outstanding work
clang-format