Skip to content

Fix MSVC test fails#2009

Open
spiralbit wants to merge 9 commits into
ginkgo-project:developfrom
spiralbit:fix/msvc-tests
Open

Fix MSVC test fails#2009
spiralbit wants to merge 9 commits into
ginkgo-project:developfrom
spiralbit:fix/msvc-tests

Conversation

@spiralbit
Copy link
Copy Markdown

@spiralbit spiralbit commented Apr 30, 2026

These changes fix two kinds of issues which cause various test failures seen running the tests on the following platform:

  • NVIDIA RTX5070 Ti
  • Windows 11
  • MSVC 14.50 (Compiler Version 19.50.35729) <-- this is the most relevant thing here in this PR
  • CUDA 13.2

The two underlying issues are:

  1. MSVC chokes on the combination of templating and braced initializer lists within member-initializer lists. Moving the initializations into the constructor bodies allows the MSVC templating engine to function correctly. Without this certain matrices were being default initialized to zero causing catastrophic failure of any tests which used this pattern.

MSVC has long‑standing conformance issues involving dependent types and braced initializer lists inside constructor member‑initializer lists. These are documented in multiple Microsoft Developer Community reports and reflected in MSVC’s own C++ conformance notes. Several major C++ libraries (Eigen, Boost, TensorFlow) avoid this pattern specifically due to MSVC miscompilation. This PR moves initialization into the constructor body, which is the standard workaround recommended by both library maintainers and Microsoft engineers.

Here are a couple of links concerning documented historical issues in MSVC with template resolution and initialiser lists:
https://learn.microsoft.com/en-us/cpp/overview/cpp-conformance-improvements-2019?view=msvc-170
https://devblogs.microsoft.com/cppblog/two-phase-name-lookup-support-comes-to-msvc/

  1. The RTTI type introspection outputs on MSVC is different to GCC and it adds "class " and "struct " to descriptions which was then causing failures in the logger tests. For example MSVC has:
class DummyLinOp *
class gko::matrix::Dense<float>
class gko::matrix::Dense<class std::complex<double>>

Whereas GCC has:

DummyLinOp *
gko::matrix::Dense<float>
gko::matrix::Dense<std::complex<double>>

This PR adds a couple of normalisation loops over the output and expected variables to strip out any mention of "class " or "struct ".

Here are the test summaries. Before the changes:

The following tests FAILED:
         15 - cuda/test/solver/upper_trs_kernels (Failed)
         18 - reference/test/base/batch_multi_vector_kernels (Failed)
         19 - reference/test/base/combination (Failed)
         20 - reference/test/base/composition (Failed)
         22 - reference/test/base/perturbation (Failed)
         48 - reference/test/matrix/batch_csr_kernels (Failed)
         49 - reference/test/matrix/batch_dense_kernels (Failed)
         50 - reference/test/matrix/batch_ell_kernels (Failed)
         53 - reference/test/matrix/dense_kernels (Failed)
         65 - reference/test/multigrid/pgm_kernels (Failed)
         66 - reference/test/multigrid/fixed_coarsening_kernels (Failed)
        174 - core/test/log/profiler_hook (Failed)
        285 - test/matrix/csr_kernels2_cuda (Failed)
        294 - test/matrix/matrix_cuda (Failed)

After:

The following tests FAILED:
         15 - cuda/test/solver/upper_trs_kernels (Failed)
         66 - reference/test/multigrid/fixed_coarsening_kernels (Failed)
        285 - test/matrix/csr_kernels2_cuda (Failed)
        294 - test/matrix/matrix_cuda (Failed)

These final four failures are CUDA hardware related and I am investigating them. One of these is reported here:

I've also added my cmake configuration script for reference.

ginkgo-run-cmake.txt

@spiralbit spiralbit marked this pull request as ready for review April 30, 2026 11:36
@yhmtsai
Copy link
Copy Markdown
Member

yhmtsai commented Apr 30, 2026

Hi @spiralbit Thanks for your contribution.
Could you also add the official document about the initialize list issue in the pr description?

@yhmtsai yhmtsai self-requested a review April 30, 2026 12:53
Copy link
Copy Markdown
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is also format issue.
Could you install pre-commit hook from ginkgo repo and run it via like pre-commit run --from-ref "origin/develop" --to-ref HEAD

Comment on lines +62 to +87

fine_b = gko::initialize<Vec>(
{I<VT>({2.0, -1.0}), I<VT>({-1.0, 2.0}), I<VT>({0.0, -1.0}),
I<VT>({3.0, -2.0}), I<VT>({-2.0, 1.0})},
exec);

coarse_b = gko::initialize<Vec>(
{I<VT>({2.0, -1.0}), I<VT>({0.0, -1.0})}, exec);

restrict_ans = (gko::initialize<Vec>(
{I<VT>({0.0, -1.0}), I<VT>({2.0, 0.0})}, exec));

prolong_ans = gko::initialize<Vec>(
{I<VT>({0.0, -2.0}), I<VT>({1.0, -2.0}), I<VT>({1.0, -2.0}),
I<VT>({0.0, -1.0}), I<VT>({2.0, 1.0})},
exec);

prolong_applyans = gko::initialize<Vec>(
{I<VT>({2.0, -1.0}), I<VT>({0.0, -1.0}), I<VT>({2.0, -1.0}),
I<VT>({0.0, -1.0}), I<VT>({2.0, -1.0})},
exec);

fine_x = gko::initialize<Vec>(
{I<VT>({-2.0, -1.0}), I<VT>({1.0, -1.0}), I<VT>({-1.0, -1.0}),
I<VT>({0.0, 0.0}), I<VT>({0.0, 2.0})},
exec);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can delete the new line between each variables

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's done now.

Comment thread core/test/log/profiler_hook.cpp Outdated
Comment thread core/test/log/profiler_hook.cpp Outdated
Comment thread core/test/log/profiler_hook.cpp
Comment thread core/test/log/profiler_hook.cpp Outdated
spiralbit and others added 5 commits April 30, 2026 15:02
Co-authored-by: Yu-Hsiang M. Tsai <19565938+yhmtsai@users.noreply.github.com>
Co-authored-by: Yu-Hsiang M. Tsai <19565938+yhmtsai@users.noreply.github.com>
Co-authored-by: Yu-Hsiang M. Tsai <19565938+yhmtsai@users.noreply.github.com>
Co-authored-by: Yu-Hsiang M. Tsai <19565938+yhmtsai@users.noreply.github.com>
@spiralbit
Copy link
Copy Markdown
Author

there is also format issue. Could you install pre-commit hook from ginkgo repo and run it via like pre-commit run --from-ref "origin/develop" --to-ref HEAD

Installed and run that. Hope the formatting looks OK to you now.

@spiralbit
Copy link
Copy Markdown
Author

Hi @spiralbit Thanks for your contribution. Could you also add the official document about the initialize list issue in the pr description?

MS have never formally acknowledged this exact problem, but they have acknowledged similar issues with templates, initializer lists and member-init lists. I've added a couple of references I could find though where issues with these kinds of things has been acknowledged. It's a known "fact" that MSVC trails Clang and GCC in standards compliance.

@yhmtsai
Copy link
Copy Markdown
Member

yhmtsai commented May 7, 2026

Hm, because you have mentioned the following in the pr description

These are documented in multiple Microsoft Developer Community reports and reflected in MSVC’s own C++ conformance notes. Several major C++ libraries (Eigen, Boost, TensorFlow) avoid this pattern specifically due to MSVC miscompilation.

I thought it is documented or raised the same question/discussion before. Is it not true?

Comment thread core/test/log/profiler_hook.cpp Outdated
@spiralbit
Copy link
Copy Markdown
Author

Hm, because you have mentioned the following in the pr description

These are documented in multiple Microsoft Developer Community reports and reflected in MSVC’s own C++ conformance notes. Several major C++ libraries (Eigen, Boost, TensorFlow) avoid this pattern specifically due to MSVC miscompilation.

I thought it is documented or raised the same question/discussion before. Is it not true?

Here's one reference to the kind of problem I could find: https://gitlab.com/libeigen/eigen/-/issues/2423. It's a bit of an obscure area, and it shouldn't really matter if the code is in a member-list init or in the body, but it certainly does seem to matter here - the failure mode is a zero-matrix result. I have tried to make a simple case which illustrates the problem in MSVC and have not managed it. I suspect it might be a template size / nesting problem which is only triggered when the parsing stack overflows or something like this. Because this code is test code I think it's academically interesting why it works in the body - it doesn't really matter, although I agree it would be nice to know exactly why it does.

@yhmtsai
Copy link
Copy Markdown
Member

yhmtsai commented May 8, 2026

yes, the initialization in the list or body is fine and they are in the test.

At least from my perspective, I do not only consider the code itself. A bug fix should address the root cause unless doing so would break the public interface or there is no practical way to fix it properly. For example, if the issue comes from an incorrect class constructor design, then we should fix the constructor itself rather than changing how member initialization is written.

If the PR is merged without clarification, people may simply trust the description when they encounter the same issue later and never look into the actual details.

In particular, when you mentioned that other libraries such as Eigen, Boost, and TensorFlow avoid this pattern for the same reason, I think we need to be more careful because that statement refers to decisions made by other projects. If it is inaccurate or unsupported by references, it could create unnecessary confusion or misrepresent what those projects actually do.

I do not mean that providing a reference automatically makes something true, but we should avoid introducing unsupported claims whenever possible, at least based on what we currently know. Otherwise, it also raises questions about the responsibility of the PR author.

Thanks again for working with me to help figure out the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants