Skip to content

Conversation

@mmuetzel
Copy link
Contributor

@mmuetzel mmuetzel commented Dec 27, 2025

Disable MPI because Ubuntu does not provide Fortran bindings for LLVM Flang.
Disable quadruple-precision math because Ubuntu distributes a Flang compiler runtime without support for it. (And reportedly [1], the upstream support for quadruple-precision floating-point math in Flang isn't ready yet either if I understood correctly.)

Also fix some build errors after configuring with -DHAVE_QP=ON and enable it by default (like until recently). (Addressed in #745.)

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Dec 27, 2025

Some tests seem to be failing with Flang as the Fortran compiler.
E.g., with runtime errors like this:

fatal Fortran runtime error(/home/runner/work/elmerfem/elmerfem/fem/src/modules/MagnetoDynamics/CalcFields.F90:1145): Assign: mismatching element counts in array assignment (to 15, from 6)

I don't know if that is an issue with the compiler, with its runtime or if it actually points at something that might be suspicious in ElmerFEM.

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Jan 1, 2026

Would it make sense to split the PR into two parts so that you could merge that part that fixes building with support for quadruple-precision floating point number, and leave the part with the CI using LLVM Flang for later?

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Jan 9, 2026

Rebased on a current head after #745 was merged.

@mmuetzel mmuetzel marked this pull request as ready for review January 9, 2026 11:21
@mmuetzel mmuetzel changed the title Fix building with quadruple-precision fp support and add CI runner building with LLVM Flang Add CI runner building with LLVM Flang Jan 9, 2026
@juharu
Copy link
Contributor

juharu commented Jan 9, 2026

Some tests seem to be failing with Flang as the Fortran compiler. E.g., with runtime errors like this:

fatal Fortran runtime error(/home/runner/work/elmerfem/elmerfem/fem/src/modules/MagnetoDynamics/CalcFields.F90:1145): Assign: mismatching element counts in array assignment (to 15, from 6)

I don't know if that is an issue with the compiler, with its runtime or if it actually points at something that might be suspicious in ElmerFEM.

Yes, this seems definitely a bug, i'll give it a try. Thanks!

@raback
Copy link
Contributor

raback commented Jan 9, 2026

I worked last week a little on this and at least some problems seemed to be from non-explicit range when using GetReal.
https://github.com/ElmerCSC/elmerfem/tree/ExplicitGetRealRange
Merging this can maybe resolve some test.

@mmuetzel
Copy link
Contributor Author

I rebased on a current head of the devel branch. And the number of failing tests reduced from 25 to 11.
Nice. 👍

The remaining test errors seem to be in different categories than the "mismatching element counts" errors.

@juharu
Copy link
Contributor

juharu commented Jan 13, 2026

670 - circuits_harmonic_foil (Failed)                   3D circuits harmonic mgdyn serial whitney
671 - circuits_harmonic_foil_anl_rotm (Failed)          3D circuits harmonic mgdyn rotm serial whitney
672 - circuits_harmonic_foil_wvector (Failed)           3D circuits harmonic mgdyn serial whitney wvector
673 - circuits_harmonic_homogenization_coil_solver (Failed) 3D circuits harmonic homogenization mgdyn serial stranded whitney
676 - circuits_harmonic_stranded (Failed)               3D circuits harmonic mgdyn serial whitney
677 - circuits_harmonic_stranded_homogenization (Failed) 3D circuits harmonic homogenization mgdyn serial stranded whitney

These failures seem to be because the stack (8M) is not large enough on my laptop.
(using "ulimit -s unlimited" on my computer fixes the failures). We maybe should look whether we
can do better, but this might also be a compiler thing ...

@juharu
Copy link
Contributor

juharu commented Jan 13, 2026

The patch below fixes the stack size problem for me. My first impression is that the compiler
is maybe doing something strange with the original code....

diff --git a/fem/src/ElmerSolver.F90 b/fem/src/ElmerSolver.F90
index afb7b025b..7c5fd73d1 100644
--- a/fem/src/ElmerSolver.F90
+++ b/fem/src/ElmerSolver.F90
@@ -2298,7 +2298,15 @@

            IF(ASSOCIATED(Mesh % Edges)) THEN
              IF ( i<=Mesh % NumberOfBulkElements) THEN
  •               Gotit = ListCheckPresent( IC, TRIM(Var % Name)//' {e}' )
    

+#if 1

  •               BLOCK
    
  •                 CHARACTER(LEN(Var % Name)+4) :: s
    
  •                 s = Var % Name // ' {e}'
    
  •                 Gotit = ListCheckPresent( IC, Var % Name//' {e}' )
    
  •               END BLOCK
    

+#else

  •                 Gotit = ListCheckPresent( IC, Var % Name//' {e}' )
    

+#endif
IF ( Gotit ) THEN
DO k=1,Element % TYPE % NumberOfedges
Edge => Mesh % Edges(Element % EdgeIndexes(k))

@juharu
Copy link
Contributor

juharu commented Jan 13, 2026

... and I wasn't even using the introduced new 's' variable for anything there (as i intended). So just
some added no-op piece of code fixed the thing.

@juharu
Copy link
Contributor

juharu commented Jan 13, 2026

Just adding the
BLOCK
END BLOCK
around the call to "ListCheckPresent()" seems enough ....

@juharu
Copy link
Contributor

juharu commented Jan 13, 2026

154 - EM_port_eigen_2ndorder (Failed)                   complex_eigen eigen emwave serial

2941

this also runs smoothly (on my laptop) with added stack space: "ulimit -s unlimited"
the stack is consumed somewhere else than in the "circuits_harmonic*" tests though

@mmuetzel
Copy link
Contributor Author

The patch is hard to read (using triple backticks for blocks of unformatted text might help).
One other change (apart from the BLOCK) could be that you removed the TRIM. Did that alone make a difference?

@juharu
Copy link
Contributor

juharu commented Jan 13, 2026 via email

@juharu
Copy link
Contributor

juharu commented Jan 13, 2026

I committed the BLOCK-END BLOCK thing to devel branch, shouldn't do any harm ...

@juharu
Copy link
Contributor

juharu commented Jan 14, 2026

After a recompilation, also tests "Contact3DLevelProj" and "Contact3DNormalProj" exceed the default 8M main stack
on my laptop. The patch below fixes this (after applying this you can reduce the stack size to < 512K), what I don't understand really, is, that this maybe should be the default:
-fno-stack-arrays Allocate array temporaries on the heap (default)
Or maybe the effective word is "temporaries" ?

st.patch

@mmuetzel
Copy link
Contributor Author

Thanks for looking into this. But the diff is unreadable with the default formatting. You need to use triple backticks for blocks of plain text in comment. Single backticks only work inside paragraphs of formatted text.
Could you please edit your comment and change the single backticks to triple backticks around the diff?

@juharu
Copy link
Contributor

juharu commented Jan 14, 2026

Ok, thanks. I'll do that next, time, for now I attached the patch...

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Jan 14, 2026

Thanks for attaching the patch file.

Yeah. Explicitly allocating these potentially large arrays on the heap instead of on the stack looks reasonable to me.

@juharu
Copy link
Contributor

juharu commented Jan 14, 2026

Yes, I think so too, I'll commit these changes (... and similar changes in the complex version of IDRS implementation) to "devel".

@mmuetzel
Copy link
Contributor Author

I rebased again (on top of b44300d).
Only 4 of the 852 run tests are still failing:

	154 - EM_port_eigen_2ndorder (Failed)                   complex_eigen eigen emwave serial
	216 - FilmFlowPlane4 (Failed)                           quick serial
	217 - FilmFlowPlane5 (Failed)                           quick serial
	593 - SunAngle (Failed)                                 quick serial

Good progress. 👍

@juharu
Copy link
Contributor

juharu commented Jan 14, 2026 via email

@juharu
Copy link
Contributor

juharu commented Jan 14, 2026

For reference, gfortran has this (which is why we haven't seen this type of problems in a while)

    -fmax-stack-var-size=n

           This  option  specifies  the  size  in bytes of the largest array that is put on the stack; 
if the size is exceeded static memory is used (except in procedures marked as "RECURSIVE"). 
Use the option -frecursive to allow for recursive procedures that do not have a "RECURSIVE" 
attribute or for  parallel  programs.  Use  -fno-automatic  to  never use the stack.

  This  option currently only affects local arrays declared with constant bounds, and may not 
apply to all character variables.  Future versions of GNU Fortran may improve   this behavior.

           The default value for n is 65536.

@mmuetzel
Copy link
Contributor Author

Apparently, there is no support for -fmax-stack-var-size in LLVM Flang. And I couldn't find an issue where that would be discussed.
According to their documentation:

Flang already allocates all local arrays on the stack

Matching the documentation for gfortran, they follow up with:

But there are some cases where temporary arrays are created on the heap by Flang.

Apparently, they implemented -fstack-array to force allocation of these temporaries on the stack, too. But I couldn't find anything to pivot in the other way (i.e., more allocations in the heap).

If you have a reproducer, it might make sense to open an issue on their tracker: https://github.com/llvm/llvm-project/issues
Obviously, the current behavior (allocating automatic variables on the stack independent of their size) can cause stack overflows in real-life applications.

@juharu
Copy link
Contributor

juharu commented Jan 14, 2026

Below is a very simple test case, it reads a number "n", gives it the to subroutine "msum", which uses it to allocate an automatic variable "w". flang compiled image crashes when n=~1100000 (running on my laptop, with 8mb default stack size), gfortran compiled image will accept anything that fits to central memory, f.ex. 1000000000 (1e9), again running my laptop

I think elmer is now mostly good with flang, given the changes I made to source code. Can't explain the "circuit_harmonic*" test cases though, with the BLOCK-END BLOCK around the one subroutine call (somewhat dramatically) reducing stack usage .


program test

   integer(8) :: n
   read(5,*) n
   call msum(n)

contains

  subroutine msum(n)
    integer(8) :: n
    real(8) :: w(n)

    call random_number(w)
    print*,'sum: ', n,sum(w)
  end subroutine msum

end program test```

Disable MPI because Ubuntu does not provide Fortran bindings for LLVM Flang.

Disable quadruple precision math because Ubuntu distributes a Flang runtime without support for it.
@mmuetzel
Copy link
Contributor Author

mmuetzel commented Jan 14, 2026

I rebased on a current head (e11e6fb). With that, the following three tests are failing for the runner that builds with LLVM Flang:

	216 - FilmFlowPlane4 (Failed)                           quick serial
	217 - FilmFlowPlane5 (Failed)                           quick serial
	593 - SunAngle (Failed)                                 quick serial

@raback
Copy link
Contributor

raback commented Jan 15, 2026

Nice work! All tests should pass now...

@raback raback merged commit 71f94d5 into ElmerCSC:devel Jan 15, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants