Skip to content

Fix top LLVM: renamed NVPTX barrier intrinsics. #8631

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 24, 2025

Conversation

mcourteaux
Copy link
Contributor

@mcourteaux mcourteaux commented May 22, 2025

Problem: LLVM changed the name of their NPTX barrier intrinsics: llvm/llvm-project#141143.

Solution: Rely on the accompanying auto-upgrade mechanism of LLVM for these intrinsics. As such, I added an alwaysinline wrapper that calls the original intrinsic name, which LLVM will upgrade to the new one, if it wants to. This way, it's compatible with all versions of LLVM without relying on any runtime or compiletime checks in the Halide code.

@mcourteaux mcourteaux added the build Issues related to building Halide and with CI label May 22, 2025
@mcourteaux mcourteaux changed the title Fix top llvm renaming NVPTX barrier intrinsics. Fix top LLVM: renamed NVPTX barrier intrinsics. May 22, 2025
@abadams
Copy link
Member

abadams commented May 23, 2025

I'm surprised this worked, because I thought module->getFunction will get declarations, regardless of whether or not the llvm backend will eventually provide definitions for them. Is it possible the new name has worked for a while?

@alexreinking
Copy link
Member

Shouldn't this be guarded by preprocessor macros, anyway?

@mcourteaux
Copy link
Contributor Author

I'm surprised this worked, because I thought module->getFunction will get declarations, regardless of whether or not the llvm backend will eventually provide definitions for them. Is it possible the new name has worked for a while?

Well, the original code on main fails with that function returning null. So it seems that this is not considering the declarations we have in ptx_dev.ll. But on the other hand... it still didn't return the new intrinsic when I didn't have it declared in ptx_dev.ll. Based on what I just experienced, it seemed like llvm::Module::getFunction() only returns a function if it both is declared and it has an implementation.

@mcourteaux
Copy link
Contributor Author

Under LLVM 19, the invalid PTX is indeed quite invalid...

	{ // callseq 2, 0
	.param .b32 param0;
	st.param.b32 	[param0], 0;
	call.uni 
	llvm.nvvm.barrier.cta.sync.aligned.all, 
	(
	param0
	);
	} // callseq 2

Seems like it does return the declaration...

@mcourteaux
Copy link
Contributor Author

@steven-johnson @abadams @alexreinking Please see the revised solution. Much simpler and very reliable.

@abadams
Copy link
Member

abadams commented May 23, 2025

That's an elegant solution. Could you put the new name in the comment in ptx_dev.ll so that whoever does the later upgrade doesn't have to figure it out?

@mcourteaux
Copy link
Contributor Author

TODO: Diff the PTX outputs before and after this PR in LLVM 19. CUDA tests seem to hang.

@mcourteaux
Copy link
Contributor Author

mcourteaux commented May 23, 2025

Result: there are no barriers in correctness_interleave_rgb (one of the tests that seemingly takes 200 to 1000 seconds depending on which bot). This test takes less than 5 seconds on my machine. Diffing the entire output from this test with HL_DEBUG_CODEGEN=1 between main and this PR, gives identical output (tested with LLVM 19).

Correctness tests are run in parallel, I believe, so perhaps there is an issue with tests involving barriers, running concurrently? Update: Yes, correctness_gpu_thread_barrier does give wildly different code, and goes super slow. This causes all other tests to slow down too, I assume.

@mcourteaux
Copy link
Contributor Author

Okay, I figured out how to correctly check if the intrinsic exists. All tests passing. Will merge.

@mcourteaux mcourteaux merged commit 60621b8 into halide:main May 24, 2025
15 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues related to building Halide and with CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants