Conversation
WalkthroughThis update introduces support for Apple Metal GPUs in the ClimaComms package. The Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
test/hygiene.jl (1)
14-14: Specify Float32 in rand calls
Usingrand(Float32, …)ensures compatibility with Metal arrays. Consider refactoring repeated patterns into a helper function to reduce duplication.Also applies to: 19-19, 24-24, 29-29, 37-37, 42-42, 48-48, 53-53
docs/src/faqs.md (1)
78-78: Fix markdown formatting issue.Remove spaces inside the code span elements.
-so you might install packages like ` MPI.jl` and `CUDA.jl` (or `Metal.jl`). +so you might install packages like `MPI.jl` and `CUDA.jl` (or `Metal.jl`).🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
78-78: Spaces inside code span elements
null(MD038, no-space-in-code)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
Project.toml(1 hunks)README.md(1 hunks)docs/src/apis.md(2 hunks)docs/src/faqs.md(3 hunks)docs/src/index.md(1 hunks)docs/src/internals.md(2 hunks)ext/ClimaCommsMetalExt.jl(1 hunks)src/devices.jl(4 hunks)src/loading.jl(2 hunks)test/Project.toml(1 hunks)test/hygiene.jl(1 hunks)test/runtests.jl(8 hunks)test/test_cuda.jl(1 hunks)test/test_metal.jl(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`*`: # CodeRabbit Style Guide (CliMA Inspired)
Leverage CodeRabbit for code reviews aligning with CliMA's practices.
I. Key Areas for CodeRabbit:
- Style: Naming (Titl...
*: # CodeRabbit Style Guide (CliMA Inspired)Leverage CodeRabbit for code reviews aligning with CliMA's practices.
I. Key Areas for CodeRabbit:
- Style: Naming (TitleCase, lowercase_with_underscores), line length (<78), indentation (4 spaces), import order.
- DRY: Flag duplicated code; encourage modularization.
- Docstrings: Flag missing docstrings for modules, structs, functions.
- Tests: Detect missing unit tests (if configured).
- Complexity: Report on cyclomatic complexity.
II. Conventions (CodeRabbit Can Help):
- Naming: Follow CliMA/CMIP conventions. Avoid
l,O,Ias single-char vars.- Unicode: Human review needed. Acceptable chars.
III. Documentation (CodeRabbit Flags Deficiencies):
- Docstrings: Describe purpose, inputs, outputs, examples.
IV. Testing (CodeRabbit: Missing Tests):
- Unit tests required for new/modified code.
V. CodeRabbit Config:
.coderabbit.yaml: Customize language, line length, tests.VI. Workflow:
- Review CodeRabbit's suggestions; investigate, provide context, address issues, improve config.
VII. Abandoned Code: Delete or clearly mark as deprecated.
VIII. Reviewer Responsibilities: Ensure adherence to best practices.
README.mdProject.toml
🪛 markdownlint-cli2 (0.17.2)
docs/src/faqs.md
78-78: Spaces inside code span elements
null
(MD038, no-space-in-code)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: test-os (macos-latest)
- GitHub Check: docbuild
- GitHub Check: test-os (ubuntu-latest)
🔇 Additional comments (32)
test/Project.toml (1)
7-7: Add Metal dependency
Metal is now listed under[deps], aligning with the new backend support.README.md (1)
18-18: Document MetalDevice support
TheMetalDevicebullet is correctly added alongside existing devices.docs/src/index.md (1)
23-23: Include MetalDevice in documentation
MetalDeviceentry matches the list of supported devices and macros.test/test_cuda.jl (1)
1-13: New CUDA test runner
Script properly configures the environment and includes the main test suite.Project.toml (2)
19-19: LGTM - Extension and compatibility setup.The extension registration and Metal v1 compatibility constraint follow Julia package conventions correctly.
Also applies to: 27-27
14-14:Details
✅ Verification successful
Verify Metal.jl UUID.
Ensure the UUID matches the official Metal.jl package registration.
🏁 Script executed:
#!/bin/bash # Check if the Metal.jl UUID is correct curl -s "https://raw.githubusercontent.com/JuliaRegistries/General/master/M/Metal/Package.toml" | grep "uuid"Length of output: 153
Metal.jl UUID Verified
The UUID “dde4c033-4e86-420c-a63e-0dd931031962” in Project.toml matches the official Metal.jl registry entry. No further action required.test/test_metal.jl (2)
7-13: LGTM - Follows established test pattern.The test structure mirrors
test_cuda.jlappropriately, maintaining consistency across device-specific test runners.
4-5:Details
✅ Verification successful
Environment variable naming consistency.
Verify these environment variable names match the expected conventions used throughout the codebase.
🏁 Script executed:
#!/bin/bash # Check for consistent environment variable usage in the codebase rg "CLIMACOMMS_DEVICE|CLIMACOMMS_TEST_DEVICE" --type julia -A 2 -B 2Length of output: 2497
Environment variable naming consistency verified.
TheCLIMACOMMS_DEVICEandCLIMACOMMS_TEST_DEVICEsettings intest/test_metal.jlmatch the patterns insrc/devices.jl,test/test_cuda.jl, andtest/runtests.jl. No further changes needed.docs/src/apis.md (1)
16-16:Details
✅ Verification successful
Verify documented functions exist.
Ensure
metal_is_requiredandMetalDeviceare actually implemented in the codebase.Also applies to: 28-28
🏁 Script executed:
#!/bin/bash # Check that documented Metal functions and types exist rg -A 3 "metal_is_required|MetalDevice" --type juliaLength of output: 7714
Documentation additions verified
Both
metal_is_requiredandMetalDeviceare implemented in the codebase:
metal_is_required()in src/loading.jlstruct MetalDevicein src/devices.jlApproving these documentation updates.
docs/src/internals.md (2)
13-13: LGTM - Device list update.MetalDevice appropriately added to the device types list.
121-125: LGTM - Comprehensive Metal backend documentation.The Metal backend description follows the established pattern and provides clear information about the Metal.jl dependency requirement.
docs/src/faqs.md (2)
16-25: LGTM: Clear Metal device instructions added.Documentation properly mirrors the CUDA device setup instructions.
91-91: LGTM: Consistent Metal device summary documentation.Properly documents Metal device behavior in summary output.
src/loading.jl (3)
34-35: LGTM: Consistent Metal extension loading check.Mirrors the CUDA pattern perfectly.
40-51: LGTM: Well-documented Metal requirement function.Documentation and implementation are consistent with
cuda_is_required().
58-58: LGTM: Macro properly extended for Metal support.Conditional Metal import follows the established pattern.
Also applies to: 70-73
src/devices.jl (4)
39-44: LGTM: Well-documented MetalDevice struct.Proper subtyping and clear documentation for M-series chips.
66-67: LGTM: Consistent Metal device type detection.Follows the established pattern for device environment variable parsing.
83-83: LGTM: Documentation updated for Metal device.Properly includes Metal in the allowed device values.
93-96: LGTM: Proper Metal backend validation.Error handling mirrors CUDA implementation with appropriate error message.
test/runtests.jl (6)
23-24: LGTM: Consistent device type testing.Properly tests MetalDevice instance detection.
40-43: LGTM: Appropriate Float64 exclusions for Metal.Metal devices don't support Float64, so these skips are necessary and well-logged.
Also applies to: 111-114, 174-177, 192-195
237-240: LGTM: Consistent Float64 broadcast exclusion.Properly excludes Float64 broadcast test for Metal devices.
244-244: LGTM: Float32 used for Metal compatibility.Using Float32 ensures the test works across all device types.
249-250: LGTM: Consistent scalar access error testing.Properly includes Metal alongside CUDA for expected scalar access errors.
302-309: LGTM: Comprehensive Adapt.jl testing for Metal.Tests mirror CUDA Adapt.jl tests and ensure proper Metal array adaptation.
Also applies to: 326-339
ext/ClimaCommsMetalExt.jl (6)
1-8: LGTM! Clean module structure.Standard Julia extension setup with appropriate imports.
14-17: LGTM! Appropriate no-op for Metal's automatic device management.The implementation correctly reflects Metal's built-in device assignment behavior.
36-38: LGTM! Correct device availability check.Simple and effective implementation.
45-60: LGTM! Proper Adapt.jl integration.Both adaptation methods follow the correct patterns for device and context adaptation.
67-67: LGTM! Correct array type specification.
74-75: LGTM! Proper Metal scalar operation handling.
|
This change is part of the following stack: Change managed by git-spice. |
|
All the independent threaded tests run locally ClimaComms.jl/test/runtests.jl Lines 233 to 274 in b4fd26c the interdependent threaded example also works ClimaComms.jl/test/runtests.jl Lines 276 to 335 in b4fd26c but the independent and interdependent threaded test does not ClimaComms.jl/test/runtests.jl Lines 337 to 404 in b4fd26c This last test does run, but the the FT = Float32
a_threaded = AT(rand(FT, 100, 100))
a_unthreaded = AT(rand(FT, 100, 100))
b = AT(rand(FT, 100, 100))
a_threaded_backup = copy(a_threaded)
a_unthreaded_backup = copy(a_unthreaded)
is_single_cpu_thread =
device isa ClimaComms.CPUSingleThreaded &&
context isa ClimaComms.SingletonCommsContext
threaded_deriv3_with_respect_to_i!(device, a_threaded, b)
unthreaded_deriv3_with_respect_to_i!(device, a_unthreaded, b)
@test_broken a_threaded == a_unthreaded
@test a_threaded == a_threaded_backup
@test a_unthreaded != a_unthreaded_backup |
|
Sorry for not reviewing this earlier. I'm just not sure that using (and spreading the use of) |
Purpose
This is a speculative PR for adding
Metalas a supported device type.Merging this PR is not useful for simulations that depend on
ClimaCore, since that package currently only supportsCUDA. But, other packages likeCloudMicrophysicsmay enjoy local GPU-enabled development without the need for an external server-based GPU. The broader Julia ecosystem may also enjoy this contribution, as this package is useful for non-CliMA projects too.Content
ClimaCommsMetalExtextension that largely mirrors theClimaCommsCUDAExtextension.README.md,apis.md,faqs.md,index.md,internals.md, with description of the proposedMetalsupport.MetalDevicestruct, and extended methods likedevice_typeanddeviceto make use of it.metal_ext_is_loadedandmetal_is_requiredto look forMetalsupport, and extended the@import_required_backendsmacro to loadMetalwhen applicable.hygiene.jlare now allFloat32, which is required forMetalruntests.jlis successful onMetal, but onlyFloat32is tested. AnyFloat64tests are skipped.test_cuda.jlandtest_metal.jlfor convenient execution of tests on the respective devices. Can remove these files if not desired.test_metal.jlis useful for manual testing since, as far as I know, we don't have easy access to a remote test server with aMetal-compatible GPU. Given this, it may be useful to labelMetalsupport as "experimental" or "untested" or similar.