Skip to content

[ci] Enable metal tests on macOS #3026

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

syl20bnr
Copy link
Member

No description provided.

@syl20bnr syl20bnr marked this pull request as ready for review April 15, 2025 18:11
@syl20bnr syl20bnr changed the title [ci] Enabled metal tests on macOS [ci] Enable metal tests on macOS Apr 15, 2025
@syl20bnr syl20bnr force-pushed the ci/enable-metal-on-macos branch from cb79160 to c04d091 Compare April 15, 2025 18:17
Copy link

codecov bot commented Apr 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.14%. Comparing base (d6533da) to head (dd4d5d0).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3026      +/-   ##
==========================================
+ Coverage   81.12%   81.14%   +0.02%     
==========================================
  Files         816      816              
  Lines      117358   117358              
==========================================
+ Hits        95210    95234      +24     
+ Misses      22148    22124      -24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@syl20bnr syl20bnr force-pushed the ci/enable-metal-on-macos branch from c04d091 to dd4d5d0 Compare April 25, 2025 19:26
Copy link
Collaborator

@antimora antimora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to dig through the CI logs.. but the macos failures are due to f16 precision errors

2025-04-25T19:40:09.3655320Z failures:
2025-04-25T19:40:09.3656240Z     tests::cube::autodiff::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups
2025-04-25T19:40:09.3656860Z     tests::cube::autodiff::f16_ty::ad_cos::tests::should_diff_cos
2025-04-25T19:40:09.3657610Z     tests::cube::autodiff::f16_ty::ad_sin::tests::should_diff_sin
2025-04-25T19:40:09.3658450Z     tests::cube::autodiff_checkpointing::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups
2025-04-25T19:40:09.3659150Z     tests::cube::autodiff_checkpointing::f16_ty::ad_cos::tests::should_diff_cos
2025-04-25T19:40:09.3659880Z     tests::cube::autodiff_checkpointing::f16_ty::ad_sin::tests::should_diff_sin
2025-04-25T19:40:09.3660640Z     tests::cube::kernel::conv2d::tests::conv2d_should_match_reference_backend_bias_regression
2025-04-25T19:40:09.3661330Z     tests::cube::tensor::f16_ty::cos::tests::should_support_cos_ops
2025-04-25T19:40:09.3662090Z     tests::cube::tensor::f16_ty::module_unfold4d::tests::test_unfold4d_shape
2025-04-25T19:40:09.3662890Z     tests::cube_fusion::autodiff::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups
2025-04-25T19:40:09.3663570Z     tests::cube_fusion::autodiff::f16_ty::ad_cos::tests::should_diff_cos
2025-04-25T19:40:09.3664310Z     tests::cube_fusion::autodiff::f16_ty::ad_sin::tests::should_diff_sin
2025-04-25T19:40:09.3665060Z     tests::cube_fusion::autodiff::f16_ty::ad_softmax::tests::test_log_softmax_grad
2025-04-25T19:40:09.3665940Z     tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups
2025-04-25T19:40:09.3666560Z     tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_cos::tests::should_diff_cos
2025-04-25T19:40:09.3667440Z     tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_sin::tests::should_diff_sin
2025-04-25T19:40:09.3668100Z     tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_softmax::tests::test_log_softmax_grad
2025-04-25T19:40:09.3668790Z     tests::cube_fusion::tensor::f16_ty::cos::tests::should_support_cos_ops
2025-04-25T19:40:09.3669370Z 
2025-04-25T19:40:09.3670390Z test result: FAILED. 5012 passed; 18 failed; 28 ignored; 0 measured; 0 filtered out; finished in 605.82s

Is this expected? If so, we would have to set a lower tolerance for macos.

2025-04-25T19:40:08.3993300Z thread 'tests::cube::autodiff::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.3993360Z Tensors are not approx eq:
2025-04-25T19:40:08.3993450Z   => Position 0: 5344 != 4096
2025-04-25T19:40:08.3993560Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)
2025-04-25T19:40:08.3993610Z   => Position 1: 5344 != 4096
2025-04-25T19:40:08.3993710Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)
2025-04-25T19:40:08.3993840Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

2025-04-25T19:40:08.4809500Z thread 'tests::cube::autodiff::f16_ty::ad_cos::tests::should_diff_cos' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.4809660Z Tensors are not approx eq:
2025-04-25T19:40:08.4809810Z   => Position 0: 9.1875 != 9.21875
2025-04-25T19:40:08.4810040Z      diff (rel = +1.70e-3, abs = +3.12e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4810230Z   => Position 2: -28.640625 != -28.71875
2025-04-25T19:40:08.4810430Z      diff (rel = +1.36e-3, abs = +7.81e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4810670Z   => Position 3: 49.875 != 49.75
2025-04-25T19:40:08.4810880Z      diff (rel = +1.25e-3, abs = +1.25e-1), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4810990Z 
2025-04-25T19:40:08.4811220Z ---- tests::cube::autodiff::f16_ty::ad_sin::tests::should_diff_sin stdout ----
2025-04-25T19:40:08.4811320Z 
2025-04-25T19:40:08.4811630Z thread 'tests::cube::autodiff::f16_ty::ad_sin::tests::should_diff_sin' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.4811780Z Tensors are not approx eq:
2025-04-25T19:40:08.4811940Z   => Position 0: 8.875 != 8.8515625
2025-04-25T19:40:08.4812150Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:08.4812310Z   => Position 1: -5.0234375 != -4.9804688
2025-04-25T19:40:08.4812600Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:08.4812760Z   => Position 2: 8.875 != 8.8515625
2025-04-25T19:40:08.4812970Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:08.4813060Z   => Position 3: -5.0234375 != -4.9804688
2025-04-25T19:40:08.4813160Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:08.4813170Z 
2025-04-25T19:40:08.4813400Z ---- tests::cube::autodiff_checkpointing::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups stdout ----
2025-04-25T19:40:08.4813400Z 
2025-04-25T19:40:08.4813710Z thread 'tests::cube::autodiff_checkpointing::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.4813770Z Tensors are not approx eq:
2025-04-25T19:40:08.4813880Z   => Position 0: 5344 != 4096
2025-04-25T19:40:08.4813980Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4814030Z   => Position 1: 5344 != 4096
2025-04-25T19:40:08.4814130Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4814130Z 
2025-04-25T19:40:08.4814290Z ---- tests::cube::autodiff_checkpointing::f16_ty::ad_cos::tests::should_diff_cos stdout ----
2025-04-25T19:40:08.4814290Z 
2025-04-25T19:40:08.4814530Z thread 'tests::cube::autodiff_checkpointing::f16_ty::ad_cos::tests::should_diff_cos' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.4814580Z Tensors are not approx eq:
2025-04-25T19:40:08.4814620Z   => Position 0: 9.1875 != 9.21875
2025-04-25T19:40:08.4814720Z      diff (rel = +1.70e-3, abs = +3.12e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4814780Z   => Position 2: -28.640625 != -28.71875
2025-04-25T19:40:08.4814870Z      diff (rel = +1.36e-3, abs = +7.81e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4814920Z   => Position 3: 49.875 != 49.75
2025-04-25T19:40:08.4815010Z      diff (rel = +1.25e-3, abs = +1.25e-1), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.4815020Z 
2025-04-25T19:40:08.4815170Z ---- tests::cube::autodiff_checkpointing::f16_ty::ad_sin::tests::should_diff_sin stdout ----
2025-04-25T19:40:08.4815170Z 
2025-04-25T19:40:08.4815400Z thread 'tests::cube::autodiff_checkpointing::f16_ty::ad_sin::tests::should_diff_sin' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.4815500Z Tensors are not approx eq:
2025-04-25T19:40:08.4815550Z   => Position 0: 8.875 != 8.8515625
2025-04-25T19:40:08.4815770Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:08.4815830Z   => Position 1: -5.0234375 != -4.9804688
2025-04-25T19:40:08.4815920Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:08.4815970Z   => Position 2: 8.875 != 8.8515625
2025-04-25T19:40:08.4816060Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:08.4816110Z   => Position 3: -5.0234375 != -4.9804688
2025-04-25T19:40:08.4816210Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)

2025-04-25T19:40:08.5483250Z thread 'tests::cube::kernel::conv2d::tests::conv2d_should_match_reference_backend_bias_regression' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.5483310Z Tensors are not approx eq:
2025-04-25T19:40:08.5483360Z   => Position 2: 0.25510266 != 0.25507012
2025-04-25T19:40:08.5483470Z      diff (rel = +6.38e-5, abs = +3.25e-5), tol (rel = +7.63e-6, abs = +1.88e-37)
2025-04-25T19:40:08.5483530Z   => Position 3: 0.32525328 != 0.32523832
2025-04-25T19:40:08.5483620Z      diff (rel = +2.30e-5, abs = +1.50e-5), tol (rel = +7.63e-6, abs = +1.88e-37)
2025-04-25T19:40:08.5483690Z   => Position 4: 1.209008 != 1.2090474
2025-04-25T19:40:08.5483800Z      diff (rel = +1.63e-5, abs = +3.95e-5), tol (rel = +7.63e-6, abs = +1.88e-37)
2025-04-25T19:40:08.5483850Z   => Position 6: 0.640136 != 0.64014834
2025-04-25T19:40:08.5483950Z      diff (rel = +9.64e-6, abs = +1.23e-5), tol (rel = +7.63e-6, abs = +1.88e-37)
2025-04-25T19:40:08.5484010Z   => Position 7: 0.2561154 != 0.2561429
2025-04-25T19:40:08.5484170Z      diff (rel = +5.37e-5, abs = +2.75e-5), tol (rel = +7.63e-6, abs = +1.88e-37)
2025-04-25T19:40:08.5484230Z 14 more errors...

2025-04-25T19:40:08.5489510Z thread 'tests::cube::tensor::f16_ty::cos::tests::should_support_cos_ops' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.5490120Z Tensors are not approx eq:
2025-04-25T19:40:08.5490850Z   => Position 5: 0.28442383 != 0.2836914
2025-04-25T19:40:08.5491630Z      diff (rel = +1.29e-3, abs = +7.32e-4), tol (rel = +1.00e-3, abs = +1.00e-5)

2025-04-25T19:40:08.9458910Z thread 'tests::cube_fusion::autodiff::f16_ty::ad_cos::tests::should_diff_cos' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.9458960Z Tensors are not approx eq:
2025-04-25T19:40:08.9459010Z   => Position 0: 9.1875 != 9.21875
2025-04-25T19:40:08.9459120Z      diff (rel = +1.70e-3, abs = +3.12e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.9459340Z   => Position 2: -28.640625 != -28.71875
2025-04-25T19:40:08.9459440Z      diff (rel = +1.36e-3, abs = +7.81e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:08.9459490Z   => Position 3: 49.875 != 49.75
2025-04-25T19:40:08.9459590Z      diff (rel = +1.25e-3, abs = +1.25e-1), tol (rel = +1.00e-3, abs = +1.00e-5)

2025-04-25T19:40:08.9521170Z thread 'tests::cube_fusion::autodiff::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:08.9521230Z Tensors are not approx eq:
2025-04-25T19:40:08.9521280Z   => Position 0: 5344 != 4096
2025-04-25T19:40:08.9521390Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)
2025-04-25T19:40:08.9521440Z   => Position 1: 5344 != 4096
2025-04-25T19:40:08.9521540Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)

2025-04-25T19:40:09.2338000Z thread 'tests::cube_fusion::autodiff::f16_ty::ad_sin::tests::should_diff_sin' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:09.2338160Z Tensors are not approx eq:
2025-04-25T19:40:09.2338370Z   => Position 0: 8.875 != 8.8515625
2025-04-25T19:40:09.2351350Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:09.2351890Z   => Position 1: -5.0234375 != -4.9804688
2025-04-25T19:40:09.2352110Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:09.2352290Z   => Position 2: 8.875 != 8.8515625
2025-04-25T19:40:09.2352470Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:09.2352710Z   => Position 3: -5.0234375 != -4.9804688
2025-04-25T19:40:09.2352900Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)

2025-04-25T19:40:09.3052540Z thread 'tests::cube_fusion::autodiff::f16_ty::ad_softmax::tests::test_log_softmax_grad' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:09.3052590Z Tensors are not approx eq:
2025-04-25T19:40:09.3052640Z   => Position 1: -4.34375 != -4.3945313
2025-04-25T19:40:09.3052750Z      diff (rel = +5.81e-3, abs = +5.08e-2), tol (rel = +5.00e-3, abs = +1.00e-5)

2025-04-25T19:40:09.3098990Z thread 'tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_cos::tests::should_diff_cos' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:09.3099040Z Tensors are not approx eq:
2025-04-25T19:40:09.3099090Z   => Position 0: 9.1875 != 9.21875
2025-04-25T19:40:09.3099300Z      diff (rel = +1.70e-3, abs = +3.12e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:09.3099360Z   => Position 2: -28.640625 != -28.71875
2025-04-25T19:40:09.3099460Z      diff (rel = +1.36e-3, abs = +7.81e-2), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:09.3099510Z   => Position 3: 49.875 != 49.75
2025-04-25T19:40:09.3099610Z      diff (rel = +1.25e-3, abs = +1.25e-1), tol (rel = +1.00e-3, abs = +1.00e-5)
2025-04-25T19:40:09.3099610Z 
2025-04-25T19:40:09.3099860Z ---- tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups stdout ----
2025-04-25T19:40:09.3099860Z 
2025-04-25T19:40:09.3100190Z thread 'tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_conv_transpose3d::tests::test_conv_transpose3d_complex_groups' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:09.3100240Z Tensors are not approx eq:
2025-04-25T19:40:09.3100280Z   => Position 0: 5344 != 4096
2025-04-25T19:40:09.3100380Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)
2025-04-25T19:40:09.3100420Z   => Position 1: 5344 != 4096
2025-04-25T19:40:09.3100520Z      diff (rel = +1.32e-1, abs = +1.25e3), tol (rel = +4.50e-3, abs = +1.00e-5)

2025-04-25T19:40:09.3291330Z thread 'tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_sin::tests::should_diff_sin' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:09.3292370Z Tensors are not approx eq:
2025-04-25T19:40:09.3293130Z   => Position 0: 8.875 != 8.8515625
2025-04-25T19:40:09.3293970Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:09.3294690Z   => Position 1: -5.0234375 != -4.9804688
2025-04-25T19:40:09.3295490Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:09.3296200Z   => Position 2: 8.875 != 8.8515625
2025-04-25T19:40:09.3297010Z      diff (rel = +1.32e-3, abs = +2.34e-2), tol (rel = +5.00e-4, abs = +0.00e0)
2025-04-25T19:40:09.3297710Z   => Position 3: -5.0234375 != -4.9804688
2025-04-25T19:40:09.3298500Z      diff (rel = +4.30e-3, abs = +4.30e-2), tol (rel = +5.00e-4, abs = +0.00e0)

2025-04-25T19:40:09.3529420Z thread 'tests::cube_fusion::autodiff_checkpointing::f16_ty::ad_softmax::tests::test_log_softmax_grad' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:09.3529930Z Tensors are not approx eq:
2025-04-25T19:40:09.3530740Z   => Position 1: -4.34375 != -4.3945313
2025-04-25T19:40:09.3531450Z      diff (rel = +5.81e-3, abs = +5.08e-2), tol (rel = +5.00e-3, abs = +1.00e-5)

2025-04-25T19:40:09.3651110Z thread 'tests::cube_fusion::tensor::f16_ty::cos::tests::should_support_cos_ops' panicked at crates/burn-wgpu/src/lib.rs:127:5:
2025-04-25T19:40:09.3651670Z Tensors are not approx eq:
2025-04-25T19:40:09.3652470Z   => Position 5: 0.28442383 != 0.2836914
2025-04-25T19:40:09.3653270Z      diff (rel = +1.29e-3, abs = +7.32e-4), tol (rel = +1.00e-3, abs = +1.00e-5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants