ci: split out integration tests #3806

roeap · 2025-10-03T10:49:19Z

Description

This is another attempt at making our CI a bit more pleasant to work with. The basic idea is - more smaller jobs as well as avoiding some redundant work.

we moved integration tests into their own job and streamlined the build job.

build/check: fmt, clippy (with --test), docs, on linux only
build/build: previous buid and check commands but without --test, all os
build/test: unit tests on all os

In integration we now run a job for every crate that has integration tests. azure, aws, gcp, hdfs, lakefs

Signed-off-by: Robert Pack <[email protected]>

codecov · 2025-10-03T10:51:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.22%. Comparing base (c5eb4c2) to head (c4dadb6).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3806      +/-   ##
==========================================
- Coverage   76.07%   74.22%   -1.86%     
==========================================
  Files         145      145              
  Lines       45313    39500    -5813     
  Branches    45313    39500    -5813     
==========================================
- Hits        34474    29317    -5157     
+ Misses       9148     8771     -377     
+ Partials     1691     1412     -279

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Robert Pack <[email protected]>

abhiaagarwal · 2025-10-03T11:26:46Z

I read this article the other day which might be helpful to avoid CI running out of space: https://kobzol.github.io/rust/2025/09/22/reducing-binary-size-of-rust-programs-with-debuginfo.html. You can compress debuginfo to keep tracebacks with RUSTFLAGS="-Clink-arg=-Wl,--compress-debug-sections=zlib"

Signed-off-by: Robert Pack <[email protected]>

roeap · 2025-10-03T13:18:51Z

@abhiaagarwal - awesome, thanks! Will take a look.

@rtyler @fvaleye @ion-elgreco - not sure how far we would wnat to go (if we follow this approach at all), but for now I tried to get the failures done and reduce runtime ...

In teh current config we run only linux tests on PRs and windows / osx only on main. WIndows beings by far the biggest offender. I think here caches are kicking in more effectively and most builds are done after about 2mins. The full unit tests still take about 5 mins (10+ mins on windows).

In the many CI runs I observed, i never saw an out of memory exception happening though!

Thoughts?

Signed-off-by: Robert Pack <[email protected]>

roeap

Just leaving some comments in the hopes of making this somehow reviewable.

roeap · 2025-10-04T08:54:21Z

.github/actions/setup-env/action.yml

-    - name: checkout
-      uses: actions/checkout@v4
-


we need to checkout the code before we can use local actions. so this must have been unused anyhow.

roeap · 2025-10-04T08:56:26Z

.github/actions/setup-env/action.yml


-    - uses: Swatinem/rust-cache@v2
+    - name: Setup Rust toolchain
+      uses: actions-rust-lang/setup-rust-toolchain@v1


The actions-rs/toolchain seems unmaintained albeit still widely used. This action is maintained and has the additional benefit of also setting up Swatinem/rust-cache. My assumtion beeing that they spend much more time thinging about how than I am hoping to invest 😆.

roeap · 2025-10-04T09:01:16Z

.github/workflows/build.yml

-  RUSTFLAGS: -C debuginfo=line-tables-only
-  # Disable incremental builds by cargo for CI which should save disk space
-  # and hopefully avoid final link "No space left on device"
-  CARGO_INCREMENTAL: 0


I deliberately removed most our rust flags etc. to see how well we perform out of the box. SO far we seem to be doing well.

One question came up along the way. Could it also be that CARGO_INCREMENTAL: 0 hurts reusability of cached data? not sure what cargo / rustc are writing out in the different modes, but if its is fewer larger files vs. more smaller files (which are in total more) it feel intuitive that this might be the case.

Just speculating though :).

Incremental cargo builds don't work with sccache. So it's the right decision.

Also, I think incremental builds theoretically are non-reproducible, so it's probably correct to not run it in CI. This job automatically does it https://github.com/actions-rust-lang/setup-rust-toolchain/blob/063a3b947b5c5bf7d5f87076c3e5e9784b776aa8/action.yml#L118

Thanks @abhiaagarwal - if I read the linked code correctly, this setting is also applied internally,, so we don't need to re-set this here. Or should we also be explicitly setting this?

The job already sets CARGO_INCREMENTAL=0, so you don't need to be explicitly setting it.

roeap · 2025-10-04T09:03:58Z

.github/workflows/build.yml

+            --exclude deltalake \
+            --exclude deltalake-azure \
+            --exclude deltalake-aws \
+            --exclude deltalake-hdfs \
+            --exclude deltalake-lakefs


I was hoping that we would not build these packages, but looking at the logs it seems we still do. This was an attempt to get build time on windows down, but seems to not have much of an impact. SO we ended at just removing them.

roeap · 2025-10-04T09:05:52Z

.github/workflows/dev_pr.yml

-  typos:
-    name: Spell Check
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - name: Check spelling
-        uses: crate-ci/typos@v1


This one is on me ... This file triggers on the pull request target, which is why we checking the errors on the main branch and seeing failures if we changed files that had spelling errors on main, even if we fixed the errors in the PR.

Now this is in a dedicated workflow that triggers on pull request.

roeap · 2025-10-04T09:07:29Z

.github/workflows/integration.yml

+        run: |
+          gmake setup-dat
+          cargo test \
+            --package deltalake-aws \


i looked through our crates and it seems that we are using a native-tls flag only in the AWS crate. We should make this a matrix if there are ever any more or I missed some here.

roeap · 2025-10-04T09:09:45Z

.github/workflows/python_build.yml

-      - name: Check Rust
-        run: make check-rust


Our rust checks already check the python crate as well, so no need to repeat that check. Alternative is to also install clippy in case someone feels this check adds value.

ion-elgreco · 2025-10-05T06:41:04Z

@abhiaagarwal - awesome, thanks! Will take a look.

@rtyler @fvaleye @ion-elgreco - not sure how far we would wnat to go (if we follow this approach at all), but for now I tried to get the failures done and reduce runtime ...

In teh current config we run only linux tests on PRs and windows / osx only on main. WIndows beings by far the biggest offender. I think here caches are kicking in more effectively and most builds are done after about 2mins. The full unit tests still take about 5 mins (10+ mins on windows).

In the many CI runs I observed, i never saw an out of memory exception happening though!

Thoughts?

I'm fine with not running windows and Mac in PRS

ci: split out integration tests

6f96c48

Signed-off-by: Robert Pack <[email protected]>

roeap requested a review from rtyler as a code owner October 3, 2025 10:49

roeap added 3 commits October 3, 2025 13:01

ci: split out integration tests fixes

45c29cf

Signed-off-by: Robert Pack <[email protected]>

ci: split out integration tests fixes2

f1d4c9a

Signed-off-by: Robert Pack <[email protected]>

ci: split out integration tests fixes 3

7d39254

Signed-off-by: Robert Pack <[email protected]>

roeap added 12 commits October 3, 2025 13:29

ci: split out integration tests fixes 4

0e663c4

Signed-off-by: Robert Pack <[email protected]>

ci: split out integration tests fixes 5

37ff78a

Signed-off-by: Robert Pack <[email protected]>

ci: split out integration tests fixes 8

3293678

Signed-off-by: Robert Pack <[email protected]>

ci: split out integration tests fixes 7

4ad996e

Signed-off-by: Robert Pack <[email protected]>

ci: limit scope in default tests

d00eda6

Signed-off-by: Robert Pack <[email protected]>

ci: exclude redundant tests

63b8c4f

Signed-off-by: Robert Pack <[email protected]>

ci: exclude redundant tests fix

6fecdd4

Signed-off-by: Robert Pack <[email protected]>

ci: try fixing shell

3106d56

Signed-off-by: Robert Pack <[email protected]>

ci: exclude wrapper package in default tests

cbdbe93

Signed-off-by: Robert Pack <[email protected]>

ci: onlt linux on PRs

8561fa6

Signed-off-by: Robert Pack <[email protected]>

ci: only linux on PRs 2

09d2845

Signed-off-by: Robert Pack <[email protected]>

fix: needs

f3741ae

Signed-off-by: Robert Pack <[email protected]>

roeap requested review from fvaleye and ion-elgreco October 3, 2025 13:21

roeap added 2 commits October 3, 2025 15:23

chore: rename docs ci

be0b027

Signed-off-by: Robert Pack <[email protected]>

fix: run typos on pr not target

c4dadb6

Signed-off-by: Robert Pack <[email protected]>

roeap commented Oct 4, 2025

View reviewed changes

ion-elgreco approved these changes Oct 5, 2025

View reviewed changes

ion-elgreco enabled auto-merge (squash) October 5, 2025 06:42

roeap disabled auto-merge October 5, 2025 07:53

roeap merged commit b8e2170 into delta-io:main Oct 5, 2025
39 of 40 checks passed

roeap deleted the ci/split-integration branch October 5, 2025 07:53

fvaleye mentioned this pull request Oct 7, 2025

chore(ci): optimize caching to reduce cache count #3798

Closed

Uh oh!

ci: split out integration tests #3806

ci: split out integration tests #3806

Uh oh!

Conversation

roeap commented Oct 3, 2025

Description

Uh oh!

codecov bot commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

abhiaagarwal commented Oct 3, 2025

Uh oh!

roeap commented Oct 3, 2025

Uh oh!

roeap left a comment

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

abhiaagarwal Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

abhi-airspace-intelligence Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roeap Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

ion-elgreco commented Oct 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Oct 3, 2025 •

edited

Loading

abhi-airspace-intelligence Oct 4, 2025 •

edited

Loading

roeap Oct 4, 2025 •

edited

Loading

roeap Oct 4, 2025 •

edited

Loading