Skip to content

ci: add SynapseML-Internal compatibility check to OSS pipeline#2542

Open
BrendanWalsh wants to merge 1 commit intomasterfrom
brwals/internal-compat-check
Open

ci: add SynapseML-Internal compatibility check to OSS pipeline#2542
BrendanWalsh wants to merge 1 commit intomasterfrom
brwals/internal-compat-check

Conversation

@BrendanWalsh
Copy link
Copy Markdown
Collaborator

Summary

Adds a non-blocking CI job (InternalCompat) to the OSS pipeline that validates SynapseML-Internal compiles and passes unit tests against the current OSS build. This catches breaking changes before they reach Internal.

What it does

  1. publishM2 — builds OSS JARs and publishes to local Maven repo
  2. Retarget — seds Internal's build.sbt to use the OSS version from this build + adds Resolver.mavenLocal
  3. Compile — runs sbt compile Test/compile against retargeted Internal
  4. Unit tests — creates Internal's conda env (with Synapse-Conda feed auth), fetches AI service secrets from mmlspark-keys, and runs spark.aifunc tests (128 tests)

Design decisions

  • Always runs, never blockscontinueOnError: true so failures surface as warnings, not build failures
  • spark.aifunc only — other test packages (powerbi, ebm, predict) extend HasSparkSession which eagerly initializes FabricTestConstants, requiring Fabric credentials from fabrictest-cert-admin-kv (not available in the OSS pipeline)
  • No Java pin — Internal uses agent-default Java 11 (not Java 8 like OSS), so we match that
  • Disk cleanup — Internal's conda env pulls PyTorch/CUDA (~15GB); we free ~30GB by removing Android SDK, .NET, GHC, Boost, and Docker images
  • CREATE_SEMPY_WRITER=false — the SemPy parquet writer dotnet codegen step is not needed for compat testing

CI validation

  • Build #213700535 — ✅ all green, 128/128 spark.aifunc tests passed
  • Prior runs validated compile-only, disk space, feed auth, and Java version fixes

Changes

All changes are in pipeline.yaml:

  • Added ADO repo resource for SynapseML-Internal
  • Added InternalCompat job (~100 lines)

Copilot AI review requested due to automatic review settings April 4, 2026 01:09
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 4, 2026

Hey @BrendanWalsh 👋!
Thank you so much for contributing to our repository 🙌.
Someone from SynapseML Team will be reviewing this pull request soon.

We use semantic commit messages to streamline the release process.
Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix.
This helps us to create release messages and credit you for your hard work!

Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

To test your commit locally, please follow our guild on building from source.
Check out the developer guide for additional guidance on testing your change.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 4, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Azure DevOps pipeline job to continuously validate that the closed-source SynapseML-Internal repo still compiles and passes a targeted unit-test subset when built against the current OSS SynapseML artifacts, helping detect breaking changes earlier.

Changes:

  • Adds a SynapseML-Internal repository resource to the pipeline.
  • Introduces a new non-blocking InternalCompat job that publishes OSS artifacts to local Maven, retargets Internal to that version, compiles, creates the Internal conda environment, and runs spark.aifunc tests.
  • Publishes Internal test results while keeping the job non-gating (continueOnError: true).
Show a summary per file
File Description
pipeline.yaml Adds a repo resource and a new non-blocking CI job to compile/test SynapseML-Internal against locally published OSS artifacts.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 1

Comment thread pipeline.yaml
Comment on lines +3 to +4
- repository: self
type: self
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the resources.repositories block, declaring the pipeline repo (repository: self) is typically unnecessary (the pipeline already has an implicit self repo for checkout: self). Consider removing this entry to reduce confusion and keep only the external SynapseML-Internal repository resource.

Suggested change
- repository: self
type: self

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit self declaration is needed here. The original pipeline had - repo: self (shorthand syntax). When restructuring to the full resources.repositories block to add the SynapseML-Internal external repo, the self entry must be included — ADO requires it in the list format for multi-repo checkout (checkout: self + checkout: SynapseML-Internal) to work correctly.

smamindl
smamindl previously approved these changes Apr 7, 2026
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
1 pipeline(s) require an authorized user to comment /azp run to run.

Comment thread pipeline.yaml
Comment thread pipeline.yaml Outdated
@BrendanWalsh BrendanWalsh force-pushed the brwals/internal-compat-check branch from fb74a9d to d700cf1 Compare April 21, 2026 20:50
Adds an optional InternalCompat job that validates OSS changes do not break
SynapseML-Internal consumers. The job:

- Reads OSS_VERSION from core/version.sbt (with empty-value guard)
- Publishes the OSS build to a local m2 repo
- Clones SynapseML-Internal and retargets it to the current OSS version
- Sets up a fresh conda env via the Synapse-Conda feed (PipAuthenticate)
- Activates with 'eval "$(conda shell.bash hook)" && conda activate ...'
- Pulls Fabric credentials from mmlspark-keys via fabric_kv.yml
- Runs Scala tests across spark.aifunc, powerbi, ebm, predict packages
- Runs Python tests from the Internal synapseml namespace
- Publishes JUnit test results

Rebased onto master to resolve conflicts with the newly-merged
ReleaseBranchCompat job (#2550); the two jobs are independent siblings
and now coexist cleanly in pipeline.yaml.
@BrendanWalsh BrendanWalsh force-pushed the brwals/internal-compat-check branch from d700cf1 to 2d79b17 Compare April 21, 2026 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants