Skip to content

Fix NuGet DLL Loading on Linux and macOS#27266

Merged
tianleiwu merged 12 commits intomainfrom
tlwu/fix_ort_nuget_dll_linux
Feb 12, 2026
Merged

Fix NuGet DLL Loading on Linux and macOS#27266
tianleiwu merged 12 commits intomainfrom
tlwu/fix_ort_nuget_dll_linux

Conversation

@tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Feb 6, 2026

Summary

This PR addresses persistent native library loading issues in the ONNX Runtime NuGet package, specifically on macOS and Linux, by implementing a robust DllImportResolver. It also includes necessary pipeline and packaging adjustments to ensure required macOS artifacts are correctly located and validated during CI.

Problem

#27263 reports that Unable to load shared library 'onnxruntime.dll' or one of its dependencies. It was caused by #26415 since the commit hard-coded onnxruntime.dll even for Linux and MacOS (The correct filename shall be libonnxruntime.so for Linux, and libonnxruntime.dylib for MacOS).

The Nuget test pipeline has been broken for a while, so we also need fix the pipeline to test our change. It has the following issues:

  • MacOS nuget is for arm64, but the vmImage macOS-15 is x64.
  • MacOS nuget test need libcustom_op_library.dylib, but it is not copied from artifacts to test environment.
  • MacOS artifact contains libonnxruntime.dylib and libonnxruntime.1.24.1.dylib, where libonnxruntime.dylib is symlink. It causes issue since the later is excluded by nuspec.
  • MacOS nuget test use models from onnx repo. However, latest onnx has some models with data types like float8 that are not supported by C#, so those model test failed.
  • Linux nuget test uses a docker Dockerfile.package_ubuntu_2404_gpu, but docker build failed due to libnvinfer-headers-python-plugin-dev and libnvinfer-win-builder-resource10 version.

Changes

1. Robust C# DLL Resolution

The DllImportResolver has been enhanced to handle various deployment scenarios where standard .NET resolution might fail:

  • Platform-Specific Naming: Maps extension-less library names (onnxruntime, ortextensions) to appropriate filenames (onnxruntime.dll, libonnxruntime.so, libonnxruntime.dylib) based on the OS.
  • Multi-Stage Probing:
    1. Default Loading: Attempts NativeLibrary.TryLoad with the mapped name.
    2. NuGet runtimes Probing: If the above fails, it probes the runtimes/{rid}/native/ subdirectories relative to the assembly location, covering common RIDs (win-x64, linux-arm64, osx-arm64, etc.).
    3. Base Directory Fallback: As a final attempt, it looks in AppContext.BaseDirectory.
  • Case-Sensitivity Handling: Ensures lowercase extensions are used on Windows to prevent lookup failures on case-sensitive filesystems.

2. macOS CI/Packaging Improvements

  • Templates (test_macos.yml):
    • Updated to extract artifacts from TGZ files.
    • Ensures libcustom_op_library.dylib is placed in the expected location (testdata/testdata) for end-to-end tests.
    • Initializes the ONNX submodule to provide required test data.
  • Node.js:
    • Restored the Node.js macOS test stage in c-api-noopenmp-test-pipelines.yml, configured to run on the ARM64 pool (AcesShared).
    • Updated test_macos.yml template to support custom agent pools (similar to the NuGet template).
  • Pipeline Config: Adjusted agent pool selection and demands for macOS jobs to ensure stable execution.
  • Binary Robustness: The copy_strip_binary.sh script now ensures libonnxruntime.dylib is a real file rather than a symlink, improving NuGet packaging reliability.

3. Test Refinements

  • Inference Tests: Skips a specific set of pretrained-model test cases on macOS that are currently known to be flaky or unsupported in that environment, preventing noise in the CI results.

Verification

Pipelines

  • Verified in NuGet_Test_MacOS.
  • Verified in NuGet_Test_Linux.
  • Verified in Windows test pipelines.

Net Effect

The C# bindings are now significantly more resilient to different deployment environments. The CI process for macOS is also more robust, correctly handling the artifacts required for comprehensive NuGet validation.

@tianleiwu tianleiwu marked this pull request as draft February 6, 2026 09:11
@tianleiwu tianleiwu marked this pull request as ready for review February 9, 2026 22:35
@tianleiwu tianleiwu changed the title Add DllImportResolver for nuget dll Fix NuGet DLL Loading on Linux and macOS Feb 9, 2026
Use AcesShared pool for arm64 macOS
probes runtimes/{RID}/native/ subfolders
copy dylib
@tianleiwu tianleiwu marked this pull request as draft February 10, 2026 22:11
@tianleiwu tianleiwu marked this pull request as ready for review February 11, 2026 17:12
@tianleiwu tianleiwu requested a review from Copilot February 11, 2026 17:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses cross-platform native library resolution for the C# ONNX Runtime NuGet packages by removing hardcoded .dll names and introducing custom resolution logic, alongside CI/packaging adjustments to ensure required macOS artifacts (notably the custom op library and dylib layout) are present during NuGet validation.

Changes:

  • Updated C# P/Invoke library names to use extension-less names and added a DllImportResolver to control native library loading behavior.
  • Adjusted macOS NuGet test pipeline and macOS packaging steps to ensure required test artifacts (e.g., libcustom_op_library.dylib) are available/located as expected.
  • Tweaked Linux packaging Docker image dependencies and related CI pipeline configuration.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tools/ci_build/github/linux/docker/Dockerfile.package_ubuntu_2404_gpu Adds additional TensorRT-related packages for the Ubuntu 24.04 GPU packaging image.
tools/ci_build/github/linux/copy_strip_binary.sh Ensures libonnxruntime.dylib is a real file (not a symlink) for NuGet packaging robustness.
tools/ci_build/github/azure-pipelines/templates/mac-cpu-packaging-steps.yml Copies libcustom_op_library.dylib into packaging/testdata locations for macOS artifacts.
tools/ci_build/github/azure-pipelines/nuget/templates/test_macos.yml Updates macOS NuGet test job pool selection and unpacks macOS artifacts into testdata/; initializes ONNX submodule for test data.
tools/ci_build/github/azure-pipelines/c-api-noopenmp-test-pipelines.yml Switches macOS NuGet tests to a specific pool/demands and removes the Node.js macOS stage from this pipeline.
csharp/test/Microsoft.ML.OnnxRuntime.Tests.NetCoreApp/InferenceTest.netcore.cs Skips a set of pretrained-model testcases on macOS.
csharp/src/Microsoft.ML.OnnxRuntime/NativeMethods.shared.cs Changes DllImport names to extension-less and adds a DllImportResolver to map/load native libraries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tianleiwu tianleiwu requested a review from eserscor February 11, 2026 23:40
eserscor
eserscor previously approved these changes Feb 12, 2026
@skottmckay
Copy link
Contributor

skottmckay commented Feb 12, 2026

Is part of the problem that the props file for the nuget package does not have logic to copy files for non-windows platforms to the build output directory?

https://github.com/microsoft/onnxruntime/tree/main/csharp/src/Microsoft.ML.OnnxRuntime/targets/netstandard

Typically the build process would copy from the relevant runtimes directory in the package to the build output if that was setup.

@tianleiwu
Copy link
Contributor Author

Is part of the problem that the props file for the nuget package does not have logic to copy files for non-windows platforms to the build output directory?

I added a section of Problem in PR description to list all issues that addressed by this PR.

@skottmckay
Copy link
Contributor

Is part of the problem that the props file for the nuget package does not have logic to copy files for non-windows platforms to the build output directory?

I added a section of Problem in PR description to list all issues that addressed by this PR.

but would the runtimes probing be required if the props file copied the library from the applicable runtimes directory to the build output directory like we do on Windows?

2. NuGet runtimes Probing: If the above fails, it probes the runtimes/{rid}/native/ subdirectories relative to the assembly location, covering common RIDs (win-x64, linux-arm64, osx-arm64, etc.).

FWIW we're having to manually add this copy step in the FL Local samples for Linux and we wouldn't need to do that if the props file included logic for linux.

https://github.com/microsoft/Foundry-Local/blob/58b81a15d0bbe7a13398f74a20aa490a5545a1a5/samples/cs/GettingStarted/ExcludeExtraLibs.props#L36-L43

@tianleiwu
Copy link
Contributor Author

tianleiwu commented Feb 12, 2026

but would the runtimes probing be required if the props file copied the library from the applicable runtimes directory to the build output directory like we do on Windows?

  1. NuGet runtimes Probing: If the above fails, it probes the runtimes/{rid}/native/ subdirectories relative to the assembly location, covering common RIDs (win-x64, linux-arm64, osx-arm64, etc.).

If we have props for Linux/MacOS, we can simplify runtime probing logic.

We can address props issue in another pull request since it is a different issue.
This PR mainly address hard-coded "onnxruntime.dll" for Linux/MacOS.

@tianleiwu tianleiwu enabled auto-merge (squash) February 12, 2026 08:17
@tianleiwu tianleiwu merged commit 1a71a5f into main Feb 12, 2026
99 of 102 checks passed
@tianleiwu tianleiwu deleted the tlwu/fix_ort_nuget_dll_linux branch February 12, 2026 09:08
tianleiwu added a commit that referenced this pull request Feb 12, 2026
## Summary

This PR addresses persistent native library loading issues in the ONNX
Runtime NuGet package, specifically on macOS and Linux, by implementing
a robust DllImportResolver. It also includes necessary pipeline and
packaging adjustments to ensure required macOS artifacts are correctly
located and validated during CI.

## Problem
#27263 reports that
`Unable to load shared library 'onnxruntime.dll' or one of its
dependencies`. It was caused by
#26415 since the commit
hard-coded onnxruntime.dll even for Linux and MacOS (The correct
filename shall be libonnxruntime.so for Linux, and libonnxruntime.dylib
for MacOS).

The Nuget test pipeline has been broken for a while, so we also need fix
the pipeline to test our change. It has the following issues:
* MacOS nuget is for arm64, but the vmImage `macOS-15` is x64. 
* MacOS nuget test need libcustom_op_library.dylib, but it is not copied
from artifacts to test environment.
* MacOS artifact contains libonnxruntime.dylib and
libonnxruntime.1.24.1.dylib, where libonnxruntime.dylib is symlink. It
causes issue since the later is excluded by nuspec.
* MacOS nuget test use models from onnx repo. However, latest onnx has
some models with data types like float8 that are not supported by C#, so
those model test failed.
* Linux nuget test uses a docker Dockerfile.package_ubuntu_2404_gpu, but
docker build failed due to libnvinfer-headers-python-plugin-dev and
libnvinfer-win-builder-resource10 version.

## Changes

### 1. Robust C# DLL Resolution

The DllImportResolver has been enhanced to handle various deployment
scenarios where standard .NET resolution might fail:

- **Platform-Specific Naming**: Maps extension-less library names
(`onnxruntime`, `ortextensions`) to appropriate filenames
(`onnxruntime.dll`, `libonnxruntime.so`, `libonnxruntime.dylib`) based
on the OS.
- **Multi-Stage Probing**:
1. **Default Loading**: Attempts `NativeLibrary.TryLoad` with the mapped
name.
2. **NuGet `runtimes` Probing**: If the above fails, it probes the
`runtimes/{rid}/native/` subdirectories relative to the assembly
location, covering common RIDs (`win-x64`, `linux-arm64`, `osx-arm64`,
etc.).
3. **Base Directory Fallback**: As a final attempt, it looks in
`AppContext.BaseDirectory`.
- **Case-Sensitivity Handling**: Ensures lowercase extensions are used
on Windows to prevent lookup failures on case-sensitive filesystems.

### 2. macOS CI/Packaging Improvements

- **Templates (test_macos.yml)**:
    - Updated to extract artifacts from TGZ files.
- Ensures `libcustom_op_library.dylib` is placed in the expected
location (`testdata/testdata`) for end-to-end tests.
    - Initializes the ONNX submodule to provide required test data.
- **Node.js**:
- Restored the Node.js macOS test stage in
c-api-noopenmp-test-pipelines.yml, configured to run on the ARM64 pool
(`AcesShared`).
- Updated test_macos.yml template to support custom agent pools (similar
to the NuGet template).
- **Pipeline Config**: Adjusted agent pool selection and demands for
macOS jobs to ensure stable execution.
- **Binary Robustness**: The `copy_strip_binary.sh` script now ensures
`libonnxruntime.dylib` is a real file rather than a symlink, improving
NuGet packaging reliability.

### 3. Test Refinements

- **Inference Tests**: Skips a specific set of pretrained-model test
cases on macOS that are currently known to be flaky or unsupported in
that environment, preventing noise in the CI results.

## Verification

### Pipelines
- [x] Verified in `NuGet_Test_MacOS`.
- [x] Verified in `NuGet_Test_Linux`.
- [x] Verified in Windows test pipelines.

### Net Effect
The C# bindings are now significantly more resilient to different
deployment environments. The CI process for macOS is also more robust,
correctly handling the artifacts required for comprehensive NuGet
validation.
tianleiwu added a commit that referenced this pull request Feb 13, 2026
This cherry-picks the following commits for the 1.24.2 release:
- #27096
- #27077
- #26677
- #27238
- #27213
- #27256
- #27278
- #27275
- #27276
- #27216
- #27271
- #27299
- #27294
- #27266
- #27176
- #27126
- #27252

---------

Co-authored-by: Xiaofei Han <xiaofeihan@microsoft.com>
Co-authored-by: Jiajia Qin <jiajiaqin@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Co-authored-by: qti-monumeen <monumeen@qti.qualcomm.com>
Co-authored-by: Ankit Maheshkar <ankit.maheshkar@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: guschmue <22941064+guschmue@users.noreply.github.com>
Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: angelser <32746004+angelser@users.noreply.github.com>
Co-authored-by: Angela Serrano Brummett <angelser@microsoft.com>
Co-authored-by: Misha Chornyi <99709299+mc-nv@users.noreply.github.com>
Co-authored-by: hariharans29 <9969784+hariharans29@users.noreply.github.com>
Co-authored-by: eserscor <erscor@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
Co-authored-by: Ti-Tai Wang <titaiwang@microsoft.com>
Co-authored-by: bmehta001 <bmehta001@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants