Skip to content

v2.0.0-rc.11

Latest

Choose a tag to compare

@decahedron1 decahedron1 released this 07 Jan 21:40
· 47 commits to main since this release
a873610
rc11

💖 If you find ort useful, please consider sponsoring us on Open Collective 💖

🤔 Need help upgrading? Ask questions in GitHub Discussions or in the pyke.io Discord server!


I'm sorry it took so long to get to this point, but the next big release of ort should be, finally, 2.0.0 🎉. I know I said that about one of the old alpha releases (if you can even remember those), but I mean it this time! Also, I would really like to not have to do another major release right after, so if you have any concerns about any APIs, please speak now or forever hold your peace!

A huge thank you to all the individuals who have contributed to the Collective over the years: Marius, Urban Pistek, Phu Tran, Haagen, Yunho Cho, Laco Skokan, Noah, Matouš Kučera, mush42, Thomas, Bartek, Kevin Lacker, & Okabintaro. You guys have made these past rc releases possible.

If you are a business using ort, please consider sponsoring me. Egress bandwidth from pyke.io has quadrupled in the last 4 months, and 90% of that comes from just a handful of businesses. I'm lucky enough that I don't have to pay for egress right now, but I don't expect that arrangement to last forever. pyke & ort have been funded entirely from my own personal savings for years, and (as I'm sure you're well aware 😂) everything is getting more expensive, so that definitely isn't sustainable.

Seeing companies that raise tens of millions in funding build large parts of their business on ort, ask for support, and then not give anything back just... seems kind of unfair, no?


ort-web

ort-web allows you to use the fully-featured ONNX Runtime on the Web! This time, it's hack-free and thus here to stay (it won't be removed, and then added back, and then removed again like last time!)

See the crate docs for info on how to port your application to ort-web; there is a little bit of work involved. For a very barebones sample application, see ort-web-sample.

Documentation for ort-web, like the rest of ort, will improve by the time 2.0.0 comes around. If you ever have any questions, you can always reach out via GitHub Discussions or Discord!

Features

  • 5d85209 Add WebNN & WASM execution providers for ort-web.
  • #430 (💖 @jhonboy121) Support statically linking to iOS frameworks.
  • #433 (💖 @rMazeiks) Implement more traits for GraphOptimizationLevel.
  • 6727c98 Make PrepackedWeights Send + Sync.
  • 15bd15c Make the TLS backend configurable with new tls-* Cargo features.
  • f3cd995 Allow overriding the cache dir with the ORT_CACHE_DIR environment variable.
  • 🚨 8b3a1ed Load the dylib immediately when using ort::init_from.
    • You can now detect errors from dylib loading and let your program react accordingly.
  • 🚨 #484 (💖 @michael-p) Update ndarray to v0.17.
    • This means you'll need to upgrade your ndarray dependency to v0.17, too.
  • 0084d08 New ort::lifetime tracing target tracks when objects are allocated/freed to aid in debugging leaks.

Fixes

  • 2ee17aa Fix a memory leak in IoBinding.
  • 317be20 Don't store Environment as a static.
    • This fixes a mutex lock failed: Invalid argument crash on macOS when exiting the process.
  • 466025c Fix unexpected CPU usage when copying GPU tensors.
  • ecca246 Fix UB when extracting empty tensors.
  • 22f71ba Gate the ArrayExtensions trait behind the std feature, fixing #![no_std] builds.
  • af63cea Fix an illegal memory access on no_std builds.
  • #444 (💖 @pembem22) Fix Android link.
  • 1585268 Don't allow sessions to be created with non-CPU allocators
  • #485 (💖 @mayocream) Fix load order when using cuda::preload_dylibs.
  • c5b68a1 Fix AsyncInferenceFut drop behavior.

Misc

  • Update ONNX Runtime to v1.23.2.
  • The MSRV is now Rust 1.88.
  • Binaries are now compressed using LZMA2, which reduces bandwidth by 30% compared to gzip but may double the time it takes to download binaries for the first time.
    • If you use ort in CI, please cache the ~/.cache/ort.pyke.io directory between runs.
  • ort's dependency tree has shrunk a little bit, so it should build a little faster!
  • b68c928 Overhaul build.rs
    • Warnings should now appear when binaries aren't available, and errors should look a lot nicer.
    • pkg-config support now requires the pkg-config feature.
  • 🚨 d269461 Make Metadata methods return Option<T> instead of Result<T>.
  • 🚨 47e5667 Gate preload_dylib and cuda::preload_dylibs behind a new preload-dylibs feature flag instead of load-dynamic.
  • 🚨 3b408b1 Shorten execution_providers to ep and XXXExecutionProvider to XXX.
    • They are still re-imported as their old names to avoid breakage, but these re-imports will be removed and thus broken in 2.0.0, so it's a good idea to change them now.
  • 🚨 38573e0 Simplify ThreadManager trait.

ONNX Runtime binary changes

  • Now shipping iOS & Android builds!!! Thank you Raphael Menges!!!
  • Support for Intel macOS (x86_64-apple-darwin) has been dropped following upstream changes to ONNX Runtime & Rust.
    • Additionally, the macOS target has been raised to 13.4.
    • This means I can't debug macOS issues in my Hackintosh VM anymore, so expect little to no macOS support in general from now on. If you know where I can get a used 16GB Apple Silicon Mac Mini for cheap, please let me know!
  • ONNX Runtime is now compiled with --client_package_build, meaning default options will optimize for low-resource edge inference rather than high throughput.
    • This currently only disables spinning by default. For server deployments, re-enable inter- and intra-op spinning for best throughput.
  • Now shipping TensorRT RTX builds on Windows & Linux!
  • x86_64 builds now target x86-64-v3, aka Intel Haswell/Broadwell and AMD Zen (any Ryzen) or later.
  • Linux builds are now built with Clang instead of GCC.
  • Various CUDA changes:
    • Kernels are now shipped compressed; this saves bandwidth & file size, but may slightly increase first-run latency. It will have no effect on subsequent runs.
    • Recently-added float/int matrix multiplication kernels aren't enabled. Quantized models will miss out on a bit of performance, but it was impossible to compile these kernels within the limitations of free GitHub Actions runners.

ort-tract

  • Update tract to 0.22.
  • 2d40e05 ort-tract no longer claims it is ort-candle in ort::info().

ort-candle

  • Update candle to 0.9.

❤️🧡💛💚💙💜