💖 If you find ort useful, please consider sponsoring us on Open Collective 💖
🤔 Need help upgrading? Ask questions in GitHub Discussions or in the pyke.io Discord server!
I'm sorry it took so long to get to this point, but the next big release of ort should be, finally, 2.0.0 🎉. I know I said that about one of the old alpha releases (if you can even remember those), but I mean it this time! Also, I would really like to not have to do another major release right after, so if you have any concerns about any APIs, please speak now or forever hold your peace!
A huge thank you to all the individuals who have contributed to the Collective over the years: Marius, Urban Pistek, Phu Tran, Haagen, Yunho Cho, Laco Skokan, Noah, Matouš Kučera, mush42, Thomas, Bartek, Kevin Lacker, & Okabintaro. You guys have made these past rc releases possible.
If you are a business using ort, please consider sponsoring me. Egress bandwidth from pyke.io has quadrupled in the last 4 months, and 90% of that comes from just a handful of businesses. I'm lucky enough that I don't have to pay for egress right now, but I don't expect that arrangement to last forever. pyke & ort have been funded entirely from my own personal savings for years, and (as I'm sure you're well aware 😂) everything is getting more expensive, so that definitely isn't sustainable.
Seeing companies that raise tens of millions in funding build large parts of their business on ort, ask for support, and then not give anything back just... seems kind of unfair, no?
ort-web
ort-web allows you to use the fully-featured ONNX Runtime on the Web! This time, it's hack-free and thus here to stay (it won't be removed, and then added back, and then removed again like last time!)
See the crate docs for info on how to port your application to ort-web; there is a little bit of work involved. For a very barebones sample application, see ort-web-sample.
Documentation for ort-web, like the rest of ort, will improve by the time 2.0.0 comes around. If you ever have any questions, you can always reach out via GitHub Discussions or Discord!
Features
5d85209Add WebNN & WASM execution providers forort-web.#430(💖 @jhonboy121) Support statically linking to iOS frameworks.#433(💖 @rMazeiks) Implement more traits forGraphOptimizationLevel.6727c98MakePrepackedWeightsSend + Sync.15bd15cMake the TLS backend configurable with newtls-*Cargo features.f3cd995Allow overriding the cache dir with theORT_CACHE_DIRenvironment variable.- 🚨
8b3a1edLoad the dylib immediately when usingort::init_from.- You can now detect errors from dylib loading and let your program react accordingly.
- 🚨
#484(💖 @michael-p) Updatendarrayto v0.17.- This means you'll need to upgrade your
ndarraydependency to v0.17, too.
- This means you'll need to upgrade your
0084d08Newort::lifetimetracing target tracks when objects are allocated/freed to aid in debugging leaks.
Fixes
2ee17aaFix a memory leak inIoBinding.317be20Don't storeEnvironmentas a static.- This fixes a
mutex lock failed: Invalid argumentcrash on macOS when exiting the process.
- This fixes a
466025cFix unexpected CPU usage when copying GPU tensors.ecca246Fix UB when extracting empty tensors.22f71baGate theArrayExtensionstrait behind thestdfeature, fixing#![no_std]builds.af63ceaFix an illegal memory access onno_stdbuilds.#444(💖 @pembem22) Fix Android link.1585268Don't allow sessions to be created with non-CPU allocators#485(💖 @mayocream) Fix load order when usingcuda::preload_dylibs.c5b68a1FixAsyncInferenceFutdrop behavior.
Misc
- Update ONNX Runtime to v1.23.2.
- The MSRV is now Rust 1.88.
- Binaries are now compressed using LZMA2, which reduces bandwidth by 30% compared to gzip but may double the time it takes to download binaries for the first time.
- If you use
ortin CI, please cache the~/.cache/ort.pyke.iodirectory between runs.
- If you use
ort's dependency tree has shrunk a little bit, so it should build a little faster!b68c928Overhaulbuild.rs- Warnings should now appear when binaries aren't available, and errors should look a lot nicer.
pkg-configsupport now requires thepkg-configfeature.
- 🚨
d269461MakeMetadatamethods returnOption<T>instead ofResult<T>. - 🚨
47e5667Gatepreload_dylibandcuda::preload_dylibsbehind a newpreload-dylibsfeature flag instead ofload-dynamic. - 🚨
3b408b1Shortenexecution_providerstoepandXXXExecutionProvidertoXXX.- They are still re-imported as their old names to avoid breakage, but these re-imports will be removed and thus broken in 2.0.0, so it's a good idea to change them now.
- 🚨
38573e0SimplifyThreadManagertrait.
ONNX Runtime binary changes
- Now shipping iOS & Android builds!!! Thank you Raphael Menges!!!
- Support for Intel macOS (
x86_64-apple-darwin) has been dropped following upstream changes to ONNX Runtime & Rust.- Additionally, the macOS target has been raised to 13.4.
- This means I can't debug macOS issues in my Hackintosh VM anymore, so expect little to no macOS support in general from now on. If you know where I can get a used 16GB Apple Silicon Mac Mini for cheap, please let me know!
- ONNX Runtime is now compiled with
--client_package_build, meaning default options will optimize for low-resource edge inference rather than high throughput.- This currently only disables spinning by default. For server deployments, re-enable inter- and intra-op spinning for best throughput.
- Now shipping TensorRT RTX builds on Windows & Linux!
- x86_64 builds now target
x86-64-v3, aka Intel Haswell/Broadwell and AMD Zen (any Ryzen) or later. - Linux builds are now built with Clang instead of GCC.
- Various CUDA changes:
- Kernels are now shipped compressed; this saves bandwidth & file size, but may slightly increase first-run latency. It will have no effect on subsequent runs.
- Recently-added float/int matrix multiplication kernels aren't enabled. Quantized models will miss out on a bit of performance, but it was impossible to compile these kernels within the limitations of free GitHub Actions runners.
ort-tract
- Update
tractto 0.22. 2d40e05ort-tractno longer claims it isort-candleinort::info().
ort-candle
- Update
candleto 0.9.