Skip to content

Conversation

@xaviliz
Copy link
Contributor

@xaviliz xaviliz commented Sep 1, 2025

New feature: OnnxPredict algorithm

Feature

It makes additional changes in Essentia library to build ONNX Runtime Inferencing library from the source and to implement a new algorithm OnnxPredict for running ONNX models (.onnx) with multiple IO.

Implementation

  • Provide a new building script for ONNX Runtime Inferencing library.
  • Modify Essentia scripts to link with the onnxruntime dynamic library.
  • Implement new algorithm OnnxPredict to run ONNX models in Essentia.
  • Implement unittests in test_onnxpredict.py

Prerequisites

  • python >= 3.10
  • cmake >= 3.28

Testing

  • Builds successfully with ONNX Runtime v1.22.1 in MacOS
    • ARM64
    • x86_64
  • Builds successfully with ONNX Runtime v1.22.1 in Linux
  • Multiple input inferencing
  • Multiple output inferencing
  • No runtime errors or compatibility issues

How to Test

Tested in onnxruntime-v1.22.1:

  • MacOS with an ARM64 machine with python 3.13.4 and cmake 4.0.2
  • Linux docker with python 3.10.18 and cmake 4.1.0

How to build ONNX Runtime

After installing Essentia dependencies in a virtual environment, install cmake

python3 -m pip install cmake
which cmake

Then we can run the building script:

cd packaging/debian_3rdparty
bash build_onnx.sh

How to build OnnxPredict

In MacOS:

source .env/bin/activate
python3 waf configure --fft=KISS --include-algos=OnnxPredict,Windowing,Spectrum,MelBands,UnaryOperator,TriangularBands,FFT,Magnitude,NoiseAdder,RealAccumulator,FileOutputProxy,FrameCutter --static-dependencies --pkg-config-path=/packaging/debian_3rdparty/lib/pkgconfig --with-onnx --lightweight= --with-python --pythondir=.env/lib/python3.13/site-packages
python3 waf -v && python3 waf install

In Linux:

python3 waf configure --fft=KISS --include-algos=OnnxPredict,Windowing,Spectrum,MelBands,UnaryOperator,TriangularBands,FFT,Magnitude,NoiseAdder,RealAccumulator,FileOutputProxy,FrameCutter --static-dependencies --with-onnx --lightweight= --with-python --pkg-config-path /usr/share/pkgconfig --std=c++14
python3 waf -v && python3 waf install

How to unittest

# prepare essentia audio repo
git clone https://github.com/MTG/essentia-audio.git test/essentia-audio
rm -rf test/audio && mv test/essentia-audio test/audio

# download effnet.onnx model for testing
curl https://essentia.upf.edu/models/feature-extractors/discogs-effnet/discogs-effnet-bsdynamic-1.onnx --output test/models/discogs-effnet-bsdynamic-1.onnx
python3 test/src/unittests/all_tests.py onnxpredict

Copy link
Contributor

@palonso palonso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @xaviliz !!
I left some comments. Some are questions about things I didn't understand

OS=$(uname -s)
CONFIG=Release

if [ "$OS" = "Darwin" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xaviliz, since we are inside debian_3rdparty, should we remove or move somewhere else the MacOS support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true. I kept it for testing purpouses. Let me clean it a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been tested on Linux.

const char* OnnxPredict::name = "OnnxPredict";
const char* OnnxPredict::category = "Machine Learning";

const char* OnnxPredict::description = DOC("This algorithm runs a Onnx graph and stores the desired output tensors in a pool.\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an ONNX graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be an ONNX model, there is no access to graphs in onnxruntime. It is fixed now.


// Do not do anything if we did not get a non-empty model name.
if (_graphFilename.empty()) return;
cout << "after return" << endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean debug output

_env = Ort::Env(ORT_LOGGING_LEVEL_WARNING, "multi_io_inference"); // {"default", "test", "multi_io_inference"}

// Set graph optimization level - check https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html
_sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are different optimization options, I'm wondering if there is a chance that extended optimization doesn't work or affects model performance in some cases. I think this should be turned into a parameter that defaults to extended.

https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html#graph-optimization-levels

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I am not sure how optimizations could affect the performance. Adding new parameter sounds good to me. So, do you propose to add boolean parameter for each optimization? or just an string to use one of them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New optimizationLevel parameter has been added as string with choices: {disable_all, basic, extended, all}, by default extended. Maybe it is nice to add some additional tests, what do you think about comparing outputs for the identity model in all extensions?

// Set graph optimization level - check https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html
_sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);
// To enable model serialization after graph optimization set this
_sessionOptions.SetOptimizedModelFilePath("optimized_file_path");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is mainly intended for debugging purposes. Can we skip saving the optimized graph for efficiency?

https://onnxruntime.ai/docs/api/c/struct_ort_api.html#ad238e424200c0f1682947a1f342c39ca

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we don't need to store the optimized graph in a model.

return out;
}

void OnnxPredict::reset() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we reset the session and env too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I couldn't find a reset method for session and env in the CPP_API like in Tensorflow. But let me try it using std::unique_ptr maybe that could work. however, I am doubting if we should do that after compute(), because if we reset the session at the end of configure(), session.Run() will fail.

const Pool& poolIn = _poolIn.get();
Pool& poolOut = _poolOut.get();

std::vector<std::vector<float>> input_datas; // <-- keeps inputs alive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_datas -> input_data?
I think data is already plural

// Step 2: Convert data to float32
input_datas.emplace_back(inputData.size());
for (size_t j = 0; j < inputData.size(); ++j) {
input_datas.back()[j] = static_cast<float>(inputData.data()[j]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of forcing casting data to float, shouldn't we keep it in Real format (that is actually float32 by default) and make sure that the model runs in whatever type Real points to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true when Real == float but we need fall back to cast when Real == double. So, we should not try to make ONNX Runtime run on “whatever Real points to.” However the redundant cast when Real == float could be avoided.

Copy link
Contributor Author

@xaviliz xaviliz Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed #1488! Essentia::Real is already float32 by default, so no need to cast ;)

}

// Step 3: Create ONNX Runtime tensor
_memoryInfo = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to run the models on GPU if available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is 6131728
Maybe it is nice to add some test for these functionalities. I couldn't test them properly yet.

def _create_essentia_class(name, moduleName = __name__):
essentia.log.debug(essentia.EPython, 'Creating essentia.standard class: %s' % name)

# print(f"name: {name}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove debug print

…hen cuda, metal or open_ml are not compiled in onnxruntime library
… level in ORT.

- Declared as a new string parameter with choices: {disable_all,basic,extended,all}, by default “extended”.
	- https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html#levels
- Set graph optimization level in ORT Session.
- `test_default_optimization_level()`: Check that the default optimization level is 'extended'.
- `test_set_valid_optimization_levels()`: Check that valid optimization levels can be set without errors.
- `test_set_invalid_optimization_level()`: Check that invalid optimization levels raise an error.
- CUDA tensors now use Ort::MemoryInfo::CreateCuda to allocate GPU memory.
- Metal and CoreML providers continue to use CPU tensors; data is managed internally by the provider.
- Execution provider is auto-selected based on availability and _deviceId.
@xaviliz
Copy link
Contributor Author

xaviliz commented Dec 23, 2025

Thank you @palonso, all changes/suggestions have been addressed.
Please when you can revise and give me feedback for the changes.
I think it is almost done.

Before merging it would be nice to:

  • Test the algorithm in Linux with the last changes, but it should work fine.
  • Test it in OSX with METAL and OPEN_ML.
  • Make a brew cask to test the essentia installation with the onnxruntime package
    • I think it was failing because it was builded without some static dependencies flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants