Skip to content

azizb-xlnx/onnx-utils

Repository files navigation

ONNX Simplification Testing Scripts

This directory contains scripts to test and compare ONNX models before and after simplification using onnxsim. These scripts help you understand the impact of ONNX model optimization on file size, inference speed, and numerical accuracy.

Scripts Overview

1. test_onnx_simplification.py - Complete Test Suite

A comprehensive script that creates a complex PyTorch model, exports it to ONNX, simplifies it, and compares:

  • Model file sizes
  • Number of nodes and operations
  • Inference performance
  • Output accuracy
  • Support for dynamic batch sizes

Usage:

python test_onnx_simplification.py

Features:

  • Creates a complex model with attention, batch norm, and redundant operations
  • Exports to ONNX with proper optimization settings
  • Uses onnxsim for simplification
  • Benchmarks inference time with ONNXRuntime
  • Validates output accuracy between original and simplified models
  • Tests with multiple batch sizes
  • Generates detailed comparison reports

2. test_existing_model.py - Test Your Own Models

Tests simplification on existing ONNX models in your workspace.

Usage:

python test_existing_model.py --model path/to/your/model.onnx

Example:

python test_existing_model.py --model dynamic_shape_model_dynamo.onnx
python test_existing_model.py --model 290925_model_opset21.onnx

Features:

  • Works with any existing ONNX model
  • Automatically creates appropriate test inputs
  • Compares performance and accuracy
  • Saves simplified model for inspection
  • Handles dynamic shapes and multiple inputs/outputs

3. demo_simplification.py - Quick Demo

A simple demonstration script that finds ONNX models in the current directory and shows the simplification benefits.

Usage:

python demo_simplification.py

Features:

  • Automatically finds ONNX models in current directory
  • Shows basic statistics (node count, file size)
  • Quick simplification without detailed benchmarking
  • Good for getting a quick overview of potential improvements

Installation

Install required dependencies:

pip install -r requirements.txt

Or manually:

pip install torch onnx onnxruntime onnxsim numpy

Understanding the Output

Model Comparison Table

Metric                         Original             Simplified           Improvement
--------------------------------------------------------------------------------
File Size (MB)                 5.23                 3.45                 34.0%
Number of Nodes                1247                 892                  28.5%
Operator Types                 23                   18                   21.7%
Inference Time (ms)            12.45                8.32                 33.2%

Key Metrics Explained

  • File Size Reduction: Smaller models load faster and use less memory
  • Node Reduction: Fewer operations mean faster inference
  • Operator Type Reduction: Simplified operation types may be better optimized
  • Inference Time: Direct performance improvement measurement

Accuracy Validation

The scripts check that simplified models produce identical outputs to the original models within a small tolerance (1e-5). This ensures that optimization doesn't change the model's behavior.

What Does ONNX Simplification Do?

ONNX simplification (onnxsim) performs several optimizations:

  1. Constant Folding: Computes constant expressions at optimization time
  2. Dead Code Elimination: Removes unused operations
  3. Operator Fusion: Combines multiple operations into single optimized operations
  4. Shape Inference: Improves shape information for better optimization
  5. Redundant Operation Removal: Eliminates identity operations and no-ops

Common Optimizations:

  • Remove Identity nodes that don't change data
  • Fold constants (e.g., x + 0 becomes just x)
  • Merge consecutive operations (e.g., multiple reshapes)
  • Eliminate unreachable code
  • Simplify mathematical expressions

Example Results

Typical improvements you might see:

Model Type Size Reduction Speed Improvement Node Reduction
CNN 20-40% 15-30% 25-45%
Transformer 15-35% 10-25% 20-40%
MLP 10-25% 5-20% 15-30%

Testing Your Models

To test your existing models:

  1. Single Model Test:

    python test_existing_model.py --model your_model.onnx
  2. Quick Overview:

    python demo_simplification.py
  3. Full PyTorch Pipeline:

    python test_onnx_simplification.py

Troubleshooting

Common Issues:

  1. "Module not found" errors: Install missing dependencies

    pip install onnx onnxruntime onnxsim
  2. Simplification validation fails: Some models with complex dynamic shapes may not simplify safely. The original model will still work.

  3. Memory errors with large models: The scripts create multiple copies of models in memory. For very large models (>1GB), you may need more RAM.

  4. Shape inference errors: Some models with very dynamic shapes may have issues. Try with smaller batch sizes or simpler models first.

Performance Tips:

  • Use GPU providers for faster inference:

    providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
  • For very large models, reduce the number of benchmark runs

  • Test with realistic input sizes for your use case

Advanced Usage

Custom Test Inputs

You can modify the scripts to use your own test data instead of random inputs:

# In test_existing_model.py, replace create_dummy_inputs()
test_inputs = {
    'input_name': your_real_data.astype(np.float32)
}

Batch Size Testing

The scripts automatically test different batch sizes to verify dynamic shape support.

Output Analysis

For detailed output comparison, the scripts provide:

  • Maximum absolute difference between outputs
  • Mean absolute difference
  • Relative error percentage

This helps you understand if any numerical differences are significant for your application.

Files Generated

After running the tests, you'll find:

  • simplification_test/ directory with simplified models
  • onnx_simplification_test/ directory with complete test results
  • Detailed console output with comparison tables

Next Steps

  1. Run the demo on your existing models
  2. Analyze the improvements for your specific use case
  3. Integrate simplified models into your deployment pipeline
  4. Consider further optimizations like quantization or tensorRT conversion

Remember: Always validate that simplified models maintain acceptable accuracy for your specific application!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages