-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
model:transformerissues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.staleissues that have not been addressed in a while; categorized by a botissues that have not been addressed in a while; categorized by a bot
Description
Describe the issue
Bug Report
Loading and optimizing the model with CUDA crashed! In comparison, it can run well when executing optimization on the CPU.
The crash stack trace:
Traceback (most recent call last):
File "test", line 7, in <module>
optimized_model = optimizer.optimize_model(model_path, opt_level=1, use_gpu=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/software/onnxruntime/build/Linux/Release/onnxruntime/transformers/optimizer.py", line 381, in optimize_model
temp_model_path = optimize_by_onnxruntime(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/software/onnxruntime/build/Linux/Release/onnxruntime/transformers/optimizer.py", line 206, in optimize_by_onnxruntime
onnxruntime.InferenceSession(onnx_model, sess_options, providers=providers, **kwargs)
File "/software/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py", line 465, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/software/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py", line 537, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /software/onnxruntime/onnxruntime/contrib_ops/cuda/fused_conv.cc:67 onnxruntime::contrib::cuda::FusedConv<T>::FusedConv(const onnxruntime::OpKernelInfo&) [with T = float] [ONNXRuntimeError] : 1 : FAIL : No attribute with name:'activation'is defined.
To reproduce
- Download model here
- Run the test script:
from onnxruntime.transformers import optimizer
model_path = "model_with_activation.onnx"
optimized_model_path = f"./opt.onnx"
optimized_model = optimizer.optimize_model(model_path, opt_level=1, use_gpu=True)
Notice:
- use_gpu=True & opt_level >=1 --> crash
- use_gpu=False --> run well
- opt_level = 0 --> run well
Urgency
No response
Platform
Linux
OS Version
Ubuntu 12.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
model:transformerissues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.staleissues that have not been addressed in a while; categorized by a botissues that have not been addressed in a while; categorized by a bot