-
Notifications
You must be signed in to change notification settings - Fork 27
Onnx transformers: Quantize option #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Onnx transformers: Quantize option #6
Conversation
-changed framework type to "pt"
added usage example with Ner pipe.
reading configuration from model config
modelcard = config | ||
# TODO: Disable modelcard (below 4 lines) if working with local models. | ||
# searches modelcard.json | ||
if not local_model: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can keep it as it is, does it break for local model ? if not let's keep it as it is
@@ -694,6 +704,20 @@ def _forward(self, inputs, return_tensors=False): | |||
else: | |||
return predictions.numpy() | |||
|
|||
def _create_quantized_graph(self, onnx_opt_model_path): | |||
#TODO: add option gpt2 if need | |||
opt_options = BertOptimizationOptions('bert') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check model type explicitly, raise an assert or exception if the model is not bert
@@ -555,7 +559,13 @@ def __init__( | |||
logger.info(f"loading onnx graph from {self.graph_path.as_posix()}") | |||
self.onnx_model = create_model_for_provider(str(graph_path), "CPUExecutionProvider") | |||
self.input_names = json.load(open(input_names_path)) | |||
self.framework = "np" | |||
self.framework = "pt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause other things to break. Are all tests passing ?
you can run tests using make test
.
Set `onnx` to `False` for standard torch inference. | ||
Set `quantized` to `True` for quantize with Onnx. ( set `onnx` to True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set `quantized` to `True` for quantize with Onnx. ( set `onnx` to True) | |
Set `quantized` to `True` for quantize with Onnx. ( set `onnx` to True) |
I've made changes as we talked in pr.
I added option
local_model
to pipeline. It ignoresmodelcard
to load local models that without having modelcard.I kept framework as torch. In some cases like loading local models i have got error that
InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type.
. We can leave it like this to stay safe.