Bump urllib3 from 2.6.3 to 2.7.0 in the uv group across 1 directory by dependabot[bot] · Pull Request #122 · istupakov/onnx-asr

dependabot · 2026-05-11T17:16:27Z

Bumps the uv group with 1 update in the / directory: urllib3.

Updates urllib3 from 2.6.3 to 2.7.0

Release notes

2.7.0

🚀 urllib3 is fundraising for HTTP/2 support

urllib3 is raising ~$40,000 USD to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects please consider contributing financially to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.

Thank you for your support.

Security

Addressed high-severity security issues. Impact was limited to specific use cases detailed in the accompanying advisories; overall user exposure was estimated to be marginal.

Decompression-bomb safeguards of the streaming API were bypassed:

When HTTPResponse.drain_conn() was called after the response had been read and decompressed partially. (Reported by @Cycloctane)

During the second HTTPResponse.read(amt=N) or HTTPResponse.stream(amt=N) call when the response was decompressed using the official Brotli library. (Reported by @kimkou2024)

See GHSA-mf9v-mfxr-j63j for details.

HTTP pools created using ProxyManager.connection_from_url did not strip sensitive headers specified in Retry.remove_headers_on_redirect when redirecting to a different host. (GHSA-qccp-gfcp-xxvc reported by @christos-spearbit)

Deprecations and Removals

Used FutureWarning instead of DeprecationWarning for better visibility of existing deprecation notices. Rescheduled the removal of deprecated features to version 3.0. (urllib3/urllib3#3763)

Removed support for end-of-life Python 3.9. (urllib3/urllib3#3720)

Removed support for end-of-life PyPy3.10. (urllib3/urllib3#4979)

Bumped the minimum supported pyOpenSSL version to 19.0.0. (urllib3/urllib3#3777)

Bugfixes

Fixed a bug where HTTPResponse.read(amt=None) was ignoring decompressed data buffered from previous partial reads. (urllib3/urllib3#3636)

Fixed a bug where HTTPResponse.read() could cache only part of the response after a partial read when cache_content=True. (urllib3/urllib3#4967)

Fixed HTTPResponse.stream() and HTTPResponse.read_chunked() to handle amt=0. (urllib3/urllib3#3793)

Updated _TYPE_BODY type alias to include missing Iterable[str], matching the documented and runtime behavior of chunked request bodies. (urllib3/urllib3#3798)

Fixed LocationParseError when paths resembling schemeless URIs were passed to HTTPConnectionPool.urlopen(). (urllib3/urllib3#3352)

Fixed BaseHTTPResponse.readinto() type annotation to accept memoryview in addition to bytearray, matching the io.RawIOBase.readinto contract and enabling use with io.BufferedReader without type errors. (urllib3/urllib3#3764)

Changelog

Sourced from urllib3's changelog.

2.7.0 (2026-05-07)

Security

Addressed high-severity security issues. Impact was limited to specific use cases detailed in the accompanying advisories; overall user exposure was estimated to be marginal.

Decompression-bomb safeguards of the streaming API were bypassed:

When HTTPResponse.drain_conn() was called after the response had been read and decompressed partially.

During the second HTTPResponse.read(amt=N) or HTTPResponse.stream(amt=N) call when the response was decompressed using the official Brotli <https://pypi.org/project/brotli/>__ library.

See GHSA-mf9v-mfxr-j63j <https://github.com/urllib3/urllib3/security/advisories/GHSA-mf9v-mfxr-j63j>__ for details.

HTTP pools created using ProxyManager.connection_from_url did not strip sensitive headers specified in Retry.remove_headers_on_redirect when redirecting to a different host. (GHSA-qccp-gfcp-xxvc <https://github.com/urllib3/urllib3/security/advisories/GHSA-qccp-gfcp-xxvc>__)

Deprecations and Removals

Used FutureWarning instead of DeprecationWarning for better visibility of existing deprecation notices. Rescheduled the removal of deprecated features to version 3.0. ([#3763](https://github.com/urllib3/urllib3/issues/3763) <https://github.com/urllib3/urllib3/issues/3763>__)

Removed support for end-of-life Python 3.9. ([#3720](https://github.com/urllib3/urllib3/issues/3720) <https://github.com/urllib3/urllib3/issues/3720>__)

Removed support for end-of-life PyPy3.10. ([#4979](https://github.com/urllib3/urllib3/issues/4979) <https://github.com/urllib3/urllib3/issues/4979>__)

Bumped the minimum supported pyOpenSSL version to 19.0.0. ([#3777](https://github.com/urllib3/urllib3/issues/3777) <https://github.com/urllib3/urllib3/issues/3777>__)

Bugfixes

Fixed a bug where HTTPResponse.read(amt=None) was ignoring decompressed data buffered from previous partial reads. ([#3636](https://github.com/urllib3/urllib3/issues/3636) <https://github.com/urllib3/urllib3/issues/3636>__)

Fixed a bug where HTTPResponse.read() could cache only part of the response after a partial read when cache_content=True.

... (truncated)

Commits

9a950b9 Release 2.7.0
5ec0de4 Merge commit from fork
2bdcc44 Merge commit from fork
f45b0df Fix a misleading example for ProxyManager (#4970)
577193c Switch to nightly PyPy3.11 in CI for now (#4984)
e90af45 Avoid infinite loop in HTTPResponse.read_chunked when amt=0 (#4974)
67ed74f Bump dev dependencies (#4972)
3abd481 Upgrade mypy to version 1.20.2 (#4978)
2b8725d Drop support for EOL PyPy3.10 (#4979)
2944b2a Upgrade setup-chrome and setup-firefox to fix warnings (#4973)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
@dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
@dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
@dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency
@dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions
You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps the uv group with 1 update in the / directory: [urllib3](https://github.com/urllib3/urllib3). Updates `urllib3` from 2.6.3 to 2.7.0 - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](urllib3/urllib3@2.6.3...2.7.0) --- updated-dependencies: - dependency-name: urllib3 dependency-version: 2.7.0 dependency-type: indirect dependency-group: uv ... Signed-off-by: dependabot[bot] <support@github.com>

codecov-commenter · 2026-05-11T17:22:27Z

❌ 7 Tests Failed:

Tests completed	Failed	Passed	Skipped
381	7	374	1

View the top 3 failed test(s) by shortest run time

tests/onnx_asr/test_recognize.py::test_empty_recognize[onnx-community/whisper-tiny]

Stack Traces | 0s run time

request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

tests/onnx_asr/test_recognize.py::test_recognize[onnx-community/whisper-tiny]

Stack Traces | 0s run time

request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

tests/onnx_asr/test_recognize.py::test_recognize_batch[onnx-community/whisper-tiny]

Stack Traces | 0s run time

request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

tests/onnx_asr/test_recognize.py::test_recognize_with_timestamps[onnx-community/whisper-tiny]

Stack Traces | 0s run time

request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

tests/onnx_asr/test_recognize.py::test_supported_only_mono_audio_error[onnx-community/whisper-tiny]

Stack Traces | 0s run time

request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

tests/onnx_asr/test_recognize.py::test_wrong_sample_rate_error[onnx-community/whisper-tiny]

Stack Traces | 0s run time

request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

tests/onnx_asr/test_recognize.py::test_file_not_found_error[onnx-community/whisper-tiny]

Stack Traces | 0.531s run time

request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

To view more test analytics, go to the Test Analytics Dashboard
_{📋 Got 3 mins? Take this short survey to help us improve Test Analytics.}

dependabot Bot added dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code labels May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump urllib3 from 2.6.3 to 2.7.0 in the uv group across 1 directory#122

Bump urllib3 from 2.6.3 to 2.7.0 in the uv group across 1 directory#122
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/uv/uv-c30c77f42d

dependabot Bot commented on behalf of github May 11, 2026

Uh oh!

codecov-commenter commented May 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dependabot Bot commented on behalf of github May 11, 2026

2.7.0

🚀 urllib3 is fundraising for HTTP/2 support

Security

Deprecations and Removals

Bugfixes

2.7.0 (2026-05-07)

Security

Deprecations and Removals

Bugfixes

Uh oh!

codecov-commenter commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ 7 Tests Failed:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov-commenter commented May 11, 2026 •

edited

Loading