Skip to content

Bump urllib3 from 2.6.3 to 2.7.0 in the uv group across 1 directory#122

Open
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/uv/uv-c30c77f42d
Open

Bump urllib3 from 2.6.3 to 2.7.0 in the uv group across 1 directory#122
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/uv/uv-c30c77f42d

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github May 11, 2026

Bumps the uv group with 1 update in the / directory: urllib3.

Updates urllib3 from 2.6.3 to 2.7.0

Release notes

Sourced from urllib3's releases.

2.7.0

🚀 urllib3 is fundraising for HTTP/2 support

urllib3 is raising ~$40,000 USD to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects please consider contributing financially to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.

Thank you for your support.

Security

Addressed high-severity security issues. Impact was limited to specific use cases detailed in the accompanying advisories; overall user exposure was estimated to be marginal.

  • Decompression-bomb safeguards of the streaming API were bypassed:

    1. When HTTPResponse.drain_conn() was called after the response had been read and decompressed partially. (Reported by @​Cycloctane)
    2. During the second HTTPResponse.read(amt=N) or HTTPResponse.stream(amt=N) call when the response was decompressed using the official Brotli library. (Reported by @​kimkou2024)

    See GHSA-mf9v-mfxr-j63j for details.

  • HTTP pools created using ProxyManager.connection_from_url did not strip sensitive headers specified in Retry.remove_headers_on_redirect when redirecting to a different host. (GHSA-qccp-gfcp-xxvc reported by @​christos-spearbit)

Deprecations and Removals

  • Used FutureWarning instead of DeprecationWarning for better visibility of existing deprecation notices. Rescheduled the removal of deprecated features to version 3.0. (urllib3/urllib3#3763)
  • Removed support for end-of-life Python 3.9. (urllib3/urllib3#3720)
  • Removed support for end-of-life PyPy3.10. (urllib3/urllib3#4979)
  • Bumped the minimum supported pyOpenSSL version to 19.0.0. (urllib3/urllib3#3777)

Bugfixes

  • Fixed a bug where HTTPResponse.read(amt=None) was ignoring decompressed data buffered from previous partial reads. (urllib3/urllib3#3636)
  • Fixed a bug where HTTPResponse.read() could cache only part of the response after a partial read when cache_content=True. (urllib3/urllib3#4967)
  • Fixed HTTPResponse.stream() and HTTPResponse.read_chunked() to handle amt=0. (urllib3/urllib3#3793)
  • Updated _TYPE_BODY type alias to include missing Iterable[str], matching the documented and runtime behavior of chunked request bodies. (urllib3/urllib3#3798)
  • Fixed LocationParseError when paths resembling schemeless URIs were passed to HTTPConnectionPool.urlopen(). (urllib3/urllib3#3352)
  • Fixed BaseHTTPResponse.readinto() type annotation to accept memoryview in addition to bytearray, matching the io.RawIOBase.readinto contract and enabling use with io.BufferedReader without type errors. (urllib3/urllib3#3764)
Changelog

Sourced from urllib3's changelog.

2.7.0 (2026-05-07)

Security

Addressed high-severity security issues. Impact was limited to specific use cases detailed in the accompanying advisories; overall user exposure was estimated to be marginal.

  • Decompression-bomb safeguards of the streaming API were bypassed:

    1. When HTTPResponse.drain_conn() was called after the response had been read and decompressed partially.
    2. During the second HTTPResponse.read(amt=N) or HTTPResponse.stream(amt=N) call when the response was decompressed using the official Brotli <https://pypi.org/project/brotli/>__ library.

    See GHSA-mf9v-mfxr-j63j <https://github.com/urllib3/urllib3/security/advisories/GHSA-mf9v-mfxr-j63j>__ for details.

  • HTTP pools created using ProxyManager.connection_from_url did not strip sensitive headers specified in Retry.remove_headers_on_redirect when redirecting to a different host. (GHSA-qccp-gfcp-xxvc <https://github.com/urllib3/urllib3/security/advisories/GHSA-qccp-gfcp-xxvc>__)

Deprecations and Removals

  • Used FutureWarning instead of DeprecationWarning for better visibility of existing deprecation notices. Rescheduled the removal of deprecated features to version 3.0. ([#3763](https://github.com/urllib3/urllib3/issues/3763) <https://github.com/urllib3/urllib3/issues/3763>__)
  • Removed support for end-of-life Python 3.9. ([#3720](https://github.com/urllib3/urllib3/issues/3720) <https://github.com/urllib3/urllib3/issues/3720>__)
  • Removed support for end-of-life PyPy3.10. ([#4979](https://github.com/urllib3/urllib3/issues/4979) <https://github.com/urllib3/urllib3/issues/4979>__)
  • Bumped the minimum supported pyOpenSSL version to 19.0.0. ([#3777](https://github.com/urllib3/urllib3/issues/3777) <https://github.com/urllib3/urllib3/issues/3777>__)

Bugfixes

  • Fixed a bug where HTTPResponse.read(amt=None) was ignoring decompressed data buffered from previous partial reads. ([#3636](https://github.com/urllib3/urllib3/issues/3636) <https://github.com/urllib3/urllib3/issues/3636>__)
  • Fixed a bug where HTTPResponse.read() could cache only part of the response after a partial read when cache_content=True.

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
  • @dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
  • @dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
  • @dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency
  • @dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions
    You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps the uv group with 1 update in the / directory: [urllib3](https://github.com/urllib3/urllib3).


Updates `urllib3` from 2.6.3 to 2.7.0
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](urllib3/urllib3@2.6.3...2.7.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.7.0
  dependency-type: indirect
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code labels May 11, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 11, 2026

❌ 7 Tests Failed:

Tests completed Failed Passed Skipped
381 7 374 1
View the top 3 failed test(s) by shortest run time
tests/onnx_asr/test_recognize.py::test_empty_recognize[onnx-community/whisper-tiny]
Stack Traces | 0s run time
request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail
tests/onnx_asr/test_recognize.py::test_recognize[onnx-community/whisper-tiny]
Stack Traces | 0s run time
request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail
tests/onnx_asr/test_recognize.py::test_recognize_batch[onnx-community/whisper-tiny]
Stack Traces | 0s run time
request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail
tests/onnx_asr/test_recognize.py::test_recognize_with_timestamps[onnx-community/whisper-tiny]
Stack Traces | 0s run time
request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail
tests/onnx_asr/test_recognize.py::test_supported_only_mono_audio_error[onnx-community/whisper-tiny]
Stack Traces | 0s run time
request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail
tests/onnx_asr/test_recognize.py::test_wrong_sample_rate_error[onnx-community/whisper-tiny]
Stack Traces | 0s run time
request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail
tests/onnx_asr/test_recognize.py::test_file_not_found_error[onnx-community/whisper-tiny]
Stack Traces | 0.531s run time
request = <SubRequest 'model' for <Function test_file_not_found_error[onnx-community/whisper-tiny]>>

    @pytest.fixture(scope="module", params=models)
    def model(request: pytest.FixtureRequest) -> TextResultsAsrAdapter:
        match request.param:
            case "t-tech/t-one":
                return onnx_asr.load_model(request.param)
            case "onnx-community/whisper-tiny":
>               return onnx_asr.load_model(request.param, quantization="uint8")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/onnx_asr/test_recognize.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:347: in load_model
    return manager.create_asr(model, path, quantization=quantization, config=asr_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14....../site-packages/onnx_asr/loader.py:248: in create_asr
    resolver.model_type(resolver.resolve_model(quantization=quantization), self._create_preprocessor, config)
.venv/lib/python3.14.../onnx_asr/models/whisper.py:157: in __init__
    self._decoder = rt.InferenceSession(model_files["decoder"], **onnx_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:530: in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x10653bb60>
providers = ['CoreMLExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
provider_options = [{}, {}, {}], disabled_optimizers = set()

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Validate that TensorrtExecutionProvider and NvTensorRTRTXExecutionProvider are not both specified
        if providers:
            has_tensorrt = any(
                provider == "TensorrtExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                for provider in providers
            )
            has_tensorrt_rtx = any(
                provider == "NvTensorRTRTXExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                for provider in providers
            )
            if has_tensorrt and has_tensorrt_rtx:
                raise ValueError(
                    "Cannot enable both 'TensorrtExecutionProvider' and 'NvTensorRTRTXExecutionProvider' "
                    "in the same session."
                )
        # Tensorrt and TensorRT RTX can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "NvTensorRTRTXExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "NvTensorRTRTXExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "NvTensorRTRTXExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        elif "TensorrtExecutionProvider" in available_providers:
            if (
                providers
                and any(
                    provider == "CUDAExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                    for provider in providers
                )
                and any(
                    provider == "TensorrtExecutionProvider"
                    or (isinstance(provider, tuple) and provider[0] == "TensorrtExecutionProvider")
                    for provider in providers
                )
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        # Print a warning if user passed providers to InferenceSession() but the SessionOptions instance
        # already has provider information (e.g., via add_provider_for_devices()). The providers specified
        # here will take precedence.
        if self._sess_options is not None and (providers or provider_options) and self._sess_options.has_providers():
            warnings.warn(
                "Specified 'providers'/'provider_options' when creating InferenceSession but SessionOptions has "
                "already been configured with providers. InferenceSession will only use the providers "
                "passed to InferenceSession()."
            )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
            sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
    
        if disabled_optimizers is None:
            disabled_optimizers = set()
        elif not isinstance(disabled_optimizers, set):
            # convert to set. assumes iterable
            disabled_optimizers = set(disabled_optimizers)
    
        # initialize the C++ InferenceSession
>       sess.initialize_session(providers, provider_options, disabled_optimizers)
E       onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : qdq_actions.cc:137 TransposeDQWeightsForMatMulNBits Missing required scale: model.decoder.embed_tokens.weight_merged_0_scale for node: model.decoder.embed_tokens.weight_transposed_DequantizeLinear

.venv/lib/python3.14.../onnxruntime/capi/onnxruntime_inference_collection.py:636: Fail

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant