Skip to content

support embedding models in model validation program#1036

Merged
Raghul-M merged 9 commits intoopendatahub-io:mainfrom
edwardquarm:vision-model-inference
Jan 21, 2026
Merged

support embedding models in model validation program#1036
Raghul-M merged 9 commits intoopendatahub-io:mainfrom
edwardquarm:vision-model-inference

Conversation

@edwardquarm
Copy link
Copy Markdown
Contributor

@edwardquarm edwardquarm commented Jan 19, 2026

Description

jira: https://issues.redhat.com/browse/RHOAIENG-46614

Support embedding models for model validation

  • Update OpenAI embeddings request payload to accept list input and use valid fields (input + encoding_format).
  • Fix embeddings response parsing to read from data instead of choices.
  • Adjust embedding inference helper to pass the correct query shape and rely on client model_name.

How Has This Been Tested?

uv run pytest -vv tests/model_serving/model_runtime/model_validation/test_modelvalidation.py \
  --model_car_yaml_path=./tests/model_serving/model_runtime/model_validation/sample_modelcar_config.yaml \
  --vllm-runtime-image=registry.redhat.io/rhaiis/vllm-cuda-rhel9@sha256:094db84a1da5e8a575d0c9eade114fa30f4a2061064a338e3e032f3578f8082a \
  --supported-accelerator-type=Nvidia \
  --registry-host=registry.stage.redhat.io \
  --snapshot-update

Results:

--------------------------------------------------------------------------------------------------------------------------- CALL ----------------------------------------------------------------------------------------------------------------------------
Running embedding inference test
2026-01-19T17:15:44.132431 tests.model_serving.model_runtime.utils INFO granite-embedding-en-predictor-f74c49854-7kvgn
2026-01-19T17:15:44.132778 tests.model_serving.model_runtime.utils INFO Using port forwarding for inference on pod: granite-embedding-en-predictor-f74c49854-7kvgn
Sending HTTP request to http://localhost:8080/v1/embeddings with data: {'input': 'What are the key benefits of renewable energy sources compared to fossil fuels?', 'model': 'granite-embedding-en'}
Sending HTTP request to http://localhost:8080/v1/embeddings with data: {'input': "Translate the following English sentence into Spanish, German, and Mandarin: 'Knowledge is power.'", 'model': 'granite-embedding-en'}
2026-01-19T17:15:44.522573 utilities.plugins.openai_plugin INFO <Response [200]>
2026-01-19T17:15:44.523103 utilities.plugins.openai_plugin INFO {'index': 0, 'object': 'embedding', 'embedding': [-0.1943359375, -0.6171875, -0.470703125, -0.294921875, -0.76953125, 0.1572265625, 0.671875, -1.015625, 0.37890625, -0.07080078125, 0.2294921875, -0.220703125, -0.6171875, -0.1396484375, 0.09130859375, 2.140625, 0.69921875, 0.291015625, -0.5859375, -0.36328125, 0.48046875, -0.419921875, 0.7578125, 0.208984375, 0.51953125, 0.765625, -0.1767578125, -0.19140625, 0.1826171875, 1.265625, 0.306640625, -0.39453125, -1.0859375, 1.65625, -0.90234375, 0.7734375, -1.015625, -0.07470703125, -0.80078125, 0.455078125, 0.1376953125, -1.7421875, -1.0703125, 0.3359375, 0.0732421875, -1.1640625, -0.451171875, 0.482421875, -1.75, 0.388671875, 0.1279296875, -0.189453125, -0.197265625, -0.578125, -0.57421875, 1.015625, 0.384765625, 0.734375, 0.86328125, -0.41796875, 0.150390625, -0.64453125, -0.28125, -0.79296875, -0.77734375, 0.96484375, -1.171875, 0.416015625, -0.2353515625, 0.32421875, 0.220703125, 0.275390625, 0.61328125, -0.1318359375, -1.234375, 1.09375, 0.67578125, -0.35546875, -0.4765625, 0.4296875, 1.6015625, 0.4921875, -0.35546875, 1.328125, -0.435546875, -0.3046875, -0.6796875, 0.765625, 0.83203125, -1.1015625, -0.1591796875, -1.2890625, -1.2421875, 0.2265625, 0.400390625, 0.71484375, 1.09375, -0.400390625, 0.1611328125, 0.01104736328125, 1.3828125, 0.72265625, 0.51953125, -1.2421875, 1.3515625, 0.08740234375, -0.8515625, -0.5859375, -0.0281982421875, -0.051513671875, -0.1162109375, 0.56640625, -0.07958984375, 1.03125, 0.60546875, 0.1005859375, 0.310546875, -1.3671875, 0.80859375, -0.2138671875, 0.89453125, -0.1494140625, -0.25390625, -0.69140625, 0.328125, 0.15625, -0.6484375, 1.015625, -0.07275390625, -0.703125, 0.00171661376953125, 0.326171875, -0.181640625, 1.1328125, 0.44921875, -0.388671875, 0.365234375, 0.416015625, 1.1953125, 1.1328125, -0.267578125, 1.5078125, 1.0625, -0.1884765625, 1.109375, -0.427734375, 1.5, 0.421875, 0.193359375, 0.1416015625, -0.03759765625, -1.015625, 0.328125, 1.6171875, 0.298828125, 0.45703125, -0.404296875, 0.9765625, 0.095703125, 0.0162353515625, 0.8046875, 0.53515625, -0.66796875, -2.53125, -1.0625, -0.6171875, -0.54296875, -0.48046875, 1.109375, 1.2109375, 0.68359375, 0.82421875, -1.671875, 0.80078125, -0.1640625, 0.44921875, -0.1953125, -0.291015625, 0.2236328125, 0.26171875, -0.4453125, 0.486328125, -0.07666015625, 0.232421875, -0.36328125, -0.96875, 0.60546875, -0.050537109375, -0.62890625, 0.875, 0.67578125, 0.64453125, 1.171875, 0.1875, -0.00225830078125, 0.326171875, -0.1640625, -0.158203125, 0.14453125, 1.5546875, -1.203125, 0.177734375, 0.375, 1.265625, -1.0859375, 0.435546875, -0.03173828125, 0.302734375, -0.068359375, 1.3515625, 0.056640625, 0.40625, 0.859375, 0.51171875, 1.015625, -0.68359375, 0.271484375, 0.298828125, -0.390625, -1.25, 1.3515625, 0.671875, -0.412109375, -0.5078125, -0.1982421875, -0.578125, 0.77734375, -0.58984375, -1.8515625, -1.3359375, -0.478515625, 0.49609375, 0.97265625, 1.7421875, -1.078125, -0.1572265625, 0.2412109375, -0.00640869140625, 0.59765625, -0.203125, 0.318359375, -1.6953125, -0.62890625, 0.494140625, 1.4296875, 0.5, 0.1455078125, -0.1845703125, -0.73828125, 0.05029296875, -1.390625, -11.5, -0.625, -1.2265625, -0.068359375, 0.8125, -0.578125, 0.55078125, -0.58203125, -1.4296875, -1.703125, -1.7265625, -0.1044921875, 0.09716796875, 0.31640625, -0.306640625, -1.921875, -1.4609375, 2.109375, 0.30859375, 0.166015625, -0.96484375, -1.0234375, -0.60546875, 0.70703125, -2.0, -1.1640625, 0.9453125, 1.0859375, 0.734375, 0.2890625, -0.12060546875, -0.1953125, 0.1494140625, 0.97265625, -1.53125, -0.0625, 0.0439453125, -0.29296875, -0.50390625, 0.6796875, -1.0, -0.71875, -0.2119140625, 0.326171875, -0.05908203125, -0.6171875, 0.59765625, -1.0546875, 0.69921875, -0.53515625, -2.765625, 0.470703125, -0.1181640625, -0.48828125, 0.65234375, -1.53125, -0.41796875, 0.1728515625, 0.384765625, 0.859375, -0.609375, 0.2197265625, 0.6171875, 0.3828125, 6.09375, -0.2001953125, 0.4140625, 0.439453125, -1.0078125, -0.01361083984375, -1.7109375, -0.2578125, 0.54296875, -0.23046875, 0.6328125, 0.578125, 0.0791015625, 0.53515625, 0.6796875, 1.078125, 0.1318359375, 1.9609375, 0.6953125, -0.54296875, -0.02294921875, -0.0791015625, -0.3671875, -0.71875, 0.376953125, 0.77734375, -0.27734375, -0.53515625, 0.2109375, -0.30078125, -0.04443359375, 1.0703125, 0.2216796875, -0.345703125, -0.25390625, -0.75390625, 0.8828125, 0.2734375, 0.058349609375, 0.82421875, -1.0546875, 0.2890625, -0.369140625, 0.08544921875, -0.03271484375, -0.9765625, -1.0859375, -0.1669921875, 0.26171875, 0.6796875, 0.62109375, 0.28515625, -0.765625, -0.240234375, 0.119140625, -0.224609375, -0.9453125, -1.078125, 0.63671875, 0.09228515625, 1.984375, 0.9453125, 1.1640625, -1.15625, 1.078125, 0.52734375, 0.017578125, 0.65234375, 0.03662109375, -0.78125, -0.451171875, -0.921875, 0.37109375, -0.68359375, 1.53125, -0.0361328125, -1.2265625, 0.291015625, 1.125, 1.1015625, -0.546875, -0.6640625, 0.06787109375, -0.09716796875, -0.72265625, 2.453125, 0.02490234375, 0.50390625, 1.546875, -0.859375, -0.44140625, -0.314453125, 0.22265625, 0.16796875, -1.0859375, -0.02978515625, 0.220703125, -0.1728515625, -1.484375, -0.162109375, 0.4296875, 0.042724609375, -0.326171875, -1.03125, 0.0634765625, -0.93359375, 0.287109375, 2.0625, 0.2353515625, 1.2109375, -1.171875, -0.5078125, -0.765625, -0.703125, 0.1484375, -0.1728515625, 0.5390625, -0.71484375, -0.51953125, -1.7890625, 0.74609375, 0.283203125, 1.015625, -1.015625, 0.55078125, -0.12890625, -0.271484375, 0.8984375, 0.349609375, -0.8125, -0.345703125, -0.8671875, -1.328125, 0.81640625, 2.859375, 0.365234375, -0.73046875, -0.5234375, 1.1171875, 0.2216796875, 1.5, -0.419921875, 0.1787109375, -0.53515625, 0.1513671875, -0.63671875, -0.890625, 0.9375, -0.482421875, -0.6640625, 0.2060546875, -0.5703125, -2.15625, -1.203125, -0.275390625, 0.47265625, -0.87109375, 0.032958984375, -0.2060546875, 0.390625, 0.9921875, 1.1640625, 1.1015625, -0.796875, 0.1884765625, 0.248046875, -1.1328125, -1.2109375, -1.6796875, -0.279296875, 0.337890625, -0.404296875, -0.310546875, 0.1728515625, -0.439453125, -0.030029296875, -0.98828125, 0.076171875, 0.1259765625, -0.671875, 0.279296875, -0.470703125, 13.5625, -0.8671875, -0.7265625, 0.828125, 0.435546875, -0.054443359375, 1.171875, -0.040771484375, -0.6953125, -0.35546875, -0.0546875, -0.8046875, -1.9140625, 1.21875, 0.369140625, -0.36328125, -0.7734375, 0.35546875, -0.69140625, 0.5625, -0.1474609375, -0.0361328125, -0.7890625, -0.490234375, 0.400390625, 1.03125, -1.1640625, 0.2109375, 0.97265625, 0.484375, -0.1044921875, -0.890625, 1.09375, -2.109375, 0.494140625, 0.166015625, -0.5546875, -0.1123046875, -1.046875, 1.203125, -0.19140625, -0.50390625, -1.1953125, -0.54296875, -0.12158203125, -0.23828125, -0.1142578125, -0.79296875, -0.76953125, 0.375, -0.455078125, -0.26953125, -0.390625, 0.29296875, -0.421875, 0.11328125, -1.25, 0.478515625, -1.234375, -0.6796875, 1.296875, -0.470703125, 0.59375, -0.314453125, 0.96484375, -1.5625, -0.62890625, 0.208984375, 0.0458984375, -0.2353515625, -0.640625, 0.4609375, 0.08544921875, 0.515625, -0.765625, -0.466796875, -1.6953125, 0.63671875, 0.380859375, 0.0006103515625, 1.2578125, 0.32421875, -1.0703125, -0.53125, 0.48046875, -0.0927734375, 1.9609375, -0.474609375, -0.279296875, -0.984375, 0.3984375, -0.671875, -0.55859375, -2.203125, 0.0478515625, -0.9296875, 0.478515625, -0.059326171875, -1.1328125, -0.5390625, -0.259765625, 1.1484375, -0.0020294189453125, 0.353515625, -0.072265625, 0.8671875, -0.7421875, -1.421875, -0.90625, 1.9140625, 0.72265625, -0.43359375, 0.72265625, -1.09375, 0.30859375, 0.09716796875, 0.37890625, 0.6796875, 0.73046875, 0.07421875, -0.306640625, 1.09375, -0.10498046875, 1.4453125, 0.1396484375, 2.015625, 0.62109375, -1.890625, -0.96484375, -0.34765625, -0.09375, -0.5859375, 0.32421875, -0.024658203125, 0.380859375, 0.046630859375, 0.9296875, -0.5234375, -0.376953125, 1.71875, -0.35546875, 0.032470703125, -0.055908203125, 0.625, 0.28515625, 1.1796875, -1.234375, -1.1171875, 0.9375, -0.84375, 1.3203125, -0.12451171875, 0.1669921875, 0.208984375, -0.52734375, 1.421875, -1.4375, -0.0001277923583984375, -0.2353515625, 0.13671875, -1.1015625, -0.2021484375, 1.296875, 0.318359375, -0.8671875, -0.8046875, -1.078125, 0.310546875, 0.89453125, 0.46875, -0.022216796875, -1.3515625, -0.578125, -0.84375, 0.51171875, -0.072265625, -0.5078125, -0.8671875, 0.1650390625, 0.34375, -0.1435546875, 0.06298828125, -0.06982421875, -0.1494140625, 0.71875, -0.283203125, -0.578125, 0.419921875, -0.1943359375, 0.2373046875, 0.26171875, 0.130859375, -0.1923828125, -0.671875, 0.203125, -0.578125, -0.220703125, 1.0, 1.2890625, 0.146484375, 0.8046875, -0.3203125, 0.1904296875, 0.65234375, -0.345703125, 0.15625, -0.7890625, 0.5546875, 0.62109375, 0.8671875, -0.85546875, 1.15625, 0.55859375, -1.15625, 1.3515625, 0.22265625, 0.70703125, 0.8203125, 0.11083984375, -0.0113525390625, -0.1884765625, -0.166015625, -0.59765625, 0.216796875, -0.166015625, -0.6328125, -0.76171875, 0.6328125, -0.9453125, -0.0751953125, -1.328125, 0.169921875, 2.25, 0.2373046875, 0.10400390625, 0.13671875, 0.369140625, -1.234375, -1.828125, 0.70703125, 0.099609375, 0.9296875, -0.5390625, 0.58984375, 1.1328125, -0.369140625, -0.55078125, 0.921875, 0.578125, 9.9375, 0.208984375, -0.65234375, 0.81640625, 1.171875, -0.68359375, -0.94921875, 0.88671875, -0.050537109375, -0.53125, -0.578125, 0.07763671875, -0.8203125, 0.44140625, 0.2294921875, 0.392578125, -0.55859375, -0.158203125, -0.1884765625, -0.37109375, 0.8984375, -0.365234375]}

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • New Features

    • Added embedding inference and validation to the testing framework.
    • Support for running embedding requests via URL or port-forward modes.
    • Integrated embedding flow into raw model inference testing and included a set of sample embedding queries.
  • Bug Fixes

    • Corrected embedding response parsing to extract embedding data from API responses.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link
Copy Markdown

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/cherry-pick', '/verified', '/hold', '/wip', '/lgtm', '/build-push-pr-image'}

@github-actions github-actions Bot added size/m and removed size/l labels Jan 20, 2026
@edwardquarm edwardquarm marked this pull request as ready for review January 20, 2026 20:48
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

Adds embedding inference support: introduces EMBEDDING_QUERY constant (duplicated in the module), implements embedding validation and execution (URL and port-forward modes), plugs embedding flow into raw inference validation, and adjusts OpenAI plugin request/response handling for embeddings.

Changes

Cohort / File(s) Summary
Embedding Constants
tests/model_serving/model_runtime/model_validation/constant.py
Adds EMBEDDING_QUERY constant (list of { "text": ... } prompts). The constant is defined twice in the same module; the second definition overrides the first.
Embedding Inference Utilities
tests/model_serving/model_runtime/utils.py
Adds validate_embedding_inference_output(model_info, embedding_responses) and run_embedding_inference(...) supporting URL and port-forward modes; integrates embedding path into validate_raw_openai_inference_request() and imports EMBEDDING_QUERY.
OpenAI Plugin Updates
utilities/plugins/openai_plugin.py
For EMBEDDINGS endpoints, removes encoding_format and temperature from constructed request payloads and changes response parsing to use message["data"][0] instead of message["choices"][0].

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding support for embedding models in the model validation program, which aligns with the core modifications across all three affected files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread tests/model_serving/model_runtime/utils.py Outdated
Comment thread tests/model_serving/model_runtime/utils.py Outdated
Signed-off-by: Edward Arthur Quarm Jnr <equarmjn@redhat.com>
Signed-off-by: Edward Arthur Quarm Jnr <equarmjn@redhat.com>
@Raghul-M Raghul-M enabled auto-merge (squash) January 21, 2026 16:29
@Raghul-M Raghul-M merged commit bf9f169 into opendatahub-io:main Jan 21, 2026
9 checks passed
@github-actions
Copy link
Copy Markdown

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

mwaykole pushed a commit to mwaykole/opendatahub-tests that referenced this pull request Jan 23, 2026
…1036)

* support embedding models in model validation program

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* revert modelcar config file changes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove print statement

Signed-off-by: Edward Arthur Quarm Jnr <equarmjn@redhat.com>

* meaningful logging messages

Signed-off-by: Edward Arthur Quarm Jnr <equarmjn@redhat.com>

---------

Signed-off-by: Edward Arthur Quarm Jnr <equarmjn@redhat.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants