Skip to content

fix(iorails): Apply inference-time llm_params on top of Model.parameters in ModelEngine#2020

Merged
tgasser-nv merged 3 commits into
developfrom
fix/iorails-model-param-merge
Jun 12, 2026
Merged

fix(iorails): Apply inference-time llm_params on top of Model.parameters in ModelEngine#2020
tgasser-nv merged 3 commits into
developfrom
fix/iorails-model-param-merge

Conversation

@tgasser-nv

@tgasser-nv tgasser-nv commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Description

IORails was silently dropping llm_params sampling fields configured in a model's parameters (i.e. Model.parameters). It took GenerationOptions.llm_params at inference-time only.

This PR filters out reserved parameters (_RESERVED_LLM_PARAMETERS) in the Model.parameters dict and stores these per-model so they're used on every call. If GenerationOptions.llm_params keys are provided they overwrite the Model.parameters values.

Related Issue(s)

#2009 Introduced LLM span attributes including parameters.

Test Plan

Pre-commit

$ poetry run pre-commit run --all-files
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
ruff (legacy alias)......................................................Passed
ruff format..............................................................Passed
Insert license in comments...............................................Passed
pyright..................................................................Passed

Unit-test

$ make test
env -u OPENAI_API_KEY -u NVIDIA_API_KEY -u LIVE_TEST -u LIVE_TEST_MODE -u TEST_LIVE_MODE poetry run pytest -n auto --dist worksteal
========================================================== test session starts ==========================================================
platform darwin -- Python 3.13.2, pytest-8.4.2, pluggy-1.6.0
rootdir: /Users/tgasser/projects/nemo_guardrails_worktree/fix/iorails-model-param-merge
configfile: pytest.ini
testpaths: tests, docs/colang-2/examples, benchmark/tests
plugins: anyio-4.12.1, langsmith-0.7.12, xdist-3.8.0, httpx-0.35.0, asyncio-0.26.0, profiling-1.8.1, cov-7.0.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=function, asyncio_default_test_loop_scope=function
10 workers [4820 items]
................................................................................................................................. [  2%]
................................................................................................................................. [  5%]
......................................................................................................s.sss...................... [  8%]
................................................................................................................................. [ 10%]
................................................................................................................................. [ 13%]
................................................................................................................................. [ 16%]
.................................................................................ssssssss.........ssssss...ss...s................ [ 18%]
.....................s.s..s..sssss............................................................................................... [ 21%]
............s...................s...s............................................................................................ [ 24%]
......................................ss.............s...............s........s....s......s...................................... [ 26%]
....s.......................................................................................................ss................... [ 29%]
................................................................................................................................. [ 32%]
................................................................................................................................. [ 34%]
....................................................................s............................................................ [ 37%]
................................................................................................................................. [ 40%]
.....................................sss.ss...........................................sssssss........................ssssssssssss [ 42%]
ssssss.........................ss.......................................s........................................................ [ 45%]
...........ss......s............................................................................................................. [ 48%]
...ss......................s.s.sss............................................................s.................................. [ 50%]
...............................................................s.s.sss........................................................... [ 53%]
.............................................................................................s................................... [ 56%]
................................................................................................................................. [ 58%]
................................................................................................................................. [ 61%]
................................................................................................................................. [ 64%]
................................................................................................................................. [ 66%]
................................................................................................................................. [ 69%]
................................................................................................................................. [ 72%]
...........................................................s......................................................s.............s [ 74%]
sss.ssssss..............................................................................ss....................................... [ 77%]
........................................................................sssss.ss.............sssssss.s........................... [ 80%]
..............................s..s................................s.............................................................. [ 82%]
.........ssssssss.............................................................sss.........s.s.................................... [ 85%]
................s..............................s.s.........ss........s........................................................... [ 88%]
...........s...........................................ssss.sss..ss...ssss.s.....ssss...s........................................ [ 90%]
................................................................................................................................. [ 93%]
......ssss...s...............................................................................ssssss.ss........................... [ 96%]
......ss..........................s.............................................................................................. [ 99%]
...............................................                                                                                   [100%]
================================================== 4640 passed, 180 skipped in 37.63s ===================================================

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

….parameters (after filtering out reserved parameters)
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@tgasser-nv tgasser-nv marked this pull request as ready for review June 12, 2026 16:12
@greptile-apps

greptile-apps Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a bug where Model.parameters sampling fields (e.g. temperature, max_tokens) were silently dropped by IORails — only per-call GenerationOptions.llm_params were forwarded to the inference endpoint. The fix introduces _RESERVED_LLM_PARAMETERS to strip transport/secret/streaming-control keys from Model.parameters, stores the rest as an immutable MappingProxyType on each ModelEngine, and merges it under per-call kwargs in both model_call and stream_model_call.

  • model_engine.py: Adds _RESERVED_LLM_PARAMETERS frozenset with documented rationale for each exclusion, and body_param_defaults: Mapping[str, Any] wrapping the filtered params in a MappingProxyType so the snapshot is immutable after __init__.
  • engine_registry.py: Both model_call and stream_model_call compute merged_params = {**engine.body_param_defaults, **kwargs} before passing to chat_completion/stream_chat_completion and set_llm_request_attributes, so the span reflects the actual sent parameters.
  • Tests: Comprehensive coverage in TestModelEngineBodyParamDefaults and TestEngineRegistryParameterDefaults validates filtering, precedence, and the stream duplicate-keyword guard.

Confidence Score: 5/5

Safe to merge — the change is a targeted bugfix with no observable regressions for callers that set no Model.parameters, and a correct improvement for those that do.

The merge logic is straightforward ({**defaults, **overrides}), reserved-key exclusions are well-documented and cover all keys that could collide with explicit body or stream arguments, MappingProxyType prevents shared-state mutation, and the test suite validates every combination including precedence order and the stream duplicate-keyword guard.

No files require special attention.

Important Files Changed

Filename Overview
nemoguardrails/guardrails/model_engine.py Adds _RESERVED_LLM_PARAMETERS frozenset and body_param_defaults (immutable MappingProxyType) to ModelEngine.__init__, correctly filtering transport/secret/streaming keys from Model.parameters before exposing them as per-request body defaults.
nemoguardrails/guardrails/engine_registry.py Both model_call and stream_model_call now compute merged_params = {**engine.body_param_defaults, **kwargs} before passing to the engine and span attributes, correctly giving per-call kwargs priority over config-level defaults.
tests/guardrails/test_model_engine.py New TestModelEngineBodyParamDefaults class covers all reserved-key categories (transport, secret, streaming-control, identity, client-only) and the sampling-pass-through case with isolated unit tests.
tests/guardrails/test_engine_registry.py New TestEngineRegistryParameterDefaults class covers both non-streaming and streaming paths including precedence ordering, reserved-key exclusion, and the stream duplicate-keyword guard.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant EngineRegistry
    participant ModelEngine

    Note over Caller,ModelEngine: Before this PR — Model.parameters dropped
    Caller->>EngineRegistry: "model_call(model_type, messages, **llm_params)"
    EngineRegistry->>ModelEngine: "chat_completion(messages, **llm_params)"
    Note right of ModelEngine: Model.parameters ignored

    Note over Caller,ModelEngine: After this PR — parameters merged
    Caller->>EngineRegistry: "model_call(model_type, messages, **llm_params)"
    EngineRegistry->>EngineRegistry: "merged = {**body_param_defaults, **llm_params}"
    Note right of EngineRegistry: body_param_defaults = Model.parameters minus _RESERVED_LLM_PARAMETERS
    EngineRegistry->>ModelEngine: "chat_completion(messages, **merged)"
    Note right of ModelEngine: Config defaults + per-call overrides both applied
Loading

Reviews (3): Last reviewed commit: "Add default_headers and default_query to..." | Re-trigger Greptile

Comment thread nemoguardrails/guardrails/model_engine.py
Comment thread nemoguardrails/guardrails/model_engine.py Outdated
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Model parameter defaults are now computed and filtered during engine initialization, retaining only sampling/body parameters while excluding reserved transport and streaming keys. When EngineRegistry invokes model calls, it merges these model-configured defaults with per-call kwargs, with per-call values taking precedence, and applies the merged parameters to both LLM span request attributes and the actual chat-completion call.

Changes

Model Parameter Defaults and Merging

Layer / File(s) Summary
ModelEngine reserved parameters and body_param_defaults
nemoguardrails/guardrails/model_engine.py, tests/guardrails/test_model_engine.py
Defines _RESERVED_LLM_PARAMETERS enumerating transport/retry, API key, streaming control, and identity/model/message fields that must not appear in request bodies. During ModelEngine.__init__, computes body_param_defaults by filtering Model.parameters to exclude reserved keys, exposing sampling parameters as per-request defaults. Tests verify sampling parameters are preserved and reserved/transport/secret keys are excluded.
EngineRegistry parameter merging in model_call and stream_model_call
nemoguardrails/guardrails/engine_registry.py, tests/guardrails/test_engine_registry.py
Both model_call and stream_model_call merge engine.body_param_defaults with per-call kwargs, with per-call values overriding defaults. Merged parameters populate LLM span request attributes and are passed to engine.chat_completion or engine.stream_chat_completion. Tests verify config defaults populate request bodies and span attributes, per-call overrides take precedence, and streaming path correctly excludes reserved keys while preserving sampling defaults.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR description provided has no test results/CI or regression/perf notes—only a checklist—despite changing LLM parameter merging behavior. Update PR description to include testing info (e.g., pytest/CI run, affected suites) and, if relevant, any performance/numeric/regression validation before/after.
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main change: applying inference-time llm_params on top of Model.parameters in ModelEngine, which is the core objective addressed across engine_registry.py and model_engine.py.
Docstring Coverage ✅ Passed Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/iorails-model-param-merge

Comment @coderabbitai help to get the list of available commands and usage tips.

@tgasser-nv tgasser-nv self-assigned this Jun 12, 2026
@tgasser-nv tgasser-nv requested review from Pouyanpi and cparisien June 12, 2026 16:42
@Pouyanpi Pouyanpi added this to the v0.23.0 milestone Jun 12, 2026

@Pouyanpi Pouyanpi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏻

a side note:

IORails does not use the shared OpenAI-compatible client yet, so I don’t think we need to fully support default_headers or default_query here. But since these are client only options, can we add them to the reserved (filter) list so they don’t get forwarded as chat completion body fields? that keeps current behavior safe and leave proper client wiring for a (potential) future refactor.

@tgasser-nv tgasser-nv merged commit 81df835 into develop Jun 12, 2026
7 checks passed
@tgasser-nv tgasser-nv deleted the fix/iorails-model-param-merge branch June 12, 2026 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants