Skip to content

feat(compat): transitional compat layer to migrate from 0.21 to 0.22+#1841

Merged
Pouyanpi merged 7 commits into
developfrom
fix/llm-default-framework-shape-validator
May 6, 2026
Merged

feat(compat): transitional compat layer to migrate from 0.21 to 0.22+#1841
Pouyanpi merged 7 commits into
developfrom
fix/llm-default-framework-shape-validator

Conversation

@Pouyanpi

@Pouyanpi Pouyanpi commented Apr 30, 2026

Copy link
Copy Markdown
Collaborator

Description [ Generated by Copilot]

Two-layer safeguard for users carrying 0.21 LangChain-style configs into the
0.22 default framework:

  1. Boot-time shape detector (nemoguardrails/_compat/langchain_kwargs.py).
    Replaces the previous hand-maintained per-provider alias map with a generic
    pattern matcher. Detects two shapes in model.parameters when the active
    framework is default:
    • LangChain BaseChatModel Python flags: streaming, disable_streaming,
      verbose, cache, callbacks, tags, metadata, name, model_kwargs.
    • Provider-prefixed credential/URL aliases via regex ^(?P<prefix>[a-zA-Z]\w*?)_(?P<canonical>api_key|base_url|api_base|endpoint)$, emitting a remediation rename to canonical api_key / base_url. Raises a ValueError at LLMRails construction with two paths surfaced: either adapt the config to OpenAI-compatible shape, or set NEMOGUARDRAILS_LLM_FRAMEWORK=langchain to keep 0.21 behavior. Sunset in 0.23.0.
  2. Wire-level HTTP 400/422 enrichment (nemoguardrails/llm/clients/_errors.py).
    When OpenAICompatibleClient receives a 400 or 422 whose body matches one
    of _UNKNOWN_PARAM_HINT_TOKENS, append a single hedged migration hint to
    the original provider error. Original message is preserved verbatim before
    the hint. Hint is hedged with "If you upgraded from 0.21..." so users on
    unrelated rejections can ignore it. Same sunset.

Why two layers

The boot detector catches the recognizable 0.21 shape of a config. The wire enrichment handles the long tail: legitimate provider extensions (NIM nvext, vLLM min_p/top_k/guided_*, Ollama keep_alive) pass through untouched; if the wire rejects something, the user gets the actual provider error message plus migration context with the field name preserved. No enumeration treadmill.

Token list provenance

Methodology used to arrive at the final token list: a probe script hits each provider's /chat/completions with deliberately-malformed payloads (LangChain BaseChatModel flags as request fields, plus snake_case credential aliases and a gibberish field), captures the actual error body, and classifies each result.

The probe script is not part of this PR. The results below are the empirical input that determined the shipping token list.

Empirical results (per provider, all 9 LangChain-shape payloads sent)

OpenAI every malformed payload returned identical canonical phrasing:

{"error": {"message": "Unrecognized request argument supplied: <field>",
           "type": "invalid_request_error", "param": null, "code": null}}

Status 400. 9/9 cases identical.

NIM (NVIDIA Integrate API):

{"error": {"message": "Validation: Unsupported parameter(s): `<field>`",
           "type": "Bad Request", "code": 400}}

Status 400. 9/9 cases identical.

Groq:

{"error": {"message": "property '<field>' is unsupported",
           "type": "invalid_request_error"}}

Status 400. 9/9 cases identical.

Fireworks (accounts/fireworks/models/llama-v3p3-70b-instruct):

{"error": {"object": "error", "type": "invalid_request_error",
           "code": "invalid_request_error",
           "message": "Extra inputs are not permitted, field: '<field>', value: <value>"}}

Status 400. 9/9 cases identical. The phrase is pydantic v2's literal default validation error; any FastAPI/pydantic-v2-backed proxy is likely to share it.

OpenRouter: all 9 payloads returned status 200 with the unknown field silently dropped and the request forwarded to the upstream model. No validation rejection emitted. Permissive-proxy behavior. Boot-time shape detector covers LangChain-specific cases regardless.

Together AI: same permissive-proxy behavior: payloads that completed returned status 200; remainder rate-limited (Together's 1 QPS account limit; the probe fires sequentially without throttling). No validation phrase to match.

Ollama (self-hosted, localhost:11434/v1, latest version, CPU-only): all 9 payloads returned status 200 with the unknown field silently dropped and the completion produced normally. Same permissive-proxy pattern as OpenRouter and Together AI. The boot-time shape detector is the only safety net for Ollama users with stale LangChain configs since the wire emits no signal.

Final token list

_UNKNOWN_PARAM_HINT_TOKENS = (
    "unrecognized request argument",       # OpenAI
    "unsupported parameter",               # NIM
    "' is unsupported",                    # Groq (apostrophe-space anchored)
    "extra inputs are not permitted",      # Fireworks (pydantic v2)
)

Same set added to _UNSUPPORTED_PARAMS_KEYWORDS so all four classify consistently as LLMUnsupportedParamsError (matches the existing handling for unsupported parameter/unknown parameter).

Summary by CodeRabbit

  • New Features

    • Configuration validation now occurs during initialization to identify incompatibilities earlier, preventing unexpected runtime failures.
    • Error messages now include specific guidance when configuration parameters are invalid or unsupported.
  • Tests

    • Added comprehensive test coverage for the new validation and error messaging functionality.

Add a transitional validator at LLMRails construction that detects 0.21
LangChain Python-side flags (streaming, verbose, callbacks, model_kwargs,
nvidia_api_key, etc.) in model.parameters when the default framework is
active, raising a clear migration error instead of silently leaking them
to the wire as HTTP 400.

Lives under nemoguardrails/_compat/langchain_kwargs.py with an explicit
0.23.0 sunset; removed in 0.23 along with its single call site in
LLMRails._init_llms. The wire layer is intentionally untouched -
DefaultFramework keeps forwarding `parameters` verbatim, so NIM `nvext`,
vLLM `min_p`/`top_k`/`guided_*`, Ollama `keep_alive`, and other
provider extensions pass through unchanged.

User-facing error:

  Your config has a LangChain-only flag in `parameters` that the default
  framework doesn't forward:

    models[main]: remove `streaming`

  To keep 0.21 LangChain behavior instead, set NEMOGUARDRAILS_LLM_FRAMEWORK=langchain.
  (Migration check; removed in 0.23.0.)
The 0.21->0.22 LangChain config validator previously enumerated provider
aliases via a hand-maintained `_LANGCHAIN_PROVIDER_ALIASES` dict (only
nim / nvidia_ai_endpoints were mapped). That does not scale; every
LangChain provider has its own snake_case credential or URL kwargs.

Replace the dict with a regex-based shape detector that matches any
`<prefix>_<api_key|base_url|api_base|endpoint>` parameter and emits a
rename remediation. `api_base` and `endpoint` collapse to `base_url`,
matching the OpenAI-compatible client's canonical name. Canonical
`api_key` and `base_url` themselves are not flagged because the regex
requires a non-empty prefix.

Also reframe the error message to make the two remediation paths
(adapt to the default framework vs. keep 0.21 LangChain behavior via
NEMOGUARDRAILS_LLM_FRAMEWORK=langchain) explicit. The 0.23.0 sunset
notice is preserved.

Tests cover representative provider aliases (openai_api_key,
cohere_api_key, azure_endpoint, xyz_base_url, huggingfacehub_api_base),
confirm canonical names do not false-positive, confirm legitimate
provider extensions (nvext, min_p, keep_alive, top_k) pass through,
and confirm the error message contains both remediation paths plus
the sunset notice.
Complement the LLMRails-init validator with a wire-level safety net.
When the OpenAI-compatible client receives an HTTP 400 whose body
mentions one of the unknown-parameter tokens (`unknown parameter`,
`unrecognized`, `extra fields`, `additional properties`,
`is not allowed`), append a single migration hint pointing the user
at NEMOGUARDRAILS_LLM_FRAMEWORK=langchain. The provider's original
error is preserved verbatim; the hint is appended as additional
context so the rejected field name is still visible.

Detection is intentionally conservative: the hint is only appended on
status 400, only when one of the recognized tokens is present, and
never on 200, 401, 429, 5xx, context-window 400s, or generic 400s
(e.g. validation errors on temperature). The wire layer continues to
forward `parameters` verbatim; this is purely error-message
enrichment.

Sunset alignment: this hint will follow the 0.23.0 removal of the
LLMRails-init validator and the langchain framework toggle.
@greptile-apps

greptile-apps Bot commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds a two-layer compatibility shim for users migrating from 0.21 LangChain-style configs to 0.22's default framework. Both layers are clearly scoped, well-tested, and sunset-tagged for 0.23.0.

  • Boot-time detector (_compat/langchain_kwargs.py): raises ValueError at LLMRails construction when model.parameters carries LangChain Python-side flags or provider-prefixed credential aliases, with actionable rename/remove advice and both remediation paths in the message.
  • Wire-level enrichment (llm/clients/_errors.py): appends a hedged 0.21 migration hint to 400/422 errors whose body matches empirically validated "unknown parameter" phrases from OpenAI, NIM, Groq, and Fireworks; stream_options rejections are correctly routed to include_usage_in_stream=False guidance only, with no hint overlap.

Confidence Score: 5/5

Safe to merge; both layers are additive error-enrichment paths that do not alter successful request handling or change any existing behavior for compliant configs.

All changed code paths are purely additive — they raise errors or append text to error messages. No happy-path logic is touched. The previous thread concerns (broad is-not-allowed token, 422 exclusion, stream_options double-hint) are all addressed in the current diff. Test coverage is thorough across both layers.

The _PROVIDER_PREFIXED_ALIAS regex in _compat/langchain_kwargs.py warrants a second look for the _endpoint suffix breadth, though the impact is limited to the transitional period before 0.23.0.

Important Files Changed

Filename Overview
nemoguardrails/_compat/langchain_kwargs.py New boot-time compat checker; regex for provider-prefixed aliases could match legitimate _endpoint-suffixed parameters, giving misleading rename advice (temporary/sunset check).
nemoguardrails/llm/clients/_errors.py Wire-level 400/422 enrichment with migration hint; stream_options guard is correct, broad is-not-allowed token was removed, 422 is now included.
nemoguardrails/rails/llm/llmrails.py Compat check correctly wired into _init_llms; main-model exclusion when constructor LLM is injected is sound and covered by tests.
tests/_compat/test_langchain_kwargs.py Comprehensive parametric tests for the boot-time detector, covering false-positives, multi-violation aggregation, and canonical pass-through.
tests/llm/clients/test_openai_compatible_400_enrichment.py New test file validates per-provider hint injection, stream_options exclusion, 422 inclusion, and original-message preservation; good coverage.
tests/test_llmrails.py Two new integration tests verify constructor-LLM skip logic and main-model validation; cover the key behavioral branches.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[LLMRails.__init__] --> B[_init_llms]
    B --> C{self.llm injected?}
    C -- Yes --> D[models_to_check = non-main models]
    C -- No --> E[models_to_check = all models]
    D --> F[check_langchain_kwargs]
    E --> F
    F --> G{active_framework == default?}
    G -- No --> H[no-op]
    G -- Yes --> I{LangChain flags or provider aliases found?}
    I -- No --> H
    I -- Yes --> J[raise ValueError with remediation paths]
    K[HTTP 400 or 422 from provider] --> L[raise_for_status]
    L --> M[_classify_bad_request]
    M --> N{context_window keyword?}
    N -- Yes --> O[LLMContextWindowError]
    N -- No --> P{unsupported_params keyword?}
    P -- No --> Q[LLMBadRequestError]
    P -- Yes --> R{stream_options in message?}
    R -- Yes --> S[append stream_options guidance only]
    R -- No --> T[_maybe_append_migration_hint]
    T --> U{matches _UNKNOWN_PARAM_HINT_TOKENS?}
    U -- No --> V[original message unchanged]
    U -- Yes --> W[append 0.21 migration hint]
    S --> X[LLMUnsupportedParamsError]
    V --> X
    W --> X
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
nemoguardrails/_compat/langchain_kwargs.py:51
**`_endpoint` suffix matches legitimate provider-specific parameters**

`endpoint` in the canonical alternation will match any parameter ending with `_endpoint` (e.g., `deployment_endpoint`, `inference_endpoint`, `fine_tune_endpoint`). A user on a provider that exposes a legitimate wire-level `deployment_endpoint` parameter would see "rename `deployment_endpoint` to `base_url`" — incorrect migration advice for a parameter that isn't a LangChain alias. The other three suffixes (`_api_key`, `_api_base`, `_base_url`) are LangChain-conventional and lower risk; `_endpoint` is the broadest match here.

### Issue 2 of 2
nemoguardrails/llm/clients/_errors.py:51-66
**`"unknown parameter"` and `"unrecognized parameter"` classify correctly but never show migration hint**

`"unknown parameter"`, `"unrecognized parameter"`, and `"parameter not allowed"` are in `_UNSUPPORTED_PARAMS_KEYWORDS` so they raise `LLMUnsupportedParamsError`, but none appear in `_UNKNOWN_PARAM_HINT_TOKENS` so `_maybe_append_migration_hint` returns the message unchanged for those phrases. A user migrating from 0.21 on a provider that emits `"unknown parameter: streaming"` would get the right exception but no guidance. This is likely an intentional, empirically-driven omission, but if it is intentional a comment stating so would make the asymmetry less surprising to future readers.

Reviews (6): Last reviewed commit: "address review feedbacks" | Re-trigger Greptile

Comment thread nemoguardrails/llm/clients/_errors.py Outdated
Comment thread nemoguardrails/llm/clients/_errors.py Outdated
Comment thread nemoguardrails/_compat/langchain_kwargs.py Outdated
@Pouyanpi Pouyanpi marked this pull request as draft April 30, 2026 14:00
@coderabbitai

coderabbitai Bot commented Apr 30, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This pull request introduces a compatibility/migration helper system to manage LangChain-style parameter deprecation. It adds pre-initialization validation in LLMRails that checks for deprecated configuration patterns, detects provider-specific parameter aliases, and raises errors with clear migration guidance. HTTP 400 error responses are also enriched with migration hints when parameter-related issues are detected.

Changes

Cohort / File(s) Summary
Compatibility Package
nemoguardrails/_compat/__init__.py, nemoguardrails/_compat/langchain_kwargs.py
Introduces new compatibility package with validation function check_langchain_kwargs() that identifies and reports deprecated LangChain parameters (streaming, verbose, etc.) and provider-prefixed aliases (*_api_key, *_base_url) in model configurations; marked for removal in version 0.23.0.
Error Enrichment
nemoguardrails/llm/clients/_errors.py
Augments HTTP 400 error messages that indicate unknown/rejected parameter fields with a migration hint specific to version 0.21, detecting parameter-related errors and appending guidance when applicable.
LLMRails Integration
nemoguardrails/rails/llm/llmrails.py
Integrates pre-initialization compatibility validation by calling check_langchain_kwargs() during _init_llms() with the default framework, failing early if deprecated parameters are detected.
Compatibility Test Suite
tests/_compat/__init__.py, tests/_compat/test_langchain_kwargs.py
Adds comprehensive pytest coverage for validation logic, including tests for skipped validation with non-default frameworks, detection of base flags and provider aliases, handling of canonical parameters, and aggregated error messages across multiple models.
Error Enrichment Tests
tests/llm/clients/test_openai_compatible_400_enrichment.py
Verifies HTTP 400 enrichment behavior with mocked OpenAI-compatible responses, confirming migration hints are appended for parameter-related errors while unrelated 400 errors remain unchanged.

Sequence Diagram(s)

sequenceDiagram
    participant LLMRails
    participant ValidatorMod as Validator Module
    participant ErrorHandler as Error Handler
    participant HTTPClient as HTTP Client
    
    rect rgba(100, 150, 200, 0.5)
    Note over LLMRails,ValidatorMod: Pre-Initialization Validation
    LLMRails->>ValidatorMod: check_langchain_kwargs(models, active_framework)
    ValidatorMod->>ValidatorMod: Inspect model parameters
    alt Violations Found
        ValidatorMod-->>LLMRails: raise ValueError (migrations required)
    else Valid Configuration
        ValidatorMod-->>LLMRails: return (continue init)
    end
    end
    
    rect rgba(200, 150, 100, 0.5)
    Note over HTTPClient,ErrorHandler: Runtime Error Enrichment
    HTTPClient->>HTTPClient: HTTP 400 response
    HTTPClient->>ErrorHandler: Pass error message
    ErrorHandler->>ErrorHandler: Detect parameter-related patterns
    alt Parameter Issue Detected
        ErrorHandler->>ErrorHandler: Append migration hint
    end
    ErrorHandler-->>HTTPClient: Return enriched error
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes ⚠️ Warning PR has major changes but PR description lacks actual test results/execution evidence, though it describes what tests cover. Add test execution results to PR description including: number of tests passed per file, overall coverage metrics, and confirmation all tests pass with CI/CD evidence.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the changeset: introducing a transitional compatibility layer to help users migrate from version 0.21 to 0.22+.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/llm-default-framework-shape-validator

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/llm/clients/test_openai_compatible_400_enrichment.py (1)

150-165: ⚡ Quick win

Add one regression test for “hint already present” to prevent duplicate append.

The implementation has de-dup logic; a direct test here would lock that behavior and avoid repeated hint fragments in future refactors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llm/clients/test_openai_compatible_400_enrichment.py` around lines 150
- 165, Add a regression test that ensures the hint is not appended twice: create
a new async test (similar to
TestPreservesOriginalProviderError.test_appended_hint_does_not_replace_original)
that uses make_client and mock_httpx_post to return a 400 with an error.message
that already contains the _HINT_FRAGMENT, call await
client.chat_completion("gpt-4o", []), catch LLMUnsupportedParamsError, then
assert the original message (including the hint fragment) is present and that
message.count(_HINT_FRAGMENT) == 1 to ensure the dedup logic in the client
prevents duplicate hint appends.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemoguardrails/llm/clients/_errors.py`:
- Around line 52-58: The _UNKNOWN_PARAM_HINT_TOKENS list is too loose (contains
standalone words like "unrecognized" and "is not allowed") and causes
false-positive migration hints; replace those generic entries with tighter,
phrase-level matches (e.g., "unrecognized field", "is not allowed in",
"additional properties are not allowed", or add explicit surrounding
word-boundary/phrase checks) so the hint only triggers on true unknown-parameter
errors, and apply the same tightening to the other occurrence referenced in the
file (the similar token list used around the error-parsing logic at the later
block).

In `@nemoguardrails/rails/llm/llmrails.py`:
- Line 434: The compat validation call
check_langchain_kwargs(self.config.models, get_default_framework()) is still
validating the config `main` model even when _init_llms ignores `main` because a
constructor LLM was injected via self.llm; modify the logic so that before
calling check_langchain_kwargs you either skip the check when self.llm is
provided or remove/omit the `main` entry from self.config.models when self.llm
is not None, ensuring the compatibility check only runs against the effective
models used by _init_llms (refer to check_langchain_kwargs, _init_llms,
self.llm, and config.models).

---

Nitpick comments:
In `@tests/llm/clients/test_openai_compatible_400_enrichment.py`:
- Around line 150-165: Add a regression test that ensures the hint is not
appended twice: create a new async test (similar to
TestPreservesOriginalProviderError.test_appended_hint_does_not_replace_original)
that uses make_client and mock_httpx_post to return a 400 with an error.message
that already contains the _HINT_FRAGMENT, call await
client.chat_completion("gpt-4o", []), catch LLMUnsupportedParamsError, then
assert the original message (including the hint fragment) is present and that
message.count(_HINT_FRAGMENT) == 1 to ensure the dedup logic in the client
prevents duplicate hint appends.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e5de00c1-ddd9-432c-8463-ac4321d1146d

📥 Commits

Reviewing files that changed from the base of the PR and between 76d3af8 and e076a71.

📒 Files selected for processing (7)
  • nemoguardrails/_compat/__init__.py
  • nemoguardrails/_compat/langchain_kwargs.py
  • nemoguardrails/llm/clients/_errors.py
  • nemoguardrails/rails/llm/llmrails.py
  • tests/_compat/__init__.py
  • tests/_compat/test_langchain_kwargs.py
  • tests/llm/clients/test_openai_compatible_400_enrichment.py

Comment thread nemoguardrails/llm/clients/_errors.py
Comment thread nemoguardrails/rails/llm/llmrails.py Outdated
@codecov

codecov Bot commented Apr 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.11321% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/llm/clients/_errors.py 90.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Pouyanpi Pouyanpi self-assigned this May 4, 2026
@Pouyanpi Pouyanpi marked this pull request as ready for review May 4, 2026 13:50
Consolidates the responses to four code-review comments on the shape
detector + wire-enrichment work.

1. Tighten _UNKNOWN_PARAM_HINT_TOKENS to probe-verified phrases only.
   Earlier drafts contained speculative variants ("unrecognized field",
   "is not allowed in", JSON Schema and pydantic v1 phrasings) that no
   real provider was observed using. A live probe against OpenAI, NIM,
   Groq, Fireworks, OpenRouter, Together AI, and Ollama produced four
   canonical phrases:

     - "unrecognized request argument"   OpenAI
     - "unsupported parameter"           NIM
     - "' is unsupported"                Groq (apostrophe-anchored
                                              to avoid matching
                                              "model is unsupported")
     - "extra inputs are not permitted"  Fireworks (pydantic v2 default)

   OpenRouter, Together AI, and Ollama are permissive proxies that
   accept unknown fields with status 200; the boot-time shape detector
   covers them without any wire-level token.
_UNSUPPORTED_PARAMS_KEYWORDS
   gets the same four entries so all four classify as
   LLMUnsupportedParamsError consistently.

2. Cover HTTP 422 in _maybe_append_migration_hint. FastAPI/pydantic-
   based servers commonly return 422 for schema rejection; the previous
   gate `status_code != 400` skipped them. Extended to (400, 422).

3. Drop unused model_engine parameter from _violations_for in
   _compat/langchain_kwargs.py. The shape detector replaced the
   per-provider alias map; the parameter was no longer read.

4. Skip type=='main' from compat check when LLMRails is constructed
   with a constructor LLM. _init_llms ignores the config's main entry
   in that case (only logs a warning); validating it produced false-
   positive ValueErrors about a config that was about to be discarded.
   Other model types (content_safety, jailbreak_detection, ...) still
   get validated.

Tests:
- Probe-payload-verbatim tests for each verified canonical phrasing.
- 400 and 422 tests for the migration-hint append.
- False-positive guard tests asserting the hint does NOT fire on
  Content type / plan-restriction / authentication-scheme phrasings.
- Constructor-LLM tests asserting silent on injected-llm + stale main,
  raises without injection
@Pouyanpi Pouyanpi force-pushed the fix/llm-default-framework-shape-validator branch from 1207503 to dba8d56 Compare May 4, 2026 13:50
@Pouyanpi Pouyanpi added enhancement New feature or request labels May 4, 2026
@Pouyanpi Pouyanpi modified the milestone: v0.22.0 May 4, 2026
@Pouyanpi Pouyanpi force-pushed the develop branch 2 times, most recently from a852192 to 76d3af8 Compare May 4, 2026 13:53
Pouyanpi added 2 commits May 4, 2026 15:58
…ord trim

The previous fixture "temperature is not supported" relied on a generic
"is not supported" entry in _UNSUPPORTED_PARAMS_KEYWORDS that was removed
because it produced false positives on unrelated 400s ("model is not
supported in your region", "image input is not supported", etc.).

Replace the synthetic fixture with the real OpenAI reasoning-model
rejection phrasing for `temperature`:

  Unsupported parameter: 'temperature' is not supported with this model

This matches the empirically-grounded "unsupported parameter" entry that
remains in the keyword list, classifies correctly as
LLMUnsupportedParamsError, and is faithful to what OpenAI actually emits
when o1/o3 models reject the temperature field.
…narrowing

Two review comments addressed in one file:

1. _maybe_append_migration_hint call in the LLMBadRequestError branch
   was unreachable. Every entry in _UNKNOWN_PARAM_HINT_TOKENS is also in
   _UNSUPPORTED_PARAMS_KEYWORDS, so by contrapositive: when control
   reaches the BadRequestError branch (no classification keyword
   matched), no hint token can match either. Remove the dead call.

2. Removal of bare "is not supported" from _UNSUPPORTED_PARAMS_KEYWORDS
   is a deliberate narrowing to avoid false positives on non-param 400s
   ("model is not supported in your region", "image input is not
   supported for this model"). Empirical probe confirmed real OpenAI
   reasoning-model rejections always carry the "Unsupported parameter:"
   prefix matched by the first entry.
@Pouyanpi Pouyanpi requested a review from tgasser-nv May 4, 2026 14:13

@tgasser-nv tgasser-nv left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just some cleanups and extra unit-tests needed before merging

Comment thread tests/llm/clients/test_openai_compatible_400_enrichment.py Outdated
Comment thread nemoguardrails/_compat/langchain_kwargs.py Outdated
Comment thread nemoguardrails/_compat/langchain_kwargs.py Outdated
Comment thread nemoguardrails/llm/clients/_errors.py Outdated
Comment thread nemoguardrails/llm/clients/_errors.py Outdated
@Pouyanpi Pouyanpi changed the title feat: transitional compat layer to migrate from 0.21 to 0.22+ feat(compat): transitional compat layer to migrate from 0.21 to 0.22+ May 6, 2026
@Pouyanpi Pouyanpi force-pushed the fix/llm-default-framework-shape-validator branch from eba9350 to 9bea8a5 Compare May 6, 2026 09:24
@Pouyanpi Pouyanpi merged commit c69efe5 into develop May 6, 2026
6 checks passed
@Pouyanpi Pouyanpi deleted the fix/llm-default-framework-shape-validator branch May 6, 2026 09:41
m-misiura pushed a commit to m-misiura/NeMo-Guardrails that referenced this pull request May 6, 2026
…NVIDIA-NeMo#1841)

Two-layer safeguard for users carrying 0.21 LangChain-style configs into the
0.22 default framework
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants