feat(compat): transitional compat layer to migrate from 0.21 to 0.22+#1841
Conversation
Add a transitional validator at LLMRails construction that detects 0.21
LangChain Python-side flags (streaming, verbose, callbacks, model_kwargs,
nvidia_api_key, etc.) in model.parameters when the default framework is
active, raising a clear migration error instead of silently leaking them
to the wire as HTTP 400.
Lives under nemoguardrails/_compat/langchain_kwargs.py with an explicit
0.23.0 sunset; removed in 0.23 along with its single call site in
LLMRails._init_llms. The wire layer is intentionally untouched -
DefaultFramework keeps forwarding `parameters` verbatim, so NIM `nvext`,
vLLM `min_p`/`top_k`/`guided_*`, Ollama `keep_alive`, and other
provider extensions pass through unchanged.
User-facing error:
Your config has a LangChain-only flag in `parameters` that the default
framework doesn't forward:
models[main]: remove `streaming`
To keep 0.21 LangChain behavior instead, set NEMOGUARDRAILS_LLM_FRAMEWORK=langchain.
(Migration check; removed in 0.23.0.)
The 0.21->0.22 LangChain config validator previously enumerated provider aliases via a hand-maintained `_LANGCHAIN_PROVIDER_ALIASES` dict (only nim / nvidia_ai_endpoints were mapped). That does not scale; every LangChain provider has its own snake_case credential or URL kwargs. Replace the dict with a regex-based shape detector that matches any `<prefix>_<api_key|base_url|api_base|endpoint>` parameter and emits a rename remediation. `api_base` and `endpoint` collapse to `base_url`, matching the OpenAI-compatible client's canonical name. Canonical `api_key` and `base_url` themselves are not flagged because the regex requires a non-empty prefix. Also reframe the error message to make the two remediation paths (adapt to the default framework vs. keep 0.21 LangChain behavior via NEMOGUARDRAILS_LLM_FRAMEWORK=langchain) explicit. The 0.23.0 sunset notice is preserved. Tests cover representative provider aliases (openai_api_key, cohere_api_key, azure_endpoint, xyz_base_url, huggingfacehub_api_base), confirm canonical names do not false-positive, confirm legitimate provider extensions (nvext, min_p, keep_alive, top_k) pass through, and confirm the error message contains both remediation paths plus the sunset notice.
Complement the LLMRails-init validator with a wire-level safety net. When the OpenAI-compatible client receives an HTTP 400 whose body mentions one of the unknown-parameter tokens (`unknown parameter`, `unrecognized`, `extra fields`, `additional properties`, `is not allowed`), append a single migration hint pointing the user at NEMOGUARDRAILS_LLM_FRAMEWORK=langchain. The provider's original error is preserved verbatim; the hint is appended as additional context so the rejected field name is still visible. Detection is intentionally conservative: the hint is only appended on status 400, only when one of the recognized tokens is present, and never on 200, 401, 429, 5xx, context-window 400s, or generic 400s (e.g. validation errors on temperature). The wire layer continues to forward `parameters` verbatim; this is purely error-message enrichment. Sunset alignment: this hint will follow the 0.23.0 removal of the LLMRails-init validator and the langchain framework toggle.
Greptile SummaryAdds a two-layer compatibility shim for users migrating from 0.21 LangChain-style configs to 0.22's default framework. Both layers are clearly scoped, well-tested, and sunset-tagged for 0.23.0.
|
| Filename | Overview |
|---|---|
| nemoguardrails/_compat/langchain_kwargs.py | New boot-time compat checker; regex for provider-prefixed aliases could match legitimate _endpoint-suffixed parameters, giving misleading rename advice (temporary/sunset check). |
| nemoguardrails/llm/clients/_errors.py | Wire-level 400/422 enrichment with migration hint; stream_options guard is correct, broad is-not-allowed token was removed, 422 is now included. |
| nemoguardrails/rails/llm/llmrails.py | Compat check correctly wired into _init_llms; main-model exclusion when constructor LLM is injected is sound and covered by tests. |
| tests/_compat/test_langchain_kwargs.py | Comprehensive parametric tests for the boot-time detector, covering false-positives, multi-violation aggregation, and canonical pass-through. |
| tests/llm/clients/test_openai_compatible_400_enrichment.py | New test file validates per-provider hint injection, stream_options exclusion, 422 inclusion, and original-message preservation; good coverage. |
| tests/test_llmrails.py | Two new integration tests verify constructor-LLM skip logic and main-model validation; cover the key behavioral branches. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[LLMRails.__init__] --> B[_init_llms]
B --> C{self.llm injected?}
C -- Yes --> D[models_to_check = non-main models]
C -- No --> E[models_to_check = all models]
D --> F[check_langchain_kwargs]
E --> F
F --> G{active_framework == default?}
G -- No --> H[no-op]
G -- Yes --> I{LangChain flags or provider aliases found?}
I -- No --> H
I -- Yes --> J[raise ValueError with remediation paths]
K[HTTP 400 or 422 from provider] --> L[raise_for_status]
L --> M[_classify_bad_request]
M --> N{context_window keyword?}
N -- Yes --> O[LLMContextWindowError]
N -- No --> P{unsupported_params keyword?}
P -- No --> Q[LLMBadRequestError]
P -- Yes --> R{stream_options in message?}
R -- Yes --> S[append stream_options guidance only]
R -- No --> T[_maybe_append_migration_hint]
T --> U{matches _UNKNOWN_PARAM_HINT_TOKENS?}
U -- No --> V[original message unchanged]
U -- Yes --> W[append 0.21 migration hint]
S --> X[LLMUnsupportedParamsError]
V --> X
W --> X
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
nemoguardrails/_compat/langchain_kwargs.py:51
**`_endpoint` suffix matches legitimate provider-specific parameters**
`endpoint` in the canonical alternation will match any parameter ending with `_endpoint` (e.g., `deployment_endpoint`, `inference_endpoint`, `fine_tune_endpoint`). A user on a provider that exposes a legitimate wire-level `deployment_endpoint` parameter would see "rename `deployment_endpoint` to `base_url`" — incorrect migration advice for a parameter that isn't a LangChain alias. The other three suffixes (`_api_key`, `_api_base`, `_base_url`) are LangChain-conventional and lower risk; `_endpoint` is the broadest match here.
### Issue 2 of 2
nemoguardrails/llm/clients/_errors.py:51-66
**`"unknown parameter"` and `"unrecognized parameter"` classify correctly but never show migration hint**
`"unknown parameter"`, `"unrecognized parameter"`, and `"parameter not allowed"` are in `_UNSUPPORTED_PARAMS_KEYWORDS` so they raise `LLMUnsupportedParamsError`, but none appear in `_UNKNOWN_PARAM_HINT_TOKENS` so `_maybe_append_migration_hint` returns the message unchanged for those phrases. A user migrating from 0.21 on a provider that emits `"unknown parameter: streaming"` would get the right exception but no guidance. This is likely an intentional, empirically-driven omission, but if it is intentional a comment stating so would make the asymmetry less surprising to future readers.
Reviews (6): Last reviewed commit: "address review feedbacks" | Re-trigger Greptile
📝 WalkthroughWalkthroughThis pull request introduces a compatibility/migration helper system to manage LangChain-style parameter deprecation. It adds pre-initialization validation in LLMRails that checks for deprecated configuration patterns, detects provider-specific parameter aliases, and raises errors with clear migration guidance. HTTP 400 error responses are also enriched with migration hints when parameter-related issues are detected. Changes
Sequence Diagram(s)sequenceDiagram
participant LLMRails
participant ValidatorMod as Validator Module
participant ErrorHandler as Error Handler
participant HTTPClient as HTTP Client
rect rgba(100, 150, 200, 0.5)
Note over LLMRails,ValidatorMod: Pre-Initialization Validation
LLMRails->>ValidatorMod: check_langchain_kwargs(models, active_framework)
ValidatorMod->>ValidatorMod: Inspect model parameters
alt Violations Found
ValidatorMod-->>LLMRails: raise ValueError (migrations required)
else Valid Configuration
ValidatorMod-->>LLMRails: return (continue init)
end
end
rect rgba(200, 150, 100, 0.5)
Note over HTTPClient,ErrorHandler: Runtime Error Enrichment
HTTPClient->>HTTPClient: HTTP 400 response
HTTPClient->>ErrorHandler: Pass error message
ErrorHandler->>ErrorHandler: Detect parameter-related patterns
alt Parameter Issue Detected
ErrorHandler->>ErrorHandler: Append migration hint
end
ErrorHandler-->>HTTPClient: Return enriched error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/llm/clients/test_openai_compatible_400_enrichment.py (1)
150-165: ⚡ Quick winAdd one regression test for “hint already present” to prevent duplicate append.
The implementation has de-dup logic; a direct test here would lock that behavior and avoid repeated hint fragments in future refactors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/llm/clients/test_openai_compatible_400_enrichment.py` around lines 150 - 165, Add a regression test that ensures the hint is not appended twice: create a new async test (similar to TestPreservesOriginalProviderError.test_appended_hint_does_not_replace_original) that uses make_client and mock_httpx_post to return a 400 with an error.message that already contains the _HINT_FRAGMENT, call await client.chat_completion("gpt-4o", []), catch LLMUnsupportedParamsError, then assert the original message (including the hint fragment) is present and that message.count(_HINT_FRAGMENT) == 1 to ensure the dedup logic in the client prevents duplicate hint appends.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nemoguardrails/llm/clients/_errors.py`:
- Around line 52-58: The _UNKNOWN_PARAM_HINT_TOKENS list is too loose (contains
standalone words like "unrecognized" and "is not allowed") and causes
false-positive migration hints; replace those generic entries with tighter,
phrase-level matches (e.g., "unrecognized field", "is not allowed in",
"additional properties are not allowed", or add explicit surrounding
word-boundary/phrase checks) so the hint only triggers on true unknown-parameter
errors, and apply the same tightening to the other occurrence referenced in the
file (the similar token list used around the error-parsing logic at the later
block).
In `@nemoguardrails/rails/llm/llmrails.py`:
- Line 434: The compat validation call
check_langchain_kwargs(self.config.models, get_default_framework()) is still
validating the config `main` model even when _init_llms ignores `main` because a
constructor LLM was injected via self.llm; modify the logic so that before
calling check_langchain_kwargs you either skip the check when self.llm is
provided or remove/omit the `main` entry from self.config.models when self.llm
is not None, ensuring the compatibility check only runs against the effective
models used by _init_llms (refer to check_langchain_kwargs, _init_llms,
self.llm, and config.models).
---
Nitpick comments:
In `@tests/llm/clients/test_openai_compatible_400_enrichment.py`:
- Around line 150-165: Add a regression test that ensures the hint is not
appended twice: create a new async test (similar to
TestPreservesOriginalProviderError.test_appended_hint_does_not_replace_original)
that uses make_client and mock_httpx_post to return a 400 with an error.message
that already contains the _HINT_FRAGMENT, call await
client.chat_completion("gpt-4o", []), catch LLMUnsupportedParamsError, then
assert the original message (including the hint fragment) is present and that
message.count(_HINT_FRAGMENT) == 1 to ensure the dedup logic in the client
prevents duplicate hint appends.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e5de00c1-ddd9-432c-8463-ac4321d1146d
📒 Files selected for processing (7)
nemoguardrails/_compat/__init__.pynemoguardrails/_compat/langchain_kwargs.pynemoguardrails/llm/clients/_errors.pynemoguardrails/rails/llm/llmrails.pytests/_compat/__init__.pytests/_compat/test_langchain_kwargs.pytests/llm/clients/test_openai_compatible_400_enrichment.py
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Consolidates the responses to four code-review comments on the shape
detector + wire-enrichment work.
1. Tighten _UNKNOWN_PARAM_HINT_TOKENS to probe-verified phrases only.
Earlier drafts contained speculative variants ("unrecognized field",
"is not allowed in", JSON Schema and pydantic v1 phrasings) that no
real provider was observed using. A live probe against OpenAI, NIM,
Groq, Fireworks, OpenRouter, Together AI, and Ollama produced four
canonical phrases:
- "unrecognized request argument" OpenAI
- "unsupported parameter" NIM
- "' is unsupported" Groq (apostrophe-anchored
to avoid matching
"model is unsupported")
- "extra inputs are not permitted" Fireworks (pydantic v2 default)
OpenRouter, Together AI, and Ollama are permissive proxies that
accept unknown fields with status 200; the boot-time shape detector
covers them without any wire-level token.
_UNSUPPORTED_PARAMS_KEYWORDS
gets the same four entries so all four classify as
LLMUnsupportedParamsError consistently.
2. Cover HTTP 422 in _maybe_append_migration_hint. FastAPI/pydantic-
based servers commonly return 422 for schema rejection; the previous
gate `status_code != 400` skipped them. Extended to (400, 422).
3. Drop unused model_engine parameter from _violations_for in
_compat/langchain_kwargs.py. The shape detector replaced the
per-provider alias map; the parameter was no longer read.
4. Skip type=='main' from compat check when LLMRails is constructed
with a constructor LLM. _init_llms ignores the config's main entry
in that case (only logs a warning); validating it produced false-
positive ValueErrors about a config that was about to be discarded.
Other model types (content_safety, jailbreak_detection, ...) still
get validated.
Tests:
- Probe-payload-verbatim tests for each verified canonical phrasing.
- 400 and 422 tests for the migration-hint append.
- False-positive guard tests asserting the hint does NOT fire on
Content type / plan-restriction / authentication-scheme phrasings.
- Constructor-LLM tests asserting silent on injected-llm + stale main,
raises without injection
1207503 to
dba8d56
Compare
a852192 to
76d3af8
Compare
…ord trim
The previous fixture "temperature is not supported" relied on a generic
"is not supported" entry in _UNSUPPORTED_PARAMS_KEYWORDS that was removed
because it produced false positives on unrelated 400s ("model is not
supported in your region", "image input is not supported", etc.).
Replace the synthetic fixture with the real OpenAI reasoning-model
rejection phrasing for `temperature`:
Unsupported parameter: 'temperature' is not supported with this model
This matches the empirically-grounded "unsupported parameter" entry that
remains in the keyword list, classifies correctly as
LLMUnsupportedParamsError, and is faithful to what OpenAI actually emits
when o1/o3 models reject the temperature field.
…narrowing
Two review comments addressed in one file:
1. _maybe_append_migration_hint call in the LLMBadRequestError branch
was unreachable. Every entry in _UNKNOWN_PARAM_HINT_TOKENS is also in
_UNSUPPORTED_PARAMS_KEYWORDS, so by contrapositive: when control
reaches the BadRequestError branch (no classification keyword
matched), no hint token can match either. Remove the dead call.
2. Removal of bare "is not supported" from _UNSUPPORTED_PARAMS_KEYWORDS
is a deliberate narrowing to avoid false positives on non-param 400s
("model is not supported in your region", "image input is not
supported for this model"). Empirical probe confirmed real OpenAI
reasoning-model rejections always carry the "Unsupported parameter:"
prefix matched by the first entry.
tgasser-nv
left a comment
There was a problem hiding this comment.
Looks good! Just some cleanups and extra unit-tests needed before merging
eba9350 to
9bea8a5
Compare
…NVIDIA-NeMo#1841) Two-layer safeguard for users carrying 0.21 LangChain-style configs into the 0.22 default framework
Description [ Generated by Copilot]
Two-layer safeguard for users carrying 0.21 LangChain-style configs into the
0.22 default framework:
nemoguardrails/_compat/langchain_kwargs.py).Replaces the previous hand-maintained per-provider alias map with a generic
pattern matcher. Detects two shapes in
model.parameterswhen the activeframework is
default:BaseChatModelPython flags:streaming,disable_streaming,verbose,cache,callbacks,tags,metadata,name,model_kwargs.^(?P<prefix>[a-zA-Z]\w*?)_(?P<canonical>api_key|base_url|api_base|endpoint)$, emitting a remediation rename to canonicalapi_key/base_url. Raises aValueErroratLLMRailsconstruction with two paths surfaced: either adapt the config to OpenAI-compatible shape, or setNEMOGUARDRAILS_LLM_FRAMEWORK=langchainto keep 0.21 behavior. Sunset in 0.23.0.nemoguardrails/llm/clients/_errors.py).When
OpenAICompatibleClientreceives a 400 or 422 whose body matches oneof
_UNKNOWN_PARAM_HINT_TOKENS, append a single hedged migration hint tothe original provider error. Original message is preserved verbatim before
the hint. Hint is hedged with "If you upgraded from 0.21..." so users on
unrelated rejections can ignore it. Same sunset.
Why two layers
The boot detector catches the recognizable 0.21 shape of a config. The wire enrichment handles the long tail: legitimate provider extensions (NIM
nvext, vLLMmin_p/top_k/guided_*, Ollamakeep_alive) pass through untouched; if the wire rejects something, the user gets the actual provider error message plus migration context with the field name preserved. No enumeration treadmill.Token list provenance
Methodology used to arrive at the final token list: a probe script hits each provider's
/chat/completionswith deliberately-malformed payloads (LangChainBaseChatModelflags as request fields, plus snake_case credential aliases and a gibberish field), captures the actual error body, and classifies each result.The probe script is not part of this PR. The results below are the empirical input that determined the shipping token list.
Empirical results (per provider, all 9 LangChain-shape payloads sent)
OpenAI every malformed payload returned identical canonical phrasing:
Status 400. 9/9 cases identical.
NIM (NVIDIA Integrate API):
Status 400. 9/9 cases identical.
Groq:
Status 400. 9/9 cases identical.
Fireworks (
accounts/fireworks/models/llama-v3p3-70b-instruct):Status 400. 9/9 cases identical. The phrase is pydantic v2's literal default validation error; any FastAPI/pydantic-v2-backed proxy is likely to share it.
OpenRouter: all 9 payloads returned status 200 with the unknown field silently dropped and the request forwarded to the upstream model. No validation rejection emitted. Permissive-proxy behavior. Boot-time shape detector covers LangChain-specific cases regardless.
Together AI: same permissive-proxy behavior: payloads that completed returned status 200; remainder rate-limited (Together's 1 QPS account limit; the probe fires sequentially without throttling). No validation phrase to match.
Ollama (self-hosted,
localhost:11434/v1, latest version, CPU-only): all 9 payloads returned status 200 with the unknown field silently dropped and the completion produced normally. Same permissive-proxy pattern as OpenRouter and Together AI. The boot-time shape detector is the only safety net for Ollama users with stale LangChain configs since the wire emits no signal.Final token list
Same set added to
_UNSUPPORTED_PARAMS_KEYWORDSso all four classify consistently asLLMUnsupportedParamsError(matches the existing handling forunsupported parameter/unknown parameter).Summary by CodeRabbit
New Features
Tests