fix: bug where empty strings overwrite user/system messages#1406
Conversation
Signed-off-by: naymaraq <dkaramyan@nvidia.com>
📝 WalkthroughWalkthroughOpenAI prompt-filling in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
nemo_skills/inference/generate.py (1)
667-670: ⚡ Quick winAdd a regression test for
system_message=""behavior on the OpenAI path.The new branch at Lines 668-670 changes semantics (explicit empty string removes a leading system message). Please add a targeted test to lock this in and prevent regressions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemo_skills/inference/generate.py` around lines 667 - 670, Add a regression test that exercises the OpenAI dispatch path when self.cfg.system_message == "" to ensure the leading system message is removed from data_point["messages"]; specifically, construct a data_point with a leading {"role":"system", ...} message, set cfg.system_message to the empty string, invoke the code path that processes messages for the OpenAI backend (the method that checks self.cfg.system_message and pops data_point["messages"][0]), and assert the leading system message has been removed while other messages remain unchanged; mirror the existing test style for other model providers so this behavior is locked in.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nemo_skills/inference/generate.py`:
- Around line 667-672: The code that handles self.cfg.system_message currently
indexes data_point["messages"][0] without ensuring the list is non-empty,
causing an IndexError; update the logic in the block around
self.cfg.system_message so that before any access to data_point["messages"][0]
you check whether data_point["messages"] is non-empty and, if it is empty and
system_message is non-empty, insert the system message (using the same
{"role":"system","content": self.cfg.system_message} structure); similarly, when
system_message == "" ensure you only attempt to pop the first message if
data_point["messages"] is non-empty—adjust the branches around the existing pop
and insert calls to guard against empty lists.
---
Nitpick comments:
In `@nemo_skills/inference/generate.py`:
- Around line 667-670: Add a regression test that exercises the OpenAI dispatch
path when self.cfg.system_message == "" to ensure the leading system message is
removed from data_point["messages"]; specifically, construct a data_point with a
leading {"role":"system", ...} message, set cfg.system_message to the empty
string, invoke the code path that processes messages for the OpenAI backend (the
method that checks self.cfg.system_message and pops data_point["messages"][0]),
and assert the leading system message has been removed while other messages
remain unchanged; mirror the existing test style for other model providers so
this behavior is locked in.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4abbd4f5-788f-4c73-8417-c26e2fee17df
📒 Files selected for processing (1)
nemo_skills/inference/generate.py
Jorjeous
left a comment
There was a problem hiding this comment.
Lets add guard on empty messages list when system_message is non-empty
Other LGTM, will approve when above done
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: naymaraq <dkaramyan@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
nemo_skills/inference/generate.py (1)
658-676: ⚡ Quick winAdd regression tests for these override semantics.
Please add focused tests for
user_message="",system_message="", andsystem_message=Nonein the pure OpenAI path so this behavior doesn’t regress.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemo_skills/inference/generate.py` around lines 658 - 676, Add focused unit tests exercising GenerationTask override semantics: 1) when cfg.user_message == "" ensure GenerationTask._set_message_text_content replaces the single user message content with an empty string and that the code still enforces exactly one user message; 2) when cfg.system_message == "" ensure an existing leading system message is removed (messages.pop(0)); and 3) when cfg.system_message is None ensure the existing system message is left unchanged. Implement these tests against the pure OpenAI path (i.e., run the code path that uses GenerationTask with the OpenAI client/stub) and assert the final data_point["messages"] contents match the expected outcomes after _set_message_text_content and _append_message_text_suffix logic has run.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@nemo_skills/inference/generate.py`:
- Around line 658-676: Add focused unit tests exercising GenerationTask override
semantics: 1) when cfg.user_message == "" ensure
GenerationTask._set_message_text_content replaces the single user message
content with an empty string and that the code still enforces exactly one user
message; 2) when cfg.system_message == "" ensure an existing leading system
message is removed (messages.pop(0)); and 3) when cfg.system_message is None
ensure the existing system message is left unchanged. Implement these tests
against the pure OpenAI path (i.e., run the code path that uses GenerationTask
with the OpenAI client/stub) and assert the final data_point["messages"]
contents match the expected outcomes after _set_message_text_content and
_append_message_text_suffix logic has run.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7543d785-48ad-422a-8066-a0992edd349b
📒 Files selected for processing (1)
nemo_skills/inference/generate.py
Summary by CodeRabbit