-
-
Notifications
You must be signed in to change notification settings - Fork 931
Fix/json system prompt customization #2031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Waqar53
wants to merge
4
commits into
567-labs:main
Choose a base branch
from
Waqar53:fix/json-system-prompt-customization
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
f546140
fix(deps): move dev dependencies to correct section
Waqar53 4f21280
feat(json): add json_system_prompt parameter for Mode.JSON customization
Waqar53 2bfb3ca
Merge branch 'main' into fix/json-system-prompt-customization
jxnl 7249a4b
fix: use str.replace() for safe placeholder substitution
Waqar53 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,149 @@ | ||
| """ | ||
| Tests for json_system_prompt customization feature. | ||
|
|
||
| Tests verify that: | ||
| - Users can customize the JSON mode system prompt | ||
| - {schema} placeholder is correctly substituted | ||
| - Empty string disables system prompt modification | ||
| - Default behavior is backward compatible | ||
| """ | ||
|
|
||
| import json | ||
| import pytest | ||
| from unittest.mock import MagicMock | ||
|
|
||
| from instructor.providers.openai.utils import handle_json_modes | ||
| from instructor.mode import Mode | ||
| from pydantic import BaseModel | ||
|
|
||
|
|
||
| class SimpleModel(BaseModel): | ||
| """Test model for JSON schema generation.""" | ||
| name: str | ||
| age: int | ||
|
|
||
|
|
||
| class TestJsonSystemPromptCustomization: | ||
| """Tests for Issue #1514 - Customizable JSON mode system prompt.""" | ||
|
|
||
| def test_default_prompt_backward_compatible(self): | ||
| """Default behavior should be unchanged (backward compatible).""" | ||
| new_kwargs = { | ||
| "messages": [{"role": "user", "content": "Extract data"}] | ||
| } | ||
|
|
||
| response_model, result_kwargs = handle_json_modes( | ||
| SimpleModel, new_kwargs, Mode.JSON | ||
| ) | ||
|
|
||
| # Should have inserted system message | ||
| assert result_kwargs["messages"][0]["role"] == "system" | ||
| # Should contain the default "genius expert" phrase | ||
| assert "genius expert" in result_kwargs["messages"][0]["content"] | ||
| # Should contain the JSON schema | ||
| assert "SimpleModel" in result_kwargs["messages"][0]["content"] or \ | ||
| "name" in result_kwargs["messages"][0]["content"] | ||
|
|
||
| def test_custom_prompt_with_schema_placeholder(self): | ||
| """Custom prompt with {schema} placeholder should work.""" | ||
| custom_prompt = "You are a helpful assistant. Return JSON matching:\n{schema}" | ||
| new_kwargs = { | ||
| "messages": [{"role": "user", "content": "Extract data"}] | ||
| } | ||
|
|
||
| response_model, result_kwargs = handle_json_modes( | ||
| SimpleModel, new_kwargs, Mode.JSON, json_system_prompt=custom_prompt | ||
| ) | ||
|
|
||
| # Should have inserted system message | ||
| assert result_kwargs["messages"][0]["role"] == "system" | ||
| # Should NOT contain the default phrase | ||
| assert "genius expert" not in result_kwargs["messages"][0]["content"] | ||
| # Should contain our custom text | ||
| assert "helpful assistant" in result_kwargs["messages"][0]["content"] | ||
| # Schema should be substituted | ||
| assert "name" in result_kwargs["messages"][0]["content"] | ||
|
|
||
| def test_empty_string_skips_system_prompt(self): | ||
| """Empty string should skip system prompt modification entirely.""" | ||
| new_kwargs = { | ||
| "messages": [{"role": "user", "content": "Extract data"}] | ||
| } | ||
|
|
||
| response_model, result_kwargs = handle_json_modes( | ||
| SimpleModel, new_kwargs, Mode.JSON, json_system_prompt="" | ||
| ) | ||
|
|
||
| # Should NOT have inserted system message | ||
| assert result_kwargs["messages"][0]["role"] == "user" | ||
| # Original message should be unchanged | ||
| assert result_kwargs["messages"][0]["content"] == "Extract data" | ||
|
|
||
| def test_custom_prompt_appends_to_existing_system(self): | ||
| """Custom prompt should append to existing system message.""" | ||
| custom_prompt = "Respond with JSON: {schema}" | ||
| new_kwargs = { | ||
| "messages": [ | ||
| {"role": "system", "content": "You are a pirate."}, | ||
| {"role": "user", "content": "Tell me about treasure"} | ||
| ] | ||
| } | ||
|
|
||
| response_model, result_kwargs = handle_json_modes( | ||
| SimpleModel, new_kwargs, Mode.JSON, json_system_prompt=custom_prompt | ||
| ) | ||
|
|
||
| # System message should be preserved and extended | ||
| assert result_kwargs["messages"][0]["role"] == "system" | ||
| assert "pirate" in result_kwargs["messages"][0]["content"] | ||
| assert "JSON" in result_kwargs["messages"][0]["content"] | ||
|
|
||
| def test_json_schema_mode_ignores_system_prompt(self): | ||
| """JSON_SCHEMA mode uses response_format, not system prompt modification.""" | ||
| custom_prompt = "Custom prompt {schema}" | ||
| new_kwargs = { | ||
| "messages": [{"role": "user", "content": "Extract data"}] | ||
| } | ||
|
|
||
| response_model, result_kwargs = handle_json_modes( | ||
| SimpleModel, new_kwargs, Mode.JSON_SCHEMA, json_system_prompt=custom_prompt | ||
| ) | ||
|
|
||
| # Should have response_format set | ||
| assert "response_format" in result_kwargs | ||
| assert result_kwargs["response_format"]["type"] == "json_schema" | ||
| # System message should NOT be inserted for JSON_SCHEMA mode | ||
| assert result_kwargs["messages"][0]["role"] == "user" | ||
|
|
||
| def test_md_json_mode_with_custom_prompt(self): | ||
| """MD_JSON mode should work with custom prompt.""" | ||
| custom_prompt = "Return markdown JSON: {schema}" | ||
| new_kwargs = { | ||
| "messages": [{"role": "user", "content": "Extract data"}] | ||
| } | ||
|
|
||
| response_model, result_kwargs = handle_json_modes( | ||
| SimpleModel, new_kwargs, Mode.MD_JSON, json_system_prompt=custom_prompt | ||
| ) | ||
|
|
||
| # Should have system message with custom prompt | ||
| assert result_kwargs["messages"][0]["role"] == "system" | ||
| assert "markdown JSON" in result_kwargs["messages"][0]["content"] | ||
|
|
||
| def test_none_response_model_returns_early(self): | ||
| """None response_model should return early without modification.""" | ||
| new_kwargs = { | ||
| "messages": [{"role": "user", "content": "Hello"}] | ||
| } | ||
|
|
||
| response_model, result_kwargs = handle_json_modes( | ||
| None, new_kwargs, Mode.JSON, json_system_prompt="custom" | ||
| ) | ||
|
|
||
| assert response_model is None | ||
| # Messages should be unchanged | ||
| assert result_kwargs["messages"][0]["role"] == "user" | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| pytest.main([__file__, "-v"]) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format string crashes on prompts with curly braces
Medium Severity
Using
str.format()for placeholder substitution causes aKeyErrorifjson_system_promptcontains curly braces other than{schema}. Users may reasonably want to include JSON examples in their prompt like"Return JSON like: {"id": 1} matching {schema}", which will crash. Literal braces require escaping as{{and}}, but this limitation isn't documented. Consider using a safer substitution method likestr.replace("{schema}", json_schema)instead.