Skip to content

Fix prompt formatting biases affecting JSON output #2824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

devin-ai-integration[bot]
Copy link
Contributor

Fix prompt formatting biases affecting JSON output

Issue

This PR addresses issue #2823 which involves two prompt formatting biases that affect agent output JSON formatting:

  1. In base_tool.py, tool descriptions were using Python's native string representation for dictionaries (with single quotes), which biased agents to output JSON with single quotes instead of the standard double quotes.

  2. In translations/en.json, the "tools" entry included formatting instructions that didn't explicitly specify to use double quotes in JSON output.

Changes

  • Modified _generate_description method in base_tool.py to use json.dumps() for proper JSON formatting with double quotes
  • Updated the prompt format in translations/en.json to explicitly instruct using double quotes for JSON keys and values
  • Updated existing tests to check for proper JSON formatting
  • Added new tests to specifically verify JSON formatting in tool descriptions

Testing

  • Added a new test file tests/tools/test_json_formatting.py with tests that verify:
    • Tool descriptions use proper JSON formatting with double quotes
    • The JSON in tool descriptions can be parsed as valid JSON
    • No single quotes are present in the JSON output
  • Updated existing tests in test_tool_usage.py to check for double quotes instead of single quotes

All tests are passing, confirming the fixes work correctly.

Link to Devin run

https://app.devin.ai/sessions/a04244961d1a4e18bde9d5b8f995659e

Requested by

Joe Moura ([email protected])

Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@joaomdmoura
Copy link
Collaborator

Disclaimer: This review was made by a crew of AI Agents.

Code Review Comment

Overview

This PR effectively addresses JSON formatting issues related to tool descriptions and enhances the consistency of string representations across the board, comprising changes across four files with 85 insertions and 6 deletions.

Detailed Analysis

1. Improvement in JSON Serialization

  • File: src/crewai/tools/base_tool.py
  • Changes: Introduction of JSON serialization for tool arguments schema.
  • Strengths:
    • Explicit JSON import and serialization improve clarity and correctness.
  • Improvements Suggested:
    • Error Handling: It's imperative to add error handling for JSON serialization to gracefully manage exceptions. Below is an example for consideration:
      def _generate_description(self):
          import json
          try:
              args_schema = {
                  name: {
                      "description": field.description,
                      "type": self._get_arg_annotations(field.annotation),
                  }
                  for name, field in self.args_schema.model_fields.items()
              }
              args_json = json.dumps(args_schema)
          except Exception as e:
              logger.warning(f"Failed to serialize args schema: {e}")
              args_json = str(args_schema)
              
          self.description = f"Tool Name: {self.name}\nTool Arguments: {args_json}\nTool Description: {self.description}"

2. Enhanced Documentation on Formatting

  • File: src/crewai/translations/en.json
  • Changes: Amendments clarifying string formatting requirements.
  • Strengths:
    • The added documentation about quote usage and consistent formatting instructions are commendable.

3. Comprehensive Testing

  • Files: tests/tools/test_json_formatting.py (new) and tests/tools/test_tool_usage.py
  • Strengths:
    • Comprehensive test coverage for JSON formatting is present. Tests validate both tool descriptions and usage scenarios, including proper quote usage.
  • Recommendations:
    • Consider including more edge cases in tests, especially those involving special characters and nested structures. Here’s a suggested enhancement:
      def test_tool_usage_render_edge_cases():
          """Test tool usage rendering with special characters and nested structures"""
          class ComplexTool(BaseTool):
              name = "Complex Tool"
              description = "Tool with complex arguments"
              args_schema = TestComplexInput  
              
          tool = ComplexTool()
          rendered = tool_usage._render()
          
          assert json.loads(rendered.split("Tool Arguments: ")[1].split("\nTool")[0])

Overall Assessment

Strengths:

  1. Enhancements to JSON formatting provide consistency and reliability.
  2. Comprehensive testing ensures maintainability and reduces the risk of future errors.
  3. Improved documentation aids in understanding the requirements and expected formatting.

Recommendations:

  1. Implement error handling during JSON serialization.
  2. Expand edge case testing, particularly for special characters.
  3. Incorporate logging for any formatting issues encountered.
  4. Document JSON formatting requirements clearly in the README for future reference.

Conclusion

This PR is well-structured with a significant positive impact on tool argument handling and JSON output reliability. Once the recommendations are addressed, it should be ready for merging. Additionally, there are no apparent security concerns related to the changes made.

Linking related PRs for historical context may provide further insights into patterns and decisions made in similar contexts, which could enrich future development discussions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant