Skip to content

_rm_titles modifies the state of @tool args schema (removes title property) #30456

Closed
@Yazan-Hamdan

Description

@Yazan-Hamdan

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

schema_to_be_extracted = {
    "type": "object",
    "items": {
      "type": "object",
      "required": [],
      "properties": {
        "title": {
          "type": "string",
          "description": "item title"
        },
        "due_date": {
          "type": "string",
          "description": "item due date, could be as closing date, due date, open until or any other format that indicates the end date of that item"
        }
      },
      "description": "foo"
    },
    "description": "A list of data."
  }

from typing import Callable, Dict, Any, List

from langchain_core.tools import tool

def get_extraction_tools(schema_to_be_extracted: Dict[str, Any]) -> List[Callable]:
    """
    Get the extraction tools for extracting structured data.
    
    Returns:
        A list of LangChain tool functions for extraction
    """
    @tool(args_schema=schema_to_be_extracted)
    def extract_data(extracted_data: Dict[str, Any]) -> Dict[str, Any]:
        """
        Extract structured data from the provided content according to the JSON schema.
        
        Args:
            extracted_data: Dictionary containing the extracted structured data
        """
        return extracted_data
    
    return [extract_data]

system_prompt = """
You are a helpful assistant that extracts structured data from a given content.
"""

content = """
# Task List

| Title            | Start Date | Due Date   |
|-----------------|------------|------------|
| Project Alpha   | 2025-03-01 | 2025-03-15 |
| Design Update   | 2025-03-05 | 2025-03-20 |
| Code Review     | 2025-03-10 | 2025-03-17 |
| Marketing Plan  | 2025-03-12 | 2025-03-25 |
| Final Testing   | 2025-03-18 | 2025-03-28 |
"""

from langchain.chat_models import init_chat_model
from langchain_core.messages import SystemMessage, HumanMessage
from langgraph.func import entrypoint

@entrypoint()
async def extract_data(
    input: Dict[str, Any]
) -> Any:
    """
    Extract structured data from content using LangChain Anthropic
    
    Args:
        schema_to_be_extracted: The JSON schema defining the structure of data to be extracted
    """
    schema_to_be_extracted: Dict[str, Any] = input.get("schema_to_be_extracted")
    
    # Get the appropriate LLM based on the model name
    language_model = init_chat_model(model="claude-3-5-sonnet-20241022")
    
    # Get the extraction tools
    extraction_tools = get_extraction_tools(schema_to_be_extracted)

    # Language model with tools
    language_model_with_tools = language_model.bind_tools(
        tools=extraction_tools,
        tool_choice="any"
    )
    
    messages = [
        SystemMessage(content=system_prompt),
        HumanMessage(content=content)
    ]
    
    # Invoke the model with the messages
    response = await language_model_with_tools.ainvoke(messages)
    return response


extraction_result = await extract_data.ainvoke(input={"schema_to_be_extracted": schema_to_be_extracted})
print(extraction_result.content[0]['input']['items'])

Error Message and Stack Trace (if applicable)

Here is Langsmith Trace

Description

Actual Results

[
    {"due_date": "2025-03-15"},
    {"due_date": "2025-03-20"},
    {"due_date": "2025-03-17"},
    {"due_date": "2025-03-25"},
    {"due_date": "2025-03-28"}
]

Expected Results

[
    {"due_date": "2025-03-15", "title": "Project Alpha"},
    {"due_date": "2025-03-20", "title": "Design Update"},
    {"due_date": "2025-03-17", "title": "Code Review"},
    {"due_date": "2025-03-25", "title": "Marketing Plan"},
    {"due_date": "2025-03-28", "title": "Final Testing"}
]

System Info

Package Information

langchain_core: 0.3.45
langchain: 0.3.20
langsmith: 0.3.15
langchain_anthropic: 0.3.9
langchain_google_genai: 2.0.10
langchain_openai: 0.3.9
langchain_text_splitters: 0.3.7
langgraph_sdk: 0.1.57

Metadata

Metadata

Assignees

Labels

Ɑ: coreRelated to langchain-core🤖:bugRelated to a bug, vulnerability, unexpected error with an existing feature

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions