Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured output parsing cutted by ' #27065

Open
5 tasks done
LudoArno opened this issue Oct 3, 2024 · 2 comments
Open
5 tasks done

Structured output parsing cutted by ' #27065

LudoArno opened this issue Oct 3, 2024 · 2 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@LudoArno
Copy link

LudoArno commented Oct 3, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from typing import Optional

from pydantic import BaseModel, Field

from langchain_google_vertexai import ChatVertexAI


class Joke(BaseModel):
    """Joke to tell user."""

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")


llm = ChatVertexAI(model="gemini-1.5-flash", project = "PROJECT_ID")
structured_llm = llm.with_structured_output(Joke)

structured_llm.invoke("Tell me a joke about cats")

output : Joke(setup='Why don’t cats play poker?', punchline='Why don', rating=7)

Adding include_raw=True doesn't solve the issue

Error Message and Stack Trace (if applicable)

No response

Description

I followed the documentation in order to use structured output with gemini and I end up with outputs being wrongly parsed due to the presence of ' in the text.

System Info

$ python -m langchain_core.sys_info

System Information
------------------
> OS:  Windows
> OS Version:  10.0.19044
> Python Version:  3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]

Package Information
-------------------
> langchain_core: 0.3.6
> langchain: 0.3.1
> langchain_community: 0.3.1
> langsmith: 0.1.128
> langchain_google_vertexai: 2.0.3
> langchain_openai: 0.2.0
> langchain_text_splitters: 0.3.0

Optional packages not installed
-------------------------------
> langgraph
> langserve

Other Dependencies
------------------
> aiohttp: 3.10.6
> anthropic[vertexai]: Installed. No version info available.
> async-timeout: 4.0.3
> dataclasses-json: 0.6.7
> google-cloud-aiplatform: 1.64.0
> google-cloud-storage: 2.18.2
> httpx: 0.27.2
> httpx-sse: 0.4.0
> jsonpatch: 1.33
> langchain-mistralai: Installed. No version info available.
> numpy: 1.26.4
> openai: 1.43.0
> orjson: 3.10.7
> packaging: 24.1
> pydantic: 2.8.2
> pydantic-settings: 2.5.2
> PyYAML: 6.0.2
> requests: 2.32.3
> SQLAlchemy: 2.0.35
> tenacity: 8.5.0
> tiktoken: 0.7.0
> typing-extensions: 4.12.2
@dosubot dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Oct 3, 2024
@arindam-giri
Copy link

Looks like the quote is giving issues. Can you run another prompt which doesn't return a single quote. For example: Tell me a LLM joke

@LudoArno
Copy link
Author

LudoArno commented Oct 4, 2024

Sometimes it works the model output \' instead of single quote however even if I ask him to use \' or no quote at all it generally still write quote

json_schema = {
    "title": "joke",
    "description": "Joke to tell user.",
    "type": "object",
    "properties": {
        "setup": {
            "type": "string",
            "description": "The setup of the joke",
        },
        "punchline": {
            "type": "string",
            "description": "The punchline to the joke",
        },
        "rating": {
            "type": "integer",
            "description": "How funny the joke is, from 1 to 10",
            "default": None,
        },
    },
    "required": ["setup", "punchline"],
}
llm = ChatVertexAI(model="gemini-1.5-flash", project = PROJECT_ID)
structured_llm = llm.with_structured_output(json_schema, include_raw=True)
response = structured_llm.invoke("Tell me a joke about cats without any quote and single quote, use \' instead")
response

output :

{'raw': AIMessage(content='', additional_kwargs={'function_call': {'name': 'joke', 'arguments': '{"rating": 8.0, "punchline": "Why don", "setup": "Why don"}'}}, response_metadata={'is_blocked': False, 'safety_ratings': [{'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability_label': 'NEGLIGIBLE', 'blocked': False, 'severity': 'HARM_SEVERITY_NEGLIGIBLE'}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability_label': 'NEGLIGIBLE', 'blocked': False, 'severity': 'HARM_SEVERITY_NEGLIGIBLE'}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability_label': 'NEGLIGIBLE', 'blocked': False, 'severity': 'HARM_SEVERITY_NEGLIGIBLE'}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability_label': 'NEGLIGIBLE', 'blocked': False, 'severity': 'HARM_SEVERITY_NEGLIGIBLE'}], 'usage_metadata': {'prompt_token_count': 49, 'candidates_token_count': 9, 'total_token_count': 58}, 'finish_reason': 'STOP'}, id='run-0f7e8284-5c75-4989-ab5e-2f5f27d0078b-0', tool_calls=[{'name': 'joke', 'args': {'rating': 8.0, 'punchline': 'Why don', 'setup': 'Why don'}, 'id': '7af76e9d-a07b-4a2e-bc31-432fb409a36d', 'type': 'tool_call'}], usage_metadata={'input_tokens': 49, 'output_tokens': 9, 'total_tokens': 58}),
 'parsed': {'rating': 8.0, 'punchline': 'Why don', 'setup': 'Why don'},
 'parsing_error': None}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants