-
Notifications
You must be signed in to change notification settings - Fork 15k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse_json_markdown is unable to parse json strings with nested triple backticks #5428
Comments
Proposed fix: def parse_json_markdown(json_string: str) -> dict:
# Try to find JSON string within first and last triple backticks
match = re.search(r"""``` # match first occuring triple backticks
(?:json)? # zero or one match of string json in non-capturing group
(.*)``` # greedy match to last triple backticks""", json_string, flags=re.DOTALL|re.VERBOSE)
# If no match found, assume the entire string is a JSON string
if match is None:
json_str = json_string
else:
# If match found, use the content within the backticks
json_str = match.group(1)
# Strip whitespace and newlines from the start and end
json_str = json_str.strip()
# Parse the JSON string into a Python dictionary while allowing control characters by setting strict to False
parsed = json.loads(json_str, strict=False)
return parsed |
@schinto It looks like the proposed fix doesn't work as well. I have this output returned by the LLM: {
"action": "Final Answer",
"action_input": "Sure! Here's an example Python code to create an S3 bucket using the Boto3 library:\n\n```python\nimport boto3\n\n# Create an S3 client\ns3 = boto3.client('s3')\n\n# Create a new S3 bucket\nbucket_name = 'your-bucket-name'\ns3.create_bucket(Bucket=bucket_name)\n\n# Print the bucket creation status\nresponse = s3.list_buckets()\nfor bucket in response['Buckets']:\n if bucket['Name'] == bucket_name:\n print('Bucket created successfully!')\n break\n```"
} And the regex is always matching the second triple backticks ( |
@yassineselmi the output by the LLM should be enclosed by triple backticks like ```json If these are missing, then the parse_json_markdown function may need further changes. |
Hi Team, I am facing a similar issue while using GraphSparqlQAChain langchain llm with RDF Graph Data. The model is able to create correct SPARQL queries with correct Intent now but they are enclosed in triple backticks (```). As a result the SPARQL query execution is failing and no insights are generated from prompts. the generated SPARQL looks like : ParseException: Expected {SelectQuery | ConstructQuery | DescribeQuery | AskQuery}, found '`' (at char 0), (line:1, col:1) Can anyone kindly help me with this? |
Hi, @schinto I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, the Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you! |
Having this issue right now, anyone has a fix for this? |
Also still having this issue |
Still facing the issue the JsonOutputParser(pydantic_object=JobDescriptionInfoExtract) the jsonoutputparser doesn't work properly with pydantic |
Hey there, do you know how to make a langchain agent always return json response in this format? by enclosing it in triple ticks? |
System Info
Langchain version 0.0.184, python 3.9.13
Function
parse_json_markdown
in langchain/output_parsers/json.py fails with input text string:```json
{
"action": "Final Answer",
"action_input": "Here's a Python script to remove backticks at the beginning and end of a string:\n\n```python\ndef remove_backticks(s):\n return s.strip('`')\n\nstring_with_backticks = '`example string`'\nresult = remove_backticks(string_with_backticks)\nprint(result)\n```\n\nThis script defines a function called `remove_backticks` that takes a string as input and returns a new string with backticks removed from the beginning and end. It then demonstrates how to use the function with an example string."
}
```
Potential case of error:
match.group(2)
in the functionparse_json_markdown
contains only the string up to the first occurrence of the second triple backticks:{
"action": "Final Answer",
"action_input": "Here's a Python script to remove backticks at the beginning and end of a string:\n\n
Who can help?
No response
Information
Related Components
Reproduction
Called function
parse_json_markdown
in langchain/output_parsers/json.py with input text string:```json
{
"action": "Final Answer",
"action_input": "Here's a Python script to remove backticks at the beginning and end of a string:\n\n```python\ndef remove_backticks(s):\n return s.strip('`')\n\nstring_with_backticks = '`example string`'\nresult = remove_backticks(string_with_backticks)\nprint(result)\n```\n\nThis script defines a function called `remove_backticks` that takes a string as input and returns a new string with backticks removed from the beginning and end. It then demonstrates how to use the function with an example string."
}
```
Expected behavior
Function
parse_json_markdown
should return the following json string{
"action": "Final Answer",
"action_input": "Here's a Python script to remove backticks at the beginning and end of a string:\n\n```python\ndef remove_backticks(s):\n return s.strip('`')\n\nstring_with_backticks = '`example string`'\nresult = remove_backticks(string_with_backticks)\nprint(result)\n```\n\nThis script defines a function called `remove_backticks` that takes a string as input and returns a new string with backticks removed from the beginning and end. It then demonstrates how to use the function with an example string."
}
The text was updated successfully, but these errors were encountered: