Skip to content

Conversation

@XinyuWuu
Copy link

@XinyuWuu XinyuWuu commented Nov 7, 2025

User description

LLMs might return code wrapped with ``` or ```yml.


PR Type

Bug fix


Description

  • Enhanced try_fix_yaml to handle code snippets with \``, ```yaml, or ```yml` prefixes

  • Updated regex pattern to match optional language identifiers (yaml/yml)

  • Added dynamic prefix detection to correctly remove code fence markers

  • Extended test coverage with three snippet format variations


Diagram Walkthrough

flowchart LR
  A["Code snippet with prefix"] --> B["Updated regex pattern"]
  B --> C["Detect prefix type"]
  C --> D["Remove correct prefix"]
  D --> E["Parse YAML successfully"]
Loading

File Walkthrough

Relevant files
Bug fix
utils.py
Enhance regex and prefix handling for code fences               

pr_agent/algo/utils.py

  • Updated snippet_pattern regex from r'yaml([\s\S]*?)(?=\s*$|")' to
    r'(yaml|yml)?([\s\S]*?)(?=\s*$|")' to match optional language
    identifiers
  • Added dynamic prefix detection logic that determines whether snippet
    starts with \\\yaml, \\\yml, or plain \\\
  • Modified removeprefix() call to use detected prefix instead of
    hardcoded
    'yaml'
+7/-2     
Tests
test_try_fix_yaml.py
Add test cases for multiple code fence formats                     

tests/unittest/test_try_fix_yaml.py

  • Renamed original test variable from review_text to review_text1 for
    clarity
  • Added review_text2 test case with \\\yml prefix format
  • Added review_text3 test case with plain \\\ prefix format
  • Updated assertions to test all three snippet format variations
+20/-2   

more robust try_fix_yaml
test extract snippet with prefix "```" or "```yml"
@qodo-merge-for-open-source
Copy link
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
ReDoS vulnerability

Description: The regex pattern r'(yaml|yml)?([\s\S]?)(?=\s$|")' is vulnerable to ReDoS (Regular
Expression Denial of Service) attacks due to nested quantifiers ([\s\S]*?) combined with
lookahead, which can cause catastrophic backtracking on malicious input with many
backticks.
utils.py [815-815]

Referred Code
snippet_pattern = r'```(yaml|yml)?([\s\S]*?)```(?=\s*$|")'
snippet = re.search(snippet_pattern, '\n'.join(response_text_lines_copy))
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Consistent Naming Conventions

Objective: All new variables, functions, and classes must follow the project's established naming
standards

Status: Passed

No Dead or Commented-Out Code

Objective: Keep the codebase clean by ensuring all submitted code is active and necessary

Status: Passed

Single Responsibility for Functions

Objective: Each function should have a single, well-defined responsibility

Status: Passed

When relevant, utilize early return

Objective: In a code snippet containing multiple logic conditions (such as 'if-else'), prefer an
early return on edge cases than deep nesting

Status: Passed

Robust Error Handling

Objective: Ensure potential errors and edge cases are anticipated and handled gracefully throughout
the code

Status:
Bare except clause: The code uses bare except: clauses without specifying exception types or logging errors,
which may silently suppress important exceptions.

Referred Code
try:
    data = yaml.safe_load(snippet_text.removeprefix(prefix).rstrip('`'))
    get_logger().info(f"Successfully parsed AI prediction after extracting yaml snippet")
    return data
except:
    pass
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-merge-for-open-source
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Simplify code by using regex group

Refactor the YAML parsing logic to directly use the second capture group from
the snippet_pattern regex, which simplifies the code by removing manual prefix
and suffix stripping.

pr_agent/algo/utils.py [820-831]

-snippet_text = snippet.group()
-prefix = (
-    '```yaml'
-    if snippet_text.startswith('```yaml')
-    else ('```yml' if snippet_text.startswith('```yml') else '```')
-)
 try:
-    data = yaml.safe_load(snippet_text.removeprefix(prefix).rstrip('`'))
+    # The YAML content is in the second capture group of the regex
+    yaml_content = snippet.group(2)
+    data = yaml.safe_load(yaml_content)
     get_logger().info(f"Successfully parsed AI prediction after extracting yaml snippet")
     return data
 except:
     pass
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that using the second capture group from the regex is a more direct and robust way to extract the YAML content, simplifying the code and improving readability.

Medium
Learned
best practice
Capture and log exceptions properly

Capture the exception into a variable and log it for debugging. Replace bare
except: with except Exception as e: and log the error details to help diagnose
parsing failures.

pr_agent/algo/utils.py [826-831]

 try:
     data = yaml.safe_load(snippet_text.removeprefix(prefix).rstrip('`'))
     get_logger().info(f"Successfully parsed AI prediction after extracting yaml snippet")
     return data
-except:
+except Exception as e:
+    get_logger().debug(f"Failed to parse yaml snippet: {str(e)}")
     pass
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Improve exception handling by preserving context (raise from), capturing exceptions into variables, sanitizing logs to avoid secrets leakage, and converting returns of exception objects into proper raises.

Low
  • More
  • Author self-review: I have reviewed the PR code suggestions, and addressed the relevant ones.

@DanaFineTLV
Copy link
Collaborator

Hi @LawrenceMantin
Apologies for the delayed response, and thank you for your questions and contributions 🙏

We offer a free Qodo version for free-trial for developers [www.qodo.ai],
and offer a free version of our paid product for open-source projects [https://www.qodo.ai/solutions/open-source/]. 🚀

We’re currently restructuring the project and contributing it to the community, with plans to move it under a foundation.
If you’re interested in taking part, please reach out to me at [email protected]
or via LinkedIn

We’ll also be launching an Ambassador Program soon — if you’d like to join, stay tuned for more details!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants