-
Notifications
You must be signed in to change notification settings - Fork 8
docs: Update file-editing.mdx for fuzzy matching ladder + apply_patch tool #814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,7 +5,7 @@ description: "Precise, conflict-safe find-and-replace across files, scoped to a | |
| icon: "file-pen" | ||
| --- | ||
|
|
||
| File editing tools provide secure, workspace-scoped file operations with **precise, conflict-safe** find-and-replace. Ambiguous matches fail loudly instead of editing the wrong occurrence, and a content-hash check prevents lost-update conflicts when files change between read and edit. | ||
| File editing tools provide secure, workspace-scoped file operations with **precise, conflict-safe** find-and-replace. Ambiguous matches fail loudly instead of editing the wrong occurrence, and a content-hash check prevents lost-update conflicts when files change between read and edit. A fuzzy matching ladder makes first-try edits succeed even when `old_string` drifts from the file by whitespace, indentation, or line endings. | ||
|
|
||
| ```mermaid | ||
| graph LR | ||
|
|
@@ -107,7 +107,7 @@ sequenceDiagram | |
| A->>E: edit_file(path, old, new, expected_hash=sha256) | ||
| E->>F: re-read + hash | ||
| F-->>E: current content | ||
| E->>E: ambiguity & hash checks | ||
| E->>E: fuzzy ladder + ambiguity & hash checks | ||
| E->>F: write preserved LF/CRLF/BOM | ||
| E-->>A: success + unified diff | ||
| ``` | ||
|
|
@@ -118,8 +118,9 @@ sequenceDiagram | |
| | **read_file** *(edit_tools)* | Low | No | Read content + SHA-256 hash for staleness checks | | ||
| | **read_file** *(file_tools)* | Low | No | Plain read returning a string | | ||
| | **list_files** | Low | No | Directory listings | | ||
| | **edit_file** | High | Recommended | Precise find-and-replace with ambiguity & staleness guards | | ||
| | **edit_file** | High | Recommended | Precise find-and-replace with fuzzy ladder, ambiguity & staleness guards | | ||
| | **write_file** | High | Recommended | Create/overwrite files | | ||
| | **apply_patch** | High | Recommended | Atomic multi-file Add/Update/Delete with rollback | | ||
|
|
||
| <Warning> | ||
| Two modules export `read_file` with different signatures: | ||
|
|
@@ -132,13 +133,216 @@ Use **edit_tools** when passing `expected_hash` to `edit_file`. | |
|
|
||
| --- | ||
|
|
||
| ## How Fuzzy Matching Works | ||
|
|
||
| `edit_file` walks a deterministic ladder of matching strategies and stops at the **first** strategy that produces a confident match. Exact matches always win; fuzzy strategies only engage when an exact substring is not found. Existing code keeps behaving exactly as before — only previously-failing edits now succeed. | ||
|
Comment on lines
+136
to
+138
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This section describes a 5-strategy fuzzy matching ladder for |
||
|
|
||
| ```mermaid | ||
| graph TB | ||
| subgraph "Fuzzy Matching Ladder" | ||
| A[🤖 Agent old_string] --> S1[1. exact] | ||
| S1 -->|match| OK[✅ Apply] | ||
| S1 -->|miss| S2[2. line_trimmed] | ||
| S2 -->|match| OK | ||
| S2 -->|miss| S3[3. whitespace_normalised] | ||
| S3 -->|match| OK | ||
| S3 -->|miss| S4[4. indentation_flexible] | ||
| S4 -->|match| OK | ||
| S4 -->|miss| S5[5. block_anchor + similarity 0.7] | ||
| S5 -->|match| OK | ||
| S5 -->|miss| ERR[⚠️ String not found] | ||
| OK --> AMB{Single span?} | ||
| AMB -->|yes| W[💾 Write + diff] | ||
| AMB -->|no| AMBERR[⚠️ Ambiguous match] | ||
| end | ||
|
|
||
| classDef agent fill:#8B0000,stroke:#7C90A0,color:#fff | ||
| classDef step fill:#189AB4,stroke:#7C90A0,color:#fff | ||
| classDef ok fill:#10B981,stroke:#7C90A0,color:#fff | ||
| classDef warn fill:#F59E0B,stroke:#7C90A0,color:#fff | ||
|
|
||
| class A agent | ||
| class S1,S2,S3,S4,S5,OK,AMB step | ||
| class W ok | ||
| class ERR,AMBERR warn | ||
| ``` | ||
|
|
||
| | # | Strategy | Tolerates | Example divergence | | ||
| |---|----------|-----------|-------------------| | ||
| | 1 | `exact` | nothing | byte-for-byte match | | ||
| | 2 | `line_trimmed` | leading/trailing whitespace per line | ` return 1` vs `return 1` | | ||
| | 3 | `whitespace_normalised` | collapsed internal whitespace | `x = 1` vs `x = 1` | | ||
| | 4 | `indentation_flexible` | tabs vs spaces, depth | `\treturn 1` vs ` return 1` | | ||
| | 5 | `block_anchor` | structural drift (similarity ≥ 0.7) | first/last lines anchor a fuzzy block | | ||
|
|
||
| **Confidence guards (block_anchor only):** | ||
| - Similarity threshold: `0.7` (constant `_BLOCK_ANCHOR_THRESHOLD` in source) | ||
| - Disproportionate-length guard: rejects blocks more than 2× or less than half the `old_string` line count | ||
| - Tie-breaking: two equally-scored candidates are treated as **ambiguous** (not silently picked) | ||
|
|
||
| ```python | ||
| from praisonaiagents.tools.edit_tools import edit_file | ||
|
|
||
| # File uses tabs; old_string uses spaces — still succeeds | ||
| # because indentation_flexible normalises both | ||
| edit_file( | ||
| "src/utils.py", | ||
| old_string=" return value", | ||
| new_string=" return processed_value", | ||
| ) | ||
| ``` | ||
|
|
||
| <Note> | ||
| **Why this matters for coding agents:** LLM-generated `old_string` values routinely drift by whitespace, indentation, or line endings. The fuzzy ladder makes the first attempt succeed when the target is unambiguous, saving retry turns and tokens. | ||
| </Note> | ||
|
|
||
| --- | ||
|
|
||
| ## Multi-file Patches with `apply_patch` | ||
|
|
||
| `apply_patch` lets an agent Add, Update, and Delete multiple files in a single atomic call — all changes succeed together or none are committed. | ||
|
|
||
| ```mermaid | ||
| graph TB | ||
| Q[Need to edit files?] | ||
| Q --> A1{One file?} | ||
| A1 -->|Yes, one change| E[edit_file] | ||
| A1 -->|Yes, multiple changes| EM[apply_patch with one Update] | ||
| A1 -->|Multiple files| AP[apply_patch] | ||
| AP --> R[All-or-nothing atomic] | ||
|
|
||
| classDef q fill:#6366F1,stroke:#7C90A0,color:#fff | ||
| classDef tool fill:#10B981,stroke:#7C90A0,color:#fff | ||
| classDef note fill:#F59E0B,stroke:#7C90A0,color:#fff | ||
|
|
||
| class Q,A1 q | ||
| class E,EM,AP tool | ||
| class R note | ||
| ``` | ||
|
|
||
| <Steps> | ||
| <Step title="Agent Quick Start"> | ||
|
|
||
| ```python | ||
| from praisonaiagents import Agent | ||
|
|
||
| agent = Agent( | ||
| name="Refactor Agent", | ||
| instructions="Refactor across files atomically. Use apply_patch for multi-file changes.", | ||
| tools=["read_file", "search_files", "apply_patch", "edit_file"], | ||
| ) | ||
|
|
||
| agent.start("Rename UserService to AccountService across src/ and update its tests.") | ||
| ``` | ||
|
|
||
| </Step> | ||
|
|
||
| <Step title="Direct SDK use"> | ||
|
|
||
| ```python | ||
| from praisonaiagents.tools.edit_tools import apply_patch | ||
|
|
||
| patch = """*** Update File: src/service.py | ||
| @@ | ||
| class UserService: | ||
| === | ||
| class AccountService: | ||
| *** Update File: tests/test_service.py | ||
| @@ | ||
| from src.service import UserService | ||
| === | ||
| from src.service import AccountService | ||
| *** Delete File: docs/old_userservice.md | ||
| """ | ||
|
|
||
| result = apply_patch(patch) | ||
| print(result) # "Success: Applied patch to 3 file(s) ... <combined diff>" | ||
| ``` | ||
|
|
||
| </Step> | ||
| </Steps> | ||
|
|
||
| ### Patch Format Reference | ||
|
|
||
| | Header | Body format | Purpose | | ||
| |--------|------------|---------| | ||
| | `*** Add File: <path>` | Full file content lines until next header | Create a new file (errors if path already exists) | | ||
| | `*** Update File: <path>` | One or more `@@` hunks (`<old>\n===\n<new>`) | Modify file using fuzzy ladder for each hunk | | ||
| | `*** Delete File: <path>` | (no body) | Remove file (errors if path missing) | | ||
|
|
||
| Optional sentinels `*** Begin Patch` / `*** End Patch` are accepted and stripped. | ||
|
|
||
| **Update hunk syntax:** | ||
|
|
||
| ``` | ||
| *** Update File: path/to/file | ||
| @@ | ||
| <old block to find> | ||
| === | ||
| <new block to replace it with> | ||
| @@ | ||
| <another old block> | ||
| === | ||
| <another new block> | ||
| ``` | ||
|
|
||
| Each `@@` hunk runs through the same fuzzy ladder as `edit_file`, so whitespace/indentation drift in the old block is tolerated. | ||
|
|
||
| ### Atomicity Guarantees | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant A as 🤖 Agent | ||
| participant AP as ✏️ apply_patch | ||
| participant FS as 📁 Filesystem | ||
|
|
||
| A->>AP: apply_patch(patch_text) | ||
| AP->>AP: Phase 1 — parse + validate | ||
| Note over AP: Compute new content, check existence,<br/>resolve fuzzy hunks | ||
| AP->>FS: Phase 2 — commit (staged temp files) | ||
| alt All commits succeed | ||
| FS-->>AP: ✅ Each os.replace OK | ||
| AP-->>A: Success + combined diff | ||
| else Any commit fails | ||
| AP->>FS: Roll back (LIFO restore backups) | ||
| FS-->>AP: Restored | ||
| AP-->>A: Error: ... | ||
| end | ||
| ``` | ||
|
|
||
| | Behaviour | How it works | | ||
| |-----------|-------------| | ||
| | All-or-nothing | Phase 1 validates every operation and computes new content; Phase 2 commits with staged temp files and `os.replace` | | ||
| | Rollback on failure | If any commit step raises, applied operations are reversed in LIFO order via backup paths | | ||
| | BOM preservation | UTF-8 BOM detected on Update is reapplied on write | | ||
| | Line-ending preservation | CRLF files stay CRLF, LF files stay LF (matches `edit_file`) | | ||
| | UTF-16 rejection | Update on a UTF-16 file fails with a clear error | | ||
|
|
||
| ### `apply_patch` Error Messages | ||
|
|
||
| | Trigger | Message | | ||
| |---------|---------| | ||
| | Empty / no operations | `Error: Patch contains no operations` | | ||
| | Malformed header / orphan body | `Error: Invalid patch: Unexpected line in patch (expected a section header): ...` | | ||
| | Add target already exists | `Error: Cannot add '<path>': file already exists` | | ||
| | Delete target missing | `Error: Cannot delete '<path>': file not found` | | ||
| | Update target missing | `Error: Cannot update '<path>': file not found` | | ||
| | Empty hunk old-block | `Error: Empty hunk in update for '<path>'` | | ||
| | Hunk not found | `Error: Hunk not found in '<path>': '<preview>'` | | ||
| | Ambiguous hunk | `Error: Ambiguous hunk in '<path>': '<preview>' matches N locations` | | ||
| | UTF-16 file | `Error: Cannot update '<path>': UTF-16 encoding is not supported. Please convert the file to UTF-8.` | | ||
| | Success | `Success: Applied patch to N file(s)\n\n<combined diff>` | | ||
|
|
||
| --- | ||
|
|
||
| ## Configuration Options | ||
|
|
||
| ### File Editing Functions | ||
|
|
||
| | Function | Args | Returns | Notes | | ||
| |----------|------|---------|-------| | ||
| | `edit_file` | `filepath`, `old_string`, `new_string`, `replace_all=False`, `expected_hash=None` | `str` | High-risk; fails on ambiguous match unless `replace_all=True` | | ||
| | `edit_file` | `filepath`, `old_string`, `new_string`, `replace_all=False`, `expected_hash=None` | `str` | High-risk; fuzzy ladder + fails on ambiguous match unless `replace_all=True` | | ||
| | `apply_patch` | `patch: str` | `str` | High-risk; atomic multi-file Add/Update/Delete with rollback | | ||
| | `read_file` *(edit_tools)* | `filepath` | `Tuple[str, str]` | `(content, sha256_hex)` for staleness checks | | ||
| | `read_file` *(file_tools)* | `filepath`, `encoding='utf-8'` | `str` | Simple read, no hash | | ||
| | `search_files` | `directory`, `pattern`, `file_pattern='*'` | JSON string | Case-insensitive substring search | | ||
|
|
@@ -176,6 +380,10 @@ edit_file("config.py", "DEBUG = False", "DEBUG = True", expected_hash=h) | |
| | Ambiguous match | `Error: Ambiguous match - '{preview}' occurs {N} times. Please provide more surrounding context to make the match unique, or use replace_all=True to replace all occurrences.` | | ||
| | Success | `Success: Made {N} replacement(s) in {filepath}\n\nDiff:\n{diff}` | | ||
|
|
||
| <Warning> | ||
| An "Ambiguous match" error can also fire when **fuzzy strategies** produce more than one candidate location (e.g. whitespace-normalised matches at two places). Fix by adding more surrounding context to `old_string`. | ||
| </Warning> | ||
|
|
||
| <Warning> | ||
| If a file contains mixed line endings, any CRLF present causes the file to be normalised to CRLF on save. | ||
| </Warning> | ||
|
|
@@ -218,6 +426,26 @@ result = edit_file("config.py", "DEBUG = False", "DEBUG = True", expected_hash=h | |
| # Returns stale-file error if content changed — re-read and retry | ||
| ``` | ||
|
|
||
| ### Atomic Multi-file Rename | ||
|
|
||
| ```python | ||
| from praisonaiagents.tools.edit_tools import apply_patch | ||
|
|
||
| result = apply_patch("""*** Begin Patch | ||
| *** Update File: src/auth.py | ||
| @@ | ||
| class UserService: | ||
| === | ||
| class AccountService: | ||
| *** Update File: tests/test_auth.py | ||
| @@ | ||
| from src.auth import UserService | ||
| === | ||
| from src.auth import AccountService | ||
| *** End Patch | ||
| """) | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Best Practices | ||
|
|
@@ -243,6 +471,14 @@ Include surrounding context so the match is unambiguous; use `replace_all=True` | |
| CRLF files stay CRLF, LF files stay LF, UTF-8 BOM is preserved. UTF-16 files are rejected with a clear error. | ||
| </Accordion> | ||
|
|
||
| <Accordion title="Use apply_patch for multi-file changes"> | ||
| Use `apply_patch` when changes span multiple files and must succeed/fail together (rename, refactor, dependency bump). Use `edit_file` when changing one file in one place — it returns a focused diff and avoids patch syntax overhead. | ||
| </Accordion> | ||
|
|
||
| <Accordion title="Patch hunk format is not unified diff"> | ||
| The patch hunk format uses `@@` to separate hunks and `===` to separate old from new — this is not unified diff format. Add/Delete sections must not contain `@@`/`===` markers. | ||
| </Accordion> | ||
|
|
||
| <Accordion title="Workspace Security"> | ||
| File operations respect workspace boundaries. Paths outside the workspace are rejected to prevent directory traversal. | ||
| </Accordion> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation references
apply_patchas an available tool inpraisonaiagents.tools.edit_tools. However,apply_patchis not implemented or exported inpraisonaiagents/tools/edit_tools.pyin this codebase. Importing or using this tool will result in anImportError. Please ensure the implementation ofapply_patchis included in this pull request, or remove/defer this documentation until the feature is implemented.