Skip to content

fix: handle UnicodeDecodeError race condition in copaw-worker#791

Open
nillikechatchat wants to merge 3 commits into
agentscope-ai:mainfrom
nillikechatchat:fix/copaw-worker-unicode-decode-race
Open

fix: handle UnicodeDecodeError race condition in copaw-worker#791
nillikechatchat wants to merge 3 commits into
agentscope-ai:mainfrom
nillikechatchat:fix/copaw-worker-unicode-decode-race

Conversation

@nillikechatchat
Copy link
Copy Markdown
Contributor

Summary

Fix for Issue #728: UnicodeDecodeError race condition in copaw-worker.

Problem

When mc mirror completes file synchronization, there is a race condition where file reads can occur before the file is fully written to disk. This causes UnicodeDecodeError when reading multi-byte UTF-8 characters at the end of files.

Solution

Added _read_text_with_retry() helper function in both worker.py and sync.py that:

  • Catches UnicodeDecodeError during file reads
  • Retries up to 5 times with 100ms delay between attempts
  • Logs debug info on retries, warning on final failure

Files Changed

  • copaw/src/copaw_worker/worker.py - Added retry logic for SOUL.md, AGENTS.md, HEARTBEAT.md reads
  • copaw/src/copaw_worker/sync.py - Added retry logic for team_id.txt and team_leader.txt reads

Fixes #728

nillikechatchat and others added 2 commits May 10, 2026 04:26
…codeDecodeError race

This fixes the race condition between mc mirror reporting completion and the file
being fully written to disk. The race can cause partial reads of multi-byte UTF-8
characters at the end of the file, resulting in UnicodeDecodeError crashes during
worker startup.

Fixes: agentscope-ai#728

Changes:
- Add _read_text_with_retry() helper function in both worker.py and sync.py
- The function retries up to 5 times with 100ms delay on UnicodeDecodeError
- Apply retry mechanism to all read_text() calls for AGENTS.md, SOUL.md, and openclaw.json
- Add debug logging for retry attempts to improve observability
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 11, 2026

📊 CI Metrics Report

Summary

Metric Current Baseline Change
LLM Calls 79 86 -7 ↓ -8.1%
Input Tokens 2917313 3825223 -907910 ↓ -23.7%
Output Tokens 17131 19253 -2122 ↓ -11.0%
Total Tokens 2934444 3844476 -910032 ↓ -23.7%

By Role

Role Metric Current Baseline Change
🧠 Manager LLM Calls 69 75 -6 ↓ -8.0%
Input Tokens 2685156 3572217 -887061 ↓ -24.8%
Output Tokens 15069 16994 -1925 ↓ -11.3%
Total Tokens 2700225 3589211 -888986 ↓ -24.8%
🔧 Workers LLM Calls 10 11 -1 ↓ -9.1%
Input Tokens 232157 253006 -20849 ↓ -8.2%
Output Tokens 2062 2259 -197 ↓ -8.7%
Total Tokens 234219 255265 -21046 ↓ -8.2%

Per-Test Breakdown

Test Mgr Calls Wkr Calls Δ Calls Mgr In Wkr In Mgr Out Wkr Out Δ Tokens Trend
02-create-worker 15 0 -2 ↓ -11.8% 480807 0 2812 0 -52952 ↓ -9.9% ✅ improved
03-assign-task 10 5 0 — 0% 356303 114248 1626 807 -110907 ↓ -19.0% — unchanged
04-human-intervene 14 0 +1 ↑ +7.7% 467979 0 2992 0 -72647 ↓ -13.4% ⚠️ regressed
05-heartbeat 8 0 +1 ↑ +14.3% 316120 0 1844 0 -24399 ↓ -7.1% ⚠️ regressed
06-multi-worker 22 5 -7 ↓ -20.6% 1063947 117909 5795 1255 -649127 ↓ -35.3% ✅ improved

Trends

2 test(s) improved (fewer LLM calls)
⚠️ 2 test(s) regressed (more LLM calls)


Generated by HiClaw CI on 2026-05-11 10:51:04 UTC


📦 Download debug logs & test artifacts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 11, 2026

❌ Integration Tests Failed (llm-interaction / mgr=copaw / wk=copaw)

Commit: 20e4f9d
Workflow run: #1018

Test Results
No test output captured.
Debug Log (tail)
No debug logs available.

📦 Download full debug logs & test artifacts

@github-actions
Copy link
Copy Markdown
Contributor

❌ Integration Tests Failed (llm-interaction-2 / mgr=copaw / wk=copaw)

Commit: 1f4341f
Workflow run: #1013

Test Results
No test output captured.
Debug Log (tail)
No debug logs available.

📦 Download full debug logs & test artifacts

@github-actions
Copy link
Copy Markdown
Contributor

❌ Integration Tests Failed (llm-interaction / mgr=copaw / wk=hermes)

Commit: 1f4341f
Workflow run: #1013

Test Results
No test output captured.
Debug Log (tail)
No debug logs available.

📦 Download full debug logs & test artifacts

@flystar32 flystar32 self-requested a review May 11, 2026 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] copaw-worker 启动时 UnicodeDecodeError 读取 SOUL.md/AGENTS.md —— mc mirror 与 read_text 之间的文件稳定性 race

1 participant