Skip to content

Fix non-range HTTP download: downloaded bytes exceeding total file size#1307

Closed
Copilot wants to merge 5 commits into
mainfrom
copilot/fix-download-issue-without-content-length
Closed

Fix non-range HTTP download: downloaded bytes exceeding total file size#1307
Copilot wants to merge 5 commits into
mainfrom
copilot/fix-download-issue-without-content-length

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 11, 2026

When a server doesn't return Content-Length or doesn't support Accept-Ranges, pause/resume causes downloaded bytes to grow unboundedly past the actual file size (e.g. 81 MB / 35 MB at 231%).

Root cause

On resume, the server resends from byte 0 (no Range support), but the code wrote at the previously accumulated offset (conn.Chunk.Downloaded), appending a second copy instead of overwriting. Each pause/resume cycle roughly doubled the file. This bug occurs sporadically because it requires the user to pause and resume a download from a server that doesn't support Range requests.

Non-range retry writes at wrong offset

Each retry in downloadChunkOnce makes a request without Range header (server doesn't support it), so the server sends from byte 0. But the code writes at conn.Chunk.Downloaded (accumulated from prior attempts), appending instead of overwriting. Every retry roughly doubles the file.

Fix: Reset conn.Chunk.Downloaded and conn.Downloaded to 0 at the start of downloadChunkOnce and runConnectionFallback for non-range downloads, since the server restarts the transfer each time.

No size limit enforcement for non-range downloads

downloadChunkOnce only checks chunk.remain() when Range=true. When Size > 0 but Range=false, data is written without bound.

Fix: Add size limit checks in all three download loops (downloadChunkOnce, runConnectionWithResolveResp, runConnectionFallback) when Size > 0, regardless of range support.

Premature completion masking the error

onDownloadComplete uses chunk.remain() to detect incomplete chunks. For non-range chunks (0,0), any downloaded bytes make remain() negative, so a timed-out connection is incorrectly treated as successfully completed.

Fix: For non-range downloads, use conn.Completed (set only on EOF) instead of chunk.remain() to determine completion.

Data race in expandConnections

go f.waitForCompletion() launched an orphaned goroutine that could race with doStart() writing f.ctx on resume. This goroutine was redundant—the downloadLoop slow-start loop already calls waitForCompletion() after checking stateSteady.

Test

  • TestFetcher_DownloadNoRangePauseResume — targeted test that uses StartTestCustomServer() (no Content-Length, no Accept-Ranges) to simulate real-world servers like GitHub's zip downloads, performs pause/resume, then asserts that the downloaded file size does not exceed the source file size and that the file content matches via MD5.
Original prompt

This section details on the original issue you should resolve

<issue_title>bug</issue_title>
<issue_description>Image
so it is a bug?</issue_description>

<agent_instructions>当http下载服务器没有返回Content-Length的情况下,文件下载会出现无法停止的情况,表现为文件总大小已经更新,但是仍然在下载并且已下载字节数会超过文件总大小,看看这个bug什么情况下会发生,并且修复,另外加上对应的单元测试案例</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@monkeyWie It seems so.

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits March 11, 2026 02:50
When HTTP server doesn't return Content-Length or doesn't support Range
requests, the download can accumulate extra bytes on pause/resume because:
1. Each retry/resume re-downloads from the beginning but writes at the
   accumulated offset
2. No size limit enforcement for non-range downloads
3. onDownloadComplete incorrectly uses chunk.remain() for non-range
   downloads, which gives negative values for chunk(0,0)

Fixes:
- Reset chunk state in downloadChunkOnce and runConnectionFallback for
  non-range downloads since server will send from beginning
- Add size limit checks in download loops when Size is known
- Fix onDownloadComplete to use conn.Completed for non-range downloads
  instead of unreliable chunk.remain()
- Add test cases for non-range download, pause/resume, and size limit

Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
… goroutine

The go f.waitForCompletion() goroutine in expandConnections was redundant
since the downloadLoop's slow start loop already handles stateSteady by
calling waitForCompletion directly. The orphaned goroutine could race with
doStart() when writing f.ctx during pause/resume.

Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix bug with file download when Content-Length is missing Fix non-range HTTP download: downloaded bytes exceeding total file size Mar 11, 2026
@monkeyWie monkeyWie marked this pull request as ready for review March 11, 2026 05:26
@monkeyWie monkeyWie self-requested a review as a code owner March 11, 2026 05:26
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 52.94118% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.58%. Comparing base (1907f40) to head (b8e517b).

Files with missing lines Patch % Lines
internal/protocol/http/fetcher.go 46.42% 24 Missing and 6 partials ⚠️
internal/test/httptest.go 83.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1307      +/-   ##
==========================================
- Coverage   73.70%   73.58%   -0.13%     
==========================================
  Files          48       48              
  Lines        6192     6253      +61     
==========================================
+ Hits         4564     4601      +37     
- Misses       1208     1228      +20     
- Partials      420      424       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@monkeyWie
Copy link
Copy Markdown
Member

@copilot 不对哦,我抓包看那个http请求都没有返回Content-Length,所以你这个单测也模拟的不对,你可以试试这个链接:https://github.com/GopeedLab/gopeed/archive/refs/heads/main.zip

…l bug scenario

Remove StartTestNoRangeServer and its tests since the real bug is about
servers that don't return Content-Length at all (like GitHub's zip downloads),
not servers with Content-Length but no Range support.

The chunked tests (TestFetcher_DownloadChunkedContinue and
TestFetcher_DownloadChunkedNotExceedSize) correctly simulate the real-world
bug using StartTestCustomServer which serves files without Content-Length.

Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 11, 2026

@copilot 不对哦,我抓包看那个http请求都没有返回Content-Length,所以你这个单测也模拟的不对,你可以试试这个链接:https://github.com/GopeedLab/go...

已修复,移除了 StartTestNoRangeServer(它错误地返回了 Content-Length),以及使用它的 3 个测试。现在测试都使用 StartTestCustomServer(),它不返回 Content-Length 也不支持 Range,正确模拟了 GitHub zip 下载的场景。核心 bug 修复保持不变——TestFetcher_DownloadChunkedContinueTestFetcher_DownloadChunkedNotExceedSize 验证了 pause/resume 后文件内容正确(MD5 校验)。提交: 8b02844

@monkeyWie
Copy link
Copy Markdown
Member

@copilot Chunked下载之前已经有单测了,不需要再添加了,要分析这种bug偶发出现的根本原因然后修复,然后如果能针对性的对于这个bug编写一个单测才有意义

…test

Remove TestFetcher_DownloadChunkedContinue (redundant with existing tests).
Replace TestFetcher_DownloadChunkedNotExceedSize with
TestFetcher_DownloadNoRangePauseResume which specifically targets the bug:
- Documents the root cause (non-range resume writes at accumulated offset
  instead of byte 0)
- Asserts downloaded file size does not exceed source file size (the exact
  bug symptom from the issue)
- Verifies file integrity via MD5

Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 12, 2026

@copilot Chunked下载之前已经有单测了,不需要再添加了,要分析这种bug偶发出现的根本原因然后修复,然后如果能针对性的对于这个bug编写一...

已移除冗余的 TestFetcher_DownloadChunkedContinue,替换为一个针对性的单测 TestFetcher_DownloadNoRangePauseResume

根本原因: 非 Range 下载在 pause/resume 后,服务器从 byte 0 重新发送数据,但代码在之前累积的偏移量处写入,导致数据被追加而不是覆盖,文件大小无限增长。

单测验证: 使用不返回 Content-Length 的服务器,暂停后恢复,断言下载文件大小不超过源文件大小(即 bug 的具体表现),并校验 MD5。

提交: 0bd3067

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug

2 participants