Skip to content

bfd: fix partial write handling in bwritev()#2975

Open
3idey wants to merge 1 commit intocheckpoint-restore:criu-devfrom
3idey:fix-bwritev-partial-writes
Open

bfd: fix partial write handling in bwritev()#2975
3idey wants to merge 1 commit intocheckpoint-restore:criu-devfrom
3idey:fix-bwritev-partial-writes

Conversation

@3idey
Copy link
Copy Markdown
Contributor

@3idey 3idey commented Mar 19, 2026

When bfd is unbuffered, bwritev() used writev() directly, which can return partial writes and leave checkpoint images incomplete.

This patch fixes this by ensuring writev() is retried until all iovecs are fully written or an error occurs.

Changes:

  • Removes const from the iov parameter in bwritev() (and updates the header) to allow direct pointer manipulation.
  • Handles partial writes by temporarily adjusting the iov_base and iov_len of the current iovec and advancing the iov pointer and cnt as data is successfully written.
  • Avoids zero-progress loops and extra memory allocations by tracking the offset inline.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 80.95238% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.12%. Comparing base (b7f6b72) to head (ca472b8).
⚠️ Report is 674 commits behind head on criu-dev.

Files with missing lines Patch % Lines
criu/bfd.c 80.95% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           criu-dev    #2975      +/-   ##
============================================
- Coverage     57.76%   57.12%   -0.64%     
============================================
  Files           142      154      +12     
  Lines         37664    40326    +2662     
  Branches          0     8845    +8845     
============================================
+ Hits          21758    23038    +1280     
- Misses        15906    17028    +1122     
- Partials          0      260     +260     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens CRIU’s bfd unbuffered write path to ensure bwritev() reliably writes the full iovec payload (retrying writev() as needed), preventing incomplete checkpoint images due to partial writev() results.

Changes:

  • Add bfd_writev_all() helper to retry writev() until all iovecs are consumed or an error occurs.
  • In unbuffered bwritev(), duplicate the iovec array so progress-tracking modifications don’t affect the caller’s iovecs.
  • Return early for cnt == 0 and treat writev() returning 0 as an error to avoid no-progress loops.

@3idey 3idey force-pushed the fix-bwritev-partial-writes branch from b11037d to b2230a8 Compare March 20, 2026 05:08
@3idey
Copy link
Copy Markdown
Contributor Author

3idey commented Mar 29, 2026

@avagin Could you please take a look at this?

@3idey 3idey force-pushed the fix-bwritev-partial-writes branch 2 times, most recently from c675610 to 18de5f8 Compare March 30, 2026 08:12
@3idey 3idey force-pushed the fix-bwritev-partial-writes branch 2 times, most recently from 3ccf9c8 to 52a8ef9 Compare March 30, 2026 21:44
@3idey 3idey requested a review from avagin March 30, 2026 23:31
When bfd is unbuffered, bwritev() used writev() directly, which can return
partial writes and leave checkpoint images incomplete.

Fix this by retrying writev() until all iovecs are written or an error occurs.
Temporarily modify the first iovec's base and length to track progress on
partial writes, then restore it before the next iteration.

Changes the function signature to accept non-const iovec, as we need to
modify it to track progress (same pattern as write_all() with buffers).

Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
@3idey 3idey force-pushed the fix-bwritev-partial-writes branch from 8366a67 to ca472b8 Compare March 31, 2026 13:19
@3idey
Copy link
Copy Markdown
Contributor Author

3idey commented Apr 1, 2026

@avagin updated! let me know if this looks good to you.

* bytes from previous partial writes, then restore it before
* the next iteration.
*/
iov[0].iov_base = (char *)iov[0].iov_base + off;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to convert it to (char *)?


written += ret;
while (ret) {
if (iov[0].iov_len > ret) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here is a bug. we adjusted iov_len after the write syscall. We probably need to do ret += off before this loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants