Skip to content

page-xfer: write zero pages when parent hole is missing#2997

Draft
rst0git wants to merge 2 commits intocheckpoint-restore:criu-devfrom
rst0git:maps_newpage
Draft

page-xfer: write zero pages when parent hole is missing#2997
rst0git wants to merge 2 commits intocheckpoint-restore:criu-devfrom
rst0git:maps_newpage

Conversation

@rst0git
Copy link
Copy Markdown
Member

@rst0git rst0git commented Apr 6, 2026

During incremental dump with pre-dumps, pages allocated between pre-dumps may have their soft-dirty bit clear (the kernel zero-fills new mmap'd pages without setting it). CRIU classifies these as PE_PARENT holes, but since the page is newly allocated, no parent has it.

This causes restore to fail with "Missing in parent pagemap" (#2995) because write_pagemap_loc() wrote a broken PE_PARENT entry referencing data that does not exist in any parent image.

To fix this, write zero-filled PE_PRESENT pages instead, which is correct because new mmap'd pages with a clear soft-dirty bit are guaranteed to be zero-filled by the kernel.

rst0git added 2 commits April 6, 2026 16:07
During incremental dump with pre-dumps, pages allocated between
pre-dumps may have their soft-dirty bit clear (the kernel zero-fills
new mmap'd pages without setting it). CRIU classifies these as
PE_PARENT holes, but since the page is newly allocated, no parent
has it.

Previously this caused restore to fail with "Missing in parent
pagemap" because write_pagemap_loc() wrote a broken PE_PARENT
entry referencing data that does not exist in any parent image.

Write zero-filled PE_PRESENT pages instead, which is correct because
new mmap'd pages with a clear soft-dirty bit are guaranteed to be
zero-filled by the kernel.

Assisted-by: Claude Code:claude-opus-4-6

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Pages allocated between a pre-dump and the final dump may have their
soft-dirty bit clear (the kernel zero-fills new anonymous pages
without setting it). CRIU classifies these as PE_PARENT holes, but
no parent image contains them.

Add a transition test that mmaps new anonymous pages after the
pre-dump completes, then verifies they survive checkpoint/restore.

Run with: sudo ./test/zdtm.py run -t zdtm/transition/maps_newpage --pre 1

Assisted-by: Claude Code:claude-opus-4-6

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes incremental dump (pre-dump) restore failures where CRIU emits PE_PARENT entries for pages that are not present in any parent pagemap by falling back to writing zero-filled PE_PRESENT pages, and adds a ZDTM transition test to exercise the scenario.

Changes:

  • criu/page-xfer.c: When a PE_PARENT hole is missing in the parent pagemap, emit a PE_PRESENT entry and write zero page data instead of failing the dump.
  • test/zdtm/transition/maps_newpage.c + .desc: Add a new transition test for “soft-dirty cleared after pre-dump” pages.
  • test/zdtm/transition/Makefile: Register the new test target.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
criu/page-xfer.c Adds a fallback path to avoid broken PE_PARENT references by writing zero-filled present pages.
test/zdtm/transition/maps_newpage.c New ZDTM transition test covering the missing-parent-hole scenario.
test/zdtm/transition/maps_newpage.desc Enables the test with pre-dump-notify and required flags.
test/zdtm/transition/Makefile Adds maps_newpage to the transition test list.

Comment on lines +367 to +369
ret = write(img_raw_fd(xfer->pi),
zero_page, PAGE_SIZE);
if (ret != PAGE_SIZE) {
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zero-page write uses plain write() and treats any short write/EINTR as fatal (ret != PAGE_SIZE). Since the pages image FD is a regular file, write() can legally return partial progress; this should use write_all() (or write_img_buf()) and handle/propagate errors consistently to avoid spurious failures and partially written images on interruption.

Suggested change
ret = write(img_raw_fd(xfer->pi),
zero_page, PAGE_SIZE);
if (ret != PAGE_SIZE) {
ret = write_all(img_raw_fd(xfer->pi),
zero_page,
PAGE_SIZE);
if (ret < 0) {

Copilot uses AI. Check for mistakes.
Comment on lines +345 to +376
* Page not found in parent. This can happen
* when a page was allocated between pre-dumps
* and the soft-dirty bit is clear (kernel
* zero-fills new pages without setting it).
* Write zero pages instead of a broken parent
* reference that would fail on restore.
*/
unsigned long i, nr = iov->iov_len / PAGE_SIZE;
static const char zero_page[PAGE_SIZE];

pr_warn("Hole %p - %p not found in parent, "
"writing %lu zero page(s)\n",
iov->iov_base,
iov->iov_base + iov->iov_len, nr);

pe.flags = PE_PRESENT;
flags = PE_PRESENT;

if (pb_write_one(xfer->pmi, &pe, PB_PAGEMAP) < 0)
return -1;

for (i = 0; i < nr; i++) {
ret = write(img_raw_fd(xfer->pi),
zero_page, PAGE_SIZE);
if (ret != PAGE_SIZE) {
pr_perror("Can't write zero page");
return -1;
}
}

cnt_add(CNT_PAGES_WRITTEN, nr);
return 0;
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback turns a PE_PARENT hole missing in the parent into PE_PRESENT and writes a zero-filled page payload without verifying the dumped process page contents are actually zero. That changes semantics from “fail restore” to “restore succeeds but memory is silently zeroed”, which can corrupt non-zero pages (e.g., if soft-dirty was cleared after writes). Consider either (a) dumping the actual page contents, or (b) making this path conditional on a proven zero-page condition and logging explicitly that data is being discarded.

Suggested change
* Page not found in parent. This can happen
* when a page was allocated between pre-dumps
* and the soft-dirty bit is clear (kernel
* zero-fills new pages without setting it).
* Write zero pages instead of a broken parent
* reference that would fail on restore.
*/
unsigned long i, nr = iov->iov_len / PAGE_SIZE;
static const char zero_page[PAGE_SIZE];
pr_warn("Hole %p - %p not found in parent, "
"writing %lu zero page(s)\n",
iov->iov_base,
iov->iov_base + iov->iov_len, nr);
pe.flags = PE_PRESENT;
flags = PE_PRESENT;
if (pb_write_one(xfer->pmi, &pe, PB_PAGEMAP) < 0)
return -1;
for (i = 0; i < nr; i++) {
ret = write(img_raw_fd(xfer->pi),
zero_page, PAGE_SIZE);
if (ret != PAGE_SIZE) {
pr_perror("Can't write zero page");
return -1;
}
}
cnt_add(CNT_PAGES_WRITTEN, nr);
return 0;
* A missing parent hole cannot be safely turned
* into a present zero page here. Without
* verifying the dumped task memory contents, that
* would silently discard non-zero data and
* corrupt the restored address space.
*/
pr_err("Hole %p - %p not found in parent, "
"refusing to synthesize zero pages\n",
iov->iov_base,
iov->iov_base + iov->iov_len);
return -1;

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +89
/*
* Map new pages in a separate VMA and write a pattern.
* This creates real page backing (not the kernel zero page).
*/
new_mem = mmap(NULL, new_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (new_mem == MAP_FAILED) {
pr_perror("mmap new");
test_wait_pre_dump_ack();
return 1;
}
memset(new_mem, 0xCD, new_size);

Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test writes 0xCD into the “new pages” mapping and then asserts the restored pages are 0x00, which enshrines “data loss is expected” behavior. This seems inconsistent with the PR description’s rationale that these pages are guaranteed zero-filled; if the intended scenario is “newly faulted but never written” pages, consider faulting them in without modifying contents and asserting they stay zero after restore so the test validates the stated guarantee rather than intentional corruption.

Copilot uses AI. Check for mistakes.
@rst0git rst0git marked this pull request as draft April 6, 2026 15:41
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.17%. Comparing base (f943d4f) to head (6c795de).

Files with missing lines Patch % Lines
criu/page-xfer.c 0.00% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           criu-dev    #2997      +/-   ##
============================================
- Coverage     57.21%   57.17%   -0.04%     
============================================
  Files           154      154              
  Lines         40400    40408       +8     
  Branches       8857     8860       +3     
============================================
- Hits          23115    23104      -11     
- Misses        17021    17044      +23     
+ Partials        264      260       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@avagin
Copy link
Copy Markdown
Member

avagin commented Apr 7, 2026

the kernel zero-fills new mmap'd pages without setting it

We should detect zero pages... Can you be more precise when it can happen? How to reproduce this case from user-space?

/*
* After a pre-dump, soft-dirty bits are cleared. If new pages are
* then written and soft-dirty is cleared again (e.g. by a device
* driver or by writing to /proc/self/clear_refs), CRIU sees
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soft-dirty is cleared again

Only CRIU has to clear soft-dirty bits, otherwise we can't guarantee consistent memory snapshot. I don't like the idea of using /proc/self/clear_refs in the test.

@avagin
Copy link
Copy Markdown
Member

avagin commented Apr 7, 2026

Can it be a kernel bug like: torvalds/linux@29f4059 or torvalds/linux@6707915

@avagin
Copy link
Copy Markdown
Member

avagin commented Apr 8, 2026

Can it be a kernel bug like: torvalds/linux@29f4059 or torvalds/linux@6707915

I think you triggered the second issue. Here is a test to reproduce it: avagin@daf5e4d. I think we need to send the kernel fix to stable trees.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants