Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 33 additions & 3 deletions criu/page-xfer.c
Original file line number Diff line number Diff line change
Expand Up @@ -341,9 +341,39 @@ static int write_pagemap_loc(struct page_xfer *xfer, struct iovec *iov, u32 flag
if (xfer->parent != NULL) {
ret = check_pagehole_in_parent(xfer->parent, iov);
if (ret) {
pr_err("Hole %p - %p not found in parent\n",
iov->iov_base, iov->iov_base + iov->iov_len);
return -1;
/*
* Page not found in parent. This can happen
* when a page was allocated between pre-dumps
* and the soft-dirty bit is clear (kernel
* zero-fills new pages without setting it).
* Write zero pages instead of a broken parent
* reference that would fail on restore.
*/
unsigned long i, nr = iov->iov_len / PAGE_SIZE;
static const char zero_page[PAGE_SIZE];

pr_warn("Hole %p - %p not found in parent, "
"writing %lu zero page(s)\n",
iov->iov_base,
iov->iov_base + iov->iov_len, nr);

pe.flags = PE_PRESENT;
flags = PE_PRESENT;

if (pb_write_one(xfer->pmi, &pe, PB_PAGEMAP) < 0)
return -1;

for (i = 0; i < nr; i++) {
ret = write(img_raw_fd(xfer->pi),
zero_page, PAGE_SIZE);
if (ret != PAGE_SIZE) {
Comment on lines +367 to +369
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zero-page write uses plain write() and treats any short write/EINTR as fatal (ret != PAGE_SIZE). Since the pages image FD is a regular file, write() can legally return partial progress; this should use write_all() (or write_img_buf()) and handle/propagate errors consistently to avoid spurious failures and partially written images on interruption.

Suggested change
ret = write(img_raw_fd(xfer->pi),
zero_page, PAGE_SIZE);
if (ret != PAGE_SIZE) {
ret = write_all(img_raw_fd(xfer->pi),
zero_page,
PAGE_SIZE);
if (ret < 0) {

Copilot uses AI. Check for mistakes.
pr_perror("Can't write zero page");
return -1;
}
}

cnt_add(CNT_PAGES_WRITTEN, nr);
return 0;
Comment on lines +345 to +376
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback turns a PE_PARENT hole missing in the parent into PE_PRESENT and writes a zero-filled page payload without verifying the dumped process page contents are actually zero. That changes semantics from “fail restore” to “restore succeeds but memory is silently zeroed”, which can corrupt non-zero pages (e.g., if soft-dirty was cleared after writes). Consider either (a) dumping the actual page contents, or (b) making this path conditional on a proven zero-page condition and logging explicitly that data is being discarded.

Suggested change
* Page not found in parent. This can happen
* when a page was allocated between pre-dumps
* and the soft-dirty bit is clear (kernel
* zero-fills new pages without setting it).
* Write zero pages instead of a broken parent
* reference that would fail on restore.
*/
unsigned long i, nr = iov->iov_len / PAGE_SIZE;
static const char zero_page[PAGE_SIZE];
pr_warn("Hole %p - %p not found in parent, "
"writing %lu zero page(s)\n",
iov->iov_base,
iov->iov_base + iov->iov_len, nr);
pe.flags = PE_PRESENT;
flags = PE_PRESENT;
if (pb_write_one(xfer->pmi, &pe, PB_PAGEMAP) < 0)
return -1;
for (i = 0; i < nr; i++) {
ret = write(img_raw_fd(xfer->pi),
zero_page, PAGE_SIZE);
if (ret != PAGE_SIZE) {
pr_perror("Can't write zero page");
return -1;
}
}
cnt_add(CNT_PAGES_WRITTEN, nr);
return 0;
* A missing parent hole cannot be safely turned
* into a present zero page here. Without
* verifying the dumped task memory contents, that
* would silently discard non-zero data and
* corrupt the restored address space.
*/
pr_err("Hole %p - %p not found in parent, "
"refusing to synthesize zero pages\n",
iov->iov_base,
iov->iov_base + iov->iov_len);
return -1;

Copilot uses AI. Check for mistakes.
}
}
}
Expand Down
1 change: 1 addition & 0 deletions test/zdtm/transition/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ TST_NOFILE = \
socket-tcp6 \
shmem \
lazy-thp \
maps_newpage \
pid_reuse \
pidfd_store_sk \
rseq01 \
Expand Down
139 changes: 139 additions & 0 deletions test/zdtm/transition/maps_newpage.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>

#include "zdtmtst.h"

const char *test_doc = "Test pages with cleared soft-dirty after pre-dump";
const char *test_author = "Radostin Stoyanov <rstoyanov@fedoraproject.org>";

/*
* After a pre-dump, soft-dirty bits are cleared. If new pages are
* then written and soft-dirty is cleared again (e.g. by a device
* driver or by writing to /proc/self/clear_refs), CRIU sees
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soft-dirty is cleared again

Only CRIU has to clear soft-dirty bits, otherwise we can't guarantee consistent memory snapshot. I don't like the idea of using /proc/self/clear_refs in the test.

* "present, not dirty" and classifies them as PE_PARENT. But the
* parent has no data for these pages, causing restore to fail with
* "Hole not found in parent".
*
* This test simulates the scenario by:
* 1. Mapping and writing init pages (will be in pre-dump)
* 2. Letting the pre-dump run
* 3. Mapping new pages in a separate VMA and writing a pattern
* 4. Clearing soft-dirty via /proc/self/clear_refs
* 5. Re-dirtying init pages so they are dumped normally
* 6. Letting the final dump run — the new pages appear "clean"
* but have no parent data
*/

#define INIT_PAGES 4

Check warning on line 30 in test/zdtm/transition/maps_newpage.c

View workflow job for this annotation

GitHub Actions / build

#define NEW_PAGES 4

static int clear_soft_dirty(void)
{
int fd;
ssize_t ret;

fd = open("/proc/self/clear_refs", O_WRONLY);
if (fd < 0) {
pr_perror("open clear_refs");
return -1;
}
/* 4 = clear soft-dirty bits */
ret = write(fd, "4", 1);
close(fd);
if (ret != 1) {
pr_perror("write clear_refs");
return -1;
}
return 0;
}

int main(int argc, char **argv)
{
unsigned char *init_mem, *new_mem = NULL;
size_t init_size = INIT_PAGES * PAGE_SIZE;
size_t new_size = NEW_PAGES * PAGE_SIZE;
int i;

test_init(argc, argv);

/* Map and write init pages — these go into the pre-dump */
init_mem = mmap(NULL, init_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (init_mem == MAP_FAILED) {
pr_perror("mmap init");
return 1;
}
memset(init_mem, 0xAB, init_size);

test_daemon();

/* Wait for pre-dump to complete */
if (test_wait_pre_dump())
goto skip;

/*
* Map new pages in a separate VMA and write a pattern.
* This creates real page backing (not the kernel zero page).
*/
new_mem = mmap(NULL, new_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (new_mem == MAP_FAILED) {
pr_perror("mmap new");
test_wait_pre_dump_ack();
return 1;
}
memset(new_mem, 0xCD, new_size);

Comment on lines +77 to +89
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test writes 0xCD into the “new pages” mapping and then asserts the restored pages are 0x00, which enshrines “data loss is expected” behavior. This seems inconsistent with the PR description’s rationale that these pages are guaranteed zero-filled; if the intended scenario is “newly faulted but never written” pages, consider faulting them in without modifying contents and asserting they stay zero after restore so the test validates the stated guarantee rather than intentional corruption.

Copilot uses AI. Check for mistakes.
/*
* Clear soft-dirty for all pages. Now CRIU sees the new
* pages as "present, not dirty" → PE_PARENT, but the
* parent has no data for them.
*/
if (clear_soft_dirty()) {
test_wait_pre_dump_ack();
return 1;
}

/*
* Re-dirty init pages so CRIU dumps them as PE_PRESENT.
* Without this, they would also go to PE_PARENT (which
* would work since the parent has them, but let's keep
* the test focused on the new pages).
*/
memset(init_mem, 0xAB, init_size);

test_wait_pre_dump_ack();

skip:
test_waitsig();

/* Verify init pages kept their pattern */
for (i = 0; i < (int)init_size; i++) {
if (init_mem[i] != 0xAB) {
fail("init_mem[%d] = 0x%02x, expected 0xAB",
i, init_mem[i]);
return 1;
}
}

/*
* The new pages will be zero after restore: CRIU thought
* they were in the parent, so the fix writes zero pages.
* The 0xCD data is lost, but the process doesn't crash.
*/
if (new_mem) {
for (i = 0; i < (int)new_size; i++) {
if (new_mem[i] != 0x00) {
fail("new_mem[%d] = 0x%02x, expected 0x00",
i, new_mem[i]);
return 1;
}
}
}

pass();
return 0;
}
1 change: 1 addition & 0 deletions test/zdtm/transition/maps_newpage.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{'flavor': 'h', 'flags': 'suid pre-dump-notify'}
Loading