Skip to content

config: avoid NULL dereference on network-lock in RPC config parsing#3051

Merged
rst0git merged 1 commit into
checkpoint-restore:criu-devfrom
sidneychang:network-lock-rpc-null-argv
Jun 16, 2026
Merged

config: avoid NULL dereference on network-lock in RPC config parsing#3051
rst0git merged 1 commit into
checkpoint-restore:criu-devfrom
sidneychang:network-lock-rpc-null-argv

Conversation

@sidneychang

Copy link
Copy Markdown
Contributor

Description

CRIU can re-read a configuration file while running in RPC mode. In this
path, cr-service.c calls parse_options() without CLI arguments:

parse_options(0, NULL, ..., PARSING_RPC_CONF);

This is used by flows such as runc checkpoint, where runc can pass
/etc/criu/runc.conf to CRIU as the RPC config file.

Commit 2e30db5c3d46 added --network-lock handling and a restore
warning that checks argv[optind] after option parsing. If the RPC
config file contains network-lock iptables, parse_options() sets
has_network_lock_opt and then reaches that warning. In the RPC config
parsing path, however, argv is NULL, so dereferencing argv[optind]
crashes criu swrk.

Guard the restore warning with argv && optind < argc before reading
argv[optind]. This keeps the existing CLI behavior unchanged, while
skipping a CLI-only check when no CLI subcommand exists.

Reproducer

From the CRIU source tree, generate the Python RPC bindings:

protoc --proto_path=test/others/rpc \
       --python_out=/tmp \
       test/others/rpc/rpc.proto

Create a minimal config file containing network-lock:

  cat >/tmp/criu-network-lock.conf <<'EOF'
  weak-sysctls
  network-lock iptables
  shell-job
  ext-unix-sk
  EOF

Create a minimal RPC reproducer:

  cat >/tmp/repro_network_lock_rpc.py <<'PY'
  import os
  import socket
  import subprocess
  import sys
  import tempfile

  import rpc_pb2 as rpc

  def run_test(criu_bin: str, cfg_path: str):
      with tempfile.TemporaryDirectory(prefix="criu-nl-test-") as tmp:
          imgs = os.path.join(tmp, "imgs")
          os.mkdir(imgs)

          s1, s2 = socket.socketpair(socket.AF_UNIX,
          socket.SOCK_SEQPACKET)
          swrk = subprocess.Popen(
              [criu_bin, "swrk", str(s1.fileno())],
              pass_fds=[s1.fileno()],
              stdout=subprocess.DEVNULL,
              stderr=subprocess.DEVNULL,
          )
          s1.close()

          req = rpc.criu_req()
          req.type = rpc.DUMP
          req.opts.leave_running = True
          req.opts.log_level = 4
          req.opts.log_file = "dump.log"
          req.opts.images_dir_fd = os.open(imgs, os.O_DIRECTORY)
          req.opts.shell_job = True
          req.opts.config_file = cfg_path

          try:
              s2.send(req.SerializeToString())
              try:
                  data = s2.recv(4096)
              except ConnectionResetError:
                  data = b""
          finally:
              s2.close()

          rc = swrk.wait(timeout=5)
          if data:
              resp = rpc.criu_resp()
              resp.ParseFromString(data)
              detail = f"rpc_type={resp.type} success={resp.success}
              err={resp.cr_errmsg!r}"
          else:
              detail = "no RPC response (connection closed)"

          print(f"criu={criu_bin}")
          print(f"return={rc} ({detail})")

  if __name__ == "__main__":
      run_test(sys.argv[1], sys.argv[2])
  PY

Run it against an unpatched CRIU binary:

  PYTHONPATH=/tmp python3 /tmp/repro_network_lock_rpc.py /usr/sbin/
  criu /tmp/criu-network-lock.conf

On an unpatched CRIU 4.2 binary, this crashes the service worker:

  criu=/usr/sbin/criu
  return=139 (no RPC response (connection closed))

Run the same reproducer against the patched binary:

  PYTHONPATH=/tmp python3 /tmp/repro_network_lock_rpc.py ./criu/criu /
  tmp/criu-network-lock.conf

With this patch applied, the same RPC flow no longer crashes. For
example, I observed:

  criu=./criu/criu
  return=1 (rpc_type=1 success=False err='Error (compel/src/lib/
  infect.c:259): The criu itself is within dumped tree.\n')

The RPC error above is expected for this minimal reproducer because it
does not provide a valid dump target PID. The purpose here is only to
verify that re-parsing a config_file containing network-lock no longer crashes
criu swrk.

Why this is safe

The guarded block only emits a warning for the CLI restore command
when --network-lock is provided. In RPC config parsing there is no CLI
subcommand in argv, so the check cannot make a valid restore/dump
decision there.

With this patch:

  • CLI parsing still warns for criu restore --network-lock ...;
  • RPC config parsing no longer dereferences argv == NULL;
  • network-lock can be parsed from an RPC config file without crashing
    the service worker.

Tests

  • make -j$(nproc)
  • make lint

No ZDTM test is added because this crash depends on the RPC config-file
parsing path. The code change is limited to guarding a CLI-only
argv[optind] access.

Related

This is related to earlier RPC/config handling issues such as #1029.

Fixes: 2e30db5 ("criu: add --network-lock option to allow nftables alternative")

@sidneychang sidneychang force-pushed the network-lock-rpc-null-argv branch from eba835b to 489051d Compare June 16, 2026 05:52

@adrianreber adrianreber left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good. I restarted the failing CI runs. They do not seem related to this change.

@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.06%. Comparing base (130df83) to head (489051d).
⚠️ Report is 3 commits behind head on criu-dev.

Additional details and impacted files
@@            Coverage Diff            @@
##           criu-dev    #3051   +/-   ##
=========================================
  Coverage     57.05%   57.06%           
=========================================
  Files           154      154           
  Lines         40534    40535    +1     
  Branches       8881     8881           
=========================================
+ Hits          23128    23130    +2     
+ Misses        17052    17051    -1     
  Partials        354      354           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rst0git rst0git left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Commit 2e30db5 added --network-lock handling and a CLI restore
warning that checks argv[optind] after option parsing.

However, CRIU can also call parse_options() from the RPC config parsing
path with argc == 0 and argv == NULL. This path was added by 99d52ec
for req->config_file handling.

If an RPC configuration file contains "network-lock iptables",
has_network_lock_opt is set and the restore warning dereferences
argv[optind], crashing criu swrk.

Guard the restore warning with argv && optind < argc before accessing
argv[optind].

Fixes: 2e30db5 ("criu: add --network-lock option to allow nftables alternative")

Signed-off-by: Zhang Xuedong <2190206983@qq.com>
@rst0git rst0git force-pushed the network-lock-rpc-null-argv branch from 489051d to 8aa144e Compare June 16, 2026 08:38
@rst0git

rst0git commented Jun 16, 2026

Copy link
Copy Markdown
Member

@sidneychang Thank you for fixing this! I've updated the order to start with most-selective predicate first (has_network_lock_opt).

@rst0git rst0git merged commit 6e75573 into checkpoint-restore:criu-dev Jun 16, 2026
16 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants