sk-queue: port SCM_CREDENTIALS checkpoint/restore support from OpenVZ#3026
sk-queue: port SCM_CREDENTIALS checkpoint/restore support from OpenVZ#30263idey wants to merge 18 commits into
Conversation
dd596e2 to
1c7688a
Compare
e334db4 to
c32c6de
Compare
There was a problem hiding this comment.
Hi @3idey,
when you porting somebody else's code you can't just drop their authorship and Signed-off-by tags ;-)
You should use git cherry-pick to make a perfect copy of the original commit or git format-patch + git am.
What you are expected to do here:
- identify needed commits from OpenVZ CRIU fork
- cherry-pick them one by one
- add your Signed-off-by tag in addition to the original ones
to make it even better, you can use git cherry-pick -x to keep original commit hash so we can compare it with OpenVZ fork easier.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## criu-dev #3026 +/- ##
============================================
+ Coverage 57.01% 57.04% +0.03%
============================================
Files 154 154
Lines 40452 40594 +142
Branches 8866 8891 +25
============================================
+ Hits 23064 23158 +94
- Misses 17034 17082 +48
Partials 354 354 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @mihalicyn, |
|
All commits is cherry-picked with the |
|
please clean up openvz specific parts from commit messages (subject prefixes, jira links and etc) |
ucred in a netlink message contains the same pid and it doesn't metter from which pidns it is read. Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> (cherry picked from commit 66f8f7b2ae51a2cef4044df478389137735e1d6e) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Checkpoint and restore ucreds on unix sockets are not that trivial: the @pid returned by the kernel upon message fetching is mapped to current user namespace so it is not the same as it would be received inside container. So we need to map real pid returned into virtual seen from inside, but we know virtual pids only when complete process tree is built. Thus what we do -- queue such packets into own queue and when process tree is build walk over and transform former pids into virtual ones. Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> (cherry picked from commit 8f3061c360381894a97edca2ada85466f95e1d88) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Test that SCM_CREDENTIALS ancillary messages queued in a unix socket are correctly saved and restored by CRIU. Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Reproducer for the next problem: When unix socket contains SCM with SCM_CREDENTIALS cmsg_type we have to postpone saving this packet to CR_FD_SK_QUEUES image because we need to have information about process which creds was transfered in this message. But if this SkPacketEntry already contains data from SCM_RIGHTS messages (file descriptors was transfered through unix socket) we will lose this data because of incorrect queue_packet_entry function implementation. This problem tends to inconsistent CRIU dump and on restore we will get: Error (criu/files.c:230): Empty list on file desc id 13(5) Error (criu/files.c:232): BUG at criu/files.c:232 Reviewed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> (cherry picked from commit 6aa13217721db06b7cbd873e615580f0a455d5a7) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
This allows us to catch long standing bug in vzcriu when scm-s were not dumped properly - scm-s from second packet contained both scm-s from first and second packet. This happenes in case there several SCM_RIGHTS packets and at least one SCM_CREDENTIALS packets were sent to unix socket at the same time. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> (cherry picked from commit 65ea504a97506355c0a5a4d9612a4827a58d6631) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> (cherry picked from commit a9b546c74f6c1e817b0790bd9621cc137bb719f3) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
We need to know a type of unsupported control messages. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> (cherry picked from commit 4ab24ee8840b4d879147442f4c65bb7eac77a8bb) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
ucred.pid is zero for kernel messages and in this case it is impossible to restore it. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> (cherry picked from commit 70948c510699aa8a2d28fc1669ad25dd81a6b102) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
nl_pktinfo control messages is used to get the extended destination group number. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> (cherry picked from commit 150afafe6f859ad8ca627fd6e08c2b127d2e0ad7) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
A process may send creds message and get exited so the message will be detected on checkpoint but we won't be able to queue it back because process is already dead and such sending is forbidden by kernel. So just ignore such packets. Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> (cherry picked from commit e0e73bcbb8e07e60dbfd4e23cebe2f477a930704) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
For debug sake. Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> (cherry picked from commit 318009a0ddf7150e7967f2dc52752cbb469cacd2) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
The queue may have a dead process' data in such case we rather should ignore the packet. Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> (cherry picked from commit f854da2284c2c0163055322804aa616b17a34c28) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
When unix socket contains SCM with SCM_CREDENTIALS cmsg_type we have to postpone saving this packet to CR_FD_SK_QUEUES image because we need to have information about process which creds was transfered in this message. But if this SkPacketEntry already contains data from SCM_RIGHTS messages (file descriptors was transfered through unix socket) we will lose this data because of incorrect queue_packet_entry function implementation. This problem tends to inconsistent CRIU dump and on restore we will get: Error (criu/files.c:230): Empty list on file desc id 13(5) Error (criu/files.c:232): BUG at criu/files.c:232 Reviewed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> (cherry picked from commit e6a9ce32ad14d18ff3160e5339e4582885d740fe) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
…packet If SkPacketEntry contains both SCM_CREDENTIALS and SCM_RIGHTS message then only SCM_RIGHTS will be put into socket queue. Reviewed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> (cherry picked from commit 1aa1b1cb430e39271e7fbd44bd8609198a7f3a20) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
When we have several SCM_RIGHTS packets and at least one SCM_CREDENTIALS sent simultaneousely into same unix socket we see that SCM_RIGHTS are dumped wrong. It happens because before this patch SkPacketEntry had been reused when dumping consequent packets, thus when dumping SCM_RIGHTS for non-first packet it's pe.scm also contained SCM_RIGHTS for previous packet. To fix this let's not reuse SkPacketEntry. Also while on it let's split dump_sk_queue_packet() - separate function for dumping one packet, to add more readability to the code. Also drop meaningless "ret = -E2BIG" and staff. Note that after this change pe.ucred is also not reused as it probably should've always been. Fixes: e4956b0 ("sk-queue: Add support of ucreds in unix sockets") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> (cherry picked from commit 6832ccd499c1bef9bc220cdf42202c468fe5a37c) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Printing info message when we go to nested fds will make easier to understand logs of unix sockets dumping. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> (cherry picked from commit ae7cee5ef93efba90466bbf67d0e0367a05cc830) Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Here is how we dump nested unix sockets inside unix sockets:
+-> dump_one_unix_fd <-+
+-> dump_sk_queue |
+-> dump_sk_queue_packet |
+-> dump_packet_cmsg |
+-> dump_one_file |
+-> do_dump_gen_file +
+-> queue_packet_entry
+-> list_add_tail(packets_list)
So in the images more deeper unix socket goes strictly before the more
shallow one.
Before this patch on restore we also had the same reverse order due to
list_add_tail used when adding to packets_list. Because of it we first
handle deeper sockets in prepare_scms and later shallow ones.
This is conterintuitive because in case of nested unix sockets without
any actual open fds on them we need to find out which task should create
them (file_master()->task) and in unix_note_scm_rights we determine this
owner from queuer which can also have it yet unset because it would be
handled later in prepare_scms.
Let's reverse the walk order on restore in prepare_scms.
Note: we can't just reverse the order in list as we rely in
restore_sk_queue that packets of one socket are in original order.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
(cherry picked from commit db11de636842c29ca21e9176cd43b785c7379c56)
Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Fake unix queuers can reach SCM preparation before they have a master fdinfo attached. Defer the final owner assignment until after prepare_scms() so unix_note_scm_rights() can resolve the real socket owner in the closed-sender case. Fixes: 67c8a5e ("sk-queue: order unixsk sent to other unixsk after it") Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
I found a unix fake queuer restore bug that caused scm10 to fail. The fix makes CRIU finish setting up fake queuers after SCM data is prepared, so closed-sender unix sockets restore correctly. Anything I should add or fix? |
Port SCM_CREDENTIALS checkpoint/restore support from the OpenVZ CRIU fork [1][2].
Unix domain sockets can carry SCM_CREDENTIALS ancillary messages via
sendmsg(). Previously CRIU could not checkpoint/restore processes that
had such messages queued in their socket buffers.
Changes:
images/sk-packet.proto: add sk_ucred_entry message to store pid, uid,
gid carried by SCM_CREDENTIALS control messages. Add optional ucred
field to sk_packet_entry at field 5 (renumbered from 128 in OpenVZ
to follow upstream sequential convention).
criu/sk-queue.c: add dump_sk_creds() to serialize SCM_CREDENTIALS
ancillary messages during dump. Add restore path in send_one_pkt()
to reconstruct the ucred cmsg before sendmsg(). Defer image writing
for packets with real PIDs via queue_packet_entry(), resolving
virtual PIDs in sk_queue_post_actions() after the process tree is
collected.
criu/cr-dump.c: call sk_queue_post_actions() in the dump post-actions
sequence to resolve deferred PIDs.
criu/sk-unix.c: fix dump_sk_queue() call to pass flags and val.
test/zdtm/static/sk-unix-cred.c: add ZDTM test that verifies
SCM_CREDENTIALS are correctly saved and restored.
This is part of Google Summer of Code 2026 work on adding SCM_CREDENTIALS
and SCM_PIDFD checkpoint/restore support to CRIU.
[1] https://bitbucket.org/openvz/criu.ovz/src/918653a0/criu/sk-queue.c
[2] https://bitbucket.org/openvz/criu.ovz/src/918653a0/criu/sk-unix.c