ci: port CentOS Stream test to GitHub Actions using Lima#2967
Open
adrianreber wants to merge 2 commits intocheckpoint-restore:criu-devfrom
Open
ci: port CentOS Stream test to GitHub Actions using Lima#2967adrianreber wants to merge 2 commits intocheckpoint-restore:criu-devfrom
adrianreber wants to merge 2 commits intocheckpoint-restore:criu-devfrom
Conversation
d7a9f45 to
822408f
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## criu-dev #2967 +/- ##
============================================
- Coverage 57.76% 57.16% -0.61%
============================================
Files 142 154 +12
Lines 37664 40278 +2614
Branches 0 8831 +8831
============================================
+ Hits 21758 23024 +1266
- Misses 15906 16990 +1084
- Partials 0 264 +264 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
7e1aebb to
380c4c5
Compare
380c4c5 to
5a5d2c5
Compare
The thread-bomb test frequently fails during setup with
pthread_create() returning EAGAIN (errno 11). The test creates
1024 threads in a tight loop from main(), and each of those
threads immediately spawns another thread that joins its
predecessor, resulting in a burst of ~2048 simultaneous thread
creations with 64KB stacks.
This burst causes transient EAGAIN errors from clone() due to
kernel resource pressure (VMA allocator contention, temporary
memory fragmentation, etc.). The failure is not related to hard
resource limits — ulimit, threads-max, max_map_count and cgroup
pids limits are all well above the required values. The failure
occurs both inside and outside containers and is worse on hosts
with fewer resources.
Measured failure rates on a 16GB / 9-CPU host:
Before fix: 65% failure rate (13/20 outside container)
After fix: ~2.5% failure rate (1/40), and that failure was
a C/R issue, not a pthread_create EAGAIN
Fix this by adding a pthread_create_retry() wrapper that retries
pthread_create() up to 50 times with a 10ms delay when it returns
EAGAIN. This gives the kernel time to reclaim resources between
attempts while keeping the total worst-case retry time under one
second per thread creation.
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Adrian Reber <areber@redhat.com>
5a5d2c5 to
083c7e1
Compare
Move the CentOS Stream 9 based test from Cirrus CI to GitHub Actions using Lima VMs. Expand coverage to a matrix of CentOS Stream 9 and 10 on x86_64. Extract the common Lima VM setup steps (Lima install, image caching, KVM enablement, VM start, source copy) into a reusable composite action at .github/actions/lima-vm-setup. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Adrian Reber <areber@redhat.com>
083c7e1 to
e49607f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Move the CentOS Stream 9 based test from Cirrus CI to GitHub Actions using Lima VMs. Expand coverage to a matrix of CentOS Stream 9 and 10 on both x86_64 and aarch64.
The new job follows the same Lima-based pattern introduced for the Fedora Rawhide test but runs the local CI target directly in the VM instead of inside a container.
Depends on #2961