Skip to content

libct/test: Disable GC on test run to catch leaking fds#5243

Merged
cyphar merged 1 commit intoopencontainers:mainfrom
rata:fix-flaky-TestFdLeaks
Apr 16, 2026
Merged

libct/test: Disable GC on test run to catch leaking fds#5243
cyphar merged 1 commit intoopencontainers:mainfrom
rata:fix-flaky-TestFdLeaks

Conversation

@rata
Copy link
Copy Markdown
Member

@rata rata commented Apr 13, 2026

libct/test: Disable GC on test run to catch leaking fds

This test is racy for a long time now. All the logs I could find in CI
seem to be dangling symlinks, like the test shows "23 -> ". This means
the fd was closed before we did the call to readlink().

Let's try to disable the GC. This should get rid of the "fds are getting
closed before we read them" part.

Updates: #4297

@rata rata requested a review from kolyshkin April 13, 2026 10:46
@rata rata force-pushed the fix-flaky-TestFdLeaks branch from 4834716 to 3e6e7a1 Compare April 13, 2026 10:51
Comment thread libcontainer/integration/exec_test.go Outdated
}
}

dst, err := linux.Readlinkat(procSelfFd, fd1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why not add error handing to the previous Readlinkat?
  2. Shouldn't we only ignore ENOENT and report other errors?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Grr, I screw it up, I had several changes and clearly I applied the diff incorrectly. Sorry, will fix now

Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess the only issue with the test is missing error checking from readlinkat.

I'd rather use a step-by-step approach at fixing this, starting with one simple fix (e.g. error checking and ignoring ENOENT) and seeing it if helps.

@kolyshkin
Copy link
Copy Markdown
Contributor

Another idea: we can also try to call runtime.GC so that any fds that are to be closed by finalizers etc will be closed.

@rata
Copy link
Copy Markdown
Member Author

rata commented Apr 15, 2026

@kolyshkin heh, it's weird, but I thought about that too. Because it will close fds that we are leaking too. I think we can disable the GC for the test and that should do it.

But honestly, at this point I doubt if we should test in a completely different way. Let's see what comes out of this GC idea...

@rata rata force-pushed the fix-flaky-TestFdLeaks branch from 3e6e7a1 to 7c089d0 Compare April 15, 2026 16:12
@rata rata changed the title libct/test: Do more runs and ignore fds without magic link libct/test: Disable GC on test run to catch leaking fds Apr 15, 2026
@rata
Copy link
Copy Markdown
Member Author

rata commented Apr 15, 2026

@kolyshkin reworked it to use the GC instead. Let me know what you think. I think ignoring readlink is weak, as we could be leaking fds and the GC closing them for us. But now with GC disabled, I guess the original test shouldn't be flaky and still reliable

Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kolyshkin
Copy link
Copy Markdown
Contributor

OK, the first fedora run (with the new patch disabling GC) failed with an unrelated test (#5013), and the second run timed out.

This test is racy for a long time now. All the logs I could find in CI
seem to be dangling symlinks, like the test shows "23 -> ". This means
the fd was closed before we did the call to readlink().

Let's try to disable the GC. This should get rid of the "fds are getting
closed before we read them" part.

Updates: opencontainers#4297

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
@kolyshkin kolyshkin force-pushed the fix-flaky-TestFdLeaks branch from 7c089d0 to 748af2e Compare April 16, 2026 00:08
Copy link
Copy Markdown
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of disabling GC, this probably makes our tests for leaks even more strict while also reducing flakiness. Thanks!

@cyphar cyphar merged commit c5077eb into opencontainers:main Apr 16, 2026
55 checks passed
@kolyshkin kolyshkin mentioned this pull request Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants