Skip to content

Conversation

@hehe7318
Copy link
Contributor

@hehe7318 hehe7318 commented Oct 25, 2025

Description of changes

Implement cross-worker directory stack sharing for AD integration tests

Add file-based registry with file locking to coordinate directory stack sharing across parallel pytest workers. This prevents premature deletion of shared AD directory stacks when multiple OS tests run concurrently in the same region.

  • Add FileLock-based cross-process synchronization
  • Implement atomic registry file operations in /var/tmp/.pcluster_tests
  • Replace in-memory reference counting with persistent file registry
  • Ensure directory stacks are only deleted when all tests complete

Fixes issue where first completing test would delete shared directory stack, causing subsequent tests to fail when accessing secrets.

Tests

  • Integ test passed, running 8 tests in parallel, 2 different directory types(SimpleAD and MicrosoftAD), 4 different OSs.

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…n tests

Add file-based registry with file locking to coordinate directory stack
sharing across parallel pytest workers. This prevents premature deletion
of shared AD directory stacks when multiple OS tests run concurrently
in the same region.

- Add FileLock-based cross-process synchronization
- Implement atomic registry file operations in /var/tmp/.pcluster_tests
- Replace in-memory reference counting with persistent file registry
- Ensure directory stacks are only deleted when all tests complete

Fixes issue where first completing test would delete shared directory
stack, causing subsequent tests to fail when accessing secrets.
@hehe7318 hehe7318 requested review from a team as code owners October 25, 2025 20:06
@hehe7318 hehe7318 added skip-changelog-update Disables the check that enforces changelog updates in PRs 3.x labels Oct 25, 2025
Copy link
Contributor

@gmarciani gmarciani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using the existing SharedFixture mechanism which implkemnents already the locking?

class SharedFixture:
"""
Define the methods to implement fixtures that can be shared across multiple pytest-dist processes.

).is_true()


class DirectoryStackSharedFixture(SharedFixture):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKING] I think the easiest way to use SharedFixture is by using the decorated @xdist_session_fixture

See example:

@xdist_session_fixture(autouse=True)

Why can't we reuse the same approach we are currently using for shared VPC stacks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out, you are right. Done. New test ongoing. This time I did exact the same thing as the vpc_stack. However, the behavior now is different.
Previously, before the latest change, when we need a stack, we create one.
Now, we pre-create stacks in all regions and all AD types, although we only need a few of them. And that's what vpc_stack do.
I am figuring if there's a way to create when we need the stack.

@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.17%. Comparing base (cb39901) to head (7e2aac2).
⚠️ Report is 118 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7067      +/-   ##
===========================================
- Coverage    90.18%   90.17%   -0.01%     
===========================================
  Files          182      183       +1     
  Lines        16467    16515      +48     
===========================================
+ Hits         14851    14893      +42     
- Misses        1616     1622       +6     
Flag Coverage Δ
unittests 90.17% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ad of pre-create stacks in all regions for all directory types.
logging.warning("Error releasing shared fixture: %s", e)


@xdist_session_fixture(autouse=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you prevent the creation of AD stack in every region if this is a session fixture that is autoused?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.x skip-changelog-update Disables the check that enforces changelog updates in PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants