Skip to content

#586: Investigate erratic and unpredictable CI error regarding synthetic case (NEW) #590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

cwschilly
Copy link
Contributor

Fixes #586

@cwschilly cwschilly force-pushed the 586-investigate-erratic-and-unpredictable-CI-error-regarding-synthetic-case-NEW branch from d2a2b79 to 96a3368 Compare February 7, 2025 21:28
@cwschilly cwschilly marked this pull request as draft February 7, 2025 21:29
@cwschilly cwschilly self-assigned this Feb 7, 2025
@ppebay
Copy link
Contributor

ppebay commented Feb 8, 2025

Running 100 times the corresponding case (with only 8 iterations and 1 object per transfer):

for run in {1..100}; do python LBAF_app.py -c synthetic-acceptance.yaml; (cat ../../../output/imbalance.txt; echo) >> res; done

on macOS 14.3 with Python 3.9.19 yields:

awk '{ sum += $1 } END { print sum }' res
0

In other all 100 runs correctly converged to an imbalance of 0.0.

This seems to indicate rather a problem with the CI system itself, and specifically with Python 3.8
@cwschilly

@ppebay
Copy link
Contributor

ppebay commented Feb 9, 2025

In keeping with the above comment, I created a Python 3.8 environment and performed the same test (100 runs):

for run in {1..100}; do python LBAF_app.py -c synthetic-acceptance.yaml; (cat ../../../output/imbalance.txt; echo) >> res; done

and obtained as well

awk '{ sum += $1 } END { print sum }' res
0

i.e., all 100 runs passed the test.

Specifically the version of Python is:

 python --version
Python 3.8.20

@cwschilly can we verify if this is the same version that runs in CI?

@cwschilly
Copy link
Contributor Author

cwschilly commented Feb 10, 2025

@ppebay The CI uses Python 3.8.18. I have been able to recreate the problem locally by running tox with Python 3.8.19, so I don't think it's just a problem with CI.

@cwschilly cwschilly marked this pull request as ready for review February 17, 2025 20:12
@cwschilly cwschilly requested a review from ppebay February 17, 2025 20:12
@cwschilly cwschilly marked this pull request as draft February 18, 2025 20:30
@cwschilly
Copy link
Contributor Author

After changes to the ordering of ranks, Python 3.8 acceptance passed several times in a row. However, Python 3.11 failed: https://github.com/DARMA-tasking/LB-analysis-framework/actions/runs/13397121375/job/37418861102?pr=590

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate erratic and unpredictable CI error regarding synthetic case
2 participants