[nightshift] investigate slow/flaky CI tests#5762
Open
claude-nightshift[bot] wants to merge 1 commit into
Open
[nightshift] investigate slow/flaky CI tests#5762claude-nightshift[bot] wants to merge 1 commit into
claude-nightshift[bot] wants to merge 1 commit into
Conversation
`test_high_char_5gram_jaccard_pair_clusters_end_to_end` runs the full normalize → minhash → fuzzy_dups pipeline on a synthetic J≈0.97 pair to verify that high-Jaccard pairs share one `dup_cluster_id` with exactly one canonical. The same end-to-end path with the same assertions is already exercised by `test_fuzzy_dups_single_source_schema_and_pair` on the fox-corpus high-Jaccard pair (also well above the LSH threshold at b=26, r=11). LSH-level recall across the Jaccard band is independently covered by the parametrized `test_high_char_5gram_jaccard_pairs_share_lsh_bucket`. Removing the redundant test trims ~32s from the unit-test job's slowest tail without weakening coverage.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Nightshift CI test audit flagged four slow unit tests in the
Marin - Unitworkflow. Investigation showed one of them —tests/processing/classification/deduplication/test_fuzzy.py::test_high_char_5gram_jaccard_pair_clusters_end_to_end(~33s) — provides no unique coverage:dup_cluster_idwith exactly one canonical.test_fuzzy_dups_single_source_schema_and_pairalready runs the same end-to-end pipeline (including connected components and per-source attr-output emission) on the fox-corpus fuzzy pair (test_contaminated_1/test_high_overlap, also well above the LSH collision threshold at b=26, r=11) and asserts the same shared-dup_cluster_id/exactly-one-canonical properties.test_high_char_5gram_jaccard_pairs_share_lsh_bucket(which exercises the synthetic-pair construction at b=26, r=11 acrosstarget_j ∈ {0.95, 0.97, 0.99}× 20 seeds).Removing the redundant test trims ~32s from the unit-test job's slowest tail without weakening coverage.
The other three slow candidates (
test_fuzzy_dups_single_source_schema_and_pair,test_connected_components_happy_path,test_ghalogs_public_normalize_steps_write_datakit_normalized_train_partition) each provide unique end-to-end coverage of a real pipeline path and their cost is dominated by intrinsicZephyrContext/StepRunnersetup, so no in-scope speedup justified touching them in this pass.Test plan
./infra/pre-commit.py --files tests/processing/classification/deduplication/test_fuzzy.py— green.uv run pytest tests/processing/classification/deduplication/test_fuzzy.py— 71 passed, 1 xfailed (data_integration tests included).