e2e: correct `TestSingleAffinities` behavior #25943

pkazmierczak · 2025-05-28T14:55:01Z

TestSingleAffinities never expected a node with affinity score set to 0 in
the set of returned nodes. However, since #25800, this can happen. What the
test should be checking for instead is that the node with the highest normalized
score has the right affinity.

Internal ref: https://hashicorp.atlassian.net/browse/NMD-797

This is the very first e2e test we run, and since I cannot reproduce this problem, my guess is we might be checking the alloc metric score too quickly on the GHA runner?

tgross

This doesn't seem like it's fixing the right problem. Isn't the allocation being created in the same plan that records the score metadata? If this is flaky, is it possible that we're scheduling onto a host with an unexpected score because of result limiter?

pkazmierczak · 2025-05-28T15:10:26Z

result limiter

what's a result limiter?

pkazmierczak · 2025-05-28T15:28:05Z

My suspicions are around client readiness. I cannot make this test fail no matter how hard I try if I manually run it against the e2e cluster. But it's the very first test that we run. Could it be that a dc1 node is just not ready when this is run? Or does it not make sense?

tgross · 2025-05-28T15:43:18Z

what's a result limiter?

Sorry, I mean the LimitIterator. The scheduler doesn't check all the nodes, a only checks a number up to a certain limit (2 for the generic scheduler for non-spread workloads) and then picks the best of those options. So I'm suggesting that we're iterating over 2 nodes, and finding that the best score includes a 0 for node-affinity rather than 1. Which strongly makes me think we're hitting a side-effect of #25800. It would be interesting to look at the rest of the scores.

My suspicions are around client readiness. I cannot make this test fail no matter how hard I try if I manually run it against the e2e cluster. But it's the very first test that we run. Could it be that a dc1 node is just not ready when this is run? Or does it not make sense?

That's possible but that still implies that it's about which particular set of nodes is ready and which order we're looking at them.

tgross

LGTM!

github-actions · 2025-09-28T02:22:06Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

e2e: retry on TestSingleAffinities

28d1845

This is the very first e2e test we run, and since I cannot reproduce this problem, my guess is we might be checking the alloc metric score too quickly on the GHA runner?

pkazmierczak requested review from jrasell and tgross May 28, 2025 14:55

pkazmierczak requested review from a team as code owners May 28, 2025 14:55

pkazmierczak added the theme/e2e label May 28, 2025

vercel bot deployed to Preview – nomad-ui May 28, 2025 14:55 View deployment

tgross reviewed May 28, 2025

View reviewed changes

pkazmierczak marked this pull request as draft May 28, 2025 15:05

test refactor

6a50469

pkazmierczak changed the title ~~e2e: retry on TestSingleAffinities~~ e2e: correct TestSingleAffinities behavior May 30, 2025

vercel bot deployed to Preview – nomad-ui May 30, 2025 16:24 View deployment

pkazmierczak marked this pull request as ready for review May 30, 2025 16:25

pkazmierczak requested a review from tgross May 30, 2025 16:25

tgross approved these changes May 30, 2025

View reviewed changes

pkazmierczak merged commit 348177d into main May 30, 2025
34 checks passed

pkazmierczak deleted the b-affinity-single-e2e branch May 30, 2025 17:46

github-actions bot locked as resolved and limited conversation to collaborators Sep 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

e2e: correct `TestSingleAffinities` behavior #25943

e2e: correct `TestSingleAffinities` behavior #25943

Uh oh!

pkazmierczak commented May 28, 2025 •

edited

Loading

Uh oh!

tgross left a comment

Uh oh!

pkazmierczak commented May 28, 2025

Uh oh!

pkazmierczak commented May 28, 2025

Uh oh!

tgross commented May 28, 2025

Uh oh!

tgross left a comment

Uh oh!

Uh oh!

github-actions bot commented Sep 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

e2e: correct TestSingleAffinities behavior #25943

e2e: correct TestSingleAffinities behavior #25943

Uh oh!

Conversation

pkazmierczak commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgross left a comment

Choose a reason for hiding this comment

Uh oh!

pkazmierczak commented May 28, 2025

Uh oh!

pkazmierczak commented May 28, 2025

Uh oh!

tgross commented May 28, 2025

Uh oh!

tgross left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Sep 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

e2e: correct `TestSingleAffinities` behavior #25943

e2e: correct `TestSingleAffinities` behavior #25943

pkazmierczak commented May 28, 2025 •

edited

Loading