Summary
tests/test_e2e_live.py::TestChangePassword::test_change_password_success_invalidates_other_sessions fails intermittently (~30% of runs). The assertion that trips is at line 1204:
me_a = a.get("/api/v1/auth/me")
assert me_a.status_code == 200, f"client A should keep working; got {me_a.status_code}"
Client A — the session that performed the password change and whose cookies were just rotated — occasionally gets 401 on its next protected call, when it should keep working. Client B (the other session) is correctly invalidated; that half of the test is stable.
Evidence (same commit, different outcomes)
| Run |
Result |
main 2786564 (CI -- Full Stack) |
FAIL |
| PR #100 (next 16) rebased |
FAIL |
| PR #99 (aiodns) rebased |
PASS |
| PR #96 (python) |
PASS |
The bump PRs are unrelated to auth — the test fails/passes regardless of the PR content, which is the signature of a flaky test, not a regression. It is not blocking merges (main is not branch-protected; E2E is not a required check), but it makes CI noisy and erodes trust in the suite.
Static analysis — what it is not
The obvious suspect is the same-second iat / password_changed_at watermark race, but that path is sound:
change_password sets password_changed_at = next_second = floor(now) + 1 (auth.py:965-968) and mints the rotated tokens with iat_override=next_second (auth.py:1002).
get_current_user compares in epoch seconds: if iat_seconds < pwc_seconds: 401 (dependencies.py:126-130).
- For client A's rotated token,
iat == pwc == floor(now)+1, so iat < pwc is False → it passes. By construction the watermark cannot 401 client A's rotated token.
So the 401 means client A's me_a request is being sent with the stale (pre-rotation) access-token cookie, whose iat is the original login second < pwc → correctly bounced.
Likely root cause
Cookie rotation on the 204 No Content response is occasionally not picked up by the client before me_a fires:
change_password returns 204 and relies on the Set-Cookie headers attached to the injected Response (_build_token_response → _set_auth_cookies). The endpoint docstring already warns that Set-Cookie on 204 is fragile across Starlette versions ("Constructing a fresh Response(...) with headers=response.headers drops them on the floor — never do that here").
- If the new
access_token cookie isn't applied to the httpx.Client jar in time, me_a reuses the old token and gets a (correct) 401.
This is consistent with the intermittency: it depends on Set-Cookie parsing/timing on the 204, not on wall-clock second boundaries.
Suggested investigation / fix directions
- Confirm the cookie hypothesis: in the test, after the
change-password 204, assert that a.cookies.get("access_token") actually changed before calling me_a. If it didn't, the bug is the 204 Set-Cookie handling (server or client side).
- Server-side option: return
200 with an explicit body from change_password instead of 204, so Set-Cookie is carried on a response shape that every Starlette/httpx version handles uniformly. (Behaviour change — weigh against the current 204 contract.)
- Test-side option: if the production 204 behaviour is correct for real browsers and only the test client is racing, make the test deterministic — re-read the rotated cookie explicitly, or add a single retry/refresh step that mirrors what the real SPA
api-client does on a 401.
- Either way, lock the outcome with a deterministic assertion so this can't silently flake again.
Scope
Separate from the dependabot PRs (#99, #100) and from the v2.5.x release work — surfaced during PR triage. Filing so it isn't lost.
Summary
tests/test_e2e_live.py::TestChangePassword::test_change_password_success_invalidates_other_sessionsfails intermittently (~30% of runs). The assertion that trips is at line 1204:Client A — the session that performed the password change and whose cookies were just rotated — occasionally gets
401on its next protected call, when it should keep working. Client B (the other session) is correctly invalidated; that half of the test is stable.Evidence (same commit, different outcomes)
2786564(CI -- Full Stack)The bump PRs are unrelated to auth — the test fails/passes regardless of the PR content, which is the signature of a flaky test, not a regression. It is not blocking merges (main is not branch-protected; E2E is not a required check), but it makes CI noisy and erodes trust in the suite.
Static analysis — what it is not
The obvious suspect is the same-second
iat/password_changed_atwatermark race, but that path is sound:change_passwordsetspassword_changed_at = next_second = floor(now) + 1(auth.py:965-968) and mints the rotated tokens withiat_override=next_second(auth.py:1002).get_current_usercompares in epoch seconds:if iat_seconds < pwc_seconds: 401(dependencies.py:126-130).iat == pwc == floor(now)+1, soiat < pwcis False → it passes. By construction the watermark cannot 401 client A's rotated token.So the 401 means client A's
me_arequest is being sent with the stale (pre-rotation) access-token cookie, whoseiatis the original login second< pwc→ correctly bounced.Likely root cause
Cookie rotation on the 204 No Content response is occasionally not picked up by the client before
me_afires:change_passwordreturns204and relies on theSet-Cookieheaders attached to the injectedResponse(_build_token_response→_set_auth_cookies). The endpoint docstring already warns that Set-Cookie on 204 is fragile across Starlette versions ("Constructing a freshResponse(...)withheaders=response.headersdrops them on the floor — never do that here").access_tokencookie isn't applied to thehttpx.Clientjar in time,me_areuses the old token and gets a (correct) 401.This is consistent with the intermittency: it depends on Set-Cookie parsing/timing on the 204, not on wall-clock second boundaries.
Suggested investigation / fix directions
change-password204, assert thata.cookies.get("access_token")actually changed before callingme_a. If it didn't, the bug is the 204 Set-Cookie handling (server or client side).200with an explicit body fromchange_passwordinstead of204, so Set-Cookie is carried on a response shape that every Starlette/httpx version handles uniformly. (Behaviour change — weigh against the current 204 contract.)api-clientdoes on a 401.Scope
Separate from the dependabot PRs (#99, #100) and from the v2.5.x release work — surfaced during PR triage. Filing so it isn't lost.