Fix flaky tests (homepage-settings.spec.js duplicate Sample Page / Sample page row match, publish-panel.spec.js focus assertion before panel close completed)#77893
Open
danluu wants to merge 1 commit intoWordPress:trunkfrom
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is part of an AI fuzzing project, where an AI wrote a fuzzer and then triages bugs from the fuzzer and creates fixes. See #77716 for the tracking issue. As of this writing, there have been no known false positives from this project, but there have been some issues, which are documented in #77716. I expect we’ll see false positives at some point (and may even have one that’s been filed in a PR that hasn’t been inspected by a code owner yet).
What?
Gutenberg CI tests seem to be somewhat flaky. Of the PRs I've submitted recently, it feels like most PRs that have no "real" CI failures fail on initial submission and need to be re-run once or twice to pass CI. I haven't actually run the numbers; maybe it's 25% or 30% or something. In any case, a decent number of CI runs fail due to flakes.
This PR attempts to fix #68892, #77385, and #77721. It probably make sense to clean up more than just these, but since I'm not familiar with this codebase, as with the RTC fuzzing bugs, I wanted to start with a small PR to get people's opinions on automated AI flake fixing before submitting more or larger PRs.
AI TEXT
The revised plan is a PR boundary, not a complete flake program. Its purpose is to select fixes where the cause is local, the reviewer can understand the change from one spec file, and the original behavior remains covered by the same end-to-end assertion. That is why the best first PR is limited to
homepage-settings.spec.js: the failure mechanism is a fixture and locator problem, and the fix can isolate the page fixtures and scope row selection without deleting coverage or changing the user-visible behavior under test.The plan intentionally rejects fixes that depend on later cleanup, monitoring, or policy enforcement. It does not use skips, quarantines, broad retries, weakened assertions, or test deletion because those reduce the chance that the PR actually improves correctness. A fix is acceptable only when it replaces a race with a deterministic condition tied to the operation being tested, such as an exact locator, an already-observable UI state, or a clearly completed editor action.
The excluded areas are not dismissed as unimportant. RTC synchronization, router readiness, Openverse/media behavior, upload lifecycle semantics, REST retry policy, and reporter hygiene can all be real sources of flakiness. They are excluded from the first PR because each has a wider review surface and a higher risk of either changing product behavior or adding infrastructure that needs ongoing ownership. Those should be handled as separate focused changes only when there is concrete causal evidence and the fix preserves or strengthens the existing coverage.
Revised Fix Plan
requestUtilsretry layer, dashboard, bot rule, skip/quarantine mechanism, or any other change that requires ongoing follow-up.socket hang up.--repeat-eachfor the modified tests, and verify the original behavior is still asserted. Do not make post-merge monitoring or future cleanup part of the PR's correctness.Low-Risk PR Shape
homepage-settings.spec.jsonly. It is a clear test-fixture/locator bug: isolate page fixtures and scope rows exactly while preserving the same homepage/posts-page action assertions.publish-panel.spec.jsonly if the change is a narrow wait for panel closure oraria-expanded=falsebefore the existing focus assertion. Keep it test-only.post-content-focus-mode.spec.jsandclassic.spec.jsas follow-up PRs unless the first PR remains trivially small. They are still local, but they touch different editor flows and are easier to review separately.Concrete Small-Scope Fix Selection
Checked again on 2026-05-02, the only fixes that fit the low-risk idea are the two test-only fixes below. They are direct responses to the latest retained failure examples, they preserve the existing assertions, and they do not require shared helpers, product changes, retries, skips, quarantines, or follow-up ownership.
test/e2e/specs/site-editor/homepage-settings.spec.js/should show correct homepage actions based on current homepage or posts pageHomepage,Sample page, andDraft page, then derive row locators from exact page titles rather than broadgetByLabel()matches.Sample PageandSample pageas two matching rows. Fixture reset removes leaked page rows; exact row selection prevents case/substring title collisions from selecting the wrong DataViews row. The same homepage and posts-page menu assertions remain in place.test/e2e/specs/editor/various/publish-panel.spec.js/should move focus back to the Publish panel toggle button when cancelingCancel, wait for the publish panel to actually close, preferably by waiting for the top-bar Publish toggle to havearia-expanded="false", then keep the existingtoBeFocused()assertion.aria-expanded="true"while the test is already asserting focus. Waiting for the close state ties the assertion to the operation under test instead of sampling focus while the panel is still open.Actual PR Branch Fix Coverage
The created code branch is
try/test-flakes-prat commit0aba5a3831f(Stabilize small-scope flaky e2e tests). It changes onlytest/e2e/specs/site-editor/homepage-settings.spec.jsandtest/e2e/specs/editor/various/publish-panel.spec.js.That branch fixes these flakes:
homepage-settings.spec.js, then replaces broadgetByLabel()row filtering with exact, case-sensitive row locators. This directly targets the observed strict-mode failure where bothSample PageandSample pagematched the same intended row lookup.aria-expanded="false"after clickingCancel, then keeps the originaltoBeFocused()assertion. This directly targets the observed failure where the test asserted focus while the panel was still open (aria-expanded="true").That branch does not fix #77720 (
post-content-focus-mode.spec.js) or #77705 / #36123 (classic.spec.js). Those remain intentionally excluded from the small PR because they need focused review of broader editor flows.Do not include
post-content-focus-mode.spec.jsin this PR even though its latest failure is a local-looking wait forInitial content. That flake lives inside the "Show template" rendering mode and template-part embedding path; a correct fix probably needs a Post Content readiness helper and should be reviewed as a focused editor-flow PR.Do not include
classic.spec.jsin this PR even though explicitly selecting the Classic block after undo might address the latest toolbar timeout. The test covers a long TinyMCE/media-modal/upload/convert/undo flow and has a long historical flake issue; keep it separate so reviewers can evaluate whether the fix actually preserves the media conversion and undo coverage rather than merely making the toolbar button clickable.END AI TEXT