Skip to content

[BUG] Test isolation: TestYamlModeDashboardRegistration register/remove pair fails under default xdist scheduling #1196

Description

@Patch76

Summary

tests/src/e2e/workflows/filesystem/test_yaml_config.py::TestYamlModeDashboardRegistration::test_remove_dashboard_entry fails non-deterministically in CI (most reliably on the 4-worker ubuntu-24.04-arm matrix, occasionally observable on ubuntu-latest-3-workers). Root cause is a test-isolation bug × CI-command interaction, not a flake.

Mechanics

  • test_register_dashboard_entry (line 680) writes lovelace.dashboards.ha-mcp-test-dash into configuration.yaml via ha_config_set_yaml.
  • test_remove_dashboard_entry (line 710) tries to remove the same key. Both share URL_PATH = "ha-mcp-test-dash" as a class attribute (line 677), so remove is order-dependent on register.
  • The HA testcontainer fixture ha_container_with_fresh_config is scope="session" (tests/src/e2e/conftest.py:259). Under pytest-xdist, session-scoped fixtures are per-worker — each worker boots its own container with its own configuration.yaml.
  • .github/workflows/pr.yml:179-180 runs the E2E suite as pytest tests/src/e2e/ -n${{ matrix.pytest_workers }} --tb=short -vwithout --dist loadscope. Default --dist load round-robins tests across workers without class affinity.
  • When register lands on worker A and remove lands on worker B, worker B's configuration.yaml never saw the registration → Dashboard 'ha-mcp-test-dash' not found under lovelace.dashboards.
  • Probability scales with worker count: arm (4 workers) higher than ubuntu-latest (3 workers).

Reproduction

Live failure: https://github.com/homeassistant-ai/ha-mcp/actions/runs/25624197082/job/75216186120 (PR #1191 head c765653, arm-only, ubuntu-latest passed). The PR diff itself is comment/docstring cleanup — no logic change touches tools_yaml_config.py or test_yaml_config.py.

AGENTS.md documents the local invocation as cd tests && uv run pytest src/e2e/ -n2 --dist loadscope -v --tb=short. Local runs don't manifest because --dist loadscope keeps same-class tests on the same worker.

Fix options (maintainer's call)

  1. CI command alignment — add --dist loadscope to pr.yml:180. Matches the documented local command. Reuse-friendly: keeps the test as written.
  2. Self-contained tests — fold register-then-remove into a single test body, or use a fixture that registers before each remove. More code, more coverage redundancy, but no CI-config dependency.
  3. @pytest.mark.xdist_group(name="yaml_dashboards") on TestYamlModeDashboardRegistration — explicit class-level grouping that survives any future --dist change.

Option 1 is the smallest delta; option 3 is the most defensive.

Out of scope for

PR #1191 (perf: dedupe lovelace/dashboards/list). Surfaced there only because a probabilistic re-trigger landed on the bad scheduling combination.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingciCI/CD changestriaged

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions