Skip to content

Using mutmut with Django: a working recipe (conftest + pyproject) — also closes the gap in #456 #504

@willianantunes

Description

@willianantunes

Using mutmut with Django: a working recipe (conftest + pyproject)

Opening this as an issue because Discussions aren't enabled on the repo — happy to close/move it if there's a better venue. This is intended as a reference for other Django users who hit the same integration problems I did, and a prompt for whether any of it deserves to land in the official docs.

Relation to #456. This is effectively a complete answer to the open question in #456 ("Is there a recommended way to use mutmut with Django projects?"). @Otto-AA's reply there correctly points at paths_to_mutate=<root> + a CLI glob (per #414), which is necessary but not sufficient for a Django app — the OP's second failure mode ("64 passed … we could not find any test case for any mutant") is caused by Django not being bootstrapped inside mutants/ when pytest runs there. The three pieces below (root conftest.py with an idempotent bootstrap, also_copy, and pytest_add_cli_args_test_selection) close that gap.

TL;DR

mutmut works great against a Django codebase, but getting there requires three non-obvious pieces to click together:

  1. A root conftest.py that bootstraps Django idempotently across the many pytest.main() calls mutmut makes per process.
  2. A [tool.mutmut] config that copies that conftest into mutants/, keeps the whole project tree as paths_to_mutate (so the type checker can still resolve intra-project imports), and narrows pytest collection to a single test file per target.
  3. A CLI glob passed to mutmut run so only the target module's mutants are actually executed, even though every file was generated.

Full working setup below — battle-tested on a Django 5.2 + Python 3.14 + pyrefly project, reaching ~83% effective mutation score on service modules.


The problem

Django's test story assumes python manage.py test (unittest-based DiscoverRunner). mutmut drives tests via pytest.main(). Three friction points:

  • Django isn't configured at pytest-collection time, so importing any django.test.TestCase subclass explodes.
  • mutmut invokes pytest.main() multiple times per process (for stats, clean baseline, forced-fail check, per-mutant forks). A naive django.setup() + setup_databases() at module import would run over and over, tearing down and recreating the test DB each time.
  • If you restrict paths_to_mutate to a single file to "make it fast", your type-check command can no longer resolve imports from sibling modules, so every mutant fails type-check for the wrong reason.

The fix

1. Root conftest.py

"""
Root conftest.py loaded only by pytest (used indirectly via mutmut).

Bootstraps Django and the test database so `django.test.TestCase`
subclasses run under plain pytest without pytest-django. Copied into
`mutants/conftest.py` by mutmut via `[tool.mutmut].also_copy`. Django's
own test runner does not load pytest conftests, so this file is inert
under `python manage.py test`.

Why guard on `_TestState` instead of a module-level flag: mutmut invokes
`pytest.main()` multiple times per process (stats, clean run, forced-fail,
per-mutant forks). Between invocations pytest re-imports conftests, which
resets module-level variables but leaves Django's already-imported modules
(and therefore `django.test.utils._TestState`) intact. Using Django's own
state as the idempotency check survives the re-import. `keepdb=True` makes
`setup_databases` idempotent on disk as well.
"""

from __future__ import annotations

import os

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "onsen.settings")

import django  # noqa: E402

django.setup()

from django.test.runner import DiscoverRunner  # noqa: E402
from django.test.utils import _TestState  # noqa: E402
from django.test.utils import setup_test_environment  # noqa: E402

if not hasattr(_TestState, "saved_data"):
    setup_test_environment()
    _runner = DiscoverRunner(verbosity=0, keepdb=True)
    _runner.setup_databases()

Key insights:

  • Idempotency check uses django.test.utils._TestState, not a module-level _bootstrapped = True flag. Pytest re-imports the conftest between pytest.main() calls, wiping module globals — but Django's _TestState class attribute persists because Django's own modules are not re-imported. Checking hasattr(_TestState, "saved_data") is a reliable "did we already call setup_test_environment()?" probe.
  • keepdb=True makes setup_databases() reuse the existing test DB on subsequent calls instead of dropping and recreating it. Combined with the _TestState guard, Django is fully bootstrapped exactly once per process.
  • pytest-django is intentionally not used. The project's canonical test runner is python manage.py test; adding pytest-django would introduce a parallel test-DB lifecycle. This conftest is the minimum shim to let mutmut's pytest-driven pipeline coexist with Django's unittest runner.

2. pyproject.toml

[tool.mutmut]
paths_to_mutate = ["onsen"]
pytest_add_cli_args_test_selection = [
    "tests/onsen/apps/core/service/test_gateway_payment_manager.py",
]
also_copy = ["conftest.py"]
type_check_command = ["pyrefly", "check", "--output-format=json"]
debug = false

Why each key matters:

  • paths_to_mutate = ["onsen"] — the whole app tree is copied to mutants/ so the type checker (pyrefly here, but mypy would be the same) can resolve intra-project imports (e.g. a view importing its sibling form). Narrowing this to a single file breaks type checking for every mutant that touches an import. We scope the actual mutation testing via the CLI glob instead (next section).
  • pytest_add_cli_args_test_selection — narrows pytest collection to the one test file relevant to the target module. Without this, pytest collects the entire tests/ tree and blows up on unrelated files that import Django-specific helpers at module scope (django.utils.timezone, settings accessors, etc.). This one line is the difference between "mutmut works" and "every run fails at collection".
  • also_copy = ["conftest.py"] — ships the root conftest into mutants/ so Django bootstraps against the mutated source tree, not the original. Required for any target whose tests hit the ORM or signals. Harmless for pure-function targets, so leave it on.
  • type_check_command = ["pyrefly", "check", "--output-format=json"] — pre-rejects ~15–20% of generated mutants as type-invalid before they even reach pytest. Massive speedup. The JSON output makes mutmut's parsing reliable.
  • debug = false — flip to true when diagnosing bootstrap failures; it prints the pytest invocation per run.

3. Running with a CLI glob

mutmut run "onsen.apps.core.service.gateway_payment_manager*"

Even though the whole onsen/ tree gets generated into mutants/, the glob scopes execution to the target module. Trivial mutants in unrelated files are created but skipped, so a run takes minutes instead of hours while type checking still sees a coherent project.

4. Use --noinput when running Django's own test runner

Because conftest.py calls setup_databases(keepdb=True), the test DB is left on disk. If you then run python manage.py test to sanity-check your changes, it will prompt interactively to drop the reused DB and hang in CI or Docker. Pass --noinput:

python manage.py test --noinput tests.onsen.apps.core.service.test_gateway_payment_manager

Observed result

On a ~600-line service module with 157 mutants:

  • 103 killed by tests
  • 27 caught by pyrefly (type-invalid)
  • 27 survived — all logger-argument mutants (logger.debug("msg %s", x)logger.debug(None, x)) or mutations on a no-op dict.pop line

Effective score = (killed + type_caught) / total = 82.8%.

The logger-argument survivors are the only consistent "accept as survivor" bucket I've found — killing them requires asserting exact log-message strings, which is brittle and violates parameterized-logging conventions. Anything else in practice has been a real test gap or a real latent bug.

Questions / possible doc additions

  1. Is there a more idiomatic way to bootstrap Django for mutmut? The _TestState guard works but feels like leaning on a Django implementation detail. Is there a supported "run me once per process" hook I'm missing?
  2. Would a short "mutmut + Django" section in the README be welcome? I'd be happy to open a PR with a distilled version of the above — roughly the three code blocks and the four bullet-point rationales — if it's in scope for the project. If accepted, this could also serve as the canonical answer to Mutation testing doesn't work with Django app structure - import path issues ? #456 and close it.
  3. paths_to_mutate + CLI-glob pattern. This two-step (generate everything, execute a subset) isn't obvious from the docs — Mutation testing doesn't work with Django app structure - import path issues ? #456 is a recent example of a user hitting exactly this. Is there a leaner way to keep type-check context while narrowing execution, or is this the intended pattern?

Thanks for a genuinely great tool — the type-check integration alone saved us hours on every run.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions