You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using mutmut with Django: a working recipe (conftest + pyproject)
Opening this as an issue because Discussions aren't enabled on the repo — happy to close/move it if there's a better venue. This is intended as a reference for other Django users who hit the same integration problems I did, and a prompt for whether any of it deserves to land in the official docs.
Relation to #456. This is effectively a complete answer to the open question in #456 ("Is there a recommended way to use mutmut with Django projects?"). @Otto-AA's reply there correctly points at paths_to_mutate=<root> + a CLI glob (per #414), which is necessary but not sufficient for a Django app — the OP's second failure mode ("64 passed … we could not find any test case for any mutant") is caused by Django not being bootstrapped inside mutants/ when pytest runs there. The three pieces below (root conftest.py with an idempotent bootstrap, also_copy, and pytest_add_cli_args_test_selection) close that gap.
TL;DR
mutmut works great against a Django codebase, but getting there requires three non-obvious pieces to click together:
A root conftest.py that bootstraps Django idempotently across the many pytest.main() calls mutmut makes per process.
A [tool.mutmut] config that copies that conftest into mutants/, keeps the whole project tree as paths_to_mutate (so the type checker can still resolve intra-project imports), and narrows pytest collection to a single test file per target.
A CLI glob passed to mutmut run so only the target module's mutants are actually executed, even though every file was generated.
Full working setup below — battle-tested on a Django 5.2 + Python 3.14 + pyrefly project, reaching ~83% effective mutation score on service modules.
The problem
Django's test story assumes python manage.py test (unittest-based DiscoverRunner). mutmut drives tests via pytest.main(). Three friction points:
Django isn't configured at pytest-collection time, so importing any django.test.TestCase subclass explodes.
mutmut invokes pytest.main()multiple times per process (for stats, clean baseline, forced-fail check, per-mutant forks). A naive django.setup() + setup_databases() at module import would run over and over, tearing down and recreating the test DB each time.
If you restrict paths_to_mutate to a single file to "make it fast", your type-check command can no longer resolve imports from sibling modules, so every mutant fails type-check for the wrong reason.
The fix
1. Root conftest.py
"""Root conftest.py loaded only by pytest (used indirectly via mutmut).Bootstraps Django and the test database so `django.test.TestCase`subclasses run under plain pytest without pytest-django. Copied into`mutants/conftest.py` by mutmut via `[tool.mutmut].also_copy`. Django'sown test runner does not load pytest conftests, so this file is inertunder `python manage.py test`.Why guard on `_TestState` instead of a module-level flag: mutmut invokes`pytest.main()` multiple times per process (stats, clean run, forced-fail,per-mutant forks). Between invocations pytest re-imports conftests, whichresets module-level variables but leaves Django's already-imported modules(and therefore `django.test.utils._TestState`) intact. Using Django's ownstate as the idempotency check survives the re-import. `keepdb=True` makes`setup_databases` idempotent on disk as well."""from __future__ importannotationsimportosos.environ.setdefault("DJANGO_SETTINGS_MODULE", "onsen.settings")
importdjango# noqa: E402django.setup()
fromdjango.test.runnerimportDiscoverRunner# noqa: E402fromdjango.test.utilsimport_TestState# noqa: E402fromdjango.test.utilsimportsetup_test_environment# noqa: E402ifnothasattr(_TestState, "saved_data"):
setup_test_environment()
_runner=DiscoverRunner(verbosity=0, keepdb=True)
_runner.setup_databases()
Key insights:
Idempotency check uses django.test.utils._TestState, not a module-level _bootstrapped = True flag. Pytest re-imports the conftest between pytest.main() calls, wiping module globals — but Django's _TestState class attribute persists because Django's own modules are not re-imported. Checking hasattr(_TestState, "saved_data") is a reliable "did we already call setup_test_environment()?" probe.
keepdb=True makes setup_databases() reuse the existing test DB on subsequent calls instead of dropping and recreating it. Combined with the _TestState guard, Django is fully bootstrapped exactly once per process.
pytest-django is intentionally not used. The project's canonical test runner is python manage.py test; adding pytest-django would introduce a parallel test-DB lifecycle. This conftest is the minimum shim to let mutmut's pytest-driven pipeline coexist with Django's unittest runner.
paths_to_mutate = ["onsen"] — the whole app tree is copied to mutants/ so the type checker (pyrefly here, but mypy would be the same) can resolve intra-project imports (e.g. a view importing its sibling form). Narrowing this to a single file breaks type checking for every mutant that touches an import. We scope the actual mutation testing via the CLI glob instead (next section).
pytest_add_cli_args_test_selection — narrows pytest collection to the one test file relevant to the target module. Without this, pytest collects the entire tests/ tree and blows up on unrelated files that import Django-specific helpers at module scope (django.utils.timezone, settings accessors, etc.). This one line is the difference between "mutmut works" and "every run fails at collection".
also_copy = ["conftest.py"] — ships the root conftest into mutants/ so Django bootstraps against the mutated source tree, not the original. Required for any target whose tests hit the ORM or signals. Harmless for pure-function targets, so leave it on.
type_check_command = ["pyrefly", "check", "--output-format=json"] — pre-rejects ~15–20% of generated mutants as type-invalid before they even reach pytest. Massive speedup. The JSON output makes mutmut's parsing reliable.
debug = false — flip to true when diagnosing bootstrap failures; it prints the pytest invocation per run.
3. Running with a CLI glob
mutmut run "onsen.apps.core.service.gateway_payment_manager*"
Even though the whole onsen/ tree gets generated into mutants/, the glob scopes execution to the target module. Trivial mutants in unrelated files are created but skipped, so a run takes minutes instead of hours while type checking still sees a coherent project.
4. Use --noinput when running Django's own test runner
Because conftest.py calls setup_databases(keepdb=True), the test DB is left on disk. If you then run python manage.py test to sanity-check your changes, it will prompt interactively to drop the reused DB and hang in CI or Docker. Pass --noinput:
python manage.py test --noinput tests.onsen.apps.core.service.test_gateway_payment_manager
Observed result
On a ~600-line service module with 157 mutants:
103 killed by tests
27 caught by pyrefly (type-invalid)
27 survived — all logger-argument mutants (logger.debug("msg %s", x) → logger.debug(None, x)) or mutations on a no-op dict.pop line
Effective score = (killed + type_caught) / total = 82.8%.
The logger-argument survivors are the only consistent "accept as survivor" bucket I've found — killing them requires asserting exact log-message strings, which is brittle and violates parameterized-logging conventions. Anything else in practice has been a real test gap or a real latent bug.
Questions / possible doc additions
Is there a more idiomatic way to bootstrap Django for mutmut? The _TestState guard works but feels like leaning on a Django implementation detail. Is there a supported "run me once per process" hook I'm missing?
Would a short "mutmut + Django" section in the README be welcome? I'd be happy to open a PR with a distilled version of the above — roughly the three code blocks and the four bullet-point rationales — if it's in scope for the project. If accepted, this could also serve as the canonical answer to Mutation testing doesn't work with Django app structure - import path issues ? #456 and close it.
paths_to_mutate + CLI-glob pattern. This two-step (generate everything, execute a subset) isn't obvious from the docs — Mutation testing doesn't work with Django app structure - import path issues ? #456 is a recent example of a user hitting exactly this. Is there a leaner way to keep type-check context while narrowing execution, or is this the intended pattern?
Thanks for a genuinely great tool — the type-check integration alone saved us hours on every run.
Using mutmut with Django: a working recipe (conftest + pyproject)
TL;DR
mutmut works great against a Django codebase, but getting there requires three non-obvious pieces to click together:
conftest.pythat bootstraps Django idempotently across the manypytest.main()calls mutmut makes per process.[tool.mutmut]config that copies that conftest intomutants/, keeps the whole project tree aspaths_to_mutate(so the type checker can still resolve intra-project imports), and narrows pytest collection to a single test file per target.mutmut runso only the target module's mutants are actually executed, even though every file was generated.Full working setup below — battle-tested on a Django 5.2 + Python 3.14 + pyrefly project, reaching ~83% effective mutation score on service modules.
The problem
Django's test story assumes
python manage.py test(unittest-basedDiscoverRunner). mutmut drives tests viapytest.main(). Three friction points:django.test.TestCasesubclass explodes.pytest.main()multiple times per process (for stats, clean baseline, forced-fail check, per-mutant forks). A naivedjango.setup()+setup_databases()at module import would run over and over, tearing down and recreating the test DB each time.paths_to_mutateto a single file to "make it fast", your type-check command can no longer resolve imports from sibling modules, so every mutant fails type-check for the wrong reason.The fix
1. Root
conftest.pyKey insights:
django.test.utils._TestState, not a module-level_bootstrapped = Trueflag. Pytest re-imports the conftest betweenpytest.main()calls, wiping module globals — but Django's_TestStateclass attribute persists because Django's own modules are not re-imported. Checkinghasattr(_TestState, "saved_data")is a reliable "did we already callsetup_test_environment()?" probe.keepdb=Truemakessetup_databases()reuse the existing test DB on subsequent calls instead of dropping and recreating it. Combined with the_TestStateguard, Django is fully bootstrapped exactly once per process.python manage.py test; adding pytest-django would introduce a parallel test-DB lifecycle. This conftest is the minimum shim to let mutmut's pytest-driven pipeline coexist with Django's unittest runner.2.
pyproject.tomlWhy each key matters:
paths_to_mutate = ["onsen"]— the whole app tree is copied tomutants/so the type checker (pyrefly here, but mypy would be the same) can resolve intra-project imports (e.g. a view importing its sibling form). Narrowing this to a single file breaks type checking for every mutant that touches an import. We scope the actual mutation testing via the CLI glob instead (next section).pytest_add_cli_args_test_selection— narrows pytest collection to the one test file relevant to the target module. Without this, pytest collects the entiretests/tree and blows up on unrelated files that import Django-specific helpers at module scope (django.utils.timezone, settings accessors, etc.). This one line is the difference between "mutmut works" and "every run fails at collection".also_copy = ["conftest.py"]— ships the root conftest intomutants/so Django bootstraps against the mutated source tree, not the original. Required for any target whose tests hit the ORM or signals. Harmless for pure-function targets, so leave it on.type_check_command = ["pyrefly", "check", "--output-format=json"]— pre-rejects ~15–20% of generated mutants as type-invalid before they even reach pytest. Massive speedup. The JSON output makes mutmut's parsing reliable.debug = false— flip totruewhen diagnosing bootstrap failures; it prints the pytest invocation per run.3. Running with a CLI glob
mutmut run "onsen.apps.core.service.gateway_payment_manager*"Even though the whole
onsen/tree gets generated intomutants/, the glob scopes execution to the target module. Trivial mutants in unrelated files are created but skipped, so a run takes minutes instead of hours while type checking still sees a coherent project.4. Use
--noinputwhen running Django's own test runnerBecause
conftest.pycallssetup_databases(keepdb=True), the test DB is left on disk. If you then runpython manage.py testto sanity-check your changes, it will prompt interactively to drop the reused DB and hang in CI or Docker. Pass--noinput:python manage.py test --noinput tests.onsen.apps.core.service.test_gateway_payment_managerObserved result
On a ~600-line service module with 157 mutants:
logger.debug("msg %s", x)→logger.debug(None, x)) or mutations on a no-opdict.poplineEffective score =
(killed + type_caught) / total= 82.8%.The logger-argument survivors are the only consistent "accept as survivor" bucket I've found — killing them requires asserting exact log-message strings, which is brittle and violates parameterized-logging conventions. Anything else in practice has been a real test gap or a real latent bug.
Questions / possible doc additions
_TestStateguard works but feels like leaning on a Django implementation detail. Is there a supported "run me once per process" hook I'm missing?paths_to_mutate+ CLI-glob pattern. This two-step (generate everything, execute a subset) isn't obvious from the docs — Mutation testing doesn't work with Django app structure - import path issues ? #456 is a recent example of a user hitting exactly this. Is there a leaner way to keep type-check context while narrowing execution, or is this the intended pattern?Thanks for a genuinely great tool — the type-check integration alone saved us hours on every run.