Skip to content

Conversation

@Otto-AA
Copy link
Collaborator

@Otto-AA Otto-AA commented Apr 2, 2025

Why

LibCST seems well maintained and currently parses all Python 3 versions (3.0 up to 3.13). This would fix #281 and allow new mutations (e.g. for match case).

Additionally, the LibCST abstractions seems more suitable for source code modifications. For instance, I didn't need to handle any whitespace while implementing the mutators. This makes the code simpler.

In the future, this change could also allow to use type analysis for mutations with the LibCST type inference provider.

Status

  • mutations reimplemented with LibCST (they pass nearly all previous tests)
  • mutation setup adapted to LibCST (module generation from mutations, etc.)
  • # pragma comments
  • performance tuning (though imo not a high priority, as test running will always dominate mutation generation)

@boxed
Copy link
Owner

boxed commented Apr 2, 2025

Ooh! Very cool. I looked at this briefly but gave up due to time constraints and API differences. I would love this!

@Otto-AA
Copy link
Collaborator Author

Otto-AA commented Apr 2, 2025

If you have time, I would have a few questions for the further steps, @boxed :

  1. Can I simplify the trampoline setup? Or preferably not?

Currently it clones a function for every mutation. I'd tend to instead insert if-else and if-then-else statements.

For instance, a = 2 could be mutated to a = 3 if _mutant_enabled(<id>) else 2. This seems simpler to me and produces less code. It's also capable of coverage through the _mutant_enabled function.

Maybe this would make it hard to change method signatures defined and definitions in the global scope, not sure about these yet.

  1. Is there a good way to E2E test?

The trampoline setup is not perfectly covered by the unit tests (e.g. I removed some code and no test complained). Do you have a suggestion on how to test it?

I would default to going through the source code and writing some more tests, which seems reasonable but takes some time 😅

  1. Does mutmut work on the mutmut code?

I've tried to run it but it puts all mutants into the not checked category, i.e. exit code None from the test runner.

@Otto-AA
Copy link
Collaborator Author

Otto-AA commented Apr 2, 2025

And don't worry if you have no time to review it currently :)

@boxed
Copy link
Owner

boxed commented Apr 3, 2025

  1. Having the entire function cloned has other advantages, like being able to look at diffs between the original and the mutant with standard tools. It also makes sure that when you look at a mutant diff I can load the original and the mutant from disk and do a diff so I know they match. Your idea works for simpler things, but I think it will be a lot harder for more complex scenarios. But I might be confused about that.

Global symbol mutation is something I dropped in mutmut 3. Mutmut 2 could do this because it modified the source in place and restarted the execution every time.

  1. I don't have any great ideas here, sorry. It's probably just a bunch of hard work :P

  2. I would think no. But that would certainly be an interesting challenge. Maybe you'd have to first copy mutmut to a new package name so that mutmut doesn't mutate itself. Python only has one namespace for modules, so one has to work around that.

Otto-AA added 6 commits April 6, 2025 09:04
Previously, mutmut would fail on a function call that
contains the kwarg `orig` or `mutants`, e.g. `foo(orig=123)`.

The trampoline setup would try to call
`_mutmut_trampoline(foo_orig, foo_mutants, *args, **kwargs)`
which now has the argument `orig` twice (once as a
positional arg and once as a kwarg) and thus raise
a TypeError.
Previously, mutated class methods were
always killed, because no `self` arg
was passed to their calls and thus
raised a TypeError for invalid number
of arguments.

The class methods in the mutants dict are
stored without reference to any class instance
and therefore not bound to any instance.
We need to pass the `self` arg explicitly
in those cases.

Note that the call to the original method
is a bound method (the method is looked up
via the `self` parameter), thus we do not
pass the `self` arg in this case.
@Otto-AA Otto-AA changed the title [WIP] Use LibCST instead of parso Use LibCST instead of parso Apr 8, 2025
@Otto-AA
Copy link
Collaborator Author

Otto-AA commented Apr 8, 2025

It's ready for review now :)

If you need some explanations / changes to make it easier to review, feel free to ask. It may be reasonable to look at the diffs of each commit individually.

Commits

Two commits for fixes (identical to #375 , you can close the other one). Not necessary for the libcst implementation, I could remove them and adapt the test cases.

Two commits that add tests for more edge cases and E2E testing.

One commit that removes unused code from the code base (which made it easier for me to understand the logic; not necessary for this PR).

Then a big commit that replaces parso with LibCST.

And finally a commit that adds multiprocessing to the mutant generation.

E2E snapshot testing

The E2E test runs the function used by mutmut run on a small test project (e2e_projects/my_lib/). It then checks if any of the mutant stats from the .meta files changed compared to the previous snapshot stored at tests/e2e/snapshots/my_lib.json.

This is helpful to see if the whole trampoline setup works as expected, or is accidentally broken (which is how I found the class method bug fixed in the 2nd commit). When code changes should influence the mutants of the E2E test, one can delete my_lib.json and it will be re-created with the current snapshot.

Parso -> LibCST

I've kept the trampoline setup as before, including the generated function names. I did changed the mutations implementation and the code creation implementation.

Node mutations

The mutations from __init__.py are now rewritten in node_mutation.py.

Each mutation function takes a node and returns an iterable of mutated nodes. The list mutation_operators contains all mutations and specifies on which node types which function should be called.

File mutation

Some code from __main__.py is now in file_mutation.py:

  • the source code parsing
  • iteration over the CST and calling of the functions in node_mutation.py
  • creation of a new source code with mutated functions and trampolines (using trampoline_templates.py)

The main function here is mutate_file_contents(filename, code) which first creates a list of mutations and then combines them with trampolines to a single mutated source code.

The class MutationVisitor is responsible for iterating over all nodes and creating mutations.

The combine_mutations_to_source takes a module and mutations for it, incorporates all the mutations into this module with trampolines, and outputs the code for it.

The code in trampoline_templates.py is only moved from __main__.py but not modified.

Known differences

Performance

The LibCST implementation is significantly slower at creating mutated source code.

Without multiprocessing on a project of mine mutmut run takes:

  • with parso: 1.2s mutant generation (800 mutants, 35s rest of mutmut run)
  • with LibCST: 7.5s mutant generation (1000 mutants, 50s rest of mutmut run)

EDIT: With multiprocessing, the LibCST version gets down to 2s.

The LibCST implementation seems hard to optimize, I don't think it will get much better than this. A lot of the time is spent in traversing the CST and deep-replacing nodes with mutants, which both seem necessary and are implemented within LibCST. I've already improved some other parts compared to a previous version.

Mutations

Added a mutation match statements (mainly as a PoC that match parsing works even when running with Python 3.9). For each case in the match, we create a mutation without this case (similar to how we remove arguments in calls).

No decorators removal. Previously it removed decorators on inner functions (e.g. a def foo: @dec def bar: ...). I think this could also be implemented if we provide the MutationVisitor as an additional argument to the node_mutation functions. But I think this decorator removal does occur only seldomly, so I didn't take the effort to try it.

No slice replacement with None. Previously it would replace a[b] with a[None]. At least for normal lists this always raises an Exception, so I did not re-implement this.

A more frequent replacement of args with None (foo(a, b) becomes foo(None, b) and foo(a, None)). I don't know why some args were not replaced with None previously.

No mutation of complex default params. For instance, def foo(a = A("abc")): ... was previously mutated to def foo(a = A("XXabcXX")): .... Now it is not mutated. I changed this, because these default params are executed at import time and thus the mutation could raise Exceptions even when not being enabled. However, it still mutates strings, numbers and variables as default params (e.g. def foo(a = 2): ... will mutate to def foo(a = 3): ...).

Removed dict synonyms, because they were not used anyway.

Whitespace differences in the mutated code.

Function hashes

Previously it created a hash_by_function_name dictionary. This is not in use and I think the potential performance optimization that uses it has not been implemented. So I did not re-implement this and simply set it to {}.

@Otto-AA Otto-AA marked this pull request as ready for review April 8, 2025 08:16
@boxed
Copy link
Owner

boxed commented Apr 8, 2025

Amazing work! I've invited you to be a maintainer. After this change you will be the foremost expert on the mutmut code base I think :P

@boxed
Copy link
Owner

boxed commented Apr 8, 2025

LibCST being slower is a bit sad. Could we maybe do something like mocking/proxying to avoid the deep copying?

The LibCST API was what made me not pursue it the first time around too, as it felt clunky... and I see that some of that reaction I had was warranted. But the ability to support modern python is well worth it of course.

@boxed boxed merged commit d1adfc2 into boxed:main Apr 8, 2025
5 checks passed
@Otto-AA
Copy link
Collaborator Author

Otto-AA commented Apr 9, 2025

Amazing work! I've invited you to be a maintainer. After this change you will be the foremost expert on the mutmut code base I think :P

Thank you, looking forward to work on this project!

How would you prefer contributions from my side? Would you prefer if I ask you for all changes in advance? And to create PRs rather than directly pushing to main?

I could also push changes without asking, if it seems straightforward (small bug fixes or features; for instance I noticed that there is a --max-children parameter which should also be respected by the multiprocessing for mutant generation). And ask for your point of view when I'm unsure about some changes.

@Otto-AA
Copy link
Collaborator Author

Otto-AA commented Apr 9, 2025

LibCST being slower is a bit sad. Could we maybe do something like mocking/proxying to avoid the deep copying?

I've created a separate issue for this discussion.

@boxed
Copy link
Owner

boxed commented Apr 9, 2025

I think you can just commit straight in. You've already shown that you have more time/energy right now than I have :)

I'm of course here for discussions if you want a sounding board. Btw, maybe you should join the mutation testing discord where there is a mutmut channel?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support python 3.10+

2 participants