Why do you say most mutations will completely break an application? This is certainly not my experience.
Mutation testing systems normal use fairly stable operators (e.g changing a > to a >=). In most locations in the code changes such as these will have only a subtle effect.
Along with high computational cost, it's one of the things identified in academic research as preventing the widespread use of mutation testing. There is no general method for distinguishing an equivalent mutation from a normal surviving one except for getting a human to have a look at it.
Having built pitest to address the concerns around computational cost, it was a pleasant surprise to find out that in practice equivalent mutations are not much of a problem.
This isn't entirely by accident - the default set of mutation operations are carefully designed to make equivalent mutations unlikely (they don't/can't guarantee not to create them, but they make them as unlikely as possible).
There's a trade off here. Pitest has a smaller set of operators than a lot of research focussed systems. A larger set of operators would catch more issues, but would also create a larger proportion of equivalent mutants (and also take longer to run).
There are more operators you can enable you wish to change this balance - an operator that changes constants as you describe is one of them.
I rarely encounter equivalent mutants using the default operators and I know of some rollouts of pitest where they break the build on anything less than 100% mutation coverage.
I have no figures to back this up, but I strongly suspect the % of equivalent mutants will be highly dependent on coding style and the domain in which the code operates.
> I have no figures to back this up, but I strongly suspect the % of equivalent mutants will be highly dependent on coding style and the domain in which the code operates.
Agreed. In everything I've written, at least, I can identify multiple places where such equivalent mutants do exist, even with just the default operators. (In particular, hashcode methods - there are very few mutators that break the contract of a hashcode method, though removing entropy in most cases) But I can easily see that not being the case with other coding styles.
Are the mutations done documented anywhere? I had to look through the source to see what's done.
No - pitest has no problems with the use of interfaces, reflection etc.
The sentence you quoted relates to an optional feature that allows the tests that will be run against a mutated class to be limited to those within a certain "distance" (i.e number of method calls) from it.
The feature is little used, but is useful in some very specific circumstances.
Even if you do enable it, it would only cause a problem if the class under test was only referred to within the test by some interface that it implemented (which would be very unusual). The fact that the classes dependencies were declared as interfaces would cause no problems.
What about checking the history file into its own rep? Clone it as part of the build-environment setup, and commit/push after validation. You can even use this across a group of people and share each other's mutations.
Or even simpler, why not have build actions copy it to/from a specific location outside the build target dir? You could even make it a shared network drive, if you didn't care about the possibility of losing someone's changes if two people run tests at the same time.
I'd recommend using one of the build tool integrations first - they're part of the core project. The IDE integrations still needed a bit of work last time I looked at them.
Mutation testing systems normal use fairly stable operators (e.g changing a > to a >=). In most locations in the code changes such as these will have only a subtle effect.