But are those steps actually doing anything that can be tested? My experience with these sorts of codebases was always that most of the functions aren't doing much other than calling other functions, and therefore testing those functions ends up either with testing exactly the same behaviour in several places, or mocking so heavily as to make the test pointless.
Or worse, I've seen people break functions apart in such a way that you now need to maintain some sort of class-level state between the function calls in order to get the correct behaviour. This is almost impossible to meaningfully test because of the complex possible states and orders between those states - you might correctly test individual cases, but you'll never cover all possible behaviours with that sort of system.
> testing exactly the same behaviour in several places
I think that's actually fine. Particularly if you're doing a testing pyramid style approach where you do a lot of tests of some piece of low level logic and then a few tests of the higher level piece that makes use of that lower level logic, I don't see any problem with the higher level test covering a codepath that's also used in the lower level test. If anything I find it makes it easier to understand and debug failures - you know that the behaviour of the lower level piece hasn't changed because otherwise the lower level test would have failed, so the bug can only be in the higher level component itself.
Or worse, I've seen people break functions apart in such a way that you now need to maintain some sort of class-level state between the function calls in order to get the correct behaviour. This is almost impossible to meaningfully test because of the complex possible states and orders between those states - you might correctly test individual cases, but you'll never cover all possible behaviours with that sort of system.