To simplify: there is a difference between "static typing is better than dynamic...

To simplify: there is a difference between "static typing is better than dynamic typing" and "all static typing is always better than all dynamic typing". It's basically the difference between ∃ and ∀.

Saying that "static typing is better than dynamic typing" is like the former: there exists some static typing system that is better than dynamic typing. Saying that "all static type systems are better than any dynamic system" is like the second. All the paper ever says is the first: "Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking." Note how it never claims to apply for all possible static type systems; rather, it just says that tests are an inadequate replacement for type systems in general (i.e. there exists some type system that catches more errors than tests). This is exactly like my first example.

In summary: a being better than b does not mean that all a is always better than all b. Just because static typing is better than dynamic typing does not imply that Java is always better than Python; it merely implies that some statically typed language is better than Python.