To simplify: there is a difference between "static typing is better than dynamic typing" and "all static typing is always better than all dynamic typing". It's basically the difference between ∃ and ∀.
Saying that "static typing is better than dynamic typing" is like the former: there exists some static typing system that is better than dynamic typing. Saying that "all static type systems are better than any dynamic system" is like the second. All the paper ever says is the first: "Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking." Note how it never claims to apply for all possible static type systems; rather, it just says that tests are an inadequate replacement for type systems in general (i.e. there exists some type system that catches more errors than tests). This is exactly like my first example.
In summary: a being better than b does not mean that all a is always better than all b. Just because static typing is better than dynamic typing does not imply that Java is always better than Python; it merely implies that some statically typed language is better than Python.
I agree with your characterization in your first paragraph, but I agree with davesims that the conclusions are too strong. If one has to do the level of analysis of the conclusions that you present in your second paragraph, then they are poorly worded. I find davesims' interpretation a reasonable one, which leads me to agree that the conclusions need to be tempered and clarified.
Saying that "static typing is better than dynamic typing" is like the former: there exists some static typing system that is better than dynamic typing. Saying that "all static type systems are better than any dynamic system" is like the second. All the paper ever says is the first: "Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking." Note how it never claims to apply for all possible static type systems; rather, it just says that tests are an inadequate replacement for type systems in general (i.e. there exists some type system that catches more errors than tests). This is exactly like my first example.
In summary: a being better than b does not mean that all a is always better than all b. Just because static typing is better than dynamic typing does not imply that Java is always better than Python; it merely implies that some statically typed language is better than Python.