I'm sure the people who made those optimisations took very good benchmarks before they made them. And, most likely, they do work under the right circumstances.
However, to make a generalisation/optimisation of my own, in 99% of the cases, trying to write overly clever code like that is a waste of time and does not result in faster code - though it does obscure your code and make it harder to optimise later when you actually need to. It would probably be better for 90+% of the people who will stumble on this article to never have read it.
The sort, OK, but the URINameSpace lookups? I think all the other elements of parsing XML take so much more time that in this case the "optimization" is really nonsense. I mean even having spent time on optimizing that code was probably a waste, premature optimization. (I have never written an XML parser with namespaces, though).
XML parsing can be done with a simple tag stack and a linear walk through the document. It's very cheap. And a quick loop over 6 (or whatever) strings in L1 cache can easily be faster than a generalized hash table lookup. Given that the namespace tag lookup isn't a huge deal anyway (just walk up the stack doing a string comparison at each level), any optimization needs to be pretty lean to start with. I'd guess that the hash thing is probably a wash; this is the only technique likely to work well.
Key word you said: "I think." Yet the code was taken from an actual XML parser, where one of the goals is performance. The optimization is likely the result of experience and data.
However, to make a generalisation/optimisation of my own, in 99% of the cases, trying to write overly clever code like that is a waste of time and does not result in faster code - though it does obscure your code and make it harder to optimise later when you actually need to. It would probably be better for 90+% of the people who will stumble on this article to never have read it.