I added the original fast implementation in this CL https://go.dev/cl/2828 because I found it useful, clear, and efficient in tuple "less" comparisons like this:
Compared to the != then < approach, it makes only a single pass over the string data. To this day I never understood the justification for intentionally making it slower, or why the style of code above isn't reasonable.