Yes, this library is of uneven quality, and would benefit from a few hours of focused attention from a specialist. E.g. the various HighestBit functions below what you've called out also look inefficient relative to something using one of the builtin_clz intrinsics, even though there's an earlier use of builtin_clz...
I think a lot of times the way a header like this comes into being is that someone has a narrow need for an operation that is deemed reusable and abstract. The 64 bit version of a 32 bit bit utility doesn't have a need at the time. Then somebody comes in later trying to fill out more stuff.