The comparison isn't to prove that .NET is always faster than C in all circumstances, it was to demonstrate that the advice to call out to C from .NET is outdated and now worse than the naive approach.
Can C wizards write faster code? I'm sure they can, but I bet it takes longer than writing a.SequenceEquals(b) and moving on to the next feature, safe in the knowledge that the standard library is taking care of business.
"Your standard library is more heavily optimised" isn't exactly a gotcha. Yes, the JIT nature of .NET means that it can leverage processor features at runtime, but that is a benefit to being compiled JIT.
> Does memcmp do all of these things? Is msvcrt.dll checking at runtime which extensions the CPU support
It's possible for a C implemention to check the CPU at dynamic link time (when the DLL is loaded) and select which memcmp gets linked.
The most heavily used libc string functions also have a tendency to use SIMD when the data sizes and offsets align, and fall back to the slow path for any odd/unaligned bytes.
I don't know to what extent MSVCRT is using these techniques. Probably some.
Also, it's common for a compiler to recognize references to common string functions and not even emit a call to a shared library, but provide an inline implementation.
The [Intrinsic] annotation is present because such comparisons on strings/arrays/spans are specially recognized in the compiler to be unrolled and inlined whenever one of the arguments has constant length or is a constant string or a span which points to constant data.
memcmp is also supposed to be heavily optimized for comparing arrays of bytes since, well, that is literally all that it does.
msvcrt.dll is the C runtime from VC++6 days; a modern (as in, compiled against VC++ released in the last 10 years) C app would use the universal runtime, ucrt.dll. That said, stuff like memcpy or memcmp is normally a compiler intrinsic, and the library version is there only so that you can take an pointer to it and do other such things that require an actual function.
Does memcmp do all of these things? Is msvcrt.dll checking at runtime which extensions the CPU support?
Because I don't think msvcrt.dll is recompiled per machine.
I think a better test would be to create a DLL in C, expose a custom version of memcmp, and compile that with all the vectorization enabled.