What kind of private internals do you have in mind? You absolutely can hand-roll your own comparison routine, just hard to beat existing implementation esp. once you start considering culture-sensitive comparison (which may defer to e.g. ICU).
There are no private SIMD APIs save for sequence comparison intrisic for unrolling against known lengths which JIT/ILC does for spans and strings.
IIRC (Been a month or so since I looked into it) I couldn't access the underlying array in a way SIMD liked I think? If you look at how they did it inside the actual string class it uses those private properties of the string that are only available internally to guarantee you don't change the string data if memory serves.
String can provide you a `ReadOnlySpan<char>`, out of which you can either take `ref readonly char` "byref" pointer, which all vectors work with, or you can use the unsafe variant and make this byref mutable (just don't write to it) with `Unsafe.AsRef`.
Because pretty much every type that has linear memory can be represented as span, it means that every span is amenable to pointer (byref) arithmetics which you then use to write a SIMD routine. e.g.:
var text = "Hello, World! Hello, World!";
var span = MemoryMarshal.Cast<char, ushort>(text);
ref readonly var ptr = ref span[0];
var chunk = Vector128.LoadUnsafe(in ptr);
var needle = Vector128.Create((ushort)',');
var comparison = Vector128.Equals(chunk, needle);
var offset = uint.TrailingZeroCount(comparison.ExtractMostSignificantBits());
Console.WriteLine(text[..(int)offset]);
If you have doubts regarding codegen quality, take a look at: https://godbolt.org/z/b97zjfTP7
The above vector API calls are lowered to lines 17-22.
Oh interesting, I'll have to give that a try then. My concern was avoiding a reallocation by doing it another way, but if the readonly span works I can see how it would get you there. I need to see if I still have that project to test it out, appreciate the heads up. SIMD is something I really want to get better with.
If you go through the guide at the first link, it will pretty much set you up with the basics to work on vectorization, and once done, you can look at what CoreLib does as a reference (just keep in mind it tries to squeeze all the performance for short lengths too, so the tail/head scalar handlers and dispatch can be high-effort, more so than you may care about). The point behind the way .NET does it is to have the same API exposed to external consumers as the one CoreLib uses itself, which is why I was surprised by your initial statement.
No offense taken, just clarifying, SIMD can seem daunting especially if you look at intrinsics in C/C++, and I hope the approach in C# will popularize it. Good luck with your experiments!
I appreciate you taking the time to talk me through this, SIMD has been an interest of mine for a while. I ran into issues and then when I went and looked at how the actual string class did it I stopped since they were doing tricks that required said access to the internal data. But this gives me a path to explore. I was already planning on looking at the links you supplied.
There are no private SIMD APIs save for sequence comparison intrisic for unrolling against known lengths which JIT/ILC does for spans and strings.