There are, by definition, no Unicode characters that don't fit in UTF-16.
UTF-16 has surrogate pairs; it's an extension of UCS-2, which doesn't.
Incidentally, this is why UTF-16 is a poor choice for a character encoding: you take twice the memory but you don't actually get O(1) indexing, you only think you do, and then your program breaks badly when someone inputs a surrogate pair.
UTF-16 has surrogate pairs; it's an extension of UCS-2, which doesn't.
Incidentally, this is why UTF-16 is a poor choice for a character encoding: you take twice the memory but you don't actually get O(1) indexing, you only think you do, and then your program breaks badly when someone inputs a surrogate pair.
See also elsewhere in the thread: https://news.ycombinator.com/item?id=8066284