Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, UTF-8 sucks for Chinese and UTF-16 is bad at English, but in practice, high-codepoint languages are rarely mixed with low-codepoint ones. Notice that when sending an email many mail programs will select the most concise encoding that happens to encompass every character in your message and usually not UTF-8 or UTF-16.


Counterexample: High-codepoint text in HTML or XML 1.0 vocabularies.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: