Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is worth reading the history of the proposal. The final form is superior to the others so someone was doing a lot of editing!

Take the final and second form where the use of multiple letters was eliminated, instead using "v" to indicate bits of the encoded character.

I also chuckle at the initial implementation's note about the desire to delete support for 4/5/6 byte versions. Someone was still laboring under the UCS/UTF-16 delusion that 16-bits was sufficient.



They pretty much got their wish, bytes 5 and 6 are gone, along with half of byte 4!

The RFC that restricted it: https://www.rfc-editor.org/rfc/rfc3629#page-11

A UTF-8 playground: https://utf8-playground.netlify.app/




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: