It turns out that this is such a common mistake that there's even a name for this encoding, CESU-8: http://en.wikipedia.org/wiki/CESU-8
It turns out that this is such a common mistake that there's even a name for this encoding, CESU-8: http://en.wikipedia.org/wiki/CESU-8