Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I seems to me best approach would be to compress the contents with a Huffman code or some other entropy encoding. All this business of restricted character sets is just an ad-hoc way of reducing the size of each symbol and we've got much more mature solutions for that.


For entropy codes to be effective for such short strings you need a shared initial probability table. And if you have that you are effectively back at special encoding modes for each character set.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: