Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Once you know how to compress 32-bit parameters to ternary, compressing ternary to binary is the easy part. :)

They would keep re-compressing the model in its entirety, recursively until the whole thing was a single bit, but the unpacking and repacking during inference is a bitch.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: