Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm storing a lot of text documents (.html) which contain long similiar sections and are thus not copies but "partial copies".

Would someone know if the fast dedup works also for this? Anything else I could be using instead?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: