Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wouldn't worry about that too much as someone's already done something similar for reddit (https://towardsdatascience.com/using-nlp-to-identify-reddito...), and has released their code publicly (https://github.com/jabraunlin/reddit-user-id)

Given the technique used, I don't see why something simple and local wouldn't defeat it? The "easiest" technique would be to use this weighting as a negative metric in rewriting.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: