Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried using OCR to scrap Facebook profiles by simulate web browsing behavior. It helps a lot in avoid account blocking but still too slow to be practical.


What's your reason for scraping so many people's personal data?


Really just curious about this approach and want to test it since most old scraping methods failed on Facebook data. My take is that it is possible with enough resources since it is actually pretty hard to separate this from real usages.


Try simulating it using headless chrome (directly accessing the page), works fast




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: