Hacker News new | past | comments | ask | show | jobs | submit login

Sorry for the confusion. They are used for "merging" scraped data from various sources, not in the scraping process itself. For example, they help in figuring out if similar-sounding listings on related websites refer to the same "thing".

If interested, take a look at this (and related) papers: http://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf




That makes more sense. Thanks! I'll check out the paper. I was hoping you had some revolutionary new scraping method.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: