Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd need to look up the individual scrapers for a fair comparison, since I tend to forget/mixup the challenges. Some scrapers have been around for 10 years and only required little updates.

In general the more native HTML elements and the more descriptive CSS classes are used the easier it gets. Disadvantageous is when great parts of a doc page are built using JavaScript, e.g. when the whole nav is generated dynamically as the nav is typically the source for categorization/grouping on devdocs.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: