That's the great thing about HtmlAgilityPack, extracting data from HTML is really easy. I might even say even easier than if I had the page in some table-based data system.
Unlike APIs, html class/tag names or whatever provide no stability guarantees. The site owner can break your parser whenever they want for any reason. They can do that with an API, but usually won't since some guarantee of stability is the point of an API.
True, but the analysis was done on files downloaded over the span of two or three days. If someone had decided to change the CSS class of an infobox during that time, I'd have noticed, investigated and adjusted my code appropriately.
Scrapping, especially on a large scale, can put a noticeable strain on servers.
Bulk downloads (database dumps) are much cheaper to serve for someone crawling millions of pages.
It gets even more significant if generation of reply is resource intensive (not sure is Wikipedia qualifying for that but complex templates may cause this).