We're actively using this approach at scale, although still improving :) You can try out a simplified version of this in our playground: https://www.kadoa.com/add
Gave this a go. Just so happened that I had the page of an eBay seller open. Wondered if it could manage to do something as simple as extracting all 240 listed products on that page. Instead of determining that the most important data on this page would be the products, it identified these properties: categoryName, subCategories, link.
yeah i tried with a type of website that i commonly write scrapers for and i'm not sure if i can do anything with these results.
ai + web scraping is hard, i've tried and gave up, but that doesn't mean it's impossible, it just means i'm not a good engineer, so i will stay tuned to kadoa project.
Absolutely not knocking this project. Was just a somewhat unexpected result from such a simple site. Asked GPT-4 to write a scraper just to compare and it produced a quite usable boilerplate.