I have not used this specific library but its far from unrealistic and hardly a money pit. A LLM can fit in nicely with scraping libraries. Sure if you are crawling the web like google, it makes no sense, but if you have a hit list, this can be a cost effective way to not have engineering hours spent maintaining the crawler.
There are phenomenal web scraping tools to crudely "preprocess" the document a bit, slashing outer HTML fluff while preserving the small subset of actual data. From there, 8k tokens (or whatever) goes really far.