Awesome! The problem with extracting schema automatically is that you won't know what comes out of it upfront and it could be changing on every run. What I'm trying to do is enable scraping webpages in a structured (and type-safe!) manner.
In my experience Anthropic models are more steerable (requires less prompting) than OpenAI's. For example in code-generation, I'd tell GPT-4 to not include any comments, yet sometimes it would just ignore this. Have not experienced this with Opus yet.
https://github.com/mishushakov/llm-scraper