Local models hallucinated a lot more that gpt4o-mini, so I stayed with OpenAI. On top of that, I paid around 14€ for inference on ~200 examples on OVH and inference was much slower. I am planning on getting everything running on Mistral or Llama though.
I used sqlite everywhere so datasette was good for visualizing scraped and extracted data. Simon released structured generation for llm a few days after I did the project though, so I haven't tried yet.
I used sqlite everywhere so datasette was good for visualizing scraped and extracted data. Simon released structured generation for llm a few days after I did the project though, so I haven't tried yet.