Local models hallucinated a lot more that gpt4o-mini, so I stayed with OpenAI. On top of that, I paid around 14€ for inference on ~200 examples on OVH and inference was much slower. I am planning on getting everything running on Mistral or Llama though.
I used sqlite everywhere so datasette was good for visualizing scraped and extracted data. Simon released structured generation for llm a few days after I did the project though, so I haven't tried yet.
The data is initially not at all structured, and the critics talk about a chef's CV in passing. For instance, take this example:
> At Grenat, Antoine Joannier and Neil Mahatsry are bathed in an ardent red glow, much like the pomegranate-toned walls of their space. After working together at La Brasserie Communale, where they first met, the duo is now firing on all cylinders in the heart of Marseille, where Antoine tends to guests seated around blonde wood tables, delivering dishes ignited by Neil behind the bar. From oysters to prime cuts of red meat, […]
I tried using NER models and the results were not great. Furthermore, these models do not extract relationships between entities (other models exist for that though). Haven't tried fine-tuning at all!
There is also a lot of variation in the ways of presenting a chef's prior restaurants, which makes this a good use-case for LLMs.
LLMs have without a doubt replaced NER models and libraries like SpaCy. At least for my use-cases, creating ontologies and populating knowledge graphs.
I agree the spatialization could be better. I used one of the algorithms in Gephi-lite directly. Do you have a favorite spatialization algorithm to recommend?
I made a mistake; I had checked the other link in your post ("You can explore the visualization here: [Interactive Culinary Network]") instead of the main link.
[1] : https://oxide-and-friends.transistor.fm/episodes/paths-into-...