What I really want to do is make it model agnostic. SDXL was an easy choice at the time, but you could really easily just make it be a local model or any hosted visual model with an endpoint. The core idea is just tying an LLM to an image model and tying those to a force-directed graph, so really anything could be an input (or an output - you could also do it with text)