I actually think you're onto something there. The "MicroLLMs Architecture" could...

arcfour · 2025-03-06T18:50:00 1741287000

This is already done with agents. Some agents only have tools and the one model, some agents will orchestrate with other LLMs to handle more advanced use cases. It's pretty obvious solution when you think about how to get good performance out of a model on a complex task when useful context length is limited: just run multiple models with their own context and give them a supervisor model—just like how humans organize themselves in real life.

fnordpiglet · 2025-03-06T19:07:37 1741288057

I’m doing this personally for my own project - essentially building an agent graph that starts with the image output, orients and cleans, does a first pass with tesseract LSTM best models to create PDF/HOCR/Alto, then pass to other LLMs and models based on their strengths to further refine towards markdown and latex. My goal is less about RAG database population but about preserving in a non manually typeset form the structure and data and analysis, and there seems to be pretty limited tooling out there since the goal generally seems to be the obviously immediately commercial goal of producing RAG amenable forms that defer the “heavy” side of chart / graphic / tabular reproduction to a future time.

unboxingelf · 2025-03-06T23:46:16 1741304776

Take a look at MCP, Model Context Protocol.