Interestingly I’m currently going through and scanning the hundreds of journal papers my grandfather authored in medicine and thinking through what to do about graphs. I was expecting to do some form of multiphase agent based generation of LaTeX or SVG rather than a verbal summary of the graphs. At least in his generation of authorship his papers clearly explained the graphs already. I was pretty excited to see your post naturally but when I looked at the examples what I saw was, effectively, a more verbose form of
```  ```
I’m assuming this is partially because your use case is targeting RAG under various assumptions bur also partially because multimodal models aren’t near what I would need to be successful with?
We need to update the examples on the front page. Currently for things that are considered charts/graphs/figures we convert to a description. For things like logos or images we do an image tag. You can also choose to exclude them.
The difference with this is that it took the entire page as an image tag (it's just a table of text in my document). rather than being more selective.
I do like that they give you coordinates for the images though, we need to do something like that.
Give the actual tool a try. Would love to get your feedback for that use case. It gives you 100 free credits initially but if you email me (ali@doctly.ai), I can give you an extra 500 (goes for anyone else here also)
```  ```
I’m assuming this is partially because your use case is targeting RAG under various assumptions bur also partially because multimodal models aren’t near what I would need to be successful with?