Happy to help, my current focus is on llms and how to understand them (pros, cons, how to use them safely and where they can fit into your workflow) so opportunities to talk through these things are useful for me.
> I suspect though that the reason the tool was "outrageously slow" in your experiment is that you gave a very general grammar
Actually even smallish ones caused problems but jsonformer (a similar tool) worked fine. Not sure what the issue is with this one, I couldn't get it to complete. Not sure if I've got the hacked together code I used to get the json, I was using very small models which didn't help but my internet is slow and I couldn't load anything decent in the time so some of the testing was "here's an llms jsonish output, fix it to this exact schema". Smaller models needed more hand holding. Gpt2 had no idea how to deal with it.
For jsonformer the grammar was near identical to what I posted before, I fixed a couple of typos I think.
Personally the flow of:
Reason about the problem
Write in english
Convert to JSON
- use a tool like this to fix broken JSON
Is a workflow I think is very applicable (you can use different models for any step too).
> again- I'd like to see the prompt that led to that, please
Sure, that was from gpt4, which actually was either fine or decent if given the jsonschema.
Here's the original prompt and the full response that had a full backstory:
> fun but not over the top character from the middle ages, with relevant weapons and a backstory. Game theme is a world populated by anthropomorphic vegetables
It's a shame you can't use some of these tools with gpt4, it's in a class of its own.
> Also, it's obvious that while you'll get valid json like that, you have no guarantee that the contents will always match your request
Yeah absolutely. You need to be doing something simple enough for the llm in use to reliably generate sensible output, tools like this then let you integrate that into other systems. How best to use llms really comes into how to pick a good one for the use case and how critical errors are - proposing d&d characters is a very low risk option (human oversight, no automatic application, errors are mostly just annoying, fixing is easy).
> I suspect though that the reason the tool was "outrageously slow" in your experiment is that you gave a very general grammar
Actually even smallish ones caused problems but jsonformer (a similar tool) worked fine. Not sure what the issue is with this one, I couldn't get it to complete. Not sure if I've got the hacked together code I used to get the json, I was using very small models which didn't help but my internet is slow and I couldn't load anything decent in the time so some of the testing was "here's an llms jsonish output, fix it to this exact schema". Smaller models needed more hand holding. Gpt2 had no idea how to deal with it.
For jsonformer the grammar was near identical to what I posted before, I fixed a couple of typos I think.
Personally the flow of:
Reason about the problem
Write in english
Convert to JSON
- use a tool like this to fix broken JSON
Is a workflow I think is very applicable (you can use different models for any step too).
> again- I'd like to see the prompt that led to that, please
Sure, that was from gpt4, which actually was either fine or decent if given the jsonschema.
Here's the original prompt and the full response that had a full backstory:
> fun but not over the top character from the middle ages, with relevant weapons and a backstory. Game theme is a world populated by anthropomorphic vegetables
https://chat.openai.com/share/4037c8b3-d1bf-4e66-b98d-b518aa...
It's a shame you can't use some of these tools with gpt4, it's in a class of its own.
> Also, it's obvious that while you'll get valid json like that, you have no guarantee that the contents will always match your request
Yeah absolutely. You need to be doing something simple enough for the llm in use to reliably generate sensible output, tools like this then let you integrate that into other systems. How best to use llms really comes into how to pick a good one for the use case and how critical errors are - proposing d&d characters is a very low risk option (human oversight, no automatic application, errors are mostly just annoying, fixing is easy).