After reading the docs for the new ChatGPT function calling yesterday, it's stru...

tornato7 · on June 14, 2023

IIRC, there's a way to "force" LLMs to output proper JSON by adding some logic to the top token selection. I.e. in the randomness function (which OpenAI calls temperature) you'd never choose a next token that results in broken JSON. The only reason it wouldn't would be if the output exceeds the token limit. I wonder if OpenAI is doing something like this.

ManuelKiessling · on June 14, 2023

Note that you don’t necessarily need to have the AI output any JSON at all — simply have it answer when being asked for the value to a specific JSON key, and handle the JSON structure part in your hallucinations-free own code: https://github.com/manuelkiessling/php-ai-tool-bridge

lyjackal · on June 15, 2023

Would be nice if you could send a back and forth interaction for each key. This approach turns into lots of requests that reapply the entire context and ends up slow. I wish i could just send a Microsoft guidance template program, and process that in a single pass.

naiv · on June 14, 2023

Thanks for sharing!

senko · on June 14, 2023

It would seem not, as the official documentation mentions the arguments may be hallucinated or be a malformed JSON.

(except if the meaning is the JSON syntax is valid but may not conform to the schema, but they're unclear on that).

sanxiyn · on June 14, 2023

For various reasons, token selection may be implemented as upweighting/downweighting instead of outright ban of invalid tokens. (Maybe it helps training?) Then the model could generate malformed JSON. I think it is premature to infer from "can generate malformed JSON" that OpenAI is not using token selection restriction.

woodrowbarlow · on June 14, 2023

the linked article hypothesizes:

> I assume OpenAI’s implementation works conceptually similar to jsonformer, where the token selection algorithm is changed from “choose the token with the highest logit” to “choose the token with the highest logit which is valid for the schema”.

sanxiyn · on June 14, 2023

Note that this (token selection restriction) is even available on OpenAI API as logit_bias.

newhouseb · on June 14, 2023

But only for the whole generation. So if you want to constrain things one token at a time (as you would to force things to follow a grammar) you have to make fresh calls and only request one token which makes things more or less impractical if you want true guarantees. A few months ago I built this anyway to suss out how much more expensive it was [1]

[1] https://github.com/newhouseb/clownfish#so-how-do-i-use-this-...

ttul · on June 15, 2023

I think the problem is that tokens are not characters. So even if you had access to a JSON parser state that could tell you whether or not a given character is valid as the next character, I am not sure how you would translate that into tokens to apply the logit biases appropriately. There would be a great deal of computation required at each step to scan the parser state and generate the list of prohibited or allowable tokens.

But if one could pull this off, it would be super cool. Similar to how Microsoft’s guidance module uses the logit_bias parameter to force the model to choose between a set of available options.

yunyu · on June 15, 2023

You simply sample tokens starting with the allowed characters and truncate if needed. It’s pretty efficient, there’s an implementation here: https://github.com/1rgs/jsonformer

DougBTX · on June 15, 2023

This is the best implementation I've seen, but only for Hugging Face models: https://github.com/1rgs/jsonformer

have_faith · on June 14, 2023

How would a tweaked temp enforce a non broken output exactly?

sanxiyn · on June 14, 2023

It's not temperature, but sampling. Output of LLM is probabilistic distribution over tokens. To get concrete tokens, you sample from that distribution. Unfortunately, OpenAI API does not expose the distribution. You only get the sampled tokens.

As an example, on the link JSON schema is defined such that recipe ingredient unit is one of grams/ml/cups/pieces/teaspoons. LLM may output the distribution grams(30%), cups(30%), pounds(40%). Sampling the best token "pounds" would generate an invalid document. Instead, you can use the schema to filter tokens and sample from the filtered distribution, which is grams(50%), cups(50%).

isoprophlex · on June 14, 2023

Not traditional temperature, maybe the parent worded it somewhat obtusely. Anyway, to disambiguate...

I think it works something like this: You let something akin to a json parser run with the output sampler. First token must be either '{' or '['; then if you see [ has the highest probability, you select that. Ignore all other tokens, even those with high probability.

Second token must be ... and so on and so on.

Guarantee for non-broken (or at least parseable) json

behnamoh · on June 14, 2023

What's the implication of this new change for Microsoft Guidance, LMQL, Langchain, etc.? It looks like much of their functionality (controlling model output) just became obsolete. Am I missing something?

lbeurerkellner · on June 14, 2023

If anything this removes a major roadblock for libraries/languages that want to employ LLM calls as a primitive, no? Although, I fear the vendor lock-in intensifies here, also given how restrictive and specific the Chat API.

Either way, as part of the LMQL team, I am actually pretty excited about this, also with respect to what we want to build going forward. This makes language model programming much easier.

londons_explore · on June 14, 2023

> Although, I fear the vendor lock-in intensifies here,

The openAI API is super simple - any other vendor is free to copy it, and I'm sure many will.

koboll · on June 14, 2023

`Although, I fear the vendor lock-in intensifies here, also given how restrictive and specific the Chat API.`

Eh, would be pretty easy to write a wrapper that takes a functions-like JSON Schema object and interpolates it into a traditional "You MUST return ONLY JSON in the following format:" prompt snippet.

neuronexmachina · on June 14, 2023

Langchain added support for `function_call` args yesterday:

* https://github.com/hwchase17/langchain/pull/6099/files

* https://github.com/hwchase17/langchain/issues/6104

IMHO, this should make Langchain much easier and less chaotic to use.

gawi · on June 15, 2023

It's only been added to the OpenAI interface. Function calling is really useful when used with agents. To include that to agents would require some redesign as the tool instructions should be removed from the prompt templates in favor of function definitions in the API request. The response parsing code would also be affected.

I just hope they won't come up with yet another agent type.

neuronexmachina · on June 15, 2023

Like this? https://github.com/hwchase17/langchain/blob/master/langchain...

gawi · on June 15, 2023

LangChain is a perpetual hackathon.

arbuge · on June 15, 2023

They have something closer to a simple Hello World example here:

https://platform.openai.com/docs/guides/gpt/function-calling

That example needs a bit of work I think. In Step 3, they're not really using the returned function_name; they're just assuming it's the only function that's been defined, which I guess is equivalent for this simple example with just one function but less instructive. In Step 4, I believe they should also have sent the function definition block again a second time since model calls in the API are memory-less and independent. They didn't, although the model appears to guess what's needed anyway in this case.

H8crilA · on June 15, 2023

That SQL example is going to result in a catastrophe somewhere when someone uses it in their project. It is encouraging something very dangerous when allowed to run on untrusted inputs.