more simonbutt's comments

simonbutt · on June 1, 2023

Q: "Tell me about Stonks GPT?"

A: "Stonks GPT is a trading name for Stocks & Breads, a UK-based provider of bread and breakfast products. The company was founded in 1894 and is headquartered in London. It is a leading provider of bread products in the UK, with a portfolio of over 50 brands. Stonks GPT is a subsidiary of Unilever PLC, which owns the brands including Pepperidge Farm, Lipton, and Marmite. The company's core business is the production of bread and breakfast products, with a focus on quality and freshness."

There are a lot of hallucinations here. Recommend really specifying in the prompt that if it doesn't know or is unsure, say that and don't make up facts. Understand it's alpha rn and this is all free UAT but for trading related products, accuracy and trust in the product is vital

l33t233372 · on June 1, 2023

Is it it even technically for the model to know it’s hallucinating?

simonbutt · on June 1, 2023

You won't catch everything but I've found if you specify that it's okay to respond that it doesn't know the answer and essentially accuracy on a smaller subset of answers > responding to all, it often will say it doesn't know rather than making something up.

Another option could be - assuming this is an LLMChain that adds the most similar (n) embedding(s) to the prompt before passing to LLM - basic entity recognition to find the asset name from the query initially and dynamic prompt that if the most similar results don't contain the asset name, don't provide examples and respond saying you don't know.

__loam · on June 1, 2023

So no then

jsight · on June 1, 2023

I feel like "hallucinating" is the wrong word. It is essentially predicting the next word based upon its neural net and training. If this training does not have the right data, it will predict things that are wrong.

While knowing might be impossible, it seems like the model could provide a confidence level and only provide answers that exceed some threshold. It'd be a bit like asking a human "are you sure about that?"

And in practice, I really don't think it is that different. We humans effectively make things up all the time. Sometimes we are well aware of our educated guesses and sometimes we are less aware.

It isn't realistic to expect an artificial intelligence to be vastly better than human intelligence in this regard.

yreg · on June 1, 2023

The catch is that the models don't have confidence. They cannot distinguish between knowing something for sure, guessing something and hallucinating false knowledge.

Perhaps researchers will find some clever solution to mitigate this, but for now hallucinating is a pretty good word precisely because the model doesn't give confidence.

T-A · on June 1, 2023

The Internal State of an LLM Knows When its Lying

https://arxiv.org/abs/2304.13734

yreg · on June 1, 2023

It's not surprising that this information is somehow saved in the internal state, but it is surprising that they were able to read it to some level.

I don't think the model can access this information, but an external "lie detector" would be interesting. Thanks for the paper.

coconuthacker42 · on June 2, 2023

that seems wrong.. don't llms runs on probability distributions? Won't they have confidence built in?

yreg · on June 2, 2023

I don't think probability of tokens can be mistaken for confidence of knowledge being correct. I'd take it more as a fitness whether the tokens fit naturally in the sentence.

coconuthacker42 · on June 2, 2023

Makes sense, but i wonder if there's some way to retrofit that functionality onto it

simonbutt · on May 31, 2023

Product market fit