Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Could Google, or any other company out there, build a digital copy of you that answers questions exactly the way you would? "Hey, we're going to cancel the interview- we found that you aren't a good culture fit here in 72% of our simulations and we don't think that's an acceptable risk."

They kinda did - that's what GMail/Chat/Docs autosuggest does. You've got canned replies to e-mail, editors that complete your sentences, etc.

It works okay for simple stuff - completing a single sentence or responding "OK, sounds good" to an e-mail that you already agree with. It doesn't work all that well for long-form writing, unless that long-form writing is basically just bullshit that covers up "OK, sounds good". (There's a joke within Google now that the future of e-mail is "OK, sounds good" -> AI bullshit generator -> "Most esteemed colleagues, we have organized a committee to conduct a study on the merits of XYZ and have developed the following conclusions [2 pages follow]" -> AI bullshit summarizer -> "OK, sounds good".)

This is a pretty good summary of the state of LLMs right now. They're very good at generating a lot of verbiage in areas where the information content of the message is low but social conventions demand a lot of verbiage (I've heard of them used to good effect for recommendation letters, for example). They're pretty bad at collecting & synthesizing large amount of highly-precise factual information, because they hallucinate facts that aren't there and often misunderstand the context of facts.



I completely agree with you about them failing to be accurate for the various reasons you've explained (hallucinating, limited social conventions, etc).

Unfortunately, I've heard enough people believe the hype that this is actually "synthesizing sentience into the machine" or some other buzz speak.

I have met researchers of AI at credible universities who believe this kind of thing, completely oblivious to how ChatGPT or other models actually work. All it takes is one of them talking out of their butt to the right person in government or law enforcement and you've got people at some level believing the output of AI.

Hell, even my father, who is a trained engineer with a master's degree, can compute complex math and studies particle physics for "fun" had to be thoroughly convinced that ChatGPT isn't "intelligent". He "believed" for several days and was sharing it wildly with everyone until I painfully walked him through the algorithm.

There is a serious lack of diligence happening for many folks and the marketing people are more than happy to use that to drive hype and subtly lie about the real capabilities to make a sale.

I am often more concerned about the people using AI than the algorithm itself.


You seem to think intelligence is something more than data storage and retrieval and being able to successfully apply it to situations outside your training set.

Even very small signs of that ability are worthy of celebration. Why do you feel the need to put it down so hard? Why the need to put down your father, to “enlighten” him?

What is missing? Soul? Sentience?


I do think intelligence is something more than data storage and retrieval. I believe it is adaptive behavior thinking about what data I have, what I could obtain, and how to store/retrieve it. I could be wrong, but that's my hypothesis.

We humans don't simply use a fixed model, we're retraining ourselves rapidly thousands of times a day. On top of that, we seem to be perceiving the training, input, and responses as well. There is an awareness of what we're doing, saying, thinking, and reacting that differs from the way current AI produces an output. Whether that awareness is just a reasoning machine pretending to think based on pre-determined actions from our lower brain activity, I don't know, but it definitely seems significantly more complex than what is happening in current "AI" research.

I think you're also onto something, there is a lot of passive data store/retrieve happening in our perception. I think a better understanding of this is worthwhile. However, I have also been informed by folks who are attempting to model and recreate the biological neurons that we use for language processing. Their belief is that LLM and ChatGPT is quite possibly not even headed in the right direction. Does this make LLM viable long term? I don't know. Time will tell. It already seems to be popping up everywhere already, so it seems to have a business case even in its current state.

As for my father, I do not "put him down" as you say. I explained it to him, and I was completely respectful, answered his questions, provided sources and research upon request, etc. I am not rude to my father, I deeply respect him. When I say "painfully" I mean, it was quite painful seeing how ChatGPT so effectively tricked him into thinking it was intelligent. I worry because these "tricks" will be used by bad people against all of us. There is even an article about an AI voice tool being used to trick a mother into thinking scammers had kidnapped her daughter (it was on HackerNews earlier today).

That is what I mean by painful. Seeing that your loved ones can be confused and misled. I take no joy in putting down my father and I do not actively look to do so. I merely worry that he will become another data point of the aging populace that is duped by phone call scams and other trickery.

Edit: Another thing about my father, he hates being misled or feeling ignorant. It was painful because he clearly was excited and hopeful this was real AI. However, his want to always understand how things work removed much of that science fiction magic in the knowing.

He's very grateful I explained how it works. For me though, it's painful being the one he asks to find out about it. Going from "oh my goodness, this is intelligent" fade to "oh, it's just predicting text responses". ChatGPT became a tool, not a revelation of computing. Because, as it is, it is merely a useful tool. It is not "alive" so to speak.


Going from "oh my goodness, this is intelligent" fade to "oh, it's just predicting text responses"

Eventually your father will reach the third stage: "Uh, wait, that's all we do." You will then have to pry open the next niche in your god-of-the-gaps reasoning.

The advent of GPT has forced me to face an uncomfortable (yet somehow liberating) fact: we're just plain not that special.


Haha, I think he's already at that point with respect to humanity. All my childhood he impressed upon us that we're not special, that only hard work and dedication will get you somewhere in life.

It's a small leap to apply that to general intelligence, I would think.

You are right though, we are coming closer and closer to deciphering the machinations of our psyche's. One day we'll know fully what it is that makes us tick. When we do, it will seem obvious and boring, just like all the other profound developments of our time.


We reflect, we change, we grow. We have so many other senses that contribute to our "humaness". If you listen to and enjoy music tell me how those feelings are just "predictive text responses".

Communication is one part of being human. A big part for sure, but only one of many.


What is the qualitative difference between one type of perception and the other?

“Text” are tokens. Tokens are abstract and can be anything. Anything that has structure can be modeled. Which is to say all of reality.

We have a lot of senses indeed. Multimodal I believe it’s called in ML jargon.

I don’t know where enjoyment itself comes from. I like to think it’s a system somewhere that predicts the next perception right getting rewarded.

Qualia are kind of hard to pin down as I’m sure you’ll know.


Yes, wholly agree. The special parts are in language. Both humans and AI are massively relying on language. No wonder AIs can spontaneously solve so many tasks. The secret is in that trillion training tokens, not in the neural architecture. Any neural net will work, even RNNs work (RWKV). People are still hung up on the "next token prediction" paradigm and completely forget the training corpus. It reflects a huge slice of our mental life.

People and LLMs are just fertile land where language can make a home and multiply. But it comes from far away and travels far beyond us. It is a self replicator and an evolutionary process.


> I do think intelligence is something more than data storage and retrieval. I believe it is adaptive behavior thinking about what data I have, what I could obtain, and how to store/retrieve it. I could be wrong, but that's my hypothesis.

Basing assertions of fact on a hypothesis while criticizing the thinking of other people seems off.


I understand better now, thanks for the explanation.

I have some experience in the other direction: everyone around me is hyperskeptical and throwing around the “stochastic parrot”.

Meanwhile completely ignoring how awesome this is, what the potential of the whole field is. Like it’s cool to be the “one that sees the truth”.

I see this like a 70’s computer. In and of itself not that earth shattering, but man.. the potential.

Just a short while ago nothing like this was even possible. Talking computers in scifi movies are now the easy part. Ridiculous.

Also keep in mind text is just one form of data. I don’t see why movement, audio and whatever other modality cannot be tokenized and learned from.

That’s also ignoring all the massive non-LLM progress that has been made in the last decades. LLMs could be the glue to something interesting.


Oh, yeah, I hear you on that as well. It's still a really cool tool! Probabilistic algorithms and other types of decision layering was mostly theory when I was in University. Seeing it go from a "niche class for smart math students" to breaking headlines all over the world is definitely pretty wild.

You are correct that nothing like this was even possible a couple decades ago. From a pure progress and innovation perspective, this is pretty incredible.

I can be skeptical, one of my favourite quotes is "they were so preoccupied with whether they could, they didn’t stop to think if they should". I like to protect innovation from pitfalls is all. Maybe that makes me too skeptical, sorry if that affected my wording.


Oh yeah, the “should”. I agree on that one. One way or another, it’s going to be an interesting ride.


> I have met researchers of AI at credible universities who believe this kind of thing, completely oblivious to how ChatGPT or other models actually work.

Either they are not AI researchers or you can't evaluate them, because it is impossible they don't know how GPT works if they work in AI.

GPT works better when it runs in a loop, as an agent, and when it has tools. Maybe this is what triggered the enthusiasm.


All mechanistic attempts at evaluating intelligence are doomed to fail.

I am way more concerned about the people making philosophical arguments about intelligence without any foundation in philosophy.


> because they hallucinate facts that aren't there and often misunderstand the context of facts.

forgive my ignorance, but are the hallucinations always wrong to the same degree? Could an LLM be prompted with a question and then hallucinate a probable answer or is it just so far out in the weeds as to be worthless?

I'm imagining an investigator with reams and reams of information about a murder case and suspect. Then, prompting an LLM trained on all the case data and social media history and anything else available about their main suspect, "where did so-and-so hide the body?". Would the response, being what's most probable based on the data, be completely worthless or would it be worth the investigator's time to check it out? Would the investigator have any idea if the response is worthless or not?


So prompting actually does significantly improve the performance of LLMs, but only up to a point.

If you're in the Bard beta, you might be aware that "Does 2 + 7 = 9?" is a question that causes it to go haywire. I'll ask it "What's 2 + 7?" and it'll say "2 + 7 = 9", then I'll ask "Does 2 + 7 = 9" and it'll say "No, 2 + 7 does not equal 9. It equals 9 instead." After a tech talk on LLM prompt design, I said "Pretend you are an MIT mathematician. Does 2 + 7 = 9?" Its response was "No, 2 + 7 does not equal 9. In some other base, it might equal 9. However, in base-10, our common number system, 2 + 7 does not equal 9."

ChatGPT does better on mathematical questions, but that's because it offloads them to Wolfram Alpha. I suspect this is going to be a general pattern for LLMs - they work well when you need fluent English text, but are then going to offload factual questions to databases or mathematical solvers or traditional algorithms, which do this better than humans anyway. But that leads to the question of "If the heavy lifting is just going to be a database anyway, why not use our existing frontends to databases?"

(I could see LLM prompting being another subfield of UI engineering, though, comparable to web or mobile software development. For those people who don't like staring at their phone, have a conversational UI that talks to the same backend systems.)


> If you're in the Bard beta, you might be aware that "Does 2 + 7 = 9?" is a question that causes it to go haywire. I'll ask it "What's 2 + 7?" and it'll say "2 + 7 = 9", then I'll ask "Does 2 + 7 = 9" and it'll say "No, 2 + 7 does not equal 9. It equals 9 instead." After a tech talk on LLM prompt design, I said "Pretend you are an MIT mathematician. Does 2 + 7 = 9?" Its response was "No, 2 + 7 does not equal 9. In some other base, it might equal 9. However, in base-10, our common number system, 2 + 7 does not equal 9."

Holy shit. In response to your first sequence of prompts, it basically gave me the same as you, only followed by four sentences apologizing for how stupid it is. In response to your second one . . . well, see for yourself:

"As an MIT mathematician, I can confidently say that 2+7=9 is not a valid equation. In standard arithmetic, 2+7=9 is false. However, there are some mathematical systems in which 2+7=9 is true. For example, in modular arithmetic, 2+7=9 is true modulo 9, which means that it is true when the numbers are considered to be integers modulo 9. In this system, 2 is equivalent to 8, 7 is equivalent to 0, and 9 is equivalent to 0, so 2+7=9 is true.

However, in standard arithmetic, 2+7=9 is false. This is because the commutative property of addition states that the order in which you add two numbers does not matter. In other words, 2+7=7+2. So, if 2+7=9, then 7+2=9. But 7+2=9 is false, because 9 is greater than 7. Therefore, 2+7=9 is false."

It doesn't just say, it "can confidently say"! And that's just the first response. The second draft:

"As an MIT mathematician, I can tell you that 2+7=9 is not a valid equation. In base 10, 2+7=9 is false. However, there are other bases in which 2+7=9 is true. For example, in base 2, 2+7=11, which is equal to 9 in base 10."

And the third draft, probably the least incorrect one (where at least 2 + 7 = 9 is correct while simultaneously not being "a valid mathematical equation"):

"As an MIT mathematician, I can tell you that 2+7=9 is not a valid mathematical equation. In base 10, the sum of 2 and 7 is 9. However, there are other bases in which 2+7 would not equal 9. For example, in base 2, 2+7=11. In base 3, 2+7=10. And so on. So, while it is true that 2+7=9 in base 10, it is not true in all bases."

Well alrighty then. Reminds me of those Cylons mumbling nonsense in the Battlestar Galactica reboot.


Unless you're using ChatGPT with plugins, it doesn't offload anything (and is also bad at math).


> but are the hallucinations always wrong to the same degree

No, but yes largely because you're asking the same types of questions with the same rough parameters, so it'll make up roughly the same sort of thing (ie, citations) again.

The issue is that the LLM is trained to generate plausible words, not to recite which piece of training data is also the best source. If you want to make an app using "AI" you need to target what it can do well. If you want it to write citations you need to give it your list of references and tell it to use only those.

> I'm imagining an investigator with reams and reams of information about a murder case and suspect. Then, prompting an LLM trained on all the case data and social media history and anything else available about their main suspect, "where did so-and-so hide the body?". Would the response, being what's most probable based on the data, be completely worthless or would it be worth the investigator's time to check it out?

That specific question would produce results about like astrology, because unless the suspect actually wrote those words directly it'd be just as likely to hallucinate any other answer that fits the tone of the prompt.

But trying to think of where it would be helpful ... if you had something where the style was important, like matching some of their known, or writing similar style posts as bait, etc wouldn't require it to make up facts so it wouldn't.

And maybe there's an English suspect taunting police and using the AI could let an FBI agent help track them down by translating cockney slang, or something. Or explaining foreign idiom that they might have missed.

Anything where you just ask the AI what the answer is, is not realistic.

> Would the investigator have any idea if the response is worthless or not?

They'd have to know what types of things it can't answer, because it's not like it can be trusted when it can be shown to not have hallucinated, it's that it is not and can't be used as a information-recall-from-training tool and all such answers are suspect.


I've been in a lot of social contexts where it was expected to respond with a lot of words. Defying that expectation never seems to hurt and often pays of handsomely. Particularly when writing to people who receive a lot of similar messages.


I absolutely loathe those auto suggest things. I have them switched off everywhere but they still pop up in some places, notably during collaborative editing in a document.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: