Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But more annoyingly if I asked "I am encountering X issue, could Y be the cause" or "could Y be a solution", the response would nearly always be "yes, exactly, it's Y" even when it wasn't the case

Seems like the same issue as the evil vector [1] and it could have been predicted that this would happen.

> It's kind of a wild sign of the times to see a tech company issue this kind of post mortem about a flaw in its tech leading to "emotional over-reliance, or risky behavior" among its users. I think the broader issue here is people using ChatGPT as their own personal therapist.

I'll say the quiet part out loud here. What's wild is that they appear to be apologizing that their Wormtongue[2] whisperer was too obvious to avoid being caught in the act, rather than prioritizing or apologizing for not building the fact-based councilor that people wanted/expected. In other words.. their business model at the top is the same as the scammers at the bottom: good-enough fakes to be deceptive, doubling down on narratives over substance, etc.

[1] https://scottaaronson.blog/?p=8693 [2] https://en.wikipedia.org/wiki/Gr%C3%ADma_Wormtongue



Well, that's always what LLM-based AI has been. It can be incredibly convincing but the bottom line is it's just flavoring past text patterns, billions of them it's been "trained" on, which is more accurately described as compressed efficiently onto latent space. Like if someone lived for 10,000 years engaging in small talk at the bar, has heard it all, and just kind of mindlessly and intuitively replied with something that sounds plausible for every situation.

Sam Altman is the real sycophant in this situation. GPT is patronizing. Listening to Sam go off on tangent about science fiction scenarios that are just around the corner... I don't know how more people don't see through it.

I kind of get the feeling the people who have to work him every day got sick of his nonsense and just did what he asked for. Targeting the self-help crowd, drive engagement, flatter users, "create the next paradigm of emotionally-enabled humans of perfect agency" or whatever the fuck it he was popping off about to try to motivate the team to compete better with Anthropic.

He clearly isn't very smart. He clearly is product of nepotism. And clearly, LLM "AI" is an overhyped, overwrought version of 20 questions artificial intelligence enabled by mass data scale and NVidia video game graphics. it's been 4 years now of this and AI still tells me the most obviously wrong nonsense every day.

"Are you sure about that?"

"You're absolutely correct to be skeptical of ..."


> which is more accurately described as compressed efficiently onto latent space.

The actual difference between solving compression+search vs novel creative synthesis / emergent "understanding" from mere tokens is always going to be hard to spot with these huge cloud-based models that drank up the whole internet. (Yes.. this is also true for domain experts in whatever content is being generated.)

I feel like people who are very optimistic about LLM capabilities for the later just need to produce simple products to prove their case; for example, drink up all the man pages, a few thousand advanced shell scripts that are easily obtainable, and some subset of stack-overflow. And BAM, you should have a offline bash oracle that makes this tiny subset of general programming endeavor a completely solved problem.

Currently, smaller offline models still routinely confuse the semantics of "|" vs "||". (An embarrassing statistical aberration that is more like the kind of issue you'd expect with old school markov chains than a human-style category error or something.) Naturally if you take the same problem to a huge cloud model you won't have the same issue, but the argument that it "understands" anything is pointless, because the data-set is so big that of course search/compression starts to look like genuine understanding/synthesis and really the two can no longer be separated. Currently it looks more likely this fundamental problem will be "solved" with increased tool use and guess-and-check approaches. The problem then is that the basic issue just comes back anyway, because it cripples generation of an appropriate test-harness!

More devs do seem to be coming around to this measured, non-hype kind of stance gradually though. I've seen more people mentioning stuff like, "wait, why can't it write simple programs in a well specified esolang?" and similar


A naive thought: What you would get if you hardcode the language grammar and not let the training discern it, so instead of it, kinda like an expert system constraining its output?


I try to do something like this when I ask the AI to generate tests -- I'll cook up a grammar and feed it to the LLM in a prompt, and then ask it to generate strings from the grammar. It's pretty good at it, but it'll produce mistakes, which is why I'll write a parser for the grammar and have the LLM feed the strings it makes through the parser and correct them if they're wrong. Works well.


What kind of hard coding do you have in mind? How the technique would look like?


word2vec meets category theory

https://en.wikipedia.org/wiki/DisCoCat

>In this post, we are going to build a generalization of Transformer models that can operate on (almost) arbitrary structures such as functions, graphs, probability distributions, not just matrices and vectors.

https://cybercat.institute/2025/02/12/transformers-applicati...


We already know the keywords of the language and symbols from the standard library and othe major ones. As well as the rules of the grammar. So the weight can be biased against that. Not sure how that would work, though.

I don’t think that would help with natural language to programming language, but that can probably help with patterns, kinda like a powerful suggestion engine.


> it's been 4 years now of this and AI still tells me the most obviously wrong nonsense every day.

It's remarkable seeing the change in sentiment in these parts, considering even just a year ago a large part of this forum seemed to regularly proclaim that programmers were done, lawyers were gone in 5 years, "Aye Gee Eye is coming", etc etc.


If it’s just flavoring text patterns, how does it reason about code when I give it explicit arbitrary criteria? Is it just mining and composing “unit level” examples from all the code bases it has ingested?


> I don't know how more people don't see through it.

When you think that maybe you should take a step back and reflect on it. Could it be that your assessment is wrong rather than everyone else being delusional?


Parent is not alone in that line of thinking. (Some rudiments of reasoning do show, but result is dominated by excellent compression of humongous amount of data)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: