Hacker Newsnew | past | comments | ask | show | jobs | submit | gallerdude's commentslogin

I really liked Dario's metaphor that in the 80's, we could have said someday we'll have "supercomputers", which can do all the calculations we did except WAY faster. When, in reality, the AI's just get smarter over time, even if the frontier is jagged. AGI is just vibes only for "smart enough, consistently enough".

AGI means no needing to retrain the model, it should be able to learn on the fly. That's the true meat of AGI. Any CEO or exec saying any remark about AGI should be forced to define what their definition of AGI is in that moment, or be completely shunned by the industry, since it seems they can just reframe what they meant by AGI later if they don't define it in that moment.

I was a baby when the Internet Revolution happened. I was in high school and college when the Mobile Revolution steamrolled everything. It’s been interesting to see this one, as an adult working in the world. I wonder how far it will go.

Further than the doomers think, but not enough to pay off the investors of the original boom. I say that as someone who has been an early believer in the internet (first website in the 90s), mobile data (slurping down the 'net, IRC, and IMs via EDGE data), smartphones (N80ie), streaming media (RIP Windows MCE), the list goes on.

Models were always going to be the commodity, just like the most popular and viable use cases at present are less job-replacement than "let's analyze huge data sets for patterns we're missing, and adjust accordingly" or "probabilistically generate deterministic software for me for X function/task". One-offs simply aren't profitable when models are interchangeable commodities, hence that brief attempt to pivot to "pay by outcome" before giddily embracing the classic consumption-based-billing playbook.


> Further than the doomers think, but not enough to pay off the investors of the original boom

Not an uncommon event - not only did this happen to many companies who were big in the original internet boom (e.g. Sun Microsystems, as well as all the Boo, Pets.com etc), it also happened to the railway boom of the previous century, and even the Channel Tunnel.


If GPT-5.5 Pro really was Spud, and two years of pretraining culminated in one release, WOW, you cannot feel it at all from this announcement. If OpenAI wants to know why they like they’ve fallen behind the vibes of Anthropic, they need to look no further than their marketing department. This makes everything feel like a completely linear upgrade in every way.


Clearly they felt a big backlash when version 5 was released. Now they are afraid of another response like this. And effectively, for the user it will likely only be a small update.


Also the naming department. You can tell that this is the AI company Microsoft chose to back because their naming scheme is as bad as .NET's.


I actually have no problem with the 5.x line... but if Pro really was an entirely new pretrain, they did a horrible job conveying that.


1. You can't understand the nuances, but there is a general pattern: new inventions may make us slightly less proficient at specifics, yet more powerful overall

2. Imagine a hunter gatherer is time travelled to 2026. You have lunch go to a cafe with him, and he learns that food is cheap, delicious, and abundant. He sees your house, and thinks it's amazing compared to his cave. He thinks that 2026 must be absolute paradise. You explain to him, well kinda, but also not really. Is the hunter gatherer right?


Alternatively he sees that you live in your house alone and feel lonely all the time. Maybe you have a small family and a few friends but it's nothing compared to the tribal life he knows.

He sees you spend your day working but rarely get to go outside or do anything active. Even when you're not working you sit behind a desk staring at a screen.

He wonders why you bother will all the technology when it made your life worse. Is he right?


I agree partially, but also misses the wonder he would have for: relaxing bathtubs, funny livestreams, wireless earbuds, huge libraries, and even globes.

And yeah, you could make a list of struggles we have today he never did. But that’s kind of my point - it’s complicated.


Yeah I think we're actually in agreement with the point about it being complicated. In reality I think different people would have react differently but they would all have mixed feelings. So it's impossible to ask "would they be right?" in a sense. Their feelings would be as valid as they would be varied.


Alternatively, he sees you alone and thinks how excellent to not have to deal with tribesmen- the elders and their rules, the children and their needs, the others hunters and their mind numbing chatter …

This future man has paradise indeed.


The hunter-gatherer will wonder why you spend so much time working. He only spends 2-3 hours a day gathering and preparing food, maybe an hour maintaining tools and shelter; with the rest dedicated to leisure and social activities.


> 1. You can't understand the nuances, but there is a general pattern: new inventions may make us slightly less proficient at specifics, yet more powerful overall

No. It's not a phenomenon with a pattern, maybe there's a coincidental pattern to some subset of inventions, but there's no logical reason that would apply to some arbitrary next invention (e.g. the pattern of biotechnology intention have allowed us to live longer and healthier lives...until some guy invented some experimental pathogen that wipes out the species).

> 2. Imagine a hunter gatherer is time travelled to 2026....

You're kinda missing my point. Many people smugly assume the present is better than the past, and and can point to cherry-picked this-and-that to feel confident about their claim. But almost every modern person has no sense of what was lost, and what prior generations mourned losing. There's a temptation to smugly dismiss the thoughts of those who lived through those transitions as stupid and ignorant, but they have insight that's no longer available to us first hand.

Some of these inventions we're so proud of having may not have resulted in a net-positive effect on our lives, but we don't have the experience to realize that anymore (like someone in a community that's been living knee-deep in shit all the time doesn't have the experience to realize it's terrible life compared to his distant ancestors').


I’ve been thinking about AI robotics lately… if internally at labs they have a GPT-2, GPT-3 “equivalent” for robotics, you can’t really release that. If a robot unloading your dishwasher breaks one of your dishes once, this is a massive failure.

So there might be awesome progress behind the scenes, just not ready for the general public.


I ended up watching Bicentennial Man (1999) with Robin Williams over the weekend. If you haven't seen I thought it was a good and timely thing to watch and is kid friendly. Without giving away the plot, the scene where it was unloading the dishwasher...take my money!


> If a robot unloading your dishwasher breaks one of your dishes once, this is a massive failure.

That's a bit exaggerated, no? Early roombas would get tangled in socks, drag pet poop all over the floor, break glass stuff and so on, and yet the market accepted that, evolved, and now we have plenty of cleaning robots from various companies, including cheap spying ones from china.

I actually think that there's a lot of value in being the first to deploy bots into homes, even if they aren't perfect. The amount of data you'd collect is invaluable, and by the looks of it, can't be synth generated in a lab.

I think the "safer" option is still the "bring them to factories first, offices next and homes last", but anyway I'm sure someone will jump straight to home deployments.


It's called "VLA" (vision-language-action) models: https://huggingface.co/models?pipeline_tag=robotics

VLA models essentially take a webcam screenshot + some text (think "put the red block in the right box") and output motor control instructions to achieve that.

Note: "Gemini Robotics-ER" is not a VLA, though Gemini does have a VLA model too: "Gemini Robotics".

A demo: https://www.youtube.com/watch?v=DeBLc2D6bvg


I have broken dishes loading and unloading the dishwasher. Am I a massive failure?

My non-AI dishwasher can't even always keep the water inside. Nothing is perfect.


If someone paid 100 grand for you to load and unload the the dishwasher, and the research to be able to do it costed hundreds of billions, decades of research, hundreds of thousands of researchers, and that was the ONLY thing you could do, yes, you WOULD be a massive failure.


From an economic standpoint the industry is anyway the most relevant by far. Its easier as the env is a lot more controlled, professionals configure and maintain the robots, they buy in bulk and have more money.

My concern with a household robot is not the dishwasher but the tv screen, the glas door, glas table, animals (fish/aquarium) etc. the robot might walk through, touch through or fall onto.


> If a robot unloading your dishwasher breaks one of your dishes once, this is a massive failure.

Depending on what the rate of breaking dishes is, this would be a massive improvement on me, a human being, since I break a really important dish I needed to use like ~2x per month on average.


You really break a dish once every 2 weeks? That seems exceptionally clumsy.

Not here to shame you for it, for the record.


> That seems exceptionally clumsy

That's me ;_;


There's not enough internet-scale data for robotics. The gap is huge! So anyone that claims to have a GPT like model is not behing honest.


This would have been an amazing release 6 months ago. But the industry moves so fast, this is a trite release. Maybe it’s best for Meta to sell their superintelligence division. I don’t think Zuck’s vision is particularly compelling.


A new model comparable (ish) to the Claude/Gemini/GPT flagships is a big deal for the industry and for Meta even if it doesn't set the new frontier.


I’m not sure. If it was open source, certainly. But 4th place doesn’t really matter if you have nothing different to add.


If the model is truly on par with Opus 4.6/Gemini 3.1/GPT 5.4 (beyond benchmarks) this still puts MSL in the frontier lab category, which is no small feat given that they pretty much rebooted last year

Many labs aren't able to keep up with the frontier, xAI, Mistral


Fourth place means you're not reliant on any of the external providers for internal AI use, which is important for organizational health and negotiating with those other providers.


I’m not sure it’s useful for negotiating, the capex to build it was surely orders of magnitude more than it would cost to just use one of the other frontier models.

It’s like someone negotiating by saying, “I’ll waste even MORE money to build something worse if you don’t give me a deal.”

I’m not discounting there may be other advantages to doing it. I just don’t think negotiating is one.


Why would you use this instead of the other more proven models? Unless it's significantly cheaper. The general population mostly wants it free, and the more professional users are willing to pay for good/better responses.


You wouldn't use this as an API. You would "use" this inside the meta properties. Have a shop on fb marketplace? Now you have copy, images, support, chat, translations, erp, esp, fps and all the other acronyms :) and so on for your mom and pop shop @200$/mo. Probably worse than say claude/gemini but it's right there, one button away. "Click here to upgrade to AI++" or something.


But rolling your own can’t be that much cheaper than buying it from a leading lab. Especially when you consider the amount of spending on datacenters.


leading labs are going to be tightening the screws. Otherwise why not just run the entire company on a public cloud?


I won't use it, but I'm excited to see it for the same reason why I'm excited to see a near-frontier open-source release: more competition pushes prices down and reduces monopoly/cartel risk. I won't use Muse or Grok or GLM at this point but they're good for the ecosystem.


Their new Contemplating mode gives this model a Deep Research ability (akin to existing models from GPT and Gemini) that might make it quite comparable to the just-announced Mythos.


Mythos is a much bigger pre train, Contemplating is not the same thing.


> Mythos is a much bigger pre train

Do we have data to substantiate that claim?


It's pretty common knowledge. Spud is the only other PT comparable with Mythos.

Both Spud and Mythos can also scale via inference time compute.

Meta simply did not have enough compute online, long enough ago, to have a similar PT.


> might make it quite comparable to the just-announced Mythos

Do we have data to substantiate that claim?


I never understood why meta decided to join the race. They don’t sell compute like Google or Microsoft. Why not let others do the hard work and integrate their LLMs in your systems if needed? I assume it’s because they have Instagram, Facebook, WhatsApp, Thread data and feel they should be the ones using them for training, but it’s really not obvious how having a frontier AI lab benefits their business


Adtech Money. They've got GPUs, they've got the infrastructure, and they've got the advertisement platform, and the point is getting AI that can exploit the adtech and create a flywheel effect, maximizing return from the data they collect from Insta, WhatsApp, Facebook, etc.

It's not just about LLMs, it's about being able to model consumers and markets and psychology and so on. Meta is also big in the manipulation side of things, any sort of cynical technological exploitation of humans you can imagine but that is technically legal, they're doing it for profit.


> I never understood why meta decided to join the race.

I can think of at least two reasons. Price and customizability. If they train their own models on their own data, they potentially have a better model at a better price, and they're not at the mercy of Anthropic's decisions when they decide to raise prices. Additionally, if you use someone else's model, you use it the way they create it and permit you to use it. In a couple years, who has any idea how these models are used. Arguably, a company the size of Meta should be in control of their AI models.


You basically have to be involved if you're meta. Even if there's only 5% chance this AI stuff is as disruptive as the labs claim it is, you can't afford to miss out. Even if you're lagging frontier, you must develop the competency internally. Otherwise you ignored a 5% chance of total annihilation, probably even exposing you to shareholder lawsuits.


Because there's a realistic chance this is the only important software technology moving forward, and commoditizes Metas's entire business which is software.


Meta’s business is human attention, human connections, and all derived data. They can use AIs for their systems, but the question is why do they feel the need to spend billions on training and running their own frontier model


Zuck is trying to convince himself he's good, and not just lucky.


From what I heard Meta is spending hundreds of millions each month in Claude credits for developers. So that’s a huge saving if they have own models that match Opus.


Spending tons of money on Claude and the recent token benchmarks came WELL after Meta's huge investments in compute infrastructure for AI as well as the long history of language model development inside science divisions at the company.


LLMs/Chat-based systems will reach a point where Facebook, WhatsApp, Threads, Instagram, etc. are all unnecessary. The idea of opening a browser or a specific app to do a thing will seem antiquated. You can do it all with your chat-based agent. Meta wants to be part of that.


I don't think everyone only wants to talk to machines going forward...?


I don't want to do it now. But that seems to be where we are being headed, like lemmings running for the cliff.


They have realized that the real money is in sitting between us and reality arbitrating what we see or know.


Sure but they have the platforms, they don’t need their own frontier models for that


The platforms will be irrelevant at some point. "Posting to Facebook" won't be a thing.


A few things:

1) meta was doing this at scale before openAI

2) decent ML is critical to catagorising content at scale, the more accurate and fast the category, the finer the recommendations can be (ie instead of woman, outside as a tag for a video, woman, age, hair colour, location, subjects in view, main subject of video, video style) doing that as fast as possible with as little energy as possible is mission critical

3) The llama leak basically evaporated the moat around openAI who _could_ have become a competitor

4) for the AR stuff, all of these models (and visual models) are required to make the platform work. They also need complete ownership so that it can be distilled to make it run on tiny hardware

5) dick swinging

6) they genuinely want to become a industrial behemoth, so robots, hardware, etc are now all in scope.


I think they just want to be a winner in the “next thing.” They hit social networking, but missed mobile operating systems and didn’t compellingly win at social media. Eventually an ambitious person with a bazillion dollars wants a clear win, right?


Only thanks to Meta we have competitive local LLMs. Without LLama nothing decent would have been released. Commoditize your complements in action.


AI NPCs to fill in the empty Metaverse?


First and most importantly is the fact they have a lot of very valuable data they wouldn't want to siphon to a competitor. This data is a key strategic asset in the space where they do business.

Secondly though, I think it has to do with the fact Meta is big enough to worry about vertical integration and full control of their business.

The whole reason they've been trying to make AR/VR happen for over a decade now is the assumption of a worst case and best case scenario. The worst case is Apple and Google wants them gone. This isn't as far fetched as it seems, Google has historically been Meta's biggest competitor and even tried to release its own social network back when Meta was threatening them. If either pulls Meta apps from their respective stores, it'd be an immense blow to Meta; their whole trillion-dollar business depends on competitor's platforms.

Meta tried making inroads into the phone business but failed; it is a very crowded market after all. So they changed their strategy. Instead of playing catch-up, they'd invent "the next iPhone" and be the first to a brand new market. This is the best case scenario; they invent a new platform where they can be dominant from day 1 and stop depending on competitor's hardware, not only removing that risk factor for them, but also unlocking a new market they can control.

AI ties into all this because it appears to be key for this next platform to happen. You will communicate with these smart glasses via voice, hand gestures, or subtle movements that a model will have to interpret. The features that could make them stand out as more than just a screen on your face are all AI related; object detection, world understanding, context awareness, etc. If all this were done via a 3rd party Meta would effectively be back on square one: a competitor could easily yank away its model access, or sell it to a competitor. Meta would be again at the mercy of others.

Compared to other big-tech players, I think it's easy to see how Meta is in a riskier position. There's little Google or Microsoft can do to kill the iPhone. There's little Apple or Google can do to kill Amazon's online store. There's little Amazon or Apple can do to kill Microsoft's business deals. Google and Meta are primarily in the business of capturing people's data, attention, and selling ads, and both Google and Apple could do quite some damage to Meta. Beyond expanding it, it's important for them to invest in ways to protect their money-printing machine.


I’m sure there’s more to it than this, but it feels like Zuck has pet interests like VR and now AI.


But no account support, that's boring

Or any quality control (people missing posts)

Or banning the people who should be banned while leaving everyone else alone

This is Zuck: https://news.ycombinator.com/item?id=4151433 or https://news.ycombinator.com/item?id=10791198


you dont understand why zuck, who paid $1B for instagram when they had no revenue and 7 employees because he is paranoid about platform shifts, decided to join the race for (what is seeming highly possibly) the biggest platform shift in human history?


He also tried and failed to buy Snapchat, and then copied their feature on all their big products: Instagram, Facebook and even WhatsApp.


The way you put it, I understand it less. lol


One word: control. It's the same reason Facebook became Meta


Pumps up the stock price.


Because Zuck has chronic FOMO, he's said as much himself


To download all those torrents, obviously.


But then how will Zuck win the billionaire dick measuring contest?


> I don’t think Zuck’s vision is particularly compelling.

But he has to do it anyways, otherwise Meta can be disrupted easily.

Google, Apple has hardware, distribution channels for their products

Amazon has the marketplace and cloud

Microsoft has enterprise and cloud

Meta is always looking for ways to stay afloat


Meta has 3.5 billion daily active users


and has competitors like: TikTok, SnapChat, YouTube, Netflix, X, HBO, Amazon Prime, all fighting for the attention time.

They are worried something like Sora can disrupt them quickly


Tbf there was a 5.3 codex


My job may have become part of the training data with how much coverage there is around it. Perhaps another career would be a better test of LLM capabilities.


Have you ever heard of a black swan?


We used to get one annual release which was 2x as good, now we get quarterly releases which are 25% better. So annually, we’re now at 2.4x better.


The weirdest thing about this AI revolution is how smooth and continuous it is. If you look closely at differences between 4.6 and 4.5, it’s hard to see the subtle details.

A year ago today, Sonnet 3.5 (new), was the newest model. A week later, Sonnet 3.7 would be released.

Even 3.7 feels like ancient history! But in the gradient of 3.5 to 3.5 (new) to 3.7 to 4 to 4.1 to 4.5, I can’t think of one moment where I saw everything change. Even with all the noise in the headlines, it’s still been a silent revolution.

Am I just a believer in an emperor with no clothes? Or, somehow, against all probability and plausibility, are we all still early?


If you've been using each new step is very noticeable and so have the mindshare. Around Sonnet 3.7 Claude Code-style coding became usable, and very quickly gained a lot of marketshare. Opus 4 could tackle significant more complexity. Opus 4.6 has been another noticable step up for me, suddenly I can let CC run significantly more independently, allowing multiple parallel agents where previously too much babysitting was required for that.


I think this is where there's a huge distinction between ability/performance/benchmark figures and utility. You can have smooth improvements to performance, but marked step changes in utility as they cross thresholds where you're able to use them for new tasks.


> If you've been using each new step is very noticeable and so have the mindshare. Around Sonnet 3.7 Claude Code-style coding became usable

Yet I vividly remember the complaints about how 3.7 was a regression compared to 3.5 with people advising to stay on 3.5.

Conversely, Sonnet 4 was well received so it's not just a story about how complainers make the most noise.


In terms of real work, it was the 4 series models. That raised the floor of Sonnet high enough to be "reliable" for common tasks and Opus 4 was capable of handling some hard problems. It still had a big reward hacking/deception problem that Codex models don't display so much, but with Opus 4.5+ it's fairly reliable.


Honestly, 4.5 Opus was the game changer. From Sonnet 4.5 to that was a massive difference.

But I'm on Codex GPT 5.3 this month, and it's also quite amazing.


I had not used Claude much until an hour ago since probably before GPT5. I had only been using Gemini the last 3 months.

Sonnet 4.6 extended on the free plan is just incredible. I am just complete floored by it. The conversation I just had with it was nuts. It was from Dario mentioning something like a 20% chance Claude is conscious or something crazy like that. I have always tried that conversation with previous models but it got boring so fast.

There is something with the way it can organize context without getting lost that completely blows Gemini away.

Maybe even more so that it was the first time it felt like a model pushed back a little and the answers were not just me ultimately steering it into certain answers. For the free plan that is nuts.

In terms of being conscious, it is the first time I would say I am not 100% certain it is just a very useful, very smart , stochastic parrot. I wouldn't want to say more than that but 15-20% doesn't sound so insane to me as it did 2 hours ago.


> Or, somehow, against all probability and plausibility, are we all still early?

What does this even mean? It's obvious we're still early and I think it's a very common opinion.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: