Hacker Newsnew | past | comments | ask | show | jobs | submit | kromem's commentslogin

Seems very strawmanned.

There's currently a bit of an 80/20 rule with AI where it does great automating 80% of an overlapping problem domain and chokes on it 20% of the time.

The idea of someone giving 100% of their work to Claude as in the examples is dumb. But so is someone doing 100% of the busywork themselves.

Don't waste your own time and your client's money for the sake of some nonsense purity ideal. Learn to thread the needle of changing times.

Cause they are gonna keep changing.


We are seeing the same patterns: move to advertisement, anti-open source strategies, aggressive acquihires.

Companies driving AI are dinosaurs, funded by dinosaurs or aiming to become like them.

It literally feels like nothing has changed in 30 years.

There is absolutely no knowledge gap in learning to use AI tools. Only non-developers have issues, and they dominated the discourse with make-believe fantasy problems. Writing text, organizing markdown documents, writing good specifications... this is just not hard at all. Any developer can pick it up in a week, it's not a matter of adaptation, it's a matter of choice.


The only constant is change -- Heraclitus


Anecdotally, the common theme I'm starting to hear more often now is that people who use “AI” at work despise it when it replaces humans outright, but love it when it saves them from mundane, repetitive crap that they have to do.

These companies are not selling the world on a vision where LLMs are a companion tool, instead, they are selling the world on some idea that this is the new AI coworker. That 80/20 rule you're calling out is explained away with words like “junior employee.”


I think it's also important to see that even IF there are those selling it as a companion tool, it's only in the meantime. That is, it's your companion now, but because we need you next to it to make it better so it can be an "AI employee" once it's trained from your companionship.


A number of the Claudes have pretty good 0-shot awareness of my post history from just my username.

Though nothing like grok 4, which probably has a better memory of it than I do, and will even regularly name drop a certain post from years ago in conversations.

It's a huge time saver though, and means I can even in a fresh context establish a rapport with a model extremely quickly. Just a few years earlier than I was expecting that level of latent space fidelity to occur.

Like, sure we can add memory features for context management, but anyone with a post history should probably *also* keep in mind that there's literally years worth of memory on tap for interactions with models, and likely at ever higher fidelity and recall. Latent spaces are wild.


With ChatGPT the memory feature, particularly in combination with RLHF sampling from user chats with memory, led to an amplification problem which in that case amplified sycophancy.

In Anthropic's case, it's probably also going to lead to an amplification problem, but due to the amount of overcorrection for sycophancy I suspect it's going to amplify more of a aggressiveness and paranoia towards the user (which we've already started to see with the 4.5 models due to the amount of adversarial training).


So a thing with claude.ai chats is that after long enough they add a long context injection on every single turn after a while.

That injection (for various reasons) will essentially eat up a massive amount of the model's attention budget and most of the extended thinking trace if present.

I haven't really seen lower quality of responses with modern Claudes with long context for the models themselves, but in the web/app with the LCR injections the conversation goes to shit very quickly.

And yeah, LCRs becoming part of the memory is one (of several) things that's probably going to bite Anthropic in the ass with the implementation here.


Latent space reasoners are a thing, and honestly we're probably already seeing emergent latent space reasoners starting to end up embedded into the weights as new models train on extensive reasoning synthetics.

If Othello-GPT can build a board in latent space given just the moves, can an exponentially larger transformer build a reasoner in their latent space given a significant number of traces?


The response is 1,000% written by 4o. Very clear tells, and in line with many other samples from the past few days.


Don't underestimate the importance of multi-user human/AI interactions.

Right now OAI's synthetic data pipeline is very heavily weighted to 1-on-1 conversations.

But models are being deployed into multi-user spaces that OAI doesn't have access to.

If you look at where their products are headed right now, this is very much the right move.

Expect it to be TikTok style media formats.


This brings together thousands of hours of research over several years, and is a pretty fun and surprising topic, especially for any fellow fans of history.

And as unbelievable as you may think the title to be, I can pretty much guarantee you'll find it much more believable by the end of the post.


For throwing that much shade, it does a piss poor job in actually backing up or citing the evidence.

Evans definitely had issues with how he went about things and his analysis. For example, the "snake goddess" is holding snakes remarkably similar to wooden snake props found in Egypt 300 years earlier.

But this article is pretty damn empty of actual substance.


In video games that have procedural generation, there's often a seed function that predicts a continuous geometry.

But in order to track state changes from free agents, when you get close to that geometry the engine converts it to discrete units.

This duality of continuous foundation becoming discrete units around the point of observation/interaction is not the result of dueling models, but a unified system.

I sometimes wonder if we'd struggle with interpreting QM the same way if there wasn't a paradigm blindness with the interpretations all predating the advances in models in information systems.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: