Hacker Newsnew | past | comments | ask | show | jobs | submit | rglover's commentslogin

Considering the scope, this could be more easily resolved by just stripping ", Republic of" from that specific string (assuming "Moldova" on its own is sufficient).

Our leaders are lost people.

Was bracing for another rug pull around all this, but kudos to Dario and co for their continued vigilance. Refreshing to see.


> When you spend two years making useless Arduino projects, you develop instincts about electronics, materials, and design that you can’t get from a tutorial. When vibe coding goes straight to production, you lose that developmental space. The tool is powerful enough to produce real output before the person using it has developed real judgment.

The crux of the problem. The only way to truly know is to get your hands dirty. There are no shortcuts, only future liabilities.


Then again, sophisticated manufactured electronics had long been cheap and available by the time somebody thought to create Arduino as a platform in the first place.

And even today, people hack on assembly and ancient mainframe languages and demoscene demos and Atari ROMs and the like (mainly for fun but sometimes with the explicit intention of developing that flavor of judgment).

I predict with high confidence that not even Claude will stop tinkerers from tinkering.

All of our technical wizardry will become anachronistic eventually. Here I stand, Ozymandius, king of motorcycle repair, 16-bit assembly, and radio antennae bent by hand…


Nah.

There are corners of the industry where people still write ASM by hand when necessary, but for the vast, vast majority it's neither necessary (because compilers are great) or worthwhile (because it's so time consuming).

Most code is written in high-level, interpreted languages with no particular attention paid to its performance characteristics. Despite the frustration of those of us who know better, businesses and users seem to choose velocity over quality pretty consistently.

LLM output is already good enough to produce working software that meets the stated requirements. The tooling used to work with them is improving rapidly. I think we're heading towards a world where actually inspecting and understanding the code is unusual (like looking at JVM/Python bytecode is today).

Future liabilities? Not any more than we're currently producing, but produced faster.


Compilers take a formal language and translate it to another formal language. In most cases there is no ambiguity, it’s deterministic, and most importantly it’s not chaotic.

That is changing one word in the source code doesn’t tend to produce a vastly different output, or changes to completely unrelated code.

Because the LLM is working from informal language, it is by necessity making thousands of small (and not so small) decisions about how to translate the prompt into code. There are far more decisions here than can reasonably fixed in tests/specs. So any changes to the prompt/spec is likely to result in unintended changes to observable behavior that users will notice and be confused by.

You’re right that programmers regularly churn out unoptimized code. But that’s very different than churning out a bubbling morass where ever little thing that isn’t bolted down is constantly changing.

The ambiguity in translation from prompt to code means that the code is still the spec and needs to be understood. Combine that with prompt instability and we’ll be stuck understanding code for the foreseeable future.


The problem you describe is real, but I think it can be addressed by improving tooling without any improvement in available LLM technology.

How? Are you thinking of adversarial AI reviewers, runtime tests (also by AI), or something else?

Guess I just don't see how you can take the human out of the loop and replace them with non-deterministic AIs and informal prompts / specs.


Humans are also non-deterministic, though. Why does replacing one non-deterministic actor with another matter here?

I'm not particularly swayed by arguments of consciousness, whether AI is currently capable of "thinking", etc. Those may matter right now... but how long will they continue to matter for the vast majority of use cases?

Generally speaking, my feeling is that most code doesn't need to be carefully-crafted. We have error budgets for a reason, and AI is just shifting how we allocate them. It's only in certain roles where small mistakes can end your company - think hedge funds, aerospace, etc. - where there's safety in the non-determinism argument. And I say this as someone who is not in one of those roles. I don't think my job is safe for more than a couple of years at this point.


> Generally speaking, my feeling is that most code doesn't need to be carefully-crafted. We have error budgets for a reason, and AI is just shifting how we allocate them. It's only in certain roles where small mistakes can end your company - think hedge funds, aerospace, etc. - where there's safety in the non-determinism argument.

That's a bit shortsighted. There have been cries of software becoming needlessly bloated and inefficient since computers have existed (Wirth, of course, but countless others too). Do you visit any gamer communities? They are constantly blaming careless waste of resources and lack of optimization in games for many AAA games performing badly in even state of the art hardware, or constantly requiring you to upgrade your gaming rig.

I don't think the only scenario is boring CRUD or line of business software, where indeed performance often doesn't matter, and most of it can now be written by an AI.


Even in CRUD line of business software, lack of performance causes enormous problems that the current software development culture glosses over.

Just one example I've seen time and again. You take an application that if optimized could run on a single server (maybe 2 if you absolutely have to have zero downtime deployments), but because no one cares about performance it runs on 10 or more. You now have a complexity avalanche that rapidly blows up. Then you need more hierarchy to handle the additional organizational complexity etc...

Then people start breaking out pieces of the app so they can scale them separately and before long you're looking at 200 engineers to do a job that certainly doesn't need that many people.

I realize I'm ignoring a whole lot of other issues that result in this kind of complexity, but lack of performance contributes to this a lot more than people want to admit.


Agreed. I wanted to give some credence to the fact many cookie-cutter CRUD apps can absorb a ton of inefficiencies until they truly burst at the seams, but yeah, even in that case software bloat and bad use of resources matters.

I find it intriguing seeing this new batch of dev-types completely giving up on the matter. The conversation of machine vs developer efficiency is not new, but completely giving up on any sane use of resources is something relatively new, I think. Especially coming from some in the HN crowd. Maybe these are new people, so I can chalk it up to generational turnover?


I've been in industry for >15 years and have given up on sane use of resources because it's been actively disincentivised everywhere I've worked.

Sanity is for my personal projects.


It has nothing to do with whether small mistakes are allowable or not. It’s about customers needing a consistent product.

The in-code tests and the expectations/assumptions about the product that your users have are wildly different. If you allow agents to make changes restricted only by those tests, they’re going to constantly make changes that break customer workflows and cause noticeable jank.

Right now agents do this at a rate far higher than humans. This is empirically demonstrable by the fact that an agent requires tests to keep from spinning out of control when writing more than a few thousand lines and a human does not. A human is capable of writing tens of thousands as of lines with no tests, using only reason and judgement. An agent is not.

They clearly lack the full capability of human reason, judgment, taste, and agency.

My suspicion is that something close enough to AGI that it can essentially do all white dollar jobs is required to solve this.


> adversarial AI reviewers, runtime tests (also by AI), or something else?

And spec management, change previews, feedback capture at runtime, skill libraries, project scaffolding, task scoping analysis, etc.

Right now this stuff is all rudimentary, DIY, or non-existent. As the more effective ways to use LLMs becomes clearer I expect we'll see far more polished, tightly-integrated tooling built to use LLMs in those ways.


Agents require tests to keep from spinning out of control when writing more than a few thousand lines, but we know that tests are wildly insufficient to describe the state of the actual code.

You are essentially saying that we should develop other methods of capturing the state of the program to prevent unintended changes.

However there’s no reason to believe that these other systems will be any easier to reason about than the code itself. If we had these other methods of ensuring that observerable behavior doesn’t change and they were substantially easier than reasoning about the code directly, they would be very useful for human developers as well.

The fact that we’ve not developed something like this in 75 years of writing programs, says it’s probably not as easy as you’re making it out.


> Agents require tests to keep from spinning out of control when writing more than a few thousand lines, but we know that tests are wildly insufficient to describe the state of the actual code.

Provide them with a mature, well-structured codebase to work within. Break the work down into tasks sized such that it's unlikely they'll spin out of control. Limit the scope/nature of changes such that they're changing one thing at a time rather than trying to one-shot huge programs. Use static analysis to identify affected user-facing flows and flag for human review. Provide the human-in-the-loop with fully functional before and after dev builds. Allow the human-in-the-loop to provide direct feedback within the dev build. Track the feedback the same way you track other changes. And, yes, have some automated tests that ensure core functionality matches requirements.

I think everything I've listed there can be built with existing technology.

> You are essentially saying that we should develop other methods of capturing the state of the program to prevent unintended changes.

I think you're imagining something far more sophisticated than what I'm actually suggesting. I also think you're setting a higher bar for agents to clear than what's actually required in practice.

Tests don't need to catch every issue, agents should be expected to make some mistakes (as humans do).

> However there’s no reason to believe that these other systems will be any easier to reason about than the code itself. If we had these other methods of ensuring that observerable behavior doesn’t change and they were substantially easier than reasoning about the code directly, they would be very useful for human developers as well.

There are lots of powerful static analysis tools out there than can be helpful in improving correctness and reducing the incidence of regressions. IME most human developers tend to eschew tools that are unfamiliar, have steep learning curves, or require extra effort when writing code.

> The fact that we’ve not developed something like this in 75 years of writing programs, says it’s probably not as easy as you’re making it out.

I think the cost/benefit of what I'm describing has changed. We've only had LLMs capable of reliably producing working code changes for around a year.


"users seem to choose velocity over quality pretty consistently"

When do they have a real choice, without vendor lock-in or other pressure?

Windows 11 is 4 years old but until a few months ago barely managed to overtake Windows 10. Despite upgrades that were only "by choice" in the most user hostile sense imaginable (those dark patterns were so misleading I know multiple people who didn't notice that they "agreed" to it, and as it pop ups repeatedly it only takes a single wrong click to mess up). It doesn't look like people are very excited about the "velocity".

In the gaming industry AAA titles being thrown on the market in an unfinished state tends to also not go over well with the users, but there they have more power to make a choice as the market is huge and games aren't necessary tools, and such games rarely recover after a failed launch.


You're absolutely right -- that's the crux of the problem. There are no shortcuts, only future liabilities.

If you didn't catch it, this is a joke calling out the comment above it for using a couple obvious LLM-isms. The comment above may have been a joke, too. It's hard to tell any more.

> It's hard to tell any more.

Wait, I think I have the answer!

"You're in a desert, walking along in the sand when all of a sudden you look down and see a tortoise. It's crawling toward you. You reach down and flip the tortoise over on its back. The tortoise lays on its back, its belly baking in the hot sun, beating its legs trying to turn itself over. But it can't. Not without your help. But you're not helping. Why is that?"


What do you mean I'm not helping?

I mean you're not helping. Why is that?

What's a tortoise? What desert? How come I'd be there?

Maybe you're fed up. Maybe you want to be by yourself. Who knows?

Hm... Why? Ah! Because you are also a tortoise

Tortoise have been observed righting other tortoise that have become stuck. https://www.youtube.com/shorts/DZ57D608fiM (two tortoises helping a third) this has a terrible voiceover but you get the idea

They're just questions

> You're absolutely right

Bot detected


But crucially they used "--" and not "—" which means they're safe. Unless it's learning. I may still be peeved that my beloved em dash has been tainted. :(

Of course they'll learn. LLM bots have been spotted on HN using that hipster all lower case style of writing.

i can write like this if i want. or if i were a clever ai bot.


No need to be clever, just add the instruction to write in that way.

I think that's the joke.

I found the key insight -- when a human tries to sound like an LLM, that's perceived by other humans as humor.

The issue is clear

Not sarcasm. Not cynism. Just pure humor.

Oh my God, this is peak GPT.

Never admit when someone else is right. They'll forget they were right and begin to think they won a fight.

Or something. You're right.


Couldn't one rebut that Arduino is plug-and-play without getting your hands dirty in lower-level electronics?

The article addresses this by making the point that prototypes != production. Arduino is great for prototyping (authors opinion; I have limited experience) but not for production-level manufacturing.

LLMs are effectively (from this article's pov) the "Arduino of coding" but due to their nature, are being misunderstood/misrepresented as production-grade code printers when really they're just glorified MVP factories.

They don't have to be used this way (I use LLMs daily to generate a ton of code, but I do it as a guided, not autonomous process which yields wildly different results than a "vibed" approach), but they are because that's the extent of most people's ability (or desire) to understand them/their role/their future beyond the consensus and hype.


I think even calling them MVP factories is a bit much. They're demo factories. Minimum Viable Products shouldn't have glaring security vulnerabilities and blatant inefficiency, they just might me missing nice-to-have features.

Not, because it isn’t. A plug and play arduino is useless without some level of circuit building expertise.

Yep. Increases output but reduces understanding.

It doesn't have to reduce understanding. It completely depends on how you use it. See, for example, this study

https://arxiv.org/html/2601.20245v2


Published by Anthropic. It's a bit like a "study" by Coca Cola "proving" that one can lose weight by just drink their product rather than doing sport. Sure, it's not impossible, but that's definitely not the normal usage.

I agree some skepticism is warranted. However I think we need to avoid essentialist thinking about what effects AI usage will have on a person.

> that's definitely not the normal usage

The way I look at it, AI use -- proper AI use -- is something that needs to be taught / learned. It's not unlike other "computer literacy" skills. There are ways of using it more or less effectively to achieve your goals.


HA hey before you code then hope you roll your own silicon because otherwise its just shortcuts.

This is such high minded bullshit.


I might be tilting at a strawman of your definition of vibe coding - apologies in advance if so.

But LLM-aided development is helping me get my hands dirty.

Last weekend, I encountered a bug in my Minecraft server. I run a small modded server for my kids and I to play on, and a contraption I was designing was doing something odd.

I pulled down the mod's codebase, the fabric-api codebase (one of the big modding APIs), and within an hour or so, I had diagnosed the bug and fixed it. Claude was essential in making this possible. Could I have potentially found the bug myself and fixed it? Almost certainly. Would I have bothered? Of course not. I'd have stuck a hopper between the mod block and the chest and just hacked it, and kept playing.

But, in the process of making this fix, and submitting the PR to fabric, I learned things that might make the next diagnosis or tweak that much easier.

Of course it took human judgment to find the bug, characterize it, test it in-game. And look! My first commit (basically fully written by Claude) took the wrong approach! [1]

Through the review process I learned that calling `toStack` wasn't the right approach, and that we should just add a `getMaxStackSize` to `ItemVariantImpl`. I got to read more of the codebase, I took the feedback on board, made a better commit (again, with Claude), and got the PR approved. [2]

They just merged the commit yesterday. Code that I wrote (or asked to have written, if we want to be picky) will end up on thousands of machines. Users will not encounter this issue. The Fabric team got a free bugfix. I learned things.

Now, again - is this a strawman of your point? Probably a little. It's not "vibe coding going straight to production." Review and discernment intervened to polish the commit, expertise of the Fabric devs was needed. Sending the original commit straight to "production" would have been less than ideal. (arguably better than leaving the bug unfixed, though!)

But having an LLM help doesn't have to mean that less understanding and instinct is built up. For this case, and for many other small things I've done, it just removed friction and schlep work that would otherwise have kept me from doing something useful.

This is, in my opinion, a very good thing!

[1]: https://github.com/FabricMC/fabric-api/pull/5220/changes/3e3...

[2]: https://github.com/FabricMC/fabric-api/pull/5220/changes


Simple != easy

A good point. I have a feeling once the dust settles most new jobs will be some form of teleoperation.

Excited to give this a try, looks really well done.

Imagine an OIDC type solution but for parents might work here.

Basically, kids can sign up for an account triggering a notification to parents. The parent either approves or rejects the sign in. Parents can revoke on demand. See kids login usage to various apps/services. Gets parental restrictions in the login flow without making it a PITA.


I think it has very little to do with the assistant factor and more to do with the loneliness factor (at least in the West, people are getting lonelier, not less). In other words: sell it to them as a friendly companion/assistant, playing on emotions, while creating a sea of surveillance drones you can license back to the powers that be at a premium.

It's a hell of a mousetrap.

Starts playing Somewhere Over the Rainbow.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: