Hacker Newsnew | past | comments | ask | show | jobs | submit | karolist's commentslogin

you must be new here


The post reads like written by someone who read too much about AI rather than tried to build a startup with the help of AI that they advocate so much. I'm still bounded by system design, UX, pricing and feature decisions, if not by the speed of code output, by the review time for sure. Yes, iterating is faster, but we're nowhere near agentic AI loops spitting out working products. Technically it's possible, but then you just spent that time planning and writing the spec up front, which you'd interleave with dev time otherwise. If the product is a simple CRUD database skin, then yeah, chances of success are lower I think, but this is not the type of startups the post seems to write about.


It's gotten to the point where, when I see yet another Thought Leadership article about software development, I search the page for the word "will". If I see unqualified predictions of the future (AI will change this and Agents will do that and developers will need to do thus), I think I can safely ignore the article. Who has the hubris to make such strong and unwavering statements about a future nobody can see?


A local "thought leader" wrote a ridiculous piece about software leadership in a post-LLM world, when clearly they'd never actually used an LLM to build anything, or actually led a team using LLMs. Lots of hand-waving but obviously no real experience.

As gp says, there's a big difference between theory and practice here, and a lot of the things we needed when we weren't using LLMs are still needed when we are, but it takes a bit of actual practice to work this out. It's still not at the stage where an Ideas Guy can make a real working product without someone on the team actually knowing how to develop software.

At least in my experience, so far. But the world is changing fast.


The last "Thought Leader" worth listening to was put to death by Athens.

Some that came after might be worthy of the title, but those who claim it for themselves aren't.


> Who has the hubris to make such strong and unwavering statements about a future nobody can see?

LinkedIn influencers


How come all the talking heads telling us AI has made whatever we've learned or built obsolete never point the lens back at themselves? I personally thing knowledge from entrepreneurs who learned their lessons in previous decades is valuable, but if what they're spouting is true it doesn't make any sense to listen to them, i.e. these thought leaders are desperate to tell you of their irrelevance. Same thing for every CTO who tells me AI replaces developers. That's a stretch, but if we could do that, don't you think it would be trivial to replace your job?


Someone is still needed to decide what the AI should do, and to harness and manage it. A CTO can do that with existing skills, and the resulting organization should converge quickly on near-zero need or want for human SWEs.

So goes the thinking, anyway. It's why my couple decades of experience and I still occasionally get to hear from rando cold recruiters desperate to sell someone a "pivot to AI," probably thinking they can lowball me by holding my mortgage over my head in order to screw three times the work out of me that they'd pay for.

I was in this business too long.


Have you ever sweat bullets at 3 a.m.? While Claude spins in circles unable to fix production without breaking five other things?

You will!


Yeah… also, it’s just weird. Interfaces are important, they contain information and affordances, everything should not become a chatbot.


Interfaces will definitely evolve. Visual graphs and representations are useful but speaking to an agent will become mainstream as its faster than typing. Also the ability for agents to code on the fly will open up different interfaces. For instance you could say "show me the impact that our marketing campaign x over this time period" and out comes graphs that were coded. Drawing might even make a comeback for instance when designing a website you just cross out things you don't want, draw boxes of where you want things, talk at the same time saying what you want in that box. Then some people are using virtual relativity. Not everything will become a chatbot but its definitely going to evolve with chatting being an integral part of the user interface.


You're describing some Hollywood version of SF, not the real world. Speaking is not faster than pressing a key or turning a knob (like try to operate a CAD or a DAW without keyboard and mouse). And for most report/infographic, you mostly need to design a few dashboard and almost never change them, because those are your core metrics that you need to monitor. And the ability to sketch even a simple wireframe relies on a lot of knowledge that most people don't want to burden themselves with.


In the real world a verbal description to start off a design from a template is _very much_ a competitive advantage.

I dabble in music production and having a DAW to help me guide some parts of the process would be extremely useful to get me out of certain creative ruts.


Yeah, it's crazy to think an opaque chatbot will be preferable to a well designed UI for most users. People don't like badly designed UIs, but I'm pretty sure most people under 40 prefer a well designed UI to a customer service agent. We call customer service because the website doesn't do what we want, not because we don't want to use the website.


I usually find chatbot interfaces completely infuriating. I know the UX paradigm of click-reduction went the way of the dodo (and rightfully so, because it was based on bullshit research,) but I think it’s funny that completely removing the user’s agency and visibility into any process, and turning 3-click processes into 200 keystroke processes is the hot shit right now.


I'm glad this is the top comment. I'm ambivalent about a bunch of writing I've seen from Steve Blank - some of his stuff I've loved and some I thought was awful.

But this I just thought was vacuous. I agree with what you wrote, but more to the point, I didn't find any real advice about how a startup should actually change that passed my sniff test. I left the tech startup world about 2 years ago myself, and I'm glad I did, because I just think there are way fewer differentiable opportunities now. That is, even if I accept what Blank says is true, what are all these 2+ year old startups supposed to do - just create some model wrapper/RAG chatbot product like the million other startups out there?

Even in defense, like the article says, there are now a bajillion drone companies, and it looks like a race to the bottom. The most successful plan at this point just looks like the grifter plan, e.g. getting the current president to tweet out your stock ticker.

I'm honestly curious what folks think are good startup business plans these days. Even startups that looked they were "knock it out of the park" successes like Cursor and Lovable just seem like they have no moat to me - I see very few startups (particularly in the "We're AI for X!" that got a ton of funding in the past two years) with defensible positions.


Are you familiar with Steve Blank? What you’re describing really isn’t his MO at all.


I have a lot of respect for Steve Blank, but my heuristic by now is to ignore any breathless posts that state “teams are doing X with AI, if you are not doing the same you’re behind”.

The much more useful posts are “my team and I are doing X with AI”. Of course, the challenge there is that the ones who are truly getting a competitive edge through AI are usually going to be too busy building to blog about it.


I really enjoyed reading his articles a while back. He wrote an article about about silicon valley's roots in microchip development. I can't remember the details but I reached out to point out how important Autonetics was to chip development and that was based in Los Angeles. From my perspective what made Silicon Valley significant was it's connection to wall street. I wanted to engage on the topic of Venture Capital might be the real product of silicon Valley.

He could have ignored the email or engaged on the topic I introduced. Instead he sent me a wikilink to Autonetics. I was left with the feeling that he had no real interest in the topic he wrote about. It was really no big deal. He is a busy guy and doesn't need to engage with strangers. I never read anything by him again because I was left with the feeling he is just phoning these posts in.


I'm not, but this is not a great introduction. It's handwavy and makes the assumption that AI dev tools are much farther along than they are. I have seen this a lot lately; the farther up the management chain and farther away from putting hands on code, the more confident people seem to be in the power of AI tools.

For big complex real world problems, and big complex real worlde codebases, the AIs are helpful but not yet earth shattering. And that helpfulness seems to have plateaued as of late.

I am extremely skeptical of posts like this.


I will take a lot more hand waving from the 70-something year-old Stanford professor who co-created far-up the chain management paradigms that run a good chunk of the economy. That context kinda changes things but what do I know.


Based on his own arguments a 70-something Stanford prof has no more knowledge, experience or credibility than someone who started 18 months ago.

These guys don't get to have it both ways.



That thing he created says you should take your assumptions out into the real world and validate them, ya?

So hand-waving about how easy it is to have an MVP in days w/o actually experience in doing that seems ironic.

Now, maybe he's saying this based on companies he's funded who've had great success with what he's saying. But it's curious that the only concrete example of a company mentioned is one that's six years old and not operating like that. And in fact, many of the ways he thinks that company went wrong seem completely unrelated to AI?

> Chris is now starting to raise his first large fundraising round. In looking at his investor deck I realized that while he’s been heads down, the world has changed around him – by a lot. The software moat he built with his 5-year investment in autonomy development is looking less unique every day. Autonomous drones and ground vehicles in Ukraine have spawned 10s, if not 100s, of companies with larger, better funded development teams working on the same problem.

> While Chris has been fighting for adoption for this niche market (one that is ripe for disruption, but the incumbents still control), the market for autonomy in an adjacent market – defense – has boomed. In the last five years VC Investment in defense startups has gone from zero to $20 billion/year. His product would be perfect for contested logistics and medical evacuation. But he had literally no clue these opportunities in the defense market had occurred.

> While there’s still a business to be had (Chris’s team has done amazing system integration with an existing airborne platform that makes his solution different from most), – it’s not the business he started.

"Being heads down without paying enough attention to the market for 6 (!!) years" doesn't seem like an AI-caused issue.

Meanwhile, the core suggestion doesn't seem to fix that, it seems almost completely perpendicular.

> You can now test multiple versions of the same business at once (or simultaneously be testing different businesses). While you can be simultaneously testing five pricing models, ten messages or twenty UX flows, the “user interface” may no longer be a screen at all. Testing might be to find prompt(s) to AI Agent(s) deliver needed outcomes.

Ok, but this person didn't even seem to be doing enough paying to the market of one version already?

And while this claim about parallel development being a huge unlock is the most interesting thing, it also sounds a bit glib. Getting your foot in the door is the hardest thing early on, now you're trying to run six versions of your company at once? Each time you get a foot in the door sales-wise, are you trying to make them use all 6 versions, or are you only gonna get feedback on 1? Would you want to pay money to be a beta tester of 6 different products simultaneously, with reason to believe that 5 of them will probably evaporate over night soon?


So you’re saying he’s majorly complicit in the ultracapitalist dystopia the US has turned into?


> the ultracapitalist dystopia the US has turned into

Seriously, where do ideas like this come from? An "ultracapitalist" country that has about as much redistributive social spending as other developed economies[0]? A "dystopia" that millions of people from all over the world clamor to get into every year?

[0]: https://www.piie.com/publications/policy-briefs/2016/true-le...


This is disingenuous at best.

The man is an economist, not a crony operating at the federal level (one does not imply the other, and I know nothing of the man's background).


So in other words... he doesn't actually use the tools he's firmly convinced will automate the building of software.

I don't agree with the parent; I think capitalism is doing a lot of great things for us and will continue to, even with AI. But man I'm tired of these hot takes from people with limited practical experience.


> But man I'm tired of these hot takes from people with limited practical experience.

I hear you there. There's a reason I limit what I say online. I know very little about very little.


Steve Blank the startup whisperer and Steven Blank the economist are two very different people.


I think the AI backlash is strong enough that "AI-Free" might be a powerful marketing tool, whether that is fair or not.


Another option I like: organic.


I like: hand crafted.


I like: rawdogged.


"hand made" still seems applicable, too


You are assuming a linear future while we are in an exponential.

One year ago models could barely write a working function.


GPT-4o is 23 months old.

One year ago, the models were only slightly less competent than today. There were models writing entire apps 3 years ago. Competent function writing is basically a given on all models since GPT3.

Much of the progress in the past year has been around the harnesses, MCPs, and skills. The models themselves are not getting better exponentially, if anything the progress is slowing down significantly since the 2023-2024 releases.


>One year ago, the models were only slightly less competent than today.

That has not been my experience. This weekend I pointed Claude Code+Opus 4.6+effort=max at a PRD describing a Docusign-like software. The exact same document I gave to Claude Code+Opus 4.5+Ultrathink around 6 months ago.

The touch-ups I needed after it completed implementation was around a tenth that it took with 4.5. It is a pretty startling difference.


Agree with this. Opus 4.6 thinks of things I didn't even put in the spec, but absolutely need. It thinks around all the edge cases and gotchas. And I love the way modern AI UIs stop in their tracks and have you answer a bunch of questions about all the ambiguities you left in the spec.

They still do dumb shit from time-to-time, but it's getting rarer.


Yeah I've been able to get great Python functions out of everything since the ChatGPT 4 API in early-to-mid 2023.

It takes far less manual prompting to make it have consistent output, work well with other languages, etc. But if you watch the "thinking" logs it looks an awful lot like the "prompt engineering" you'd do by hand back then. And the output for tricky cases still sometimes goes sideways in obviously-naive-ways. The most telling thing in my experience is all the grepping, looping, refining - it's not "I loaded all twenty of these files into context and have such a perfect understanding of every line's place in the big picture that I can suggest a perfect-the-first-time maximally-elegant modification." It's targeted and tactical. Getting really good at its tactics for that stuff, though!

I can get more done now than a year ago because taking me out of the annoying part of that loop is very helpful.

But there's still a very curious gap that the tool that can quickly and easily recognize certain type of bugs if you ask them directly will also happily spit out those sorts of bugs while writing the code. "Making up fake functions" doesn't make it to the user much anymore, but "not going to be robust in production but technically satisfies the prompt" still does, despite it "knowing better" when you ask it about the code five seconds later.


One year before 1969 we had never been to the moon. In the 70s credible scientists and physicists predicted that large martian colonies would exist before the year 2000.

If a metric goes from 0 to 2 it doesn't mean it's on a long-lived exponential trajectory.


Even if there is a growth pattern that doesn't say how long it will continue. Some things can grow for a while and then hit a ceiling. Sometimes they are a fad that dies (when was the last time you bought a pet rock), sometimes everyone has one - you get a few sales of replacemets but no growth (you have a washing machine and won't buy another until the old wears out)



[flagged]


What if I fold you... that using 1960s technology it would be easier to just go to the moon than it would be to fake it? Have you SEEN a 1960s movie with SFX?


The conspiracy theorists like to involve Kubrick, because he did such a good job on 2001: A Space Odyssey (1968).


> One year ago models could barely write a working function.

This is a false claim.

Claude Code was released over a year ago.

Models have improved a lot recently, but if you think 12 months ago they could barely write a working function you are mistaken.


Sigmoids look a lot like exponentials early on.

We can’t say for sure yet which trajectory we are on.


This comment is getting punished for the incorrect timeline (I would know, I've been harping on about AI getting good at coding for ~2 years now!) but I do think it is directionally correct. Just over 3 years ago, (publicly available) AI could not write code at all. Today it can write whole modules and project scaffoldings and even entire apps, not to mention all the other stuff agents can do today. Considering I didn't think I'd see this kind of stuff in my lifetime, this is a blink of an eye.

Even if a lot of the improvements we see today are due to things outside the models themselves -- tools, harnesses, agents, skills, availability of compute, better understanding of how to use AI, etc. -- things are changing very quickly overall. It would be a mistake to just focus on one or two things, like models or benchmarks, and ignore everything else that is changing in the ecosystem.


I agree it's directionally correct, but only in the ways that don't matter to this discussion. If 2026->2029 AI is as much of an improvement as 2023->2026 AI, is anything we learn about how to leverage it in 2026 going to stay relevant?


Seems extremely disingenuous to say that one year ago models could barely write a working function. In fact, there were plenty capable of writing a working function with the right context fed in, exactly as today.


can you link to that gist? I'd be interested to read through it


I forked pi-mono to freeze it.

Here is a session of pi analyzing coding-agent package itself.

https://ontouchstart.github.io/.pi/agent/sessions/--pi-mono-...

It was created via a non-interactive CLI command in a docker container connected to a local llama.cpp server with a very limited model. $0 token cost.



My reading of that isn't that the harness matters so much as the overall platform environment that agents operate in and the approach taken by the team.

    > Before Blitzy starts any work on code generation, the platform launches collaborative agents to deeply analyze the repository – mapping dependencies, understanding conventions, and capturing domain logic. This documentation process can take hours or days. When prompted to add a feature, refactor code or fix bugs, Blitzy replies with a highly detailed technical specification.
The same approach could be taken with any harness with a skill to perform this step first before starting work.



What exactly are you pointing out? I read the link and the linked thread and it's not clear what position is being presented.

I don't see evidence that the harness -- rather than the approach to information indexing and agent tooling -- makes much of a difference.

You can make a case "this harness bakes X in" (or in the case of pi "this harness bakes nothing in; you choose your own adventure"), but at the end of the day, skills are just markdown files and CLIs and shell scripts can be used by any harness; they are portable. CC allows override of the system prompt[0] and I would guess most harnesses have similar facilities. I don't see how the harness is going to be the bigger impact versus the configured tooling (skills, scripts, plugins).

The extraordinary claim here is that if I configured pi and CC, Codex, etc. with the same system prompt, same tools, same skills, that pi would outperform CC, Codex. That's what it means to say the harness matters. That just doesn't seem right; rather its the configuration of tools, skills, and default prompt that matters.

[0] https://code.claude.com/docs/en/cli-reference#system-prompt-...


My point is pi-coding-agent [1] is a very well designed and implemented open source project that we all can learn from as software engineers. His blog post about his decision making [2] is also very well written.

I should've given original links instead of noisy HN threads.

[1] https://github.com/badlogic/pi-mono/blob/main/packages/codin...

[2] https://mariozechner.at/posts/2025-11-30-pi-coding-agent/

pi-conding-agent itself is not a product yet and it won't have much value to end users other than those using the products built on top of it.


You can have an opinion about a tool as a user, without ever having ability to create such a tool yourself, that's literally what every tech and auto reviewer does.


Sure, and the less you understand about the tool’s fundamental capabilities, the less useful your opinion is. The best reviewers have deep knowledge.


You can use this logic to say all products are perfect and any criticisms of them by users are moot because their creator knows them best.


This, at best bullet talking points were fed to the prompt and given and output length restriction, it's padded to fit the space diluting the message to the point only an LLM can


Ditto, I don't see myself upgrading in the near future, the 64GB M1 Max I paid 2499 at the end of 2023 still feels like a new machine, nothing I do can slow it down. Apple kept OS updated for around 6 years in Intel times, I don't see how they can drop support for this one tbh. I'm still paying for apple care since I depend on it so much


To replace Kubernetes, you inevitably have to reinvent Kubernetes. By the time you build in canaries, blue/green deployments, and rolling updates with precise availability controls, you've just built a bespoke version of k8s. I'll take the industry standard over a homegrown orchestration tool any day.


We've used ECS back when we were on AWS, and now GCE.

We didn't have to invent any homegrown orchestration tool. Our infra is hundreds of VMs across 4 regions.

Can you give an example of what you needed to do?


Really? What deploys your code now? I'm SRE, walk me through high level. How do I roll back?


It used be Google Deployment Manager but that's dead soon so terraform.

To roll back you tell GCE to use the previous image. It does all the rolling over for you.

Our deployment process looks like this:

- Jenkins: build the code to debian packages hosted on JFrog

- Jenkins: build a machine image with ansible and packer

- Jenkins: deploy the new image either to test or prod.

Test deployments create a new Instance Group that isn't automatically attached to any load balancer. You do that manually once you've confirmed everything has started ok.


ECS deployments. Automatically rolls back on failure. Not sexy but it works reliably.


I've moved my SaaS I'm developing to SeaweedFS, it was rather painless to do it. I should also move away from minio-go SDK to just use the generic AWS one, one day. No hard feelings from my side to MinIO team though.


The amount of times benchmarks of competitors said something is close to Claude and it was remotely close in practice in the past year: 0


I honestly feel like people are brainwashed by anthropic propaganda when it comes to claude, I think codex is just way better and kimi 2.5 (and I think glm 5 now) are perfectly fine for a claude replacement.


So much money is on the line for US super scalers that they probably pay for ‘pushes’ on social media. Maybe Chinese companies are doing the same.


I would say that’s more certain than just a “probably“. I would bet that some of the ridiculous fear mongering about language models trying to escape their server, blackmail their developers, or spontaneously participating in a social network are all clandestine marketing campaigns. The technology is certainly amazing and very useful, but I don’t think any of these terminator stories were boosted by the algorithms on their own.


> I think codex is just way better

Codex was super slow till 5.2 codex. Claude models were noticeably faster.


Was this text run through LLM before posting? I recognize that writing style honestly; or did we simply speak to machines enough to now speak like machines?


Yes. This is absolutely chatgpt-speak. I see it everywhere now. It's inescapable. At least this appears to be largely human authored and have some substance, which is generally not the case when I see these LLM-isms.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: