Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's such a huge disconnect between people reading headlines and developers who are actually trying to use AI day to day in good faith. We know what it is good at and what it's not.

It's incredibly far away from doing any significant change in a mature codebase. In fact I've become so bearish on the technology trying to use it for this, I'm thinking there's going to have to be some other breakthrough or something other than LLM's. It just doesn't feel right around the corner. Now completing small chunks of mundane code, explaining code, doing very small mundane changes. Very good at.



I think that LLMs are only going to make people with real tech/programming skills much more in demand, as younger programmers skip straight into prompt engineering and never develop themselves technically beyond the bare minimum needed to glue things together.

The gap between people with deep, hands-on experience that understand how a computer works and prompt engineers will become so insanely deep.

Somebody needs to write that operating system the LLM runs on. Or your bank's backend system that securely stores your money. Or the mission critical systems powering this airplane you're flying next week... to pretend like this will all be handled by LLMs is so insanely out of touch with reality.


I think we who are already in tech have this gleeful fantasy that new tools impair newcomers in a way that will somehow serve us, the incumbents, in some way.

But in reality pretty much anyone who enters software starts off cutting corners just to build things instead of working their way up from nand gates. And then they backfill their knowledge over time.

My first serious foray into software wasn't even Ruby. It was Ruby on Rails. I built some popular services without knowing how anything worked. There was always a gem (lib) for it. And Rails especially insulated the workings of anything.

An S3 avatar upload system was `gem install carrierwave` and then `mount_uploader :avatar, AvatarUploader`. It added an avatar <input type="file"> control to the User form.

But it's not satisfying to stay at that level of ignorance very long, especially once you've built a few things, and you keep learning new things. And you keep wanting to build different things.

Why wouldn't this be the case for people using LLM like it was for everyone else?

It's like presuming that StackOverflow will keep you as a question-asker your whole life when nobody here would relate to that. You get better, you learn more, and you become the question-answerer. And one day you sheepishly look at your question history in amazement at how far you've come.


> Why wouldn't this be the case for people using LLM like it was for everyone else?

I feel like it's a bit different this time because LLMs aren't just an abstraction.

To make an analogy: Ruby on Rails serves a similar role as highways—it's a quick path to get where you're going, but once you learn the major highways in a metro area you can very easily break out and explore and learn the surface streets.

LLMs are a GPS, not a highway. They tell you what to do and where to go, and if you follow them blindly you will not learn the layout of the city, you'll just learn how to use the GPS. I find myself unable to navigate a city by myself until I consciously force myself off of Google Maps, and I don't find that having used GPS directions gives me a leg up in understanding the city—I'm starting from scratch no matter how many GPS-assisted trips I've taken.

I think the analogy helps both in that the weaknesses in LLM coding are similar and also that it's not the end of the world. I don't need to know how to navigate most cities by memory, so most of the time Google Maps is exactly what I need. But I need to recognize that leaning on it too much for cities that I really do benefit from knowing by heart is a problem, and intentionally force myself to do it the old-fashioned way in those cases.


I think also a critical weakness is that LLMs are trained on the code people write ... and our code doesn't annotate what was written by a human and what was suggested by a tool. In your analogy, this would be like if your sat nav system suggests that you turn right where other people have turned right ... because they were directed to turn by their sat nav.


In fact, I'm pretty sure this already happens and the results are exactly what you'd expect. Some of the "alternate routes" Google Maps has suggested for me in the past are almost certainly due to other people making unscheduled detours for gas or whatever, and the algorithm thinks "oh this random loop on a side street is popular, let's suggest it". And then anyone silly enough to follow the suggestion just adds more signal to the noise.


Google Maps has some strange feedback loops. I frequently drive across the Bay Bridge to Delaware beaches. There are 2 or 3 roughly equal routes with everyone going to the same destination. Google will find a "shorter" route every 5 minutes. Naturally, Maps is smart enough to detect traffic, but not smart enough to equally distribute users to prevent it. It creates a traffic jam on route A, then tells all the users to use route B which causes a jam there, and so on.


It hadn't even occurred to me that there are places where enough people are using Google Maps while driving to cause significant impact on traffic patterns. Being car-free (and smartphone-free) really gives a different perspective.


Are you Theodore Kaczynski ghost? :)

Seriously what job you do that allows you to not have a smartphone ?


Not OP: I have a smartphone for my own personal use but don't use it for work at all. If my employer wants me to use specific phone apps they can provide one for me like they do a laptop.


yeah so you got a smartphone, the dude was saying he doesn't have a smartphone.

no smartphone make it a real pain in the ass to do most things nowadays, job is one of the biggest but it's not the least. to even connect to my bank account on my computer i need a phone.


Cell phones exist which are not smartphones, and everyone who uses phones for 2FA is happy to send 2FA codes to a "dumb" phone. They only have your phone number, after all.


Well my bank doesn't give me the possibility to use SMS code. It's obligatory to use the app on a verified phone and put a pin.


This is also problematic in cases where navigation apps are not updated and drivers start taking routes they are no longer authorized to take.


Long before any use of LLMs, OsmAnd would direct you, if you were driving past Palo Alto, to take a congested offramp to the onramp that faced it across the street. There is no earthly reason to do that; just staying on the freeway is faster and safer.

So it's not obvious to me that patently crazy directions must come from watching people's behavior. Something else is going on.


In Australia the routes seem to be overly influenced by truck drivers, at least out of the cities. Maps will recommend you take some odd town bypass when just going down Main Street is easier.

I imagine what you saw is some other frequent road users making choices that get ranked higher.


> if you were driving past Palo Alto, to take a congested offramp to the onramp that faced it across the street

If you're talking about that left turn into Alma with the long wait instead of going into the Stanford roundabout and then the overpass, it still does that.


I've seen this type of thing with OsmAnd too. My hypothesis is that someone messed up when drawing the map, and made the offramp an extension of the highway. But I haven't actually verified this.


OsmAnd doesn't use traffic data. You can enable traffic map layers by feeding a reverse-engineered URL, though.


I'm not talking about use of traffic data. In the abstract, assuming you are the only person in the world who owns a car, that route would be a very bad recommendation. Safety concerns would be lower, but still, there's no reason you'd ever do that.


safety concerns would probably actually be higher since the most dangerous places on roads are areas where traffic crosses and conflicts (the road you cross to get from the offramp to onramp)


An example I notice frequently on interstate highways that go through large cities is GM suggesting you get off on an exit, take the access road and skip one of no exits, then get back on at the next ramp. It does it especially often during rush hour traffic. Denver is where I've noticed it the most, but it's not limited to that area by any means.


That already happens. Maps directs you to odd nonsense detours somewhat frequently now, that you get better results by overriding the machine. It's going down the way of web search.


> It's going down the way of web search.

This is an interesting idea. There's an obvious force directing search to get worse, which is the adversarial desire of certain people to be found.

But no such force exists for directions. Why would they be getting worse?


Probably my cynicism but its the more stores you drive past the more likely you are to stop off and buy something.


Exactly! This is an amazing observation and analogy.


The problem is now that the LLM GPS will lead you to the wrong place once a day on average, and then you still need either open the map and study where you are and figure out the route, or refine the destination address and pray it will bring you to the correct place. Such a great analogy!


Strangely this reminds me of exactly how you would navigate in parts of India before the Internet became ubiquitous.

The steps were roughly: Ask a passerby how to get where you want to go. They will usually confidently describe the steps, even if they didn't speak your language. Cheerfully thank them and proceed to follow the directions. After a block or two, ask a new passerby. Follow their directions for a while and repeat. Never follow the instructions fully. This triangulation served to naturally fill out faulty guidance and hucksters.

Never thought that would one day remind me of programming.


Indeed. My experience in India is that people are friendly and helpful and try to help you in a very convincing way, even so when they don't know the answer. Not so far off LLM user experience.


Try asking your GPS for the Western blue line stop on the Chicago L. (There are two of them and it will randomly pick one)


What is “your GPS” meant here. With Google Maps and Apple Maps it consistently picks the closest one (this being within minutes to both but much closer to one), which seems reasonable. Maybe not ideal as when either of these apps will bring up a disambiguation for a super market chain or similar, but I’m not witnessing randomness.


To be clear, above i was talking about LLMs. Randomness in real GPS usage is something I have never encountered in using Google maps already since 15 years or so. 99 percent of the time it brings/brought me exactly where i want to be, even around road works or traffic jams. It seems some people have totally different experiences, so odd.


Perhaps they have improved their heuristic for this one, though perhaps it was actually Uber/Lyft that randomly picks one when given as a destination...


I'm the kind of guy who decently likes maps, and I pay attention to where I'm going and also to the map before, during, and after using a GPS (Google maps). I do benefit from Google maps in learning my way around a place. It depends on how you use it. So if people use LLMs to code without trying to learn from it and just copy and paste, then yeah, they're not going to learn the skills themselves. But if they are paying attention to the answers they are getting from the LLMs, adjusting things themselves, etc. then they should be able to learn from that as well as they can from online code snippets, modulus the (however occasional) bad examples from the LLM.


> I do benefit from Google maps in learning my way around a place.

Tangent: I once got into a discussion with a friend who was surprised I had the map (on a car dashboard display) locked to North-is-up instead of relative to the car's direction of travel.

I agreed that it's less-convenient for relative turn decisions, but rationalized that setting as making it easier to learn the route's correspondence to the map, and where it passed relative to other landmarks beyond visual sight. (The issue of knowing whether the upcoming turn was left-or-right was addressed by the audio guidance portion.)


It's neat to hear that I'm not the only one who does this. It makes a night-and-day difference for me.

When the map is locked north, I'm always aware of my location within the larger area, even when driving somewhere completely new.

Without it, I could never develop any associations between what I'm seeing outside the windshield and a geospatial location unless I was already familiar with the area.


> LLMs are a GPS, not a highway.

I love these analogies and I think this one is apt.

To adapt another which I saw here to this RoR thread, if you're building furniture then LLMs are powertools while frameworks are ikea flatpacks.


It’s best the analogy of the month. I don’t think cab drivers today are the same cab drivers that knew the city by heart of the past.

So, it’s been a privilege gentlemen, writing apps from scratch with you.

walks off the programming Titanic with a giant violin


One small change to using a GPS radically impacts how much you know about the area -- do you use the default, always-forward view, or do you use the slightly-less-usable always-north setting? If you use the latter, you will find that you learn far more about the city layout while still benefitting from the GPS functionality itself.

I think LLMs are similar. Sure, you can vibe code and blindly accept what the LLM gives you. Or, you can engage with it as if pair programming.


> LLMs are a GPS, not a highway. They tell you what to do and where to go

It still gives you code you can inspect. There is no black box. Curious people will continue being curious.


The code you can inspect is analogous to directions on a map. Some have noted in this thread that for them directions on a map actually do help them learn the territory. I have found that they absolutely do not help me.

That's not for lack of curiosity, it seems to be something about the way that I'm wired that making decisions about where to navigate helps me to learn in a way that following someone else's decisions does not.


You have to study the map to learn from it. Zoom in and out on surroundings, look up unfamiliar landmarks, et cetera. If you just follow the GPS or copy paste the code no, you won’t learn.


The problem is that coders taking this approach are dominantly ones who lack the relevant skill - ones who are taking that approach because they lack that skill.


The ones that until now copied and pasted everything from Stack Overflow.


For now LLMs in coding are more like an LLM version of a GPS, not the GPS itself.

Like imagine you’d ask turn-by-turn directions from an LLM and then follow these directions ;). That’s how it feels when LLMs are used for coding.

Sometimes amazing, sometimes generated code is a swamp of technical debt. Still, a decade ago it was completely impossible. And the sky is the limit.


Difference here being that you actually learned the information about Ruby on Rails, whereas the modern programmer doesn't learn anything. They are but a clipboard-like vessel that passes information from an LLM onto a text editor, rarely ever actually reading and understanding the code. And if something doesn't work, they don't debug the code, they debug the LLM for not getting it right. The actual knowledge here never gets stored in the brain, making any future learning or evolving impossible.

I've had to work with developers that are over dependent on LLM's, one didn't even know how to undo code, they had to ask an LLM to undo. Almost as if the person is a zombie or something. It's scary to witness. And as soon as you ask them to explain their rationale for the solution they came up with - dead silence. They can't, because they never actually _thought_.


> I've had to work with developers that are over dependent on LLM's, one didn't even know how to undo code, they had to ask an LLM to undo.

Some also get into a loop where they ask the LLM to rewrite what they have, and the result ends up changing in a subtle undetected way or loses comments.


Difference here being that you actually learned the information about computers, whereas the modern programmer doesn't learn anything. They are but a typist-like vessel that passes information from an architect onto a text editor, rarely ever actually reading and understanding the compiled instructions. And if something doesn't work, they don't debug the machine code, they complain about the compiler for not getting it right. The actual knowledge here never gets stored in the brain, making any future learning or evolving impossible.

I've had to work with developers that are over dependent on high-level languages. One didn't even know how to trace execution in machine code; they had to ask a debugger. Almost as if the person is a zombie or something. It's scary to witness. And as soon as you explain them to explain their memory segmentation strategy - dead silence. They can't, because they never actually _thought_.


No, it really isn't at all comparable like that (and other discussion in the thread makes it clear why). Users of high-level languages clearly still do write code in those languages, that comes out of their own thought rather than e.g. the GoF patterns book. They don't just complain about compilers; they actually do debug the high-level code, based on the compiler's error messages (or, more commonly, runtime results). When people get their code from LLMs, however, you can see very often that they have no idea how to proceed when the code is wrong.

Debugging is a skill anyone can learn, which applies broadly. But some people just don't. People who want correct code to be written for them are fundamentally asking something different than people who want writing correct code to be easier.


Abstractions on top of abstractions on top of turtles...

It'll be interesting to see what kinds of new tools come out of this AI boom. I think we're still figuring out what the new abstraction tier is going to be, but I don't think the tools to really work at that tier have been written yet.


I think you're right; I can see it in the accelerating growth curve of my good Junior devs; I see grandOP's vision in my bad Junior devs. Optimistically, I think this gives more jr devs more runway to advance deeper into more sophisticated tech stacks. I think we're gonna need more SW devs, not fewer, as these tools get better: things that were previously impossible will be possible.


> I think we're gonna need more SW devs, not fewer

Code is a liability. What we really care about is the outcome, not the code. These AI tools are great at generating code, but are they good at maintaining the generated code? Not from what I've seen.

So there's a good chance we'll see people using tools to generate a ton of instant legacy code (because nobody in house has ever understood it) which, if it hits production, will require skilled people to figure out how to support it.


We will see both: lots of poor code, lots of neutral code (LLMs cranking out reasonably well written boilerplate), and even some improved code (by devs who use LLMs to ferret out inefficiencies and bugs in their existing, human-written codebase).

This is no different from what we see with any tool or language: the results are highly dependent on the experience and skills of the operator.


You've missed my core point if you think those isn't different. Before AI there was always someone who understood the code/system.

In the a world where people are having machines build the entire system, there is potentially no human that has ever understood it. Now, we are talking about a yet unseen future; I have yet to see a real world system that did not have a human driving the design. But, maintaining a system that nobody has ever understood could be ultra-hardmode.


Humans will always have a hand in the design because they need to explain the real-world constraints to the AI. Sure, the code it produces may be complex, but if the AI is really as smart as you're claiming it will eventually be, then it will also have the ability to explain how the code works in plain English (or your human language of choice). Even today, LLMs are remarkably good at summarizing what code does.

Philosophical question: how is LLM-produced code that nobody has ever understood any different from human-written legacy code that nobody alive today understands?


> Philosophical question: how is LLM-produced code that nobody has ever understood any different from human-written legacy code that nobody alive today understands?

- There is zero option of paying an obscene amount of money to find the person and make the problem 'go away'

- There is a non-zero possibility that the code is not understandable by any developer you can afford. By this I mean that the system exhibits the desired behavior, but is written in such a way that only someone like Mike Pall* can understand.

* Mike Pall is a robot from the future


> Code is a liability

Another way I've seen this expressed, which resonates with me, is "All code is technical debt."


As one HNer aptly put it: coding is to SWE as cutting is to surgery.


These AI tools are also not good at answering PagerDuty to fix a production problem that is a result of the code they imagined.


> Code is a liability

This is so true! Actual writing of the code is such a small step in overall running of a typical business/project, and the less of it the better.


> more sophisticated tech stacks

Please don't do this, pick more boring tech stacks https://news.ycombinator.com/item?id=43012862 instead. "Sophisticated" tech stacks are a huge waste, so please save the sophisticated stuff for the 0.1% of the time where you actually need it.


Sophistication doesn't imply any increase or decrease in "boringness".


The dictionary definition of 'sophisticated' is "changed in a deceptive or misleading way; not genuine or pure; unrefined, adulterated, impure." Pretty much the polar opposite of "boring" in a technology context.


No, this is not "the" dictionary definition.

This definition is obsolete according to Wikitionary: https://en.wiktionary.org/wiki/sophisticated (Wikitionary is the first result that shows when I type your words)


That is an extremely archaic definition that's pretty far from modern usage, especially in a tech context


No clue what dictionary you looked at but this is not at all what dictionaries actually say.


An edge case in startups is something that provides a competitive advantage. When you run a startup, you have to do something different from the way everyone else does, or you’ll get the same results everyone else does. My theory is that some part of a startup’s operations should be cutting edge. HR processes, programming stack, sales cycle, something.


That's great advice when you're building a simple CRUD app - use the paved roads for the 10^9th instance.

It's terrible advice when you're building something that will cause that boring tech to fall over. Or when you've reached the limits of that boring tech and are still growing. Or when the sophisticated tech lowers CPU usage by 1% and saves your company millions of dollars. Or when that sophisticate tech saves your engineers hours and your company 10s of millions. Or just: when the boring tech doesn't actually do the things you need it to do.


"Boring" tech stacks tend to be highly scalable in their own right - certainly more so than the average of trendy newfangled tech. So what's a lot more likely is that the trendy newfangled tech will fail to meet your needs and you'll be moving to some even newer and trendier tech, at surprisingly high cost. The point of picking the "boring" choice is that it keeps you off that treadmill.


I'm not disagreeing with anything you said here - reread my comment.

Sometimes you want to use the sophisticated shiny new tech because you actually need it. Here's a recent example from a real situation:

The linux kernel (a boring tech these days) has a great networking stack. It's choking on packets that need to be forwarded, and you've already tuned all the queues and the cpu affinities and timers and polling. Do you -

a) buy more servers and network gear to spread your packets across more machines? (boring and expensive and introduces new ongoing costs of maintenance, datacenter costs, etc).

b) Write a kernel module to process your packets more efficiently? (a boring, well known solution, introduces engineer costs to make and maintain as well as downtime because the new shiny module is buggy?)

c) Port your whole stack to a different OS (risky, but choosing a different boring stack should suffice... if youre certain that it can handle the load without kernel code changes/modules).

d) Write a whole userspace networking system (trendy and popular - your engineers are excited about this, expensive in eng time, risks lots of bugs that are already solved by the kernel just fine, have to re-invent a lot of stuff that exists elsewhere)

e) Use ebpf to fast path your packets around the kernel processing that you don't need? (trendy and popular - your engineers are excited about this, inexpensive relative to the other choices, introduces some new bugs and stability issues til the kinks are worked out)

We sinned and went with (e). That new-fangled tech met our needs quite well - we still had to buy more gear but far less than projected before we went with (e). We're actually starting to reach limits of ebpf for some of our packet operations too so we've started looking at (d) which has come down in costs and risk as we understand our product and needs better.

I'm glad we didn't go the boring path - our budget wasn't eaten up with trying to make all that work and we could afford to build features our customers buy instead.

We also use postgres to store a bunch of user data. I'm glad we went the boring path there - it just works and we don't have to think about it, and that lack of attention has afforded us the chance to work on features customers buy instead.

The point isn't "don't choose boring". It's: blindly choosing boring instead of evaluating your actual needs and options from a knowledgeable place is unwise.


None of these seem all that 'trendy' to me. The real trendy approach would be something like leaping directly to a hybrid userspace-kernelspace solution using something like https://github.com/CloudNativeDataPlane/cndp and/or the https://www.kernel.org/doc/html/latest/networking/af_xdp.htm... addressing that the former is built on. Very interesting stuff, don't get me wrong there - but hardly something that can be said to have 'stood the test of time' like most boring tech has. (And I would include things like eBPF in that by now.)


I have similar examples from other projects of using io_uring and af_xdp with similar outcomes. In 2020 when the ebpf decision was made it was pretty new an trendy still too... in a few cases each of these choices required us to wait for deployment until some feature we chose to depend on landed in a mainline kernel. Things move a bit slower that far down the stack so new doesn't mean "the js framework of the week", but it's still the trendy unproven thing vs the well-known path.

The point is still: evaluate the options for real - using the new thing because it's new and exicting is equally as foolish as use the boring thing because it's well-proven... if those are your main criteria.


Today I learned that some tech stacks are sophisticated, I suppose those are for the discerning developer.


I agree with this stance. Junior developers are going to learn faster than previous generations, and I'm happy for it.

I know that is confronting for a lot of people, but I think it is better to accept it, and spend time thinking about what your experience is worth. (A lot!)


> Junior developers are going to learn faster than previous generations, and I'm happy for it.

How? Students now are handing out LLM homework left and right. They are not nurturing the resolve to learn. We are training a cohort of youngs who will give up without trying hard, and end learning nothing


>Junior developers are going to learn faster than previous generations, and I'm happy for it.

I would have agreed, until I started seeing the kinds of questions they're asking.


> "But it's not satisfying to stay at that level of ignorance very long"

It's not about satisfaction: it's literally dangerous and can bankrupt your employer, cause immense harm to your customers and people at home, and make you unhirable as an engineer.

Let's take your example of "an S3 avatar upload system", which you consider finished after writing 2 lines of code and a couple of packages installed. What makes sure this can't be abused by an attacker to DDOS your system, leading to massive bills from AWS? What happens after an attacker abuses this system and takes control of your machines? What makes sure those avatars are "safe-for-work" and legal to host in your S3 bucket?

People using LLMs and feeling all confident about it are the equivalent of hobby carpenters after watching a DIY video on YouTube and building a garden shed over the weekend. You're telling me they're now qualified to go build buildings and bridges?

> "It's like presuming that StackOverflow will keep you as a question-asker your whole life when nobody here would relate to that."

I meet people like this during job interviews all of the time, if I'm hiring for a position. Can't tell you how many people with 10+ years of industry experience I met recently that can't explain how to read data from a local file, from the machine's file system.


At present, LLMs are basically Stack Overflow with infinite answers on demand... of Stack Overflow quality and relevance. Prompting is the new Googling. It's a critical base skill, but it's not sufficient.

The models I've tried aren't that great at algorithm design. They're abysmal at generating highly specific, correct code (e.g. kernel drivers, consensus protocols, locking constructs.) They're good plumbers. A lot of programming is plumbing, so I'm happy to have the help, but they have trouble doing actual computer science.

And most relevantly, they currently don't scale to large codebases. They're not autonomous enough to pull a work item off the queue, make changes across a 100kloc codebase, debug and iterate, and submit a PR. But they can help a lot with each individual part of that workflow when focused, so we end up in the perverse situation where junior devs act as the machine's secretary, while the model does most of the actual programming.

So we end up de-skilling the junior devs, but the models still can't replace the principal devs and researchers, so where are the principal devs going to come from?


>The models I've tried aren't that great at algorithm design. They're abysmal at generating highly specific, correct code (e.g. kernel drivers, consensus protocols, locking constructs.) They're good plumbers. A lot of programming is plumbing, so I'm happy to have the help, but they have trouble doing actual computer science.

I tend towards tool development, so this suggests a fringe benefit of LLMs to me: if my users are asking LLMs to help with a specific part of my API, I know that's the part that sucks and needs to be redesigned.


>Why wouldn't this be the case for people using LLM like it was for everyone else?

Because of the mode of interaction.

When you dive into a framework that provides a ton of scaffolding, and "backfill your knowledge over time" (guilty! using Nikola as a SSG has been my entry point to relearn modern CSS, for example), you're forced to proceed by creating your own loop of experimentation and research.

When you interact with an LLM, and use forums to figure out problems the LLM didn't successfully explain to you (about its own output), you're in chat mode the whole time. Even if people are willing to teach you to fish, they won't voluntarily start the lesson, because you haven't shown any interest in it. And the fish are all over the place - for now - so why would you want to learn?

>It's like presuming that StackOverflow will keep you as a question-asker your whole life when nobody here would relate to that.

Of course nobody on HN would relate to that first-hand. But as someone with extensive experience curating Stack Overflow, I can assure you I have seen it second-hand many times.


> But in reality pretty much anyone who enters software starts off cutting corners just to build things instead of working their way up from nand gates.

The article is right in a zoomed-in view (fundamental skills will be rare and essential), but in the big picture the critique in the comment is better (folks rarely start on nand gates). Programmers of the future will have less need to know code syntax the same way current programmers don't have to fuss with hardware-specific machine code.

The people who still do hardware-specific code, are they currently in demand? The marketplace is smaller, so results will vary and probably like the article suggests, be less satisfactory for the participant with the time-critical need or demand.


> The article is right in a zoomed-in view.

First of all, I think the problems of the industry were long overdue. It started with Twitter and it proved it can be done. AI just made it easier psychologically because it's much easier to explore and modify existing code and not freak out "omg, omg, omg, we've lost this guy and ony he understands the code and we're so lost without him". AI just removes the incentives to hoard talent.

I also think of excel/spreadsheets and how it did in fact change accounting industry forever. Every claim the author makes about software developers could have been made about accounting after the advent of electronic spreadsheets.

I don't want to even get started on the huge waste and politics in the industry. I'm on the 3rd re-write of a simple task that removes metrics in Grafana which saves the team maybe 50$ monthly. If the team was cut in half, I'm sure we'd simply not do half the bullshit "improvements" we do.


Great points. I see my journey from an offshore application support contractor to full time engineer and learning a lot along the way. Along the journey I've seen folks who held good/senior engineering roles just stagnated or moved to management role.

Industry is now large enough to have all sort of people. Growing, stagnating, moving out, moving in, laid off, retiring early, or just plain retiring etc.


This is a great point. I remember my first computer programming class was Java. The teacher said “just type public static void main( string[] args) at the top”. I asked why and he said it didn’t matter for now just memorize that part. That was great advice. At that point it was more important to get a feel for how computers behave and how programs are structured on a high level. I just kept typing that cryptic line mindlessly on top of all my programs so that I could get to the other stuff. Only many months later I looked into the mysterious line and understood what all the keywords meant.

It’s funny now that I haven’t programmed Java for more than a decade and the “public static void main” incantation is still burned into my memory.


I agree, and I also share your experience (guess I was a bit earlier with PHP).

I think what's left out though is that this is the experience of those who are really interested and for whom "it's not satisfying" to stay there.

As tech has turned into a money-maker, people aren't doing it for the satisfaction, they are doing it for the money. That appears to cause more corner cutting and less learning what's underneath instead of just doing the quickest fix that SO/LLM/whatever gives you.


I'm not so sure, I think a junior dev on my team might be being held back by AI, he's good at using it. However he was really struggling to do something very basic. In my view he just needs to learn that syntax and play around with it in a throw away console app. But I think AI is crutch that may distract from doing that. Then again it is utterly fantastic at explaining small bits of code so it could be an excellent teacher too.


Who the hell, in today's market, is going to hire an engineer with a tenuous grasp on foundational technological systems, with the hope that one day they will backfill?!


Yeah, my recollection of the past couple decades is many companies felt like: "Someone else will surely train up the junior developers, we'll just hire them away after they know what they're doing." This often went with an oddly-bewildered: "Wow, why is it so hard to find good candidates?"

I don't see how that trend would change much just because junior developers can use LLMs as a crutch. (Well, except when it helps them cheat at an interview that wasn't predictive of what the job really needed.)


> And then they backfill their knowledge over time.

If only. There are too many devs who've learnt to write JS or Python, and simply won't change. I've seen one case where someone ported an existing 20k C++ app to a browser app in the most unsuitable way with emscripten, where a 1100 lines of typescript do a much better job.


> But it's not satisfying to stay at that level of ignorance very long

That's the difference. This is how you feel because you like programming to some extent. Having worked closely with them, I can tell you there are many people going into bootcamps that flat out dislike programming and just heard it pays well. Some of them get jobs, but they don't want to learn anything. They just want to do as much that doesn't get them fired. They are not curious even with tasks they are supposed to do.

I don't think this is inherently wrong, as I don't feel like gatekeeping the profession if their bosses feel they add value. But this is a classic case of losing the junior > expert pipeline. We could easily find ourselves in a spot in 30 years where AI coding is rampant but there's no experts to actually know what it does.


There have been people entering the profession for (purported) money and not love of the craft for at least as long as the 20 years I've been in it. So long as there are also people who still genuinely enjoy it and take pride in doing the job well, then the junior->expert pipeline isn't lost.

I buy that LLMs may shift the proportion of those two camps. But doubt it will really eliminate those who genuinely love building things with code.


I'm not sure it takes more than a shift, though. There aren't 0 people in training to be a doctor, but we have a shortage for sure and it causes huge problems.


> I think we who are already in tech have this gleeful fantasy that new tools impair newcomers in a way that will somehow serve us, the incumbents, in some way.

Well put. There’s a similar phenomenon in industrial maintenance. The “grey tsunami.” Skilled electricians, pipefitters, and technicians of all stripes are aging out of the workforce. They’re not being replaced, and instead of fixing the pipeline, many factories are going out of business, and many others are opting to replace equipment wholesale rather than attempt repairs. Everybody loses, even the equipment vendors, who in the long run have fewer customers left to sell to.


I very much relate to the sentiment of starting out with simple tools and then backfilling knowledge gaps as I went. For me it was Excel -> Access shared DB forms -> Django web application framework -> etc. From spreadsheets, to database design, to web application development, to scaling HTTP services, and on and on it goes.


> But it's not satisfying to stay at that level of ignorance very long

I agree, but have found that for a lot of people that is totally satisfying enough. Most people don't care to really understand how the code works.


That's fine and all but I'm not sure the nand-gate folks are out of a job either.


Or assuming software needs to be of a certain quality.

Software engineers 15 years ago would have thought it crazy to ship a full browser with every desktop app. That’s now routine. Wasteful? Sure. But it works. The need for low level knowledge dramatically decreased.


Isn’t this kind of thing the story of tech though?

Languages like Python and Java come around, and old-school C engineers grouse that the kids these days don’t really understand how things work, because they’re not managing memory.

Modern web-dev comes around and now the old Java hands are annoyed that these new kids are just slamming NPM packages together and polyfills everywhere and no one understands Real Software Design.

I actually sort of agree with the old C hands to some extent. I think people don’t understand how a lot of things actually work. And it also doesn’t really seem to matter 95% of the time.


I don't think the value of senior developers is so much in knowing how more things work, but rather that they've learnt (over many projects of increasing complexity) how to design and build larger more complex systems, and this knowledge mostly isn't documented for LLMs to learn from. An LLM can do the LLM thing and copy designs it has seen, but this is cargo-cult behavior - copy the surface form of something without understanding why it was built that way, and when a different design would have been better for a myriad of reasons.

This is really an issue for all jobs, not just software development, where there is a large planning and reasoning component. Most of the artifacts available to train an LLM on are the end result of reasoning, not the reasoning process themselves (the day by day, hour by hour, diary of the thought process of someone exercising their journeyman skills). As far as software is concerned, even the end result of reasoning is going to have very limited availability when it comes to large projects since there are relatively few large projects that are open source (things like Linux, gcc, etc). Most large software projects are commercial and proprietary.

This is really one of the major weaknesses of LLM-as-AGI, or LLM-as-human-worker-replacement - their lack of ability to learn on the job and pick up a skill for themselves as opposed to needing to have been pre-trained on it (with the corresponding need for training data). In-context learning is ephemeral and anyways no substitute for weight updates where new knowledge and capabilities have been integrated with existing knowledge into a consistent whole.


Just because there are these abstractions layers that happened in the past does not mean that it will continue to happen that way. For example, many no-code tools promised just that, but they never caught on.

I believe that there's a "optimal" level of abstraction, which, for the web, seems to be something like the modern web stack of HTML, JavaScript and some server-side language like Python, Ruby, Java, JavaScript.

Now, there might be tools that make a developer's life easier, like a nice IDE, debugging tools, linters, autocomplete and also LLMs to a certain degree (which, for me, still is a fancy autocomplete), but they are not abstraction layers in that sense.


I love that you brought no-code tools into this because I think it's interesting it never worked correctly.

My guess is: on one side, things like squarespace and wix get super super good for building sites that don't feel like squarespace and wix, (I'm not sure I'd want to be a pure "website dev" right now - although I think squarespace squashed a lot of that long ago) - and then very very nice tooling for "real engineers" (whatever that means).

I'm pretty handy with tech, I mean last time I built anything real was the 90s but I know how most things work pretty well. I sat down to ship an app last weekend, no sleep and Monday rolling around GCP was giving me errors and I hadn't realized one of the files the LLMs wrote looked like code but was all placeholder.

I think this is basically what the anthropic report says, automation issues happen via displacement, and displacement is typically fine, except the displacement this time is happening very rapidly (I read in different report, expecting traditionally ~80 years of displacement happens in ~10 years with AI)


Excel is a "no-code" system and people seem to like it. Of course, sometimes it tampers with your data in horrifying ways because something you entered (or imported into the system from elsewhere) just happened to look kinda like a date, even though it was intended to be something completely different. So there's that.


> Excel is a "no-code" system and people seem to like it.

If you've found any Excel guru that don't spend most of their time in VBA, you have a really unusual experience.


I've worked in finance for 20 years and this is the complete opposite of my experience. Excel is ubiquitous and drives all sorts of business processes in various departments. I've seen people I would consider Excel gurus, in that they are able to use Excel much more productively than normal users, but I've almost never seen anyone use VBA.


Huge numbers of accountants and lawyers use excel heavily knowing only the built in formula language. They will have a few "gurus" sprinkled around who can write macros but this is used sparingly because the macros are a black box and make it harder to audit the financial models.


Excel is a programming system with pure functions, imperative code (VBA/Python recently), database (cell grid, sheets etc.) and visualization tools.

So, not really "no-code".


That’s technically correct but it’s also wrong.

No-code in excel is that most functions are implemented for user and user doesn’t have to know anything about software development to create what he needs and doesn’t need software developer to do stuff for him.


Excel is hardly "no-code". Any heavy use of Excel I've seen uses formulas, which are straight-up code.


But any heavy use of "no-code" apps also ends up looking this way, with "straight-up code" behind many of the wysiwyg boxes.


Right, but "no-code" implies something: programming without code. Excel is not that in any fashion. It's either programming with code or an ordinary spreadsheet application without code. You'd really have to stretch your definitions to consider it "no-code" in a way that wouldn't apply to pretty much any office application.


I would disagree. Every formula you enter into a cell is "code". Moreover, more complex worksheets require VBA.


> Modern web-dev comes around and now the old Java hands are annoyed that these new kids are just slamming NPM packages together and polyfills everywhere and no one understands Real Software Design.

The real issue here is that a lot of the modern tech stacks are crap, but won for other reasons, e.g. JavaScript is a terrible language but became popular because it was the only one available in browsers. Then you got a lot of people who knew JavaScript so they started putting it in places outside the browser because they didn't want to learn another language.

You get a similar story with Python. It's essentially a scripting language and poorly suited to large projects, but sometimes large projects start out as small ones, or people (especially e.g. mathematicians in machine learning) choose a language for their initial small projects and then lean on it again because it's what they know even when the project size exceeds what the language is suitable for.

To slay these beasts we need to get languages that are actually good in general but also good at the things that cause languages to become popular, e.g. to get something better than JavaScript to be able to run in browsers, and to make languages with good support for large projects to be easier to use for novices and small ones, so people don't keep starting out in a place they don't want to end up.


With Web Assembly you can write in any language, even C++.

Unfortunately it doesn’t expose the DOM, so you still need JavaScript


My son is a CS major right now, and since I've been programming my whole adult life, I've been keeping an eye on his curriculum. They do still teach CS majors from the "ground up" - he took system architecture, assembly language and operating systems classes. While I kind of get the sense that most of them memorize enough to pass the tests and get their degree, I have to believe that they end up retaining some of it.


I think this is still true of a solid CS curriculum.

But it’s also true that your son will probably end up working with boot camp grads who didn’t have that education. Your son will have a deeper understanding of the world he’s operating in, but what I’m saying is that from what I’ve seen it largely hasn’t mattered all that much. The bootcampers seem to do just fine for the most part.


Yes, they remember the concepts, mostly. Not the details. But that's often enough to help with reasoning about higher-level problems.


I always considered my education to be a "Table of Contents" listing for what I'd actually learn later


And also these old C hands don't seem to get paid (significantly) more than a regular web-dev who doesn't care about hardware, memory, performance etc. Go figure.


Pay is determined by the supply and demand for labor, which encompass many factors beyond the difficulty of the work.

Being a game developer is harder than being an enterprise web services developer. Who gets paid more, especially per hour?


They do where I'm from, and spend most of their time cleaning up the messes that the regular web-devs create...


The real hardcore experts should be writing libraries anyway, to fully take advantage of their expertise in a tiny niche and to amortize the cost of studying their subproblem across many projects. It has never been easier to get people to call your C library, right? As long as somebody can write the Python interface…

Numpy has delivered so many FLOPs for BLAS libraries to work on.

Does anyone really care if you call their optimized library from C or Python? It seems like a sophomoric concern.


I think the problem is that with the over-reliance on LLMs, that expertise of writing the foundational libraries that even other languages rely on, is going away. That is exactly the problem.


Yea, every progeammer should write at least a cpu emulator in their language of choice, its such a undervalued exercise that will teach you so much about how stuff really works.


You can go to the next step. I studied computer engineering not computer science in college. We designed our own CPU and then implemented it in an FPGA.

You can go further and design it out of discrete logic gates. Then write it in Verilog. Compare the differences and which made you think more about optimizations.


"in order to bake a pie you must first create the universe", at some point, reaching to lower and lower levels stops being useful.


Sure.

Older people are always going to complain about younger people not learning something that they did. When I graduated in 1997 and started working I remember some topics that were not taught but the older engineers were shocked I didn't know it from college.

We keep creating new knowledge. It is impossible to fit everything into a 4 year curriculum without deemphasizing some other topic.

I learned Motorola 68000 assembly language in college. I talked to a recent computer science graduate and he had never seen assembly before. I also showed him how I write static HTML in vi the same way I did in 1994 for my simple web site and he laughed. He showed me the back end to their web site and how it interacts with all their databases to generate all the HTML dynamically.


The universe underneath the pie is mostly made up of invariant laws that must be followed.

The OS, libraries, web browser, runtime, and JavaScript framework underneath your website are absolutely riddled with bugs, and knowing how to identify and fix them makes you a better engineer. Many junior developers get hung up on the assumption that the function they're calling is perfect, and are incapable of investigating whether that's the truth.

This is true of many of the shoulders-of-giants we're standing on, including the stack beneath python, rust, whatever...


In fairness, creating a universe is pretty useful.


When I was a kid I "wrote" (mostly copied from a programming magazine) a 4-bit CPU emulator on my TI-99/4a. Simple as it was, it was the light bulb coming on for me about how CPUs actually worked. I could then understand the assembly language books that had been impenetrable to me before. In college when I first started using "C", pointers made intuitive sense. It's a very valuable exercise.


Notably, I don't think there was a mass disemployment of "old C hands". They just work on different things.


I wonder about this too - and also wonder what the difference of order is between the historical shifts you mention and the one we're seeing now (or will see soon).

Is it 10 times the "abstracting away complexity and understanding"? 100, 1000, [...]?

This seems important.

There must be some threshold beyond which (assuming most new developers are learning using these tools) fundamental ability to understand how the machine works and thus ability to "dive in and figure things out" when something goes wrong is pretty much completely lost.


> There must be some threshold beyond which (assuming most new developers are learning using these tools) fundamental ability to understand how the machine works and thus ability to "dive in and figure things out" when something goes wrong is pretty much completely lost.

For me this happened when working on some Spring Boot codebase thrown together by people who obviously had no idea what they were doing (which maybe is the point of Spring Boot; it seems to encourage slopping a bunch of annotations together in the hope that it will do something useful). I used to be able to fix things when they went wrong, but this thing is just so mysterious and broken in such ridiculous ways that I can never seem to get to the bottom of it,


> Languages like Python and Java come around, and old-school C engineers grouse that the kids these days don’t really understand how things work

Everything has a place, you most likely wouldn't write an HPC database in Python, and you wouldn't write a simple CRUD recipe app in C.

I think the same thing applies to using LLMS, you don't use the code it generates to control a power plant or fly an airplane. You use it for building the simple CRUD recipe app where the stakes are essentially zero.


$1 for the pencil, $1000 for the line.

That’s the 5% when it does matter.


Yes this is what people like to think. It’s not really true in practice.


And that last 5% is what you're paying for


But not really. Looking around my shop, I’m probably the only one around who used to write a lot of C code. No one is coming to ask me about obscure memory bugs. I’m certainly not getting paid better than my peers.

The knowledge I have is personally gratifying to me because I like having a deeper understanding of things. But I have to tell you I thought knowing more would give me a deeper advantage than it has in actual practice.


You're providing value every time you kill a bad idea "because things don't actually work that way" or shave a loop, you're just not tracking the value and neither is your boss.

To your employer, hiring people who know things (i.e. you) has giving them a deeper advantage in actual practice.


I would argue that your advantage right now is that YOU are the one position they can't replace with LLMs, because your knowledge requires exact fine detail on pointers and everything and needs that exact expertise. You might have toughen the same pay as your peers, but you also carry additional stability.


Is that because the languages being used at your shop have largely abstracted away memory bug issues? If you were to get a job writing embedded systems, or compilers, or OSes, wouldn't your knowledge be more highly valued and sought after (assuming you were one of the more senior devs)?


If you have genuine systems programming knowledge, usually the answer is to innovate on a particular toolchain or ship your own product (I understand you may not like business stuff though.)


LLMs are a much bigger jump in productivity than moving to a high level language.


Lately, I've been asking ChatGPT for answers to problems that I've gotten stuck on. I have yet to receive a correct answer from it that actually increases my productivity.


I don't know what to say.

I've been able to get code working in libraries that I'm wholly unfamiliar with pretty rapidly by asking the LLM what to do.

As an example, this weekend I got a new mechanical keyboard. I like to use caps+hjkl as arrows and don't want to remap in software because I'll connect to multiple computers. Turns out there's a whole open source system for this called QMK that requires one to write C to configure the keyboard.

It's been over a decade since I touched a Makefile and I never really knew C anyway but I was able get the keyboard configured and also have some custom RGB lighting on it pretty easily by going back and forth with the LLM.


It is just very random. LLMs help me write a synthesizer using an odd synth technique in an obscure musical programming language with no problem, help me fix my broken linux system no problem but then can't do anything right with the python library pyro. I think it is why people have such different experiences. It all depends randomly on how what you want to solve lines up with what the models are good at.


At least for the type of coding I do, if someone gave me the choice between continuing to work in a modern high-level language (such as C#) without LLM assistance, or switching to a low-level language like C with LLM assistance, I know which one I would choose.


Likewise, under no circumstances would I trade C for LLM-aided assembly programming. That sounds hellish. Of course it could (probably will?) be the case that this may change at some point. Innovations in higher-level languages aren't returning productivity improvements at anywhere close to the same rate as LLMs are, and in any case LLMs probably benefit from improvements to higher-level languages as well.


The programming is an interface to the machine. The AI even what we know now (LLM's, Agents, RAG) will absorb all that. It has many flaws but is still much better than most programmers.

All future programmers will be using it.

For the programmers that don't want to use it. I think there will be literally billions of lines of unbelievably bad code generated by these 1-100 generation Ai's and junior programmers that need to be corrected and fixed.


> It has many flaws but is still much better than most programmers.

This says more about most programmers then about any given LLM


There's no need for tens of millions of OS Kernel devs , most of us are writing business logic CRUD apps.

Also, it's not entirely clear to me why LLMs should get extremely good in web app development but not OS development, as far as I can see it's the amount and quality of training data that counts.


> as far as I can see it's the amount and quality of training data that counts

Well there's your reason. OS code is not as in demand or prevalent as crud web app code, so there's less relevant data to train your models on.


The OS code that exists is much higher quality so the signal to noise ratio is much better


I think arguably there's still a quantity issue, but I'm no expert on LLMs. Plus I hear the windows source code is a bit of a nightmare. But for every windows there's a TempleOS I suppose.


It is far more likely that everything, and not just IT, but everything collapses than we make it to the point you mention.

LLMs replace entry level people who invested in education. They would have the beginning knowledge, but there's no means to become better because opportunities are non-existent because they replaced these positions. Its a sequential pipeline failure of talent development. In the meantime you have the mid and senior level people who cannot pass their knowledge on, they age out, and die.

What happens when you hit a criticality point where production which is dependent on these systems, and it can no longer continue.

The knowledge implicit in production is lost, the economic incentives have been poisoned. The distribution systems are destroyed.

How do you bootstrap recovery for something that effectively took several centuries to build in the first place, but not in centuries but in weeks/months.

If this isn't sufficient enough to explain the core of the issue. Check out the Atari/Nintendo crash, which isn't nearly as large as this but goes into the dangers of destroying your distributor networks.

If you pay attention to the details, you'll see Atari's crash was fueled by debt financing, and in the process they destroyed their distributor networks with catastrophic losses. After that crash, Nintendo couldn't get shelf-space; no distributor would risk the loss without a guarantee. They couldn't advertise as video games. They had to trojan horse the perception of what they were selling, and guarantee it. There is a documentary on Amazon which covers this, playing with power. Check it out.


One of my first bosses was a big Perl guy. I checked on what he was doing 15 years later and he was one of 3 people at Windstream handling backbone packet management rules.

You just don’t run into many people comfortable with that technology anymore. It’s one of the big reasons I go out of my way to recruit talks on “old” languages to be included at the Carolina Code Conference every year.


We've been in this world for decades.

Most developers couldn't write an operating system to save their life. Most could not write more than a simple SQL query. They sling code in some opinionated dev stack that abstracts the database and don't think too hard about the low-level details.


They'll probably go a step further and use an ORM instead of writing queries.

Since ORMs generally write crap unoptimized sql for all but the simplest of queries, this will lead to performance issues once things scale up.


I agree. It's the current generation's version of what happened with the advent of Javascript frameworks about 15 years ago, when suddenly web devs stopped learning how computers actually work. There will always be high demand for software engineers who actually know what they're doing, can debug complex code bases, and can make appropriate decisions about how to apply technology to business problems.

That said, AI agents are absolutely going to put a bunch of lower end devs out of work in the near term. I wouldn't want to be entering the job market in the next couple of years....


> There will always be high demand for software engineers who actually know what they're doing

Unfortunately they won’t be found due to horrible tech interviews focused on “culture” (*-isms), leetcode under the gun, or resume thrown in trash at first sight from lack of full degree. AMHIK.


> I wouldn't want to be entering the job market in the next couple of years....

I bet there's a software dev employment boom about 5 years away once it becomes obvious competent people are needed to unwind and rework all the llm generated code.


Except juniors are not going to be the competent people you're looking for to unwind those systems. Personally, no matter how it plays out, I feel like the entry-level market in this field is going to take a hit. It will become much more difficult and competitive.


The "prompt" engineering is also going to create a ton of cargocult tips and tricks- endless shell scripts, that do nothing but look spectacular, with one or two important commands at the end. Fractal classes, that nobody knows why they exist. Endless boilerplate.

And the ai will be trained on this- and thus cluelessness reinforced and baked in. Omnissiah, hear our prayers in the terminal for we are but -h less man (bashes keyboard with a ritual wrench).


> I think that LLMs are only going to make people with real tech/programming skills much more in demand, as younger programmers skip straight into prompt engineering and never develop themselves technically beyond the bare minimum needed to glue things together.

This is exactly what the article says in point 3.


I hired a junior developer for a couple months and was incredibly impressed with what he was able to accomplish with a paid ChatGPT subscription on a greenfield project for me. He’d definitely struggle with a mature code base, it you have to start somewhere!


Yea, i agree fully.

real programming of course wont go away. But in the public eye it lost its mysticism as seemingly anyone can code now. Of course that aint true and noone managed to create anything of substance by prompting alone.


How do we define real programming? I'm working on python and JS codebases in my startup. So very high level stuff. However to reason well about everything that goes on in our code is no small feat for an LLM (or a human), if its able to take our requirements , understand the business logic and just start refactoring and creating new features on a codebase that is quite big, well yeah, that sounds like AGI to me. In that case I don't see why it won't be able to hack on kernels.


The fact that you don't see why is the issue. Both python and JS are very permissive and their runtime env is very good. More often than not, you're just dealing with logic bugs and malformed domain data. A kernel codebase like Linux is one where there are many motivated individual trying every trick to get the computer to do something. And you're usually dealing with leaner abstractions because general safety logic is not performant enough. It's a bit like the difference between a children playground and a construction site.


> More often than not, you're just dealing with logic bugs

Definitely. More often than not you're dealing with logic bugs. So the thing solving them will sometimes have to be able to reason quite well across large code bases (not every bug of course, but quite often) to the point I don't really see how it's different than general intelligence if it can do that well. And if it gets to the point its AGIish , I don't see why it can't do Kernel work (or in the very least - reduce the amount of jobs dramatically in that space as well). Perhaps you can automate 50% of the job where we're not really thinking at all as programmers, but the other 50% (or less, or more, debatable) involves planning, reasoning, debugging, thinking. Even if all you do is python and js.


> So the thing solving them will sometimes have to be able to reason quite well across large code bases

The codebase only describes what the software can do currently, never the why. And you can't reason without both. And the why is the primary vector of changes which may completely redefines the what. And even the what have many possible interpretations. The code is only one specific how. Going from the why, to the what, to a specific how is the core tenet of programming. Then you add concerns like performance, reliability, maintainability, security...

Once you have a mature codebase, outside of refactoring and new features, you mostly have to edit a few lines for each task. Finding the lines to work one requires careful investigation and you need to carefully test after that to ensure that no other operations have been affected. We already have good deterministic tools to help with that.


I agree with this. An AI that can fully handle web dev is clearly AGI. Maybe the first AGI can't fully handle OS kernel development, just as many humans can't. But if/once AGI is achieved it seems highly unlikely to me that it will stay at the "can do web dev but can't do OS kernel dev" level for very long.


> Somebody needs to write that operating system the LLM runs on. Or your bank's backend system that securely stores your money. Or the mission critical systems powering this airplane you're flying next week... to pretend like this will all be handled by LLMs is so insanely out of touch with reality.

When they do this, I really want to know they did this. Like an organic food label. Right now AI is this buzzword that companies self-label with for marketing, but when that changes, I still want to see who's using AI to handle my data.


The enshittification will come for the software engineers themselves eventually, because so many businesses out there only have their shareholders in mind and not their customers, and if a broken product or a promise of a product is enough to boost the stock then why bother investing in the talent to build it properly?

Look at Google and Facebook - absolute shithouse services now that completely fail to meet the basic functionality they had ~20 years ago. Google still rakes in billions rendering ads in the style of a search engine and Facebook the same for rendering ads in the format of a social news feed. Why even bother with engineering anything except total slop?


> as younger programmers skip straight into prompt engineering and never develop themselves technically beyond the bare minimum needed to glue things together.

I'm not worried about that at all. Many young people are passionate, eager to learn and build things. They won't become suddenly dumb and lazy because they have this extra tool available to them. I think it's the opposite. They'll be better than their seniors because they'll have AI help them improving and learn faster.


Have you _seen_ the tragedy that is occurring in primary and secondary education right now with students using LLMs for the bulk of their coursework? Humans, and most forms of life in general, are lazy. They take the lowest energy route to a solution that works, whether that solution is food, shelter, or the answer to a homework question or problem at work. To some degree, this is good: An experienced <animal/student/engineer> has well-defined mental pathways towards getting what they need in as little time/energy as possible. I myself have dozens of things that I don't remember offhand, but that I remember a particular google query will get me to what I need (chmod args being the one that comes to mind). This leaves mental resources available for more important or difficult-to-acquire knowledge, like the subtle nuances of a complex system or cat pictures.

The problem is a lack of balance, and in some instances skipping the entirety of Critical Reasoning. Why go through the effort of working your way through a problem when you would rather be doing <literally anything else> with your time. Iterate on this to the extreme, with what feels like a magic bullet that can solve anything, and your skills *will* atrophy.

Of course there are exceptions to this trend. Star pupils exist in any generation who will go out of their way to discover answers to questions they have, re-derive understanding of things just for the sake of it, and apply their passions towards solving problems they decide are worth solving. The issue is the _average_ person, given an _average_ (e.g. if in America, under-funded) education, with an _average_ mentor, will likely choose the path of least resistance.


> people with real tech/programming skills much more in demand, as younger programmers skip straight into prompt engineering

Perhaps Python will become the new assembly code. :-)


Would technical depth change the fundamental supply and demand, though? If we view AI as a powerful automation tool, it's possible that the overall demand will be lowered so much that the demand of the deep technical expertise will go down as well. Take EE industry, for instance, the technical expertise required to get things done is vast and deep, yet the demand has not been so good, compared to the software industry.


I think I've seen the comparison with respect to training data, but it's interesting to think of the presence of LLMs as a sort of barrier to developing skills akin to pre-WW2 low background radiation steel (which, fun fact, isn't actually that relevant anymore, since background radiation levels have dropped significantly since the partial end of nuclear testing)


It’s the “CTO’s nephew” trope but at 100x the cost.


This is so on point IMO, I feel like there is no better time than to learn more low level languages than now. Since the hype will in the end resolve around insanely technical people carrying all the major weight.


I don’t think “prompt engineering” will remain its own field.

It’s just modern SEO and SEO will eat it, eventually.

Prompt engineering as a service makes more sense than having on-staff people anyway, since your prompt’s effectiveness can change from model to model.

Have someone else deal with platform inconsistencies, like always.


You think this like newcomers can’t use the LLM to more deeply understand these topics in addition to glue. This mindset is a fallacy as newcomers are more adept and passionate as any other generation. They have better tools and can compete just the same.


> younger programmers skip straight into prompt engineering and never develop themselves technically beyond the bare minimum needed to glue things together

This was true before LLMs though. A lot of people just glue javascript libraries together


I'm aware of a designer, no coding skills, who is going to turn his startup idea into an MVP using LLMs. If that ever got serious, they would need an actual engineer to maintain and improve things.


I think a good analogy is being able to drive a car vs understanding the engine of a car.

Both are useful, but you wouldn’t hire a mechanic drive you around


If cars were as unreliable as most software, you'd need a mechanic to drive you around.


Governor Gavin Newsom proposed to use AI agents to manage the California budget.

Using ai to write critical code doesn't seem far stretched to me.

Doing it right now would be suicidal of course, but in a few years when ai is even better ? It sure is coming.

We're already speaking seriously about ai surgeon, we are already using ai to do radiography doctors job and found it to be more reliable.

Some job are really at risk in the near future, that's obvious.

I'm no developer so I got no bias towards it.


On a recent AllIn podcast[1], there was a fascinating discussion between Aaron Levie and Chamath Palihapitiya about how LLMs will (or will not) supplant software developers, which industries, total addressable markets (TAMs), and current obstacles preventing tech CEOs from firing all the developers right now. It seemed pretty obvious to me that Chamath was looking forward to breaking his dependence on software developers, and predicts AI will lead to a 90% reduction in the market for software-as-a-service (and the related jobs).

Regardless of point of view, it was an eye opening discussion to hear a business leader discussing this so frankly, but I guess not so surprising since most of his income these days is from VC investments.

[1] https://youtu.be/hY_glSDyGUU?t=4333


yup. the good news is this should make interviewing easier; bad news is there'll be fewer qualified candidates.

the other thing, though, is that you and I know that LLMs can't write or debug operating systems, but the people who pay us and see LLMs writing prose and songs? hmm


When I see people trying to define which programmers will enjoy continued success as AI continues to improve, I often see One True Scotsman used.

I wish more would try to describe what the differentiating skills and circumstances are instead of just saying that real programmers should still be in demand.

I think maybe raw talent is more important than how much you "genuinely love coding" (https://x.com/AdamRackis/status/1888965636833083416) or how much of a real programmer you are. This essay captures raw talent pretty well IMO: https://www.joelonsoftware.com/2005/07/25/hitting-the-high-n...


>I think that LLMs are only going to make people with real tech/programming skills much more in demand, as younger programmers skip straight into prompt engineering and never develop themselves technically beyond the bare minimum needed to glue things together.

My experience with Stack Overflow, the Python forums, etc. etc. suggests that we've been there for a year or so already.

On the one hand, it's revolutionary that it works at all (and I have to admit it works better than "at all").

But when it doesn't work, a significant fraction of those users will try to get experienced humans to fix the problem for them, for free - while also deluding themselves that they're "learning programming" through this exercise.


> Or the mission critical systems powering this airplane you're flying next week... to pretend like this will all be handled by LLMs is so insanely out of touch with reality.

Found the guy who's never worked for a large publicly-traded company :) Do you know what's out of touch with reality? Thinking that $BIG_CORP execs who are compensated based on the last three months of stock performance will do anything other than take shortcuts and cut corners given the chance.


> Or the mission critical systems powering this airplane you're flying next week... to pretend like this will all be handled by LLMs is so insanely out of touch with reality.

Airplane manufacturers have proved themselves more than willing to sacrifice safety for profits. What makes you think they would stop short of using LLMs?


Part of the problem is that many working developers are still in companies that don't allow experimentation with the bleeding edge of AI on their code base, so their experiences come from headlines and from playing around on personal projects.

And on the first 10,000 lines of code, the best in class tools are actually pretty good. Since they can help define the structure of the code, it ends up shaped in a way that works well for the models, and it still basically all fits in the useful context window.

What developers who can't use it on large warty codebases don't see is how poorly even the best tools do on the kinds of projects that software engineers typically work on for pay. So they're faced with headlines that oversell AI capabilities and positive experiences with their own small projects and they buy the hype.


My company allowed us to use it but most developers around me didn't reach out to the correct people to be able to use it.

Yes I find it incredibly helpful and try to tell them.

But it's only helpful in small contexts, auto completing things, small snippets, generating small functions.

Any large scale changes like most of these AI companies try to push them being capable of doing it just falls straight on its face. I've tried many times, and with every new model. It can't do it well enough to trust in any codebase that's bigger than a few 10000 lines of code.


I've found it very easy to end up "generating" yourself into a corner with a total mess with no clear control flow that ends up more convoluted than need be, by a mile.

If you're in mostly (or totally) unfamiliar territory, you can end up in a mess, fast.

I was playing around with writing a dead-simple websocket server in go the other evening and it generated some monstrosity with multiple channels (some unused?) and a tangle of goroutines etc.

Quite literally copying the example from Gorilla's source tree and making small changes would have gotten me 90% of the way there, instead I ended up with a mostly opaque pile of code that *looks good* from a distance, but is barely functional.

(This wasn't a serious exercise, I just wanted to see how "far" I could get with Copilot and minimal intervention)


Yeah I've found its good for getting something basic started from scratch, but often times if I try to iterate, it starts hallucinating very fast and forgetting what it was even doing after a short while.

Newer models have gotten better at this and it takes longer before they start making things gibberish but all of them have their limit.

And given the size of lots of enterprise codebases like the ones I'm working in, it just is too far away from being useful enough to replace many programmers in my opinion. I'm convinced the CEO's who are saying AI are replacing programmers are just using it as an excuse to downsize while getting investors happy.


That is also my experience. I use ChatGPT to help me iterate a Godot game project, and it does not take more than a handful of prompts for it to forget or hallucinate about something we previously established. I need to constantly remind it about code it suggested a while ago or things I asked for in the past, or it completely ignores the context and focus just on the latest ask.

It is incredibly powerful for getting things started, but as soon as you have a sketch of a complex system going it loses its grasp on the full picture and do not account for the states outside the small asks you make. This is even more evident when you need to correct it about something or request a change after a large prompt. It just throws all the other stuff out the window and hyperfocus only on that one piece of code that needs changing.

This has been the case since GPT 3, the even their most recent model (forgot the name, the reasoning one) has this issue.


In a similar situation at my workplace.

What models are you using that you feel comfortable trusting it to understand and operate on 10-20k LOC?

Using the latest and greatest from OpenAI, I've seen output become unreliable with as little as ~300 LOC on a pretty simple personal project. It will drop features as new ones are added, make obvious mistakes, refuse to follow instructions no matter how many different ways I try to tell it to fix a bug, etc.

Tried taking those 300 LOC (generated by o3-mini-high) to cursor and didn't fare much better with the variety of models it offers.

I haven't tried OpenAI's APIs yet - I think I read that they accommodate quite a bit more context than the web interface.

I do find OpenAI's web-based offerings extremely useful for generating short 50-200 LOC support scripts, generating boilerplate, creating short single-purpose functions, etc.

Anything beyond this just hasn't worked all that well for me. Maybe I just need better or different tools though?


I usually use Claude 3.5 sonnett since its still the one I've had my best luck with for coding tasks.

When it comes to 10k LOC codebases, I still don't really trust it with anything. My best luck has been small personal projects where I can sort of trust it to make larger scale changes, but larger scale at a small level in the first place.

I've found it best for generating tests, autocompletion, especially if you give context via function names and parameter names I find it can oftentimes complete a whole function I was about to write using the interfaces available to it in files I've visited recently.

But besides that I don't really use it for much outside of starting from scratch on a new feature or getting helping me with getting a plan together before starting working on something I may be unfamiliar with.

We have access to all models available through copilot including o3 and o1, and access to chatgpt enterprise, and I do find using it via the chat interface nice just for architecting and planning. But I usually do the actual coding with help from autocompletion since it honestly takes longer to try to wrangle it into doing the correct thing than doing it myself with a little bit of its help.


This makes sense. I've mostly been successful doing these sorts of things as well and really appreciate the way it saves me some typing (even in cases where I only keep 40-80% of what it writes, this is still a huge savings).

It's when I try to give it a clear, logical specification for a full feature and expect it to write everything that's required to deliver that feature (or the entirety of slightly-more-than-non-trivial personal project) that it falls over.

I've experimented trying to get it to do this (for features or personal projects that require maybe 200-400 LOC) mostly just to see what the limitations of the tool are.

Interestingly, I hit a wall with GPT-4 on a ~300 LOC personal project that o3-mini-high was able to overcome. So, as you'd expect - the models are getting better. Pushing my use case only a little bit further with a few more enhancements, however, o3-mini-high similarly fell over in precisely the same ways as GPT-4, only a bit worse in the volume and severity of errors.

The improvement between GPT-4 and o3-mini-high felt nominally incremental (which I guess is what they're claiming it offers).

Just to say: having seen similar small bumps in capability over the last few years of model releases, I tend to agree with other posters that it feels like we'll need something revolutionary to deliver on a lot of the hype being sold at the moment. I don't think current LLM models / approaches are going to cut it.


Did you have to do any preparation steps before you asked from a model to do the large scale change or there were no steps involved? For example, did you simply ask for the change or did you give a model a chance to learn about the codebase. I am genuinely asking, I'm curious because I haven't had a chance to use those models at work.


There are simply no models that can keep in context the amount of info required in enterprise codebases before starting to forget or hallucinate.

I've tried to give it relevant context myself (a tedious task in itself to be honest) and even tools that claim to automatically be able to do so fail wonderfully at bigger than toy project size in my experience.

The codebase I'm working on day to day at this moment is give or take around 800,000 lines of code and this isn't even close to our largest codebase since its just one client app for our monolith.

Even trivial changes require touching many files. It would honestly take any average programmer less time to implement something themselves than trying to convince an LLM to complete it.


The largest context that I am aware that an open-source model (e.g. qwen) can manage is 1M tokens. This should translate to ~30kLoC. I'd envision that this could in theory work even on large codebases. It certainly depends on the change to be done but I can imagine that ~30kLoC of context is large enough for most of the module-specific changes. Possibly the models that you're using have a much smaller context window?

Then again, and I am repeating myself from other comments I made here in the topic, there's also Devon which pre-processes the codebase before you can do anything else. That kinda makes me wonder if current limitations that people observe in using those tools are really representative of what might be the current state of the art.


If you don't mind me asking, what size of codebases do you typically work on? As mentioned I've tried using all the available commercial models and none work better than as a helpful autocomplete, test, and utility function generator. I'm sure maybe big players like Meta, OpenAI, MS, etc do have the capability of expanding its context for their own internal projects and training specifically on their code, but most of the rest of us can't feasibly do that since we don't own our own AI moat.

Even on my personal projects and smaller internal projects that are small toy projects or utility tools I sometimes struggle to get them to build anything significant. I'm not saying its impossible, but I always find it best at starting things from scratch, and small tools. Maybe its just a sign that AI would be best for microservices.

I've never used Devon so I can't speak to it, but I do recall seeing it was also overhyped at best and struggled to do anything it was purported to be able to in demos. Not saying that this is still true.

I would be interested in seeing how Devon performs on a large open source project in real-time (since if I recall their demos were not real-time demonstrations) for instance just to evaluate its capabilities.


Several millions lines of code. Can't remember any project that I was involved with and that was less than 5MLoC. C++ system level programming.

Overhyped or not Devon is using something else under the hood since it is pre-processing your whole codebase. It's not "realtime" since it simulates the CoT meaning that it "works" on the patch the very same way a developer would. and therefore it will give you a resulting PR in few hours AFAIR. I agree that a workable example on more complex codebase would be more interesting.

> I've tried using all the available commercial models and none work better than as a helpful autocomplete, test, and utility function generator

That's the why I mentioned qwen because I think commercial AI models do not have such a large window context size. Perhaps, therefore an experience would have been different.


And you have had luck with models like the one you mentioned and Devon generating significant amounts of code in these codebases? I would love to be able to have this due to the productivity gains it should allow but I've just never been able to demonstrate what the big AI coding services claim to be able to do at a large scale.

What they already do is a decent productivity boost but not nearly as much as they claim to be capable of.


As I already said in my first comment, I haven't used those models and any of them would have been forbidden at my work.

My point was rather that you might be observing suboptimal results only because you haven't used the models which are more fit, at least hypothetically, for your use case.


I've heard pretty mixed opinions about the touted capabilities of Devon.

https://www.itpro.com/software/development/the-worlds-first-...


That's good news for us I suppose.


> For example, did you simply ask for the change or did you give a model a chance to learn about the codebase.

I've tried it both with personal projects and work.

My personal project/benchmark is a 3d snake game. O3 is by far the best, but even with a couple of hundred lines of code it wrote itself it loses coherence and can't produce changes that involve changing 2 lines of code in a span of 50 lines of code. It either cannot comprehend it needs to touch multiple places of re-writes huge chunks of code and breaks other functionality.

At work, it's fine for writing unit tests on straight forward tasks that it most likely has seen examples of before. On domain-specific tasks it's not so good and those tasks usually involve multiple file edits in multiple modules.

The denser the logic, the smaller the context where LLMs seem to be coherent. And that's funny, because LLMs seem to deal much better with changing code humans wrote than the code the LLMs wrote themselves.

Which makes me wonder -- if we're all replaced by AI, who will write the frameworks and programming languages themselves?


Thanks but IIUC you're describing a situation where you're simply using a model without giving it a chance to learn from the whole codebase? If so, then I was asking for the opposite where you would ingest the whole codebase and then let the model spit out the code. This in theory should enable the AI model to build a model of your code.

> if we're all replaced by AI, who will write the frameworks and programming languages themselves?

What for? There's enough programming languages and there's enough of the frameworks. How about using an AI model to maintain and develop existing complex codebases? IMHO if AI models become more sophisticated and are able to solve this, then the answer is pretty clear who will be doing it.


Not OP, but I've had the same experience, and that's with tools that purport to handle the context for you.

And frankly, if you can't automate context, then you don't have an AI tool that can realistically replace a programmer. If I have to manually select which of my 10000 files are relevant to a given query, then I still need to be in the loop and will also likely end up doing almost as much work as I would have to just write the code.


I see that you deleted your previous response which was unnecessarily snarky while my question was genuine and simple I suppose.

> And frankly, if you can't automate context,

How about ingesting the whole codebase into the model? I have seen that this is possible with at least one such tool (Devon) and which I believe is using gpt model underneath meaning that other providers could automate this step too. I am curious if that would help in generating more legit large scale changes.


> I see that you deleted your previous response which was unnecessarily snarky while my question was genuine and simple I suppose.

You edited your comment to clarify that you were asking from a place of ignorance as to the tools. Your original comment read as snarky and I responded accordingly, deleting it when I realized that you had changed yours. :)

> How about ingesting the whole codebase into the model? I have seen that this is possible with at least one such tool (Devon) and which I believe is using gpt model underneath meaning that other providers could automate this step too. I am curious if that would help in generating more legit large scale changes.

It doesn't work. Even the models that claim to have really large context windows get very distracted if you don't selectively pick relevant context. That's why I always talk about useful context window instead of just plain context window—the useful context window is much lower and how much you have depends on the type of text you're feeding it.


I don't think my comment read as snarky but I was surprised to see the immediate downvote which presumably came from you so I only added the last sentence. This is a stupid way of disagreeing and attempting to shut down the discussion without merits.

> It doesn't work. Even the models that claim to have really large context windows get very distracted if you don't selectively pick relevant context.

I thought Devon is able to pre-process the whole codebase and which it could take up to a one single day for larger codebases so it must be doing something, e.g. indexing the code? If so, this isn't a context-window specific thing, it's something else and it makes me wonder how that works.


> I don't think my comment read as snarky but I was surprised to see the immediate downvote which presumably came from you so I only added the last sentence.

I can't downvote you because you are downthread of me. HN shadow-disables downvotes on all child and grandchild comments.

I'm the one who upvoted you to counteract the downvote. :)


And then they hugged and became lifelong friends :)


You never know - one moment arguing on HN and the second moment you know, drinking at the bar lamenting on how AI is gonna replace us :)


Ok, sorry about that.


> How about ingesting the whole codebase into the model?

You keep referring to this vague idea of "ingesting the whole codebase". What does this even mean? Are you talking about building a code base specific rag, fine tuning against a model, injecting the entire code base into the system context, etc.?


It is vague because the implementation details you are asking me for are closed source, for obvious reasons. I can only guess what it does but that's besides the point. The point is rather that Devon or 1M window context qwen model might be better or more resilient towards the "lack of context" than what the others were suggesting.


Some codebases grown with AI assistance must be getting pretty large now, I think an interesting metric to track would be percent of code that is AI generated over time. Still isn't a perfect proxy for how much work the AI is replacing though, because of course it isn't the case that all lines of code would take the same amount of time to write by hand.


Yeah, that would be very helpful to track. Anecdotally, I have found in my own projects that the larger they get the less I can lean on agent/chat models to generate new code that works (without needing enough tweaks that I may as well have just written it myself). Having been written with models does seem to help, but it doesn't get over the problem that eventually you run out of useful context window.

What I have seen is that autocomplete scales fine (and Cursor's autocomplete is amazing), but autocomplete supplements a software engineer, it doesn't replace them. So right now I can see a world where one engineer can do a lot more than before, but it's not clear that that will actually reduce engineering jobs in the long term as opposed to just creating a teller effect.


It might not just be helpful but required one day. Depending on how the legality around AI-generated code plays out, it's not out of the question that companies using it will have to keep track of and check the provenance and history of their code, like many companies already do for any open source code that may leak into their project. My company has an "open source review" process to help ensure that developers aren't copy-pasting GPL'ed code or including copyleft libraries into our non-GPL licensed products. Perhaps one day it will be common to do an "AI audit" to ensure all code written complied with whatever the future regulatory landscape shapes up to be.


the kinds of projects that software engineers typically work on for pay

This assumes a typical project is fairly big and complex. Maybe I'm biased the other way, but I'd guess 90% of software engineers are writing boilerplate code today that could be greatly assisted by LLM tools. E.g., PHP is still one of the top languages, which means a lot of basic WordPress stuff that LLMs are great at.


The question isn't whether the code is complex algorithmically, the question is whether the code is:

* Too large to fit in the useful context window of the model,

* Filled with bunch of warts and landmines, and

* Connected to external systems that are not self-documenting in the code.

Most stuff that most of us are working on meets all three of these criteria. Even microservices don't help, if anything they make things worse by pulling the necessary context outside of the code altogether.

And note that I'm not saying that the tools aren't useful, I'm saying that they're nowhere near good enough to be threatening to anyone's job.


I'm surprised to see a huge disconnect between how I perceive things and the vast majority of comments here.

AI is obviously not good enough to replace programmers today. But I'm worried that it will get much better at real-world programming tasks within years or months. If you follow AI closely, how can you be dismissive of this threat? OpenAI will probably release a reasoning-based software engineering agent this year.

We have a system that is similar to top humans at competitive programming. This wasn't true 1 year ago. Who knows what will happen in 1 year.


Nobody can tell you whether progress will continue at current, faster or slower rates - humans have a pretty terrible track record at extrapolating current events into the future. It's like how movies in the 80's made predictions about where we'll be in 30 years time. Back to the Future promised me hoverboards in 2015 - I'm still waiting!


Compute power increases and algorithmic efficiency improvements have been rapid and regular. I'm not sure why you thought that Back to the Future was a documentary film.


Unless you have a crystal ball there is nothing that can give you certainty that will continue at the same or better rate. I’m not sure why you took the second half of the comment more seriously than the first.


Nobody has certainty about the future. We can only look at what seems most likely given the data.


When I see stuff like https://news.ycombinator.com/item?id=42994610 (continued in https://news.ycombinator.com/item?id=42996895), I think the field still has fundamental hurdles to overcome.


Why do you think this is a fundamental hurdle, rather than just one more problem that can be solved? I dont have strong evidence either way, but I've seen a lot of 'fundamental unsurmountable problems' fall by the wayside over the past few years. So I'm not sure we can be that confident that a problem like this, for which we have very good classic algorithms, is a fundamental issue.


This kind of error doesn't really matter in programming where the output can be verified with a feedback loop.


This is not about the numerical result, but about the way it reasons. Testing is a sanity check, not a substitute for reasoning about program correctness.


It's the opposite. I don't think it'll replace programmers legitimately within a decade. I DO think that companies will try a lot in the months and years anyway and that programmers will be the only ones suffering the consequences of such actions.


People somehow have expectations that are both too high and too low at the same time. They expect (demand) current language models completely replace a human engineer in any field without making mistakes (this is obviously way too optimistic) while at the same time they are ignoring how rapid the progress has been and how much these models can now do that seemed impossible just 2 years ago, delivering huge value when used well, and they assume no further progress (this seems too pessimistic, even if progres is not guaranteed to continue at the same rate).


ChatGPT 4 was released 2 years ago. Personally I don't think things have moved on significantly since then.


Really now. I think that deserves a bit more explaination, given the cost per token has dropped by several orders of magnitude, we have seen large changes on all benchmarks (including entirely new capabilities), multimodality is now a fact since 4o, test time compute with reasoning models is making big strides since o1.... It seems on the surface a lot is happening. In fact, I wanted to share one of the benchmark overviews, but none include ChatGPT 4 anymore since it is totally not competitive anymore..


Benchmarks are meaningless in and of themselves, they are supposed to be a proxy for usefulness. I have used Sonnet 3.5, ChatGPT-3, ChatGPT-3.5, ChatGPT-4, ChatGPT-4o, o1, o3-mini, o3-mini-high nearly daily for software development. I am not saying AI isn't cool or useful but I am experiencing diminishing returns in model quality (I do appreciate the cost reductions). The sorts of things I can have AI do really haven't changed that much since I got access to my first model. The delta between having no LLM to an LLM feels an order of magnitude bigger at least than the delta between the first LLM and now.


its bigger, shinier, faster, but still doesnt fly


Exactly. I have been waiting for gpt5 to see the delta, but after gpt4 things seemed to have stalled.


This seems like a bizarre claim on the surface, see also my other message above.

https://epoch.ai/data/ai-benchmarking-dashboard


depends on what you work on in the software field. Many of these LLM’s have pretty small context windows. In the real world when my company wants to develop a new feature, or change the business logic, that is a cross-cutting change (many repos/services). I work at a large org for background. No LLM will be automating this for a long time to come. Especially if you’re in a specific domain that is niche.

If your project is very small, and it’s possible to feed your entire code base into an LLM in the near future, then you’re in trouble.

Also the problem is the LLM output is only as good as the prompt. 99% of the time the LLM won’t be thinking of how to make your API change backwards compatible for existing clients, how to help you do a zero-downtime migration, following security best practices, or handling a high volume of API traffic. Etc.

Not to mention, what the product team _thinks_ they want (business logic) is usually not what they really want. Happens ALL THE TIME friend. :) It’s like the offshoring challenge all over again. Communication with humans is hard. Communication with an LLM is even harder. Writing the code is the easiest part of my job!

I think some software development jobs will definitely be at risk in the next 10-15 years. Thinking this will happen in 1 years time is myopic in my opinion.


> If you follow AI closely, how can you be dismissive of this threat?

Just use a state of the art LLM to write actual code. Not just a PoC or an MVP, actual production ready code on an actual code base.

It’s nowhere close to being useful, let alone replacing developers. I agree with another comment that LLMs don’t cut it, another breakthrough is necessary.


https://tinyurl.com/mrymfwwp

We will see, maybe models do get good enough but I think we are underestimating these last few percent of improvement.


It's a bit paradoxical. A smart enough AI, and there is no point in worrying, because almost everyone will be out of a job.

The problem case is the somewhat odd scenario where there is an AI that's excellent at software dev, but not most other work, and we all have to go off and learn some other trade.


A "causal model" is needed to fix bugs ie, to "root-cause" a bug.

LLMs yet dont have the idea of a causal-model of how something works built-in. What they do have is pattern matching from a large index and generation of plausible answers from that index. (aside: the plausible snippets are of questionable licensing lineage as the indexes could contain public code with restrictive licensing)

Causal models require machinery which is symbolic, which is able to generate hypotheses and test and prove statements about a world. LLMs are not yet capable of this and the fundamental architecture of the llm machine is not built for it.

Hence, while they are a great productivity boost as a semantic search engine, and a plausible snippet generator, they are not capable of building (or fixing bugs in) a machine which requires causal modeling.


>Causal models require machinery which is symbolic, which is able to generate hypotheses and test and prove statements about a world. LLMs are not yet capable of this and the fundamental architecture of the llm machine is not built for it.

Prove that the human brain does symbolic computation.


We dont know what the human brain does, but we know it can produce symbolic theories or models of abstract worlds (in the case of math) or real worlds (in the case of science). It can also produce the "symbolic" turing machine which serves as an abstraction for all computation we use (cpu/gpu/etc)


Agreed, and I haven't yet seen any single instance of a company firing software engineers because AI is replacing them (even if by increasing productivity of another set of software engineers): I've asked this a number of times, and while it's a common refrain, I haven't really seen any concrete news report saying it in so many words.

And to be honest, if any company is firing software engineers hoping AI replaces their production, that is good news since that company will soon stop existing and treating engineers like shit which it probably did :)


Yes. I think part of the problem is how good it is at starting from a blank slate and putting together an MVP type app. As a developer, I have been thoroughly impressed by this. Then non-devs see this and must think software engineers are doomed. What they don't see is how terrible LLMs are at working with complex, mature codebases and the hallucinations and endless feedback loops that go with that.


The tech to quickly spin up MVP apps has been around for a while now. It gets you from a troubling blank slate to something with structure, something you can shape and build on.

I am of course talking about

  npx create-{template name}
Or your language of choice's equivalent (or git clone template-repo).


Yes, but the LLM driven MVP-s are not only builerplates but actual functioning apps. The "create-" is somewhat good, but it's usually throwaway code and do it properly later. While my LLM made boilerplate is the actual first few steps to get the boring parts done. It also needs refactoring and polishing, but it's an order of magnitude better than the "MVP helper tooling" before.


A friend of mine reached out with some code ChatGPT wrote for him to trade crypto. It had so much random crap in it and lines would say "AI enhanced trading algo" and it was just an np.randomint line. It was pulling in random deps not even used.

I get it though like I'm terrible working with IMUs and I want to just get something going but I can't there's that wall I need to overcome/learn eg. the math behind it. Same with programming helps to have the background knowing how to read code and how it works.


I used claude to help write a crypto trading bot. It helped me push out thousands of lines a day. What wouldve taken months took a couple weeks. Obviously you still need experienced pilots but unless we find an absolute fuckload of new work to do(not unlikely looking at history) its hard for me to see anything other than way less developers being needed.


The only thing I use it for is for small self-contained snippets of code on problems that require use of APIs I don't quite remember out of the top of my head. The LLM spits out the calls I need to make or attributes/config I need to set and I go check the docs to confirm.

Like "How to truncate text with CSS alone" or "How to set an AWS EC2 instance RAM to 2GB using terraform"


> Now completing small chunks of mundane code, explaining code, doing very small mundane changes. Very good at.

I would not trust them until they can do the news properly. Just read the source Luke.

"AI chatbots unable to accurately summarise news, BBC finds" - https://www.bbc.com/news/articles/c0m17d8827ko


I use AI coding assistants daily, and whenever there is a task that those tools cannot do correctly/quickly enough so that I need to fallback to editing things by myself, I spend a bit of time thinking what is so special about the tasks.

My observation is that LLMs do repetitive, boring tasks really well, like boilerplate code and common logic/basic UI that thousands of people have already done. Well, in some sense, jobs where developers who spend a lot of time writing generic code is already at risk of being outsourced.

The tasks that need a ton of tweaking or not worth asking AI at all are those that are very specific to a specific product and need to meet specific requirements that often come from discussions or meetings. Well, I guess in theory if we had transcripts for everything, AI could write code like the way you want, but I doubt that's happening any time soon.

I have since become less worried about the pace AI will replace human programmers -- there is still a lot that these tools cannot do. But for sure people need to watch out and be aware of what's happening.


I dunno if it's always good at explaining code. It tends to take everything at face value and is unable to opinionatedly reject bs when it's presented with it. Which in the majority of cases is bad.


this is also my problem. When I ask someone a technical question, and I did not provide context on some abstractions. Usually this is common because abstractions can be very deep. "Hmm, not sure.. can you check what's this supposed to do?"

LLMs don't do this, it confidently hallucinate the abstraction out of thin air or uses their outdated knowledge store. Sending wrong use or wrong input parameters.


Same, LLMs are interesting but on their own are a dead end. I think something needs to actually experience the world in 3d in real time to understand what it is actually coding things or doing tasks for.


I don’t know that it needs to experience the world in real-time, but when the brain thinks about things it’s updating its own weights. I don’t think attention is a sufficient replacement for that mechanism.

Reasoning LLMs feel like an attempt to stuff the context window with additional thoughts, which does influence the output, but is still a proxy for plasticity and aha-moments that can generate.


>I think this is true only if there is a novel solution that is in a drastically different direction than similar efforts that came before.

That's good point, we don't do that right now. it's all very crystalized.


> actually experience the world in 3d in real time

AKA embodiment. Hubert L. Dreyfus discussed this extensively in "Why Heideggerian AI Failed and How Fixing it Would Require Making it More Heideggerian": http://dx.doi.org/10.1080/09515080701239510


> LLMs are interesting but on their own are a dead end.

I don't think that anyone is advocating for LLMs to be used "on their own". Isn't it like saying that airplanes are useless "on their own" in 1910, before people had a chance to figure out proper runways and ATC towers?


there was that post about "vibe coding" here the other day if you want to see what the OP is talking about


You mean Karpathy's post discussed on https://twitter.com/karpathy/status/1886192184808149383 ?

If so, I quite enjoyed that as a way of considering how LLM-driven exploratory coding has now become feasible. It's not quite there yet, but we're getting closer to a non-technical user being able to create a POC on their own, which would then be a much better point for them in engaging an engineer. And it will only get better from here.


Technology to allow business people to create POCs has been around for a long time.


All previous examples have been of the "no code" variety, where you press buttons and it controls presets that the creators of the authoring tool have prepared for you. This is the first time where you can talk to it and it writes arbitrary code for you. You can argue that it's not a good idea, but it is a novel development.


A no code solution at its most basic level is nothing more or less than a compiler.

You wouldn’t argue that writing in a high level language doesn’t let you produce arbitrary code because the compiler is just spitting out presets its author prepared for you.

There are 2 main differences between using an LLM to build an app for you and using a no code solution with a visual language.

1. The source code is English (which is definitely more expressive).

2. The output isn’t deterministic (even with temperature set to 0 which is probably not what you want anyway)

Both 1 and 2 are terrible ideas. I’m not sure which is worse.


I just outright disagree. What this Vibe Coding is a substitute for is to finding a random dev on Fiverr, which inherently suffers from your "1 and 2". And I'd argue that vibe coding already offers you more bang for your buck than the median dev on Fiverr.


Low code/no code solutions were already a substitute for finding a random dev on Fiverr, which was almost always a terrible way to solve almost any problem.

The median dev on Fiverr is so awful that almost anything is more bang for your buck.


I think of it as an enabler that reduces my dependency on junior developers. Instead of delegating simple stuff to them, I now do it myself with about the same amount of overhead (have to explain what I want, have to triple check the results) on my side but less time wasted on their end.

A lot of micro managing is involved either way. And most LLMs suffer from a severe case of ground hog day. You can't assume them to remember anything over time. Every conversation starts from scratch. If it's not in your recent context, specify it again. Etc. Quite tedious but it still beats me doing it manually. For some things.

For at least the next few years, it's going to be an expectation from customers that you will not waste their time with stuff they could have just asked an LLM to do for them. I've had two instances of non technical CPO and CEO types recently figuring out how to get a few simple projects done with LLMs. One actually is tackling rust programs now. The point here is not that that's good code but that neither of them would have dreamed about doing anything themselves a few years ago. The scope of the stuff you can get done quickly is increasing.

LLMs are worse at modifying existing code than they are at creating new code. Every conversation is a new conversation. Ground hog day, every day. Modifying something with a lot of history and context requires larger context windows and tools to fill those. The tools are increasingly becoming the bottleneck. Because without context the whole thing derails and micromanaging a lot of context is a chore.

And a big factor here is that huge context windows are costly so there's an incentive for service providers to cut some corners there. Most value for me these days come from LLM tool improvements that result in me having to type less. "fix this" now means "fix the thing under my cursor in my open editor, with the full context of that file". I do this a lot since a few weeks.


> It's incredibly far away from doing any significant change in a mature codebase

The COBOL crisis at Y2K comes to mind.


Is this the same Cobol crisis we have now?

https://www.computerweekly.com/news/366588232/Cobol-knowledg...


I think it's more likely that we'll see a rise in workflows that AI is good at, rather than AI rising to meet the challenges of our more complex workflows.

Let the user pair with an AI to edit and hot-reload some subset of the code which needs to be very adapted to the problem domain, and have the AI fine-tuned for the task at hand. If that doesn't cut it, have the user submit issues if they need an engineer to alter the interface that they and the AI are using.

I guess this would resemble how myspace used to do it, where you'd get a text box where you could provide custom edits, but you couldn't change the interface.


As context sizes get larger (and remain accurate within the size) and speeds increase, especially inference, it will start solving these large complex code bases.

I think people lose sight of how much better it has gotten in just a few years.


AI will create more jobs, if anything, as the "engineers" out of their depth create massive unmaintainable legacy.


It's Access databases all over again.


> I'm thinking there's going to have to be some other breakthrough or something other than LLM's.

We actually _need_ a breakthrough for the promises to materialize, otherwise we will have yet another AI Winter.

Even though there seems to be some emergent behavior (some evidence that LLMs can, for example, create an internal chess representation by themselves when asked to play), that's not enough. We'll end up with diminishing returns. Investors will get bored of waiting and this whole thing comes crashing down.

We'll get an useful too in our toolbox, as we do at every AI cycle.


+1. I've tried really hard to replace even some parts of my job with AI ever since GPT3 era, unsuccessfully. All it does for me is it allows me to enter unfamiliar domains (such as e.g. SwiftUI) but then I'm all on my own. In domains where I already have expertise it just doesn't work well. So it is a productivity booster, sure, but I don't see it replacing anyone doing non-bullshit work. I don't even see a trend line pointing in that direction.



> It's incredibly far away from doing any significant change in a mature codebase

A lot of the use cases are on building something that has already been built before, like a web app, a popular algorithm, and etc. I think the real threat to us programmers is stagnation. If we don't have new use cases to develop but only introduce marginal changes, then we can surely use AI to generate our code from the vast amount of previous work.


Sorry for inadvertently advising but I met a guy who used v0.dev to make impressive websites (although admittedly he did use react before so he was experienced) with professional success. It's more than arguable that his company will fire/hire fewer devs. Of course in a decade or so they'll be a skill gap unless LLMs can fill that gap too.


The huge disconnect is that the skill set to use LLMs for code effectively is not the same skill set of standard software engineering. There is a very heavy intersection and I would say you cannot be effective at LLM software development without being an effective software engineer, but being an effective software engineer does not by any means make somebody good at LLM development.

Very talented engineers, coworkers, that I would place above myself in skill, seemed stumped by it, while I have realized at least a 10x productively gain.

The claim that LLMs are not being applied in mature, complex code-bases is pure fantasy, example: https://arxiv.org/abs/2501.06972. Here Google is using LLMs to accelerate the migration of mature, complex production systems.


> It's incredibly far away from doing any significant change in a mature codebase.

I agree with this completely. However the problem that I think the article gets at is still real because junior engineers also can't do significant changes on a mature codebase when they first start out. They used to do the 'easy stuff' which freed the rest of us up to do bigger stuff. But:

1. Companies like mine don't hire juniors anymore

2. With Copilot I can be so much more productive that I don't need juniors to do "the easy stuff" because Copilot can easily do that in 1/1000th the time a junior would.

3. So now who is going to train those juniors to get to the level where we need them to be to make those "significant changes"?


> So now who is going to train those juniors to get to the level where we need them to be to make those "significant changes"?

Founders will cash out long before that becomes an issue. Alternatively, the hype is true and they will obsolete programmers, also solving the issue above…

This is quite devious if you think about it, withering pipeline of new devs and only them having an immediate fix in all cases.


> Now completing small chunks of mundane code, explaining code, doing very small mundane changes. Very good at.

This is the only current threat. The time you save as a developer using AI on mundane stuff will get filled by something else, possibly more mundane stuff.

A small company with only 2-5 Seniors may not be able to drop anyone. A company with 100 seniors might be able to drop 5-10 of them total, spread across each team.

The first cuts will come at scaled companies. However, it's difficult to detect if companies are cutting people just to save money or if they are actually realizing any productivity gains from AI at this point.


Especially since the zero-interest bonanza led to over-hiring of resume-driven developers. Half of AWS is torching energy by runnning some bloat that should not even be there.


I don't think companies realize AI is not free. A 100+ devs, openai, anthropic, gemini API costs, the hidden overhead of costs not spoken about.

Too much speculation that productivity will increase substantially, especially when a majority of companies IT is just so broken and archaic.


I think you're discounting efficiency gains — through a series of individually minor breakthroughs in LLM tech I think we could end up with things like 100M+ token context windows

We've already seen this sort of incrementalism over the past couple of years, the initial buzz started without much more than a 2048 context window and we're seeing models with 1M out there now that are significantly more capable.


Marketing is really good at their job.

That coupled with new money and retail investors being thinking they’re in a gold rush and you get the environment we’re in.


I’m more keen on formal methods to do this than LLMs. I take the view that we need more precise languages that require us to write less code that obviously has no errors in it. LLMs are primed to generate more code using less precise specifications; resulting in code that has no obvious errors in it.


> We know what it is good at and what it's not.

We know what it's good at today. And pretty sure it won't be any worse at it in the future. And 5 years ago state of the art was basically output of Markov Chain. In 5 years we might be at another place entirely.


Been heavily testing Cursor, Windsurf and VSCode w/CP lately. Most low level stuff works surprisingly well. But for anything slightly more complex I just end up wasting 90% credits watching the AI chase its own tail.


My favorite has been everyday people claiming Elmo will find all sorts of corruption on Fed databases using AI. Trained on what datasets? What biases? Etc.


I agree but too many serious people are hinting we are very close I can't ignore it anymore. Sure, when Sam Altman / Zuckerberg say we're close I don't know if I can believe him because obviously the dudes will say anything to sell/pump the stock price. But how about Demis Hassabis ? He doesn't strike me like that at all. Same for Geoff Hinton, Bengio and a couple of others.


People investing their lives in the field are inherently biased. This is not to diminish them, it’s just a fact of the matter. Nobody knows how general intelligence really works, nor even how to reliably test for it, so it’s all speculation.


Market hype is all it is.


They all sounds like crypto bros talking about AI. It's really frustrating to talk to them, just like crypto bros.


They're the same people in my experience.


Its the same energy for sure.


A breakthrough is exactly what everyone is banking on. OpenAI was surprised by GPT3, they were just dumping data into an LLM and ended up with something that was way better than they expected.

Everyone is hoping (probably delusional) that bigger and more impressive breakthroughs will keep leaping up if we just keep tweaking the models and increasing the size of the data sets.


You missed the companies selling AI consulting projects, with the disconnect between sales team, customer, folks on the customer side, consultants doing the delivery, and what actually gets done.


My take is that AI's ability to generate new code will prove so valuable, it will not matter if it is bad at changing existing code. And that the engineers of the distant future (like, two years from now) will not bother to read the generated code, as long as it runs and passes the tests (which will also be AI-generated).

I try to use AI daily, and every month I see how it is able to generate larger and more complex chunks of code from the first shot. It is almost there. We just need to adopt the new paradigm, build the tooling, and embrace the new weird future of software development.


> I try to use AI daily

You should reflect on the consequences of relying too much on it.

See https://www.404media.co/microsoft-study-finds-ai-makes-human...


I don't buy it makes me dumber. It just makes me worse at some things I used to do before, while making better at some other things. Often times it doesn't feel like coding anymore, more like if I were training to be a lawyer or something. But that's my bet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: