And here in largely vegetarian India, everyone is now pushing for more protein and meats because a vegetable-heavy diet has been awful for our public health
Even if Indians ate 2x the meat that they do now, they wouldn’t consume anywhere as much as Americans do. Increasing meat consumption in America is not necessary.
India would do well to consume more protein, and the US would do well to consume less
If you want to actually get full and satiated with a largely vegetarian diet, you will eventually resort to carbs
And this is for a culture that really knows how to make smashingly good vegetarian dishes
I love my vegetables, but a vegetable-heavy diet is clearly not something that everyone can or should do. The people I know who retain their health with vegetarian/vegan diets are usually really well-versed in nutrition
If you look at a lot of the indian vegetarian dishes you'll find things like potatoes fried in butter being a staple.
Chickpeas and yogurt do make a showing, but a lot of indian dishes are devoid of vegetarian protein sources. You need a lot more beans/nuts if you want to eat healthy as a vegetarian.
> a lot of indian dishes are devoid of vegetarian protein sources
What about legumes -- daal (pulses) and chickpeas? They have plenty of protein for vegetable sources. Also: Paneer. What I find in practice: You get a tiny amount of legumes/paneer, and a huge amount of carbs.
has your government published any science on this? being completely serious, i'd like to read it. Is India mostly vegetarian because of lack of access to farms/meats, religious reasons, financial, or what? I didn't know it was largely vegetarian. I don't know i had an idea of the ratio or that it would be different than any other country.
Apparently the Mediterranean also is largely vegetarian. at least the eponymous diet is.
Most branches of hinduism condemn meat eating, so this has created a significant pressure against meat production (same as you'll find little production of pork in the Middle East and North Africa). This is not universal, of course, because historically many regions of India had large meat-eating muslim populations as well.
Note that this is typically lacto-ovo-vegetarianism, not veganism.
Well, I never read the artcicle because paywall, but there is a WSJ headline today about a $160k mechanic job at Ford that can't be fulfilled because no labor
It's very good at following instructions. You can build dedicated agents for different tasks (backend, API design, database design) and make it follow design and coding patterns.
It's verbose by default but a few hours of custom instructions and you can make it code just like anyone
What's even the point of this comment if you self-admittedly don't have access to the flagship tool that everyone has been using to make these big bold coding claims?
I believe part of why Claude Code is so great because it has the chance to catch its own mistakes. It can run compilers, linters, browsers and check its own output. If it makes a mistake, it takes one or two extra iterations until it gets it right.
I really think a lof of people tried AI coding earlier, got frustrated at the errors and gave up. That's where the rejection of all these doomer predictions comes from.
And I get it. Coding with Claude Code really was prompting something, getting errors, and asking it to fix it. Which was still useful but I could see why a skilled coder adding a feature to a complex codebase would just give up
Opus 4.5 really is at a new tier however. It just...works. The errors are far fewer and often very minor - "careless" errors, not fundamental issues (like forgetting to add "use client" to a nextjs client component.
This was me. I was a huge AI coding detractor on here for a while (you can check my comment history). But, in order to stay informed and not just be that grouchy curmudgeon all the time, I kept up with the models and regularly tried them out. Opus 4.5 is so much better than anything I've tried before, I'm ready to change my mind about AI assistance.
I even gave -True Vibe Coding- a whirl. Yesterday, from a blank directory and text file list of requirements, I had Opus 4.5 build an Android TV video player that could read a directory over NFS, show a grid view of movie poster thumbnails, and play the selected video file on the TV. The result wasn't exactly full-featured Kodi, but it works in the emulator and actual device, it has no memory leaks, crashes, ANRs, no performance problems, no network latency bugs or anything. It was pretty astounding.
Oh, and I did this all without ever opening a single source file or even looking at the proposed code changes while Opus was doing its thing. I don't even know Kotlin and still don't know it.
I have a few Go projects now and I speak Go as well as you speak Kotlin. I predict that we'll see some languages really pull ahead of others in the next few years based on their advantages for AI-powered development.
For instance, I always respected types, but I'm too lazy to go spend hours working on types when I can just do ruby-style duck typing and get a long ways before the inevitable problems rear their head. Now, I can use a strongly typed language and get the advantages for "free".
> I predict that we'll see some languages really pull ahead of others in the next few years based on their advantages for AI-powered development.
Oh absolutely. I've been using Python for past 15 or so years for everything.
I've never written a single line of Rust in my life, and all my new projects are Rust now, even the quick-script-throwaway things, because it's so much better at instantly screaming at claude when it goes off track. It may take it longer to finish what I asked it to do, but requires so much less involvement from me.
I will likely never start another new project in python ever.
EDIT: Forgot to add that paired with a good linter, this is even more impressive. I told Claude to come up with the most masochistic clippy configuration possible, where even a tiny mistake is instantly punished and exceptions have to be truly exceptional (I have another agent that verifies this each run).
I just wish there was cargo-clippy for enforcing architectural patterns.
and with types, it makes it easier for rounds of agents to pick up mistakes at compile time, statically. linting and sanity checking untyped languages only goes so far.
I've not seen LLM's one shot perl style regexes. and javascript can still have ugly runtime WTFs
Going to one-up you though -- here's a literal one-liner that gets me a polished media center with beautiful interface and powerful skinning engine. It supports Android, BSD, Linux, macOS, iOS, tvOS and Windows.
Hah! I actually initiated the project because I'm a long time XBMC/Kodi user. I started using it when it was called XBMC, on an actual Xbox 1. I am sick and tired of its crashing, poor playback performance, and increasingly bloated feature set. It's embarrassing when I have friends or family over for movie night, and I have to explain "Sorry folks, Kodi froze midway through the movie again" while I frantically try to re-launch/reboot my way back to watching the movie. VLC's playback engine is much better but the VLC app's TV UX is ass. This application actually uses the libVLC playback engine under the hood.
I think anecdotes like this may prove very relevant the next few years. AI might make bad code, but a project of bad code that's still way smaller than a bloated alternative, and has a UX tailored to your exact requirements could be compelling.
A big part of the problem with existing software is that humans seem to be pretty much incapable of deciding a project is done and stop adding to it. We treat creating code like a job or hobby instead of a tool. Nothing wrong with that, unless you're advertising it as a tool.
Yea, after this little experiment, I feel like I can just go through every big, bloated, slow, tech-debt-ridden software I use and replace it with a tiny, bespoke version that does only what I need and no more.
The old adage about how "users use 10% of your software's features, but they each use a different 10%" can now be solved by each user just building that 10% for themselves.
How do you know “it has no memory leaks, crashes, ANRs, no performance problems, no network latency bugs or anything” if you built it just yesterday? Isn’t it a bit too early for claims like this? I get it’s easy to bring ideas to life but aren’t we overly optimistic?
Part of the "one day" development time was exhaustively testing it. Since the tool's scope is so small, getting good test coverage was pretty easy. Of course, I'm not guaranteeing through formal verification methods that the code is bug free. I did find bugs, but they were all areas that were poorly specified by me in the requirements.
I decided to vibe code something myself last week at work. I've been wanting to create a poc that involves a coding agent create custom bokeh plots that a user can interact with and ask follow up questions. All this had to be served using a holoview panel library
At work I only have access to calude using the GitHub copilot integration so this could be the cause of my problems. Claude was able to get slthe first iteration up pretty quick. At that stage the app could create a plot and you could interact with it and ask follow up questions.
Then I asked it to extend the app so that it could generate multiple plots and the user could interact with all of them one at a time. It made a bunch of changes but the feature was never implemented. I asked it to do again but got the same outcome. I completely accept the fact that it could just be all because I am using vscode copilot or my promoting skills are not good but the LLM got 70% of the way there and then completely failed
> At work I only have access to calude using the GitHub copilot integration so this could be the cause of my problems.
You really need to at least try Claude Code directly instead of using CoPilot. My work gives us access to CoPilot, Claude Code, and Codex. CoPilot isn’t close to the other more agentic products.
Do they manage context differently or have different system prompts? I would assume a lot of that would be the same between them. I think GH Copilots biggest shortcoming is that they are too token cheap. Aggressively managing context to the detriment of the results. Watching Claude read a 500 line file in 100 line chunks just makes me sad.
I recently replaced my monitor with one that could be vertically oriented, because I'm just using Claude Code in the terminal and not looking at file trees at all
but I do want a better way to glance and keep up with what its doing in longer conversations, for my own mental context window
Ah, but you’re at the beginning stage young grasshopper. Soon you will be missing that horizontal ultra wide monitor as you spin up 8 different Claude agents in parallel seasons.
oh I noticed! I've begun doing that on my laptop. I just started going down all my list of sideprojects one by one, then two by two, a Claude Code instance in a terminal window for each folder. It's a bit mental
I'm finding that branding and graphic design is the most arduous part, that I'm hoping to accelerate soon. I'm heavily AI assisted there too and I'm evaluating MCP servers to help, but so far I do actually have to focus on just that part as opposed to babysit
Thanks for posting this. It's a nice reminder that despite all the noise from hype-mongers and skeptics in the past few years, most of us here are just trying to figure this all out with an open mind and are ready to change our opinions when the facts change. And a lot of people in the industry that I respect on HN or elsewhere have changed their minds about this stuff in the last year, having previously been quite justifiably skeptical. We're not in 2023 anymore.
If you were someone saying at the start of 2025 "this is a flash in the pan and a bunch of hype, it's not going to fundamentally change how we write code", that was still a reasonable belief to hold back then. At the start of 2026 that position is basically untenable: it's just burying your head in the sand and wishing for AI to go away. If you're someone who still holds it you really really need to download Claude Code and set it to Opus and start trying it with an open mind: I don't know what else to tell you. So now the question has shifted from whether this is going to transform our profession (it is), to how exactly it's going to play out. I personally don't think we will be replacing human engineers anytime soon ("coders", maybe!), but I'm prepared to change my mind on that too if the facts change. We'll see.
I was a fellow mind-changer, although it was back around the first half of last year when Claude Code was good enough to do things for me in a mature codebase under supervision. It clearly still had a long way to go but it was at that tipping point from "not really useful" to "useful". But Opus 4.5 is something different - I don't feel I have to keep pulling it back on track in quite the way I used to with Sonnet 3.7, 4, even Sonnet 4.5.
For the record, I still think we're in a bubble. AI companies are overvalued. But that's a separate question from whether this is going to change the software development profession.
The AI bubble is kind of like the dot-com bubble in that it's a revolutionary technology that will certainly be a huge part of the future, but it's still overhyped (i.e. people are investing without regard for logic).
We were enjoying cheap second hand rack mount servers, RAM, hard drives, printers, office chairs and so on for a decade after the original dot com crash. Every company that went out of business liquidated their good shit for pennies.
I'm hoping after AI comes back down to earth there will be a new glut of cheap second hand GPUs and RAM to get snapped up.
Right. And same for railways, which had a huge bubble early on. Over-hyped on the short time horizon. Long term, they were transformative in the end, although most of the early companies and early investors didn’t reap the eventual profits.
At the time it was overhyped because just by adding .com to your company's name you could increase your valuation regardless of whether or not you had anything to do with the internet. Is that not stupid?
I think my comparison is apt; being a bubble and a truly society-altering technology are not mutually exclusive, and by virtue of it being a bubble, it is overhyped.
There was definitely a lot of stupid stuff happening. IMO the clearest accurate way to put it is that it was overhyped for the short term (hence the crazy high valuations for obvious bullshit), and underhyped for the long term (in the sense that we didn't really foresee how broadly and deeply it would change the world). Of course, there's more nuance to it, because some people had wild long-term predictions too. But I think the overall, mainstream vibe was to underappreciate how big a deal it was.
> Oh, and I did this all without ever opening a single source file or even looking at the proposed code changes while Opus was doing its thing. I don't even know Kotlin and still don't know it.
This is what people are still doing wrong. Tools in a loop people, tools in a loop.
The agent has to have the tools to detect whatever it just created is producing errors during linting/testing/running. When it can do that, I can loop again, fix the error and again - use the tools to see whether it worked.
I _still_ encounter people who think "AI programming" is pasting stuff into ChatGPT on the browser and they complain it hallucinates functions and produces invalid code.
Last weekend I was debugging some blocking issue on a microcontroller with embassy-rs, where the whole microcontroller would lock up as soon as I started trying to connect to an MQTT server.
I was having Opus investigate it and I kept building and deploying the firmware for testing.. then I just figured I'd explain how it could do the same and pull the logs.
Off it went, for the next ~15 minutes it would flash the firmware multiple times until it figured out the issue and fixed it.
There was something so interesting about seeing a microcontroller on the desk being flashed by Claude Code, with LEDs blinking indicating failure states. There's something about it not being just code on your laptop that felt so interesting to me.
But I agree, absolutely, red/green test or have a way of validating (linting, testing, whatever it is) and explain the end-to-end loop, then the agent is able to work much faster without being blocked by you multiple times along the way.
This is kind of why I'm not really scared of losing my job.
While Claude is amazing at writing code, it still requires human operators. And even experienced human operators are bad at operating this machinery.
Tell your average joe - the one who thinks they can create software without engineers - what "tools-in-a-loop" means, and they'll make the same face they made when you tried explaining iterators to them, before LLMs.
Explain to them how typing system, E2E or integration test helps the agent, and suddenly, they now have to learn all the things they would be required to learn to be able to write on their own.
I have been out of the loop for a couple of months (vacation). I tried Claude Opus 4.5 at the end of November 2025 with the corporate Github Copilot subscription in Agent mode and it was awful: basically ignoring code and hallucinating.
My team is using it with Claude Code and say it works brilliantly, so I'll be giving it another go.
How much of the value comes from Opus 4.5, how much comes from Claude Code, and how much comes from the combination?
I strongly concur with your second statement. Anything other than agent mode in GH copilot feels useless to me. If I want to engage Opus through GH copilot for planning work, I still use agent mode and just indicate the desired output is whatever.md. I obviously only do this in environments lacking a better tool (Claude Code).
I suspect that's the other thing at play here; many people have only tried Copilot because it's cheap with all the other Microsoft subscriptions many companies have. Copilot frankly is garbage compared to Cursor/Claude, even with the same exact models.
my issue hasn't been for a long time now that the code they write works or doesn't work. My issues all stem from that it works, but does the wrong thing
> My issues all stem from that it works, but does the wrong thing
It's an opportunity, not a problem. Because it means there's a gap in your specifications and then your tests.
I use Aider not Claude but I run it with Anthropic models. And what I found is that comprehensively writing up the documentation for a feature spec style before starting eliminates a huge amount of what you're referring to. It serves a triple purpose (a) you get the documentation, (b) you guide the AI and (c) it's surprising how often this helps to refine the feature itself. Sometimes I invoke the AI to help me write the spec as well, asking it to prompt for areas where clarification is needed etc.
This is how Beads works, especially with Claude Code. What I do is I tell Claude to always create a Bead when I tell it to add something, or about something that needs to be added, then I start brainstorming, and even ask it to do market research what are top apps doing for x, y or z. Then ask it to update the bead (I call them tasks) and then finally when its got enough detail, I tell it, do all of these in parallel.
There are several rubs with that operating protocol extending beyond the "you're holding it wrong" claim.
1) There exists a threshold, only identifiable in retrospect, past which it would have been faster to locate or write the code yourself than to navigate the LLM's correction loop or otherwise ensure one-shot success.
2) The intuition and motivations of LLMs derive from a latent space that the LLM cannot actually access. I cannot get a reliable answer on why the LLM chose the approaches it did; it can only retroactively confabulate. Unlike human developers who can recall off-hand, or at least review associated tickets and meeting notes to jog their memory. The LLM prompter always documenting sufficiently to bridge this LLM provenance gap hits rub #1.
3) Gradually building prompt dependency where one's ability to take over from the LLM declines and one can no longer answer questions or develop at the same velocity themselves.
4) My development costs increasingly being determined by the AI labs and hardware vendors they partner with. Particularly when the former will need to increase prices dramatically over the coming years to break even with even 2025 economics.
Since we asked you to stop hounding another user in this manner and you've continued to do it repeatedly, I've banned the account. This is not what Hacker News is for, and you've done it almost 50 times (!), almost 30 of which have been after we first asked you to stop. That is extreme, and totally unacceptable.
Many people - simonw is the most visible of them, but there are countless others - have given up trying to convinced folks who are determined to not be convinced, and are simply enjoying their increased productivity. This is not a competition or an argument.
Maybe they are struggling to convince others because they are unable to produce evidence that is able to convince people?
My experience scrolling X and HN is a bunch of people going "omg opus omg Claude Code I'm 10x more productive" and that's it. Just hand wavy anecdotes based on their own perceived productivity. I'm open to being convinced but just saying stuff is not convincing. It's the opposite, it feels like people have been put under a spell.
I'm following The Primeagen, he's doing a series where he is trying these tools on stream and following peoples advice on how to use them the best. He's actually quite a good programmer so I'm eager to see how it goes. So far he isn't impressed and thus neither am I. If he cracks it and unlocks significant productivity then I will be convinced.
>> Maybe they are struggling to convince others because they are unable to produce evidence that is able to convince people?
Simon has produced plenty of evidence over the past year. You can check their submission history and their blog: https://simonwillison.net/
The problem with people asking for evidence is that there's no level of evidence that will convince them. They will say things like "that's great but this is not a novel problem so obviously the AI did well" or "the AI worked only because this is a greenfield project, it fails miserably in large codebases".
It's true that some people will just continually move the goalposts because they are invested in their beliefs. But that doesn't mean that the skepticism around certain claims aren't relevant.
Nobody serious is disputing that LLM's can generate working code. They dispute claims like "Agentic workflows will replace software developers in the short to medium term", or "Agentic workflows lead to 2-100x improvements in productivity across the board". This is what people are looking for in terms of evidence and there just isn't any.
Thus far, we do have evidence that AI (at least in OSS) produces a 19% decrease in productivity [0]. We also have evidence that it harms our cognitive abilities [1]. Anecdotally, I have found myself lazily reaching for LLM assistance when encountering a difficult problem instead of thinking deeply about the problem. Anecdotally I also struggle to be more productive using AI-centric agents workflows in areas of expertise.
We want evidence that "vibe engineering" is actually more productive across the entire lifespan of a software project. We want evidence that it produces better outcomes. Nobody has yet shown that. It's just people claiming that because they vibe coded some trivial project, all of software development can benefit from this approach. Recently a principle engineer at Google claimed that Claude Code wrote their team's entire year's worth of work in a single afternoon. They later walked that claim back, but most do not.
I'm more than happy to be convinced but it's becoming extremely tiring to hear the same claims being parroted without evidence and then you get called a luddite when you question it. It's also tiring when you push them on it and they blame it on the model you use, and then the agent, and then the way you handle context, and then the prompts, and then "skill issue". Meanwhile all they have to show is some slop that could be hand coded in a couple hours by someone familiar with the domain. I use AI, I was pretty bullish on it for the last two years, and the combination of it simply not living up to expectations + the constant barrage of what feels like a stealth marketing campaign parroting the same thing over and over (the new model is way better, unlike the other times we said that) + the amount of absolute slop code that seems to continue to increase + companies like Microsoft producing worse and worse software as they shoehorn AI into every single product (Office was renamed to Copilot 365). I've become very sensitive to it, much in the same way I was very sensitive to the claims being made by certain VC backed webdev companies regarding their product + framework in the last few years.
I'm not even going to bring up the economic, social, and environmental issues because I don't think they're relevant, but they do contribute to my annoyance with this stuff.
> Thus far, we do have evidence that AI (at least in OSS) produces a 19% decrease in productivity
I generally agree with you, but I'd be remiss if I didn't point out that it's plausible that the slow down observed in the METR study was at least partially due to the subjects lack of experience with LLMs. Someone with more experience performed the same experiment on themselves, and couldn't find a significant difference between using LLMs and not [0]. I think the more important point here is that programmers subjective assessment of how much LLMs help them is not reliable, and biased towards the LLMs.
I think we're on the same page re. that study. Actually your link made me think about the ongoing debate around IDE's vs stuff like Vim. Some people swear by IDE's and insist they drastically improve their productivity, others dismiss them or even claim they make them less productive. Sound familiar? I think it's possible these AI tools are simply another way to type code, and the differences averaged out end up being a wash.
IDEs vs vim makes a lot of sense. AI really does feel like using an IDE in a certain way
Using AI for me absolutely makes it feel like I'm more productive. When I look back on my work at the end of the day and look at what I got done, it would be ludicrous to say it was multiple times the amount as my output pre-AI
Despite all the people replying to me saying "you're holding it wrong" I know the fix to it doing the wrong thing. Specify in more detail what I want. The problem with that is twofold:
1. How much to specify? As little as possible is the ideal, if we want to maximize how much it can help us. A balance here is key. If I need to detail every minute thing I may as well write the code myself
2. If I get this step wrong, I still have to review everything, rethink it, go back and re-prompt, costing time
When I'm working on production code, I have to understand it all to confidently commit. It costs time for me to go over everything, sometimes multiple iterations. Sometimes the AI uses things I don't know about and I need to dig into it to understand it
AI is currently writing 90% of my code. Quality is fine. It's fun! It's magical when it nails something one-shot. I'm just not confident it's faster overall
I think this is an extremely honest perspective. It's actually kind of cool that it's gotten to the point it can write most code - albeit with a lot of handholding.
This is why you use this AI bubble (it IS a bubble) to use the VC-funded AI models for dirt cheap prices and CREATE tools for yourself.
Need a very specific linter? AI can do it. Need a complex Roslyn analyser? AI. Any kind of scripting or automation that you run on your own machine. AI.
None of that will go away or suddenly stop working when the bubble bursts.
Within just the last 6 months I've built so many little utilities to speed up my work (and personal life) it's completely bonkers. Most went from "hmm, might be cool to..." to a good-enough script/program in an evening while doing chores.
Even better, start getting the feel for local models. Current gen home hardware is getting good enough and the local models smart enough so you can, with the correct tooling, use them for suprisingly many things.
> Even better, start getting the feel for local models. Current gen home hardware is getting good enough and the local models smart enough so you can, with the correct tooling, use them for suprisingly many things.
Are there any local models that are at least somewhat comparable to the latest-and-greatest (e.g. Opus 4.5, Gemini 3), especially in terms of coding?
A risk I see with this approach is that when the bubble pops, you'll be left dependent on a bunch of tools which you don't know how to maintain or replace on your own, and won't have/be able to afford access to LLMs to do it for you.
The "tools" in this context are literally a few hundred lines of Python or Github CI build pipeline, we're not talking about 500kLOC massive applications.
I'm building tools, not complete factories :) The AI builds me a better hammer specifically for the nails I'm nailing 90% of the time. Even if the AI goes away, I still know how the custom hammer works.
I thought that initially, but I don't think the skills AI weakens in me are particularly valuable
Let's say AI becomes too expensive - I more or less only have to sharpen up being able to write the language. My active recall of the syntax, common methods and libraries. That's not hard or much of a setback
Maybe this would be a problem if you're purely vibe coding, but I haven't seen that work long term
Open source models hosted by independent providers (or even yourself, which if the bubble pops will be affordable if you manage to pick up hardware on fire sales) are already good enough to explain most code.
> 1) There exists a threshold, only identifiable in retrospect, past which it would have been faster to locate or write the code yourself than to navigate the LLM's correction loop or otherwise ensure one-shot success.
I can run multiple agents at once, across multiple code bases (or the same codebase but multiple different branches), doing the same or different things. You absolutely can't keep up with that. Maybe the one singular task you were working on, sure, but the fact that I can work on multiple different things without the same cognitive load will blow you out of the water.
> 2) The intuition and motivations of LLMs derive from a latent space that the LLM cannot actually access. I cannot get a reliable answer on why the LLM chose the approaches it did; it can only retroactively confabulate. Unlike human developers who can recall off-hand, or at least review associated tickets and meeting notes to jog their memory. The LLM prompter always documenting sufficiently to bridge this LLM provenance gap hits rub #1.
Tell the LLM to document in comments why it did things. Human developers often leave and then people with no knowledge of their codebase or their "whys" are even around to give details. Devs are notoriously terrible about documentation.
> 3) Gradually building prompt dependency where one's ability to take over from the LLM declines and one can no longer answer questions or develop at the same velocity themselves.
You can't develop at the same velocity, so drop that assumption now. There's all kinds of lower abstractions that you build on top of that you probably can't explain currently.
> 4) My development costs increasingly being determined by the AI labs and hardware vendors they partner with. Particularly when the former will need to increase prices dramatically over the coming years to break even with even 2025 economics.
You aren't keeping up with the actual economics. This shit is technically profitable, the unprofitable part is the ongoing battle between LLM providers to have the best model. They know software in the past has often been winner takes all so they're all trying to win.
> With the latest models if you're clear enough with your requirements you'll usually find it does the right thing on the first try
That's great that this is your experience, but it's not a lot of people's. There are projects where it's just not going to know what to do.
I'm working in a web framework that is a Frankenstein-ing of Laravel and October CMS. It's so easy for the agent to get confused because, even when I tell it this is a different framework, it sees things that look like Laravel or October CMS and suggests solutions that are only for those frameworks. So there's constant made up methods and getting stuck in loops.
The documentation is terrible, you just have to read the code. Which, despite what people say, Cursor is terrible at, because embeddings are not a real way to read a codebase.
I'm working mostly in a web framework that's used by me and almost nobody else (the weird little ASGI wrapper buried in Datasette) and I find the coding agents pick it up pretty fast.
One trick I use that might work for you as well:
Clone GitHub.com/simonw/datasette to /tmp
then look at /tmp/docs/datasette for
documentation and search the code
if you need to
Try that with your own custom framework and it might unblock things.
If your framework is missing documentation tell Claude Code to write itself some documentation based on what it learns from reading the code!
> I'm working mostly in a web framework that's used by me and almost nobody else (the weird little ASGI wrapper buried in Datasette) and I find the coding agents pick it up pretty fast
Potentially because there is no baggage with similar frameworks. I'm sure it would have an easier time with this if it was not spun off from other frameworks.
> If your framework is missing documentation tell Claude Code to write itself some documentation based on what it learns from reading the code!
If Claude cannot read the code well enough to begin with, and needs supplemental documentation, I certainly don't want it generating the docs from the code. That's just compounding hallucinations on top of each other.
I find Claude Code is so good at docs that I sometimes investigate a new library by checking out a GitHub repo, deleting the docs/ and README and having Claude write fresh docs from scratch.
In a circuitous way, you can rather successfully have one agent write a specification and another one execute the code changes. Claude code has a planning mode that lets you work with the model to create a robust specification that can then be executed, asking the sort of leading questions for which it already seems to know it could make an incorrect assumption. I say 'agent' but I'm really just talking about separate model contexts, nothing fancy.
Cursor's planning functionality is very similar and I have found that I can even use "cheap" models like their Composer-1 and get great results in the planning phase, and then turn on Sonnet or Opus to actually produce the plan. 90% of the stuff I need to argue about is during the planning phase, so I save a ton of tokens and rework just making a really good spec.
It turns out that Waterfall was always the correct method, it's just really slow ;)
And if you've told it too many times to fix it, tell it someone has a gun to your head, for some reason it almost always gets it right this very next time.
Yeah, if anyone can truly afford the AI empire. Remember all these "leading" companies are running it at a loss, so most companies paying for it are severely underpaying the cost of it all. We would need an insane technological breakthrough of unlimited memory and power before I start to worry, and at that point, I'll just look for a new career.
I think it's worth understanding why. Because that's not everyone's experience and there's a chance you could make a change such that you find it extremely useful.
There's a lesser chance that you're working on a code base that Claude Code just isn't capable of helping with.
The more explicit/detailed your plan, the more context it uses up, the less accurate and generally functional it is. Don't get me wrong, it's amazing, but on a complex problem with large enough context it will consistently shit the bed.
The human still has to manage complexity. A properly modularized and maintainable code base is much easier for the LLM to operate on — but the LLM has difficulty keeping the code base in that state without strong guidance.
Putting “Make minimal changes” in my standard prompt helped a lot with the tendency of basically all agents to make too many changes at once. With that addition it became possible to direct the LLM to make something similar to the logical progression of commits I would have made anyway, but now don’t have to work as hard at crafting.
Most of the hype merchants avoid the topic of maintainability because they’re playing to non-technical management skeptical of the importance of engineering fundamentals. But everything I’ve experienced so far working with LLMs screams that the fundamentals are more important than ever.
It usually works well for me. With very big tasks I break the plan into multiple MD files with the relevant context included and work through in individual sessions, updating remaining plans appropriately at the end of each one (usually there will be decision changes or additions during iteration).
It takes a lot of plan to use up the context and most of the time the agent doesn't need the whole plan, they just need what's relevant to the current task.
This was me. I have done a full 180 over the last 12 months or so, from "they're an interesting idea, and technically impressive, but not practically useful" to "holy shit I can have entire days/weeks where I don't write a single line of code".
> I really think a lof of people tried AI coding earlier, got frustrated at the errors and gave up. That's where the rejection of all these doomer predictions comes from.
It's not just the deficiencies of earlier versions, but the mismatch between the praise from AI enthusiasts and the reality.
I mean maybe it is really different now and I should definitely try uploading all of my employer's IP on Claude's cloud and see how well it works. But so many people were as hyped by GPT-4 as they are now, despite GPT-4 actually being underwhelming.
Too much hype for disappointing results leads to skepticism later on, even when the product has improved.
I feel similar, I'm not against the idea that maybe LLMs have gotten so much better... but I've been told this probably 10 times in the last few years working with AI daily.
The funny part about rapidly changing industries is that, despite the fomo, there's honestly not any reward to keeping up unless you want to be a consultant. Otherwise, wait and see what sticks. If this summer people are still citing the Opus 4.5 was a game changing moment and have solid, repeatable workflows, then I'll happily change up my workflow.
Someone could walk into the LLM space today and wouldn't be significantly at a loss for not having paid attention to anything that had happened in the last 4 years other than learning what has stuck since then.
> The funny part about rapidly changing industries is that, despite the fomo, there's honestly not any reward to keeping up unless you want to be a consultant.
I've lived through multiple incredibly rapid changes in tech throughout my career, and the lesson always learned was there is a lot of wasted energy keeping up.
Two big examples:
- Period from early mvc JavaScript frontends (backbone.js etc) and the time of the great React/Angular wars. I completely stepped out of the webdev space during that time period.
- The rapid expansion of Deep Learning frameworks where I did try to keep up (shipped some Lua torch packages and made minor contributions to Pylearn2).
In the first case, missing 5 years of front-end wars had zero impact. After not doing webdev work at all for 5-years I was tasked with shipping a React app. It took me a week to catch up, and everything was deployed in roughly the same time as someone would have had they spent years keeping up with changes.
In the second case, where I did keep up with many of the developing deep learning frameworks, it didn't really confer any advantage. Coworkers who I worked with who started with Pytorch fresh out of school were just as proficient, if not more so, with building models. Spending energy keeping up offered no value other than feeling "current" at the time.
Can you give me a counter example of where keeping up with a rapidly changing area that's unstable has conferred a benefit to you? Most of FOMO is really just fear. Again, unless you're trying to sell your self specifically as a consultant on the bleeding edge, there's no reason to keep up with all these changes (other than finding it fun).
You moved out of webdev for 5 years, not everybody else had that luxury. I'm sure it was beneficial to those people to keep up with webdev technologies.
If everything changes every month, then stuff you learn next month would be obsolete in two months. This is a response to people saying "adapt or be left behind". There's so much thrashing that if you're not interested with the SOTA, you can just wait for everything to calm down and pick it up then.
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
> Opus 4.5 really is at a new tier however. It just...works.
Literally tried it yesterday. I didn't see a single difference with whatever model Claude Code was using two months ago. Same crippled context window. Same "I'll read 10 irrelevant lines from a file", same random changes etc.
Create a markdown document of your task (or use CLAUDE.md), put it in "plan mode" which allows Claude to use tool calls to ask questions before it generates the plan.
When it finishes one part of the plan, have it create a another markdown document - "progress.md" or whatever with the whole plan and what is completed at that point.
Type /clear (no more context window), tell Claude to read the two documents.
Repeat until even a massive project is complete - with those 2 markdown documents and no context window issues.
> ... Proceeds to explain how it's crippled and all the workarounds you have to do to make it less crippled.
No - that's not what I did.
You don't need an extra-long context full of irrelevant tokens. Claude doesn't need to see the code it implemented 40 steps ago in a working method from Phase 1 if it is on Phase 3 and not using that method. It doesn't need reasoning traces for things it already "thought" through.
This other information is cluttering, not helpful. It is making signal to noise ratio worse.
If Claude needs to know something it did in Phase 1 for Phase 4 it will put a note on it in the living markdown document to simply find it again when it needs it.
Again, you're basically explaining how Claude has a very short limited context and you have to implement multiple workarounds to "prevent cluttering". Aka: try to keep context as small as possible, restart context often, try and feed it only small relevant information.
What I very succinctly called "crippled context" despite claims that Opus 4.5 is somehow "next tier". It's all the same techniques we've been using for over a year now.
I get by because I also have long-term memory, and experience, and I can learn. LLMs have none of that, and every new session is rebuilding the world anew.
And even my short-term memory is significantly larger than the at most 50% of the 200k-token context window that Claude has. It runs out of context before my short-term memory is probably not even 1% full, for the same task (and I'm capable of more context-switching in the meantime).
And so even the "Opus 4.5 really is at a new tier" runs into the very same limitations all models have been running into since the beginning.
> For LLMs long term memory is achieved by tooling. Which you discounted in your previous comments.
My specific complaint, which is an observable fact about "Opus 4.5 is next tier": it has the same crippled context that degrades the quality of the model as soon as it fills 50%.
EMM_386: no-no-no, it's not crippled. All you have to do is keep track across multiple files, clear out context often, feed very specific information not to overflow context.
Me: so... it's crippled, and you need multiple workarounds
scotty79: After all it's the same as your own short-term memory, and <some unspecified tooling (I guess those same files)> provide long-term memory for LLMs.
Me: Your comparison is invalid because I can go have lunch, and come back to the problem at hand and continue where I left off. "Next tier Opus 4.5" will have to be fed the entire world from scratch after a context clear/compact/in a new session.
Unless, of course, you meant to say that "next tier Opus model" only has 15-30 second short term memory, and needs to keep multiple notes around like the guy from Memento. Which... makes it crippled.
If you refuse to use what you call workarounds and I call long term memory then you end up with a guy from Memento and regardless of how smart the model is it can end up making same mistakes. And that's why you can't tell the difference between smarter and dumber one while others can.
I evaluated the claim that Opus is somehow next tier/something different/amazeballs future at its face value. It still has all the same issues and needs all the same workarounds as whatever I was using two months ago (I had a bit of a coding hiatus between beginning of December and now).
> then you end up with a guy from Memento and regardless of how smart the model is
Those models are, and keep being the guy from memento. Your "long memory" is nothing but notes scribbled everywhere that you have to re-assemble every time.
> And that's why you can't tell the difference between smarter and dumber one while others can.
If it was "next tier smarter" it wouldn't need the exact same workarounds as the "dumber" models. You wouldn't compare the context to the 15-30 second short-term memory and need unspecified tools [1] to have "long-term memory". You wouldn't have the model behave in an indistinguishable way from a "dumber" model after half of its context windows has been filled. You wouldn't even think about context windows. And yet here we are
[1] For each person these tools will be a different collection of magic incantations. From scattered .md files to slop like Beads to MCP servers providing access to various external storage solutions to custom shell scripts to ...
BTW, I still find "superpowers" from https://github.com/obra/superpowers to be the single best improvement to Claude (and other providers) even if it's just another in a long serious of magic chants I've evaluated.
> Those models are, and keep being the guy from memento. Your "long memory" is nothing but notes scribbled everywhere that you have to re-assemble every time.
That's exactly how the long term memory works in humans as well. The fact that some of these scribbles are done chemically in the same organ that does the processing doesn't make it much better. Human memories are reassembled at recall (often inaccurately). And humans also scribble when they try to solve a problem that exceeds their short term memory.
> If it was "next tier smarter" it wouldn't need the exact same workarounds as the "dumber" models.
This is akin to opposing calling processor next tier because it still needs RAM and bus to communicate with it and SSD as well. You think it should have everything in cache to be worthy of calling it next tier.
It's fine to have your own standards for applying words. But expect further confusion and miscommunication with other people if don't intend to realign.
> That's exactly how the long term memory works in humans as well.
Where this is applicable when is you go away from a problem for a while. And yet I don't lose the entire context and have to rebuild it from scratch when I go for lunch, for example.
Models have to rebuild the entire world from scratch for every small task.
> This is akin to opposing calling processor next tier because it still needs RAM and bus to communicate with it and SSD as well.
You're so lost in your own metaphor that it makes no sense.
> You think it should have everything in cache to be worthy of calling it next tier.
No. "Next tier" implies something significantly and observably better. I don't. And here you are trying to tell me "if you use all the exact same tools that you have already used before with 'previous tier models' you will see it is somehow next tier".
If your "next tier" needs an equator-length list of caveats and all the same tools, it's not next tier is it?
BTW. I'm literally coding with this "next tier" tool with "long memory just like people". After just doing the "plan/execute/write notes" bullshit incantations I had to correct it:
You're right, I fucked up on all three counts:
1. FileDetails - I should have WIRED IT UP, not deleted it.
It's a useful feature to preview file details before playing.
I treated "unused" as "unwanted" instead of "not yet connected".
2. Worktree not merged - Complete oversight. Did all the work but
didn't finish the job.
3. _spacing - Lazy fix. Should have analyzed why it exists and either
used it or removed the layout constraint entirely.
So next tier. So long memory. So person-like.
Oh. Within about 10 seconds after that it started compacting the "non-crippled" context window and immediately forgot most of what it had just been doing. So I had to clear out the context and teach it the world from the start again.
Edit. And now this amazing next tier model completely ignored that there already exists code to discover network interfaces, and wrote bullshit code calling CLI tools from Rust. So once again it needed to be reminded of this.
> It's fine to have your own standards for applying words. But expect further confusion and miscommunication with other people if don't intend to realign.
I mean, just like crypto bros before them, AI bros do sure love to invent their own terminology and their own realities that have nothing to do with anything real and observable.
> "You're right, I fucked up on all three counts:"
It very well might be that AI tools are not for you, if you are getting such poor results with your methods of approaching them.
If you would like to improve your outcomes at some point, ask people who achieve better results for pointers and try them out. Here's a freebie, never tell AI it fucked up.
200k+ tokens is a pretty big context window if you are feeding it the right context. Editors like Cursor are really good at indexing and curating context for you; perhaps it'd be worth trying something that does that better than Claude CLI does?
> a pretty big context window if you are feeding it the right context.
Yup. There's some magical "right context" that will fix all the problems. What is that right context? No idea, I guess I need to read a yet-another 20 000-word post describing magical incantations that you should or shouldn't do in the context.
The "Opus 4.5 is something else/nex tier/just works" claims in my mind means that I wouldn't need to babysit its every decision, or that it would actually read relevant lines from relevant files etc. Nope. Exact same behaviors as whatever the previous model was.
Oh, and that "200k tokens context window"? It's a lie. The quality quickly degrades as soon as Claude reaches somewhere around 50% of the context window. At 80+% it's nearly indistinguishable from a model from two years ago. (BTW, same for Codex/GPT with it's "1 million token window")
1) define problem
2) split problem into small independently verifiable tasks
3) implement tasks one by one, verify with tools
With humans 1) is the spec, 2) is the Jira or whatever tasks
With an LLM usually 1) is just a markdown file, 2) is a markdown checklist, Github issues (which Claude can use with the `gh` cli) and every loop of 3 gets a fresh context, maybe the spec from step 1 and the relevant task information from 2
I haven't ran into context issues in a LONG time, and if I have it's usually been either intentional (it's a problem where compacting wont' hurt) or an error on my part.
Yes and no. I've worked quite a bit with juniors, offshore consultants and just in companies where processes are a bit shit.
The exact same method that worked for those happened to also work for LLMs, I didn't have to learn anything new or change much in my workflow.
"Fix bug in FoobarComponent" is enough of a bug ticket for the 100x developer in your team with experience with that specific product, but bad for AI, juniors and offshored teams.
Thus, giving enough context in each ticket to tell whoever is working on it where to look and a few ideas what might be the root cause and how to fix it is kinda second nature to me.
Also my own brain is mostly neurospicy mush, so _I_ need to write the context to the tickets even if I'm the one on it a few weeks from now. Because now-me remembers things, two-weeks-from-now me most likely doesn't.
The problem with LLMs (similar to people :) ) is that you never really know what works. I've had Claude one-shot "implement <some complex requirement>" with little additional input, and then completely botch even the smallest bug fix with explicit instructions and context. And vice versa :)
I realize your experience has been frustrating. I hope you see that every generation of model and harness is converting more hold-outs. We're still a few years from hard diminishing returns assuming capital keeps flowing (and that's without any major new architectures which are likely) so you should be able to see how this is going to play out.
It's in your interest to deal with your frustration and figure out how you can leverage the new tools to stay relevant (to the degree that you want to).
Regarding the context window, Claude needs thinking turned up for long context accuracy, it's quite forgetful without thinking.
Note how nothing in your comment addresses anything I said. Except the last sentence that basically confirms what I said. This perfectly illustrates the discourse around AI.
As for the snide and patronizing "it's in your interest to stay relevant":
1. I use these tools daily. That's why I don't subscribe to willful wide-eyed gullibility. I know exactly what these tools can and cannot do.
The vast majority of "AI skeptics" are the same.
2. In a few years when the world is awash in barely working incomprehensible AI slop my skills will be in great demand. Not because I'm an amazing developer (I'm not), but because I have experience separating wheat from the chaff
The snide and patronizing is your projection. It kinda makes me sad when the discourse is so poisoned that I can't even encourage someone to protect their own future from something that's obviously coming (technical merits aside, purely based on social dynamics).
It seems the subject of AI is emotionally charged for you, so I expect friendly/rational discourse is going to be a challenge. I'd say something nice but since you're primed to see me being patronizing... Fuck you? That what you were expecting?
It's not me who decided to barge in, assume their opponent doesn't use something or doesn't want to use something, and offer unsolicited advice.
> It kinda makes me sad when the discourse is so poisoned that I can't even encourage someone to protect their own future from something that's obviously coming
See. Again. You're so in love with your "wisdom" that you can't even see what you sound like: snide, patronising, condenscending. And completely missing the whole point of what was written. You are literally the person who poisons the discourse.
Me: "here are the issues I still experience with what people claim are 'next tier frontier model'"
You: "it's in your interests to figure out how to leverage new tools to stay relevant in the future"
Me: ... what the hell are you talking about? I'm using these tools daily. Do you have anything constructive to add to the discourse?
> so I expect friendly/rational discourse is going to be a challenge.
It's only challenge to you because you keep being in love with your voice and your voice only. Do you have anything to contribute to the actual rational discourse, are you going to attack my character?
> 'd say something nice but since you're primed to see me being patronizing... Fuck you? T
Ah. The famous friendly/rational discourse of "they attack my use of AI" (no one attacked you), "why don't you invest in learning tools to stay relevant in the future" (I literally use these tools daily, do you have anything useful to say?) and "fuck you" (well, same to you).
> That what you were expecting?
What I was expecting is responses to what I wrote, not you riding in on a high horse.
You were the one complaining about how the tools aren't giving you the results you expected. If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue.
If you want to take politeness as being patronizing, I'm happy to stop bothering. My guess is you're not a special snowflake, and you need to "get good" or you're going to end up on unemployment complaining about how unfair life is. I'd have sympathy but you don't seem like a pleasant human being to interact with, so have fun!
> ou were the one complaining about how the tools aren't giving you the results you expected.
They are not giving me the results people claim they give. It is distinctly different from not giving the results I want.
> If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue.
Indeed. And your rational/friendly discourse that you claim you're having would start with trying to figure that out. Did you? No, you didn't. You immediately assumed your opponent is a clueless idiot who is somehow against AI and is incapable or learning or something.
> If you want to take politeness as being patronizing, I'm happy to stop bothering.
No. It's not politeness. It's smugness. You literally started your interaction in this thread with a "git gud or else" and even managed to complain later that "you dislike it when they attack your use of AI as a skill issue". While continuously attacking others.
> you don't seem like a pleasant human being to interact with
Says the person who has contributed nothing to the conversation except his arrogance, smugness, holier-than-thou attitude, engaged in nothing but personal attacks, complained about non-existent grievances and when called out on this behavior completed his "friendly and rational discourse" with a "fuck you".
Personally I'm sympathetic to people who don't want to have to use AI, but I dislike it when they attack my use of AI as a skill issue. I'm quite certain the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful.
> but I dislike it when they attack my use of AI as a skill issue.
No one attacked your use of AI. I explained my own experience with the "Claude Opus 4.5 is next tier". You barged in, ignored anything I said, and attacked my skills.
> the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful.
The only thing I disagreed with in your post is your objectively incorrect statement regarding Claude's context behavior. Other than that I'm just trying to encourage you to make preparations for something that I don't think you're taking seriously enough yet. No need to get all worked up, it'll only reflect on you.
And, conversely, when we read a comment like yours, it sounds like someone who's afraid of computers, would maybe have decried the bicycle and automobile, and really wishes they could just go live in a cabin in the woods.
(And it's fine to do so, just don't mail bombs to us, ok?)
> There's some magical "right context" that will fix all the problems.
All I can tell you is that in my own lived experience, I've had some fantastic results from AI, and it comes from telling it "look at this thing here, ok, i want you to chain it to that, please consider this factor, don't forget that... blah blah blah" like how I would have spelled things out to a junior developer, and then it really does stand a really solid chance of turning out what I've asked for. It helps a lot that I know what to ask for; there's no replacing that with AI yet.
So, your own situation must fall into one of these coarse buckets:
- You're doing something way too hard for AI to have a chance at yet, like real science / engineering at the frontier, not just boring software or infra development
- Your prompts aren't specific enough, you're not feeding it context, and you're expecting it to one-shot things perfectly instead of having to spend an afternoon prompting and correcting stuff
- You're not actually using and getting better at the tools, so you're just shouting criticisms from the sidelines, perhaps as sour grape because you're not allowed by policy / company can't afford to have you get into it.
IDK. I hope it's the first one and you're just doing Really Hard Things, but if you're doing normal software developer stuff and not seeing a productivity advantage, it's a fucking skill issue.
I'm not familiar with any form of intelligence that does not suffer from a bloated context. If you want to try and improve your workflow, a good place to start is using sub-agents so individual task implementations do not fill up your top level agents context. I used to regularly have to compact and clear, but since using sub-agents for most direct tasks, I hardly do anymore.
2. It's the same workarounds we've been doing forever
3. It's indistinguishable from "clear context and re-feed the entire world of relevant info from scratch" we've had forever, just slightly more automated
That's why I don't understand all the "it's new tier" etc. It's all the same issues with all the same workarounds.
That's because Opus has been out for almost 5 months now lol. Its the same model, so I think people have been vibe coding with a heavy dose of wine this holiday and are now convinced its the future.
Actually, I've been saying that even models from 2+ years ago were extremely good, but you needed to "hold them right" to get good results, else you might cut yourself on the sharp edges of the "jagged frontier" (https://www.hbs.edu/faculty/Pages/item.aspx?num=64700) Unfortunately, this often necessitated you to adapt yourself to the tool, which is a big change -- unfeasible for most people and companies.
I would say the underlying principle was ensuring a tight, highly relevant context (e.g. choose the "right" task size and load only the relevant files or even code snippets, not the whole codebase; more manual work upfront, but almost guaranteed one-shot results.)
With newer models the sharper edges have largely disappeared, so you can hold them pretty much any which way and still get very good results. I'm not sure how much of this is from the improvements in the model itself vs the additional context it gets from the agentic scaffolding.
I still maintain that we need to adapt ourselves to this new paradigm to fully leverage AI-assisted coding, and the future of coding will be pretty strange compared to what we're used to. As an example, see Gas Town: https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...
FWIW, Gas Town is strange because Steve is strange (in a good way).
It's just the same agent swarm orchestration that most agent frameworks are using, but with quirky marketing. All of that is just based on the SDLC [PM/Architect -> engineer planning group -> engineer -> review -> qa/evaluation] loop most people here should be familiar with. So actually pretty banal, which is probably part of the reason Steve decided to be zany.
Ah, gotcha, I am still working through the article, but its detailed focus on all the moving parts under the covers is making it hard to grok the high-level workflow.
Each failed prediction should lower our confidence in the next "it's finally useful!" claim. But this inductive reasoning breaks down at genuine inflection points.
I agree with your framing that measuring should NOT be separated from political issues, but each can be made clear separately (framing it as "training the tools of the oppressor" seems to conflate measuring tool usefulness with politics).
> How is it useful to you that these companies are so valuation hungry that they are moving money into this technology in such a way that people are fearful it could cripple the entire global economy?
The creation of entire new classes of profession has always been the result of technological breakthroughs. The automobile did not cripple the economy, even as it ended the buggy-whip barons.
> How is it useful to you that this tech is so power hungry that environmental externalities are being further accelerated while regular people's utility costs are raising to cover the increased demand(whether they use the tech to "code" or "manifest art")?
There will be advantages to lower-power computing, and lower-cost electricity. Implement carbon taxes and AI companies will follow the market incentive to install their datacentres in places where sustainable power is available for cheap. We'll see China soaring to new heights with their massive solar investment, and America will eventually figure out they have to catch up and cannot do so with coal and gas.
> How is it useful to you that this tech is so compute hungry that they are seemingly ending the industry of personal compute to feed this tech's demand?
Temporary problem, the demand for personal computing is not going to die in five years, and meanwhile the lucrative markets for producing this equipment will result in many new factories, increasing capacity and eventually lowering prices again. In the meantime, many pundits are suggesting that this may thankfully begin the end of the Electron App Era where a fuckin' chat client thinks it deserves 1GB of RAM.
Consider this: why are we using Electron and needing 32GB of RAM on a desktop? Because web developers only knew how to use Javascript and couldn't write a proper desktop app. With AI, desktop frameworks can have a resurgence; why shouldn't I use Go or Rust and write a native app on all platforms now that the cost of doing so is decreasing and the number of people empowered to work with it is increasing? I wrote a nice multithreaded fractal renderer in Rust the other day; I don't know how to multithread, write Rust, and probably can't iterate complex numbers correctly on paper anymore....
> How is it useful to you that this tech is so water hungry that it is emptying drinking water acquifers?
This is only a problem in places that have poor water policy, e.g. California (who can all thank the gods that their reservoirs are all now very full from the recent rain). This problem predates datacenters and needs to be solved - for instance, by federalizing and closing down the so-called Wonderful Company and anyone else who uses underhanded tactics to buy up water rights to grow crops that shouldn't be grown there.
Come and run your datacenters up in the cold North, you won't even need evaporative cooling for them, just blow a ton of fresh air in....
> How is it useful to you that this tech is being used to manufacture consent?
Now you've actually got an argument, and I am on your side on this one.
> If at any point any of these releases were "genuine inflection points" it would be unnecessary to proselytize such. It would be self evident. Much like rain.
Agreed.
Now, I suggest reading through all of this to note that I am not a fan of tech bros, that I do want this to be a bubble. Then also note what else I'm saying despite all that.
To me, it is self-evident. The various projects I have created by simply asking for them, are so. I have looked at the source code they produce, and how this has changed over time: Last year I was describing them as "junior" coders, by which I meant "fresh hire"; now, even with the same title, I would say "someone who is just about to stop being a junior".
> "The oppressed need to acknowledge that their oppression is useful to their oppressors."
The capacity for AI to oppress you is in direct relation to its economic value.
> How is it useful to you that this tech is so power hungry that environmental externalities are being further accelerated while regular people's utility costs are raising to cover the increased demand(whether they use the tech to "code" or "manifest art")?
The power hunger is in direct proportion to the demand. Someone burning USD 20 to get Claude Code tokens has consumed approximately USD 10 of electricity in that period, with the other USD 10 having been spread between repaying the model training cost and the server construction cost.
The reason they're willing to spend USD 20 is to save at least US 20 worth of dev time. This was already the case with the initial version of ChatGPT pro back in the day, when it could justify that by saving 23 dev minutes per month. There's around a million developers in the USA, just that group increasing electricity spending by USD 10/month will put a massive dent on the USA's power grid.
Gets worse though. Based on my experience, using Claude Code optimally, when you spend USD 20 you get at least 10 junior sprints' worth of output. Hiring a junior for 10 sprints is, what, USD 30,000? The bound here is "are you able to get value from having hired 1,500 juniors for the price of one?"
One can of course also waste those tokens. Both because nobody needs slop, and because most people can't manage one junior never mind 1500 of them.
However, if the economy collectively answers "yes", then the environmental externalities expand until you can't afford to keep your fridge cold or your lights on.
This is one of the failure modes of the technological singularity that people like me have been forewarning about for years, even when there's no alignment issues within the models themselves. Which there are, because Musk's one went and called itself Mecha Hitler, while being so sycophantic about Musk himself that it called him the best at everything even when the thing was "drinking piss", which would be extremely funny if he wasn't selling this to the US military.
> How is it useful to you that this tech is so compute hungry that they are seemingly ending the industry of personal compute to feed this tech's demand?
This will pass. Either this is a bubble, it pops, the manufacturers return to their roots; or it isn't because it works as advertised, which means it leads to much higher growth rates, and we (us, personally, you and me) get personal McKendree cylinders each with more compute than currently exists… or we get turned into the raw materials for those cylinders.
I assume the former. But I say that as one who wants it to be the former.
> How is it useful to you that this tech is so water hungry that it is emptying drinking water acquifers?
Is it what's emptying drinking water acquifers?
The combined water usage of all data centers in Arizona. All of them. Together. Which is over 100 DCs. All of them combined use about double what Tesla was expecting from just the Brandenburg Gigafactory to use before Musk decided to burn his reputation with EV consumers and Europeans for political point scoring.
> How is it useful to you that this tech is being used to manufacture consent?
This is one of the objectively bad things, though it's hard to say if this is more or less competent at this than all the other stuff we had three years ago, given the observed issues with the algorithmic feeds.
I appreciate you taking the time to write up your thoughts on something other than exclusively these tools 'usefulness' at writing code.
> The capacity for AI to oppress you is in direct relation to its economic value.
I think this assumes a level of rationality in these systems, corporate interests and global markets, that I would push back on as being largely absent.
> The power hunger is in direct proportion to the demand.
Do you think this is entirely the case? I mean, I understand what you are saying, but I would draw stark lines between "company" demand versus "user" demand. I have found many times the 'AI' tools are being thrust into nearly everything regardless of user demand. Spinning its wheels to only ultimately cause frustration. [0]
> Is it what's emptying drinking water aquifers?
It appears this is a problem, and will only continue to be such. [1]
> The combined water usage of all data centers in Arizona. All of them. Together. Which is over 100 DCs. All of them combined use about double what Tesla was expecting from just the Brandenburg Gigafactory to use before Musk decided to burn his reputation with EV consumers and Europeans for political point scoring.
I am unsure if I am getting what your statements here are trying to say. Would you be able to restate this to be more explicit in what you are trying to communicate.
> I think this assumes a level of rationality in these systems, corporate interests and global markets, that I would push back on as being largely absent.
Could be. What I hope and suspect is happening is that these companies are taking a real observation (the economic value that I also observe in software) and falsely expanding this to other domains.
Even to the extent that these work, AI has clearly been over-sold in humanoid robotics and self-driving systems, for example.
> Do you think this is entirely the case? I mean, I understand what you are saying, but I would draw stark lines between "company" demand versus "user" demand. I have found many times the 'AI' tools are being thrust into nearly everything regardless of user demand. Spinning its wheels to only ultimately cause frustration. [0]
I think it is. Companies setting silly goals like everyone must use LLMs once a day or whatever, that won't burn a lot of tokens. Claude Code is available in both subscription mode and PAYG mode, and the cost of subscriptions suggests it is burning millions of tokens a month for the basic subscription.
Other heavy users who we would both agree are bad, are slop content farms. I cannot even guesstimate those, so would be willing to accept the possibility they're huge.
> It appears this is a problem, and will only continue to be such. [1]
I find no reference to "aquifers" in that.
Where it says e.g. "up to 9 liters of water to evaporate per kWh of energy used", the average is 1.9 l/kWh. Also, evaporated water tends to fall nearby (on this scale) as rain, so unless there's now too much water on the surface, this isn't a net change even if it all comes form an aquifer (and I have yet to see any evidence of DCs going for that water source).
It says "The U.S. relies on water-intensive thermoelectric plants for electricity, indirectly increasing data centers' water footprint, with an average of 43.8L/kWh withdrawn for power generation." - most water withdrawn is returned, not consumed.
It says "Already AI's projected water usage could hit 6.6 billion m³ by 2027, signaling a need to tackle its water footprint.", this is less than the famously-a-desert that is Arizona.
> I am unsure if I am getting what your statements here are trying to say. Would you be able to restate this to be more explicit in what you are trying to communicate.
That the water consumption of data centres is much much smaller than the media would have you believe. It's more of a convenient scare story than a reality. If water is your principal concern, give up beef, dairy, cotton, rice, almonds, soy, biofuels, mining, paper, steel, cement, residential lawns, soft drinks, car washing, and hospitals, in approximately that order (assuming the lists I'm reading those from are not invented whole cloth), before you get to data centres.
And again, I don't disagree that they're a problem, it's just that the "water" part of the problem is so low down the list of things to worry about as to be a rounding error.
Ahh, I see your objection now. That is my bad. I was using my language too loosely. Here I was using 'aquifer' to mean 'any source of drinking water', but that is certainly different from the intended meaning.
> And again, I don't disagree that they're a problem, it's just that the "water" part of the problem is so low down the list of things to worry about as to be a rounding error.
I'm skeptical of the rounding error argument, and weary of relying on the logical framework of 'low down the list' when list items' effects stack interdependently.
> give up beef, dairy, cotton, rice, almonds, soy, biofuels, mining, paper, steel, cement, residential lawns, soft drinks, car washing, and hospitals
In part due to this reason, as well as others, I have stopped directly supporting the industries for: beef, dairy, rice, almonds, soy, biofuels, residential lawns, soft drinks, car washing
The hype curve is a problem, but it's difficult to prevent. I myself have never made such a prediction. Though it now seems that the money and effort to create working coding tools is near an inflection point.
"It would be self evident." History shows the opposite at inflection points. The "self evident" stage typically comes much later.
It's a little weird how defensive people are about these tools. Did everyone really think being able to import a few npm packages, string together a few APIs, and run npx create-react-app was something a large number of people could do forever?
The vast majority of coders in employment barely write anything more complex than basic CRUD apps. These jobs were always going to be automated or abstracted away sooner or later.
Every profession changes. Saying that these new tools are useless or won't impact you/xyz devs is just ignoring a repeated historical pattern
It's employing so may people who specialize in Salesforce configuration that every year San Francisco collapses under the weight of 50,000+ of them attending Dreamforce.
And it's actually kind of amazing, because a lot of people who earn six figures programming Salesforce came to it from a non-traditional software engineering background.
I think perhaps for some folks we're looking at their first professional paradigm shift. If you're a bit older, you've seen (smaller versions of) the same thing happening before as e.g. the Internet gained traction, Web2.0, ecommerce, crypto, etc. and have seen your past skillset become useless as now it can be accomplished for only $10/mo/user.... either you pivot and move on somehow, or you become a curmudgeon. Truly, the latter is optional, and at any point when you find yourself doing that you wish to stop and just embrace the new thing, you're still more than welcome to do so. AI is only going to get EASIER to get involved with, not harder.
And by the same token (ha) for some folks we're looking at their first hype wave. If you're a bit older, you've seen similar things like 4GLs and visual programming languages and blockchain and expert systems. They each left their mark on our profession but most of their promises were unfounded and ultimately unrealized.
I like a lot of 4GL ideas. Closest I've come was working on ServiceNow which is sort of a really powerful system with ugly, ugly roots but the idea of your code being the database being the code really resonated with me, as a self-taught programmer.
Similarly, Lisp's homoiconicity makes sense to me as a wonderfully aesthetic idea. I remember generating strings-of-text that were code, but still just text, and wishing that I could trivially step into the structure there like it was a map/dict... without realizing that that's what an AST is and what the language compiler / runtime is already always doing.
Lol. In a few years when the world is awash in AI-generated slop [1] my "past skills" will not only be relevant, they will be actively sought after.
[1] Like the recent "Gas Town" and "Beads" that people keep mentioning in the comments that require extensive scripts/human intervention to purge from the system: https://news.ycombinator.com/item?id=46510121
Agreed, it always seemed a little crazy that you could make wild amounts of money to just write software. I think the music is finally stopping and we'll all have to go back to actually knowing how to do something useful.
> The vast majority of coders in employment barely write anything more complex than basic CRUD apps. These jobs were always going to be automated or abstracted away sooner or later.
My experience has been negative progress in this field. On iOS, UIKit in Interface Builder is an order of magnitude faster to write and to debug, with less weird edge cases, than SwiftUI was last summer. I say last summer because I've been less and less interested in iOS the more I learn about liquid glass, even ignoring the whole "aaaaaaa" factor of "has AI made front end irrelevant anyway?" and "can someone please suggest something the AI really can't do so I can get a job in that?"
The 80s TUI frameworks are still not beaten in developer productivity buy GUI or web frameworks. They have been beaten by GUIs in usability, but then the GUIs reverted into a worse option.
Too bad they were mostly proprietary and won't even run in modern hardware.
Democratizing coding so regular people can get the most out of computers is the opposite of oppression. You are mistaking your interests for societies interests.
It's the same with artists who are now pissed that regular people can manifest their artistic ideas without needing to go through an artist or spend years studying the craft. The artists are calling the AI companies oppressors because they are breaking the artist's stranglehold on the market.
It's incredibly ironic how socializing what was a privatized ability has otherwise "socialist" people completely losing their shit. Just the mask of pure virtue slipping...
On what planet is concentrating an increasingly high amount of the output of this whole industry on a small handful of megacorps “democratising” anything?
Software development was already one of the most democratised professions on earth. With any old dirt cheap used computer, an internet connection, and enough drive and curiosity you could self-train yourself into a role that could quickly become a high paying job. While they certainly helped, you never needed any formal education or expensive qualifications to excel in this field. How is this better?
The open models don't have access to all the proprietary code that the closed ones have trained on.
That's primarily why I finally had to suck it up and sign up for Claude. Claude clearly can cough up proprietary codebase examples that I otherwise have no access to.
Given that very few of the "open models" disclose their training data there's no reason at all to assume that the proprietary models have an advantage in terms of training on proprietary data.
As far as I can tell the reason OpenAI and Anthropic are ahead in code is that they've invested extremely heavily in figuring out the right reinforcement learning training mix needed to get great coding results.
Some of the Chinese open models are already showing signs of catching up.
> deergomoo: On what planet is concentrating an increasingly high amount of the output of this whole industry on a small handful of megacorps “democratising” anything?
> simonw: It's better because now you can automate something tedious in your life with a computer without having to first climb a six month learning curve.
Completely ignores, or enthusiastically accepts and endorses, the consolidation of production, power, and wealth into a stark few (friends), and claims superiority and increased productivity without evidence?
This may be the most simonw comment I have ever seen.
At the tail end of 2023 I was deeply worried about consolidation of power, because OpenAI were the only lab with a GPT-4 class model and none of their competitions had produced anything that matched it in the ~8 months since it had launched.
I'm not worried about that at all any more. There are dozens of organizations who have achieved that milestone now, and OpenAI aren't even definitively in the lead.
A lot of those top-class models are open weight (mainly thanks to the Chinese labs) and available for people to run on their own hardware.
I used claude code to set up a bunch of basic tools my wife was using in her daily work. Things like custom pomodoro timers, task managers, todo notes.
She used to log into 3 different websites. Now she just opens localhost:3000 and has all of them on the same page. No emails shared with anyone. All data stored locally.
I could have done this earlier but the time commitment with Claude Code now was writing a spec in 5-minutes and pressing approve a few times vs half a day.
I count this as an absolute win. No privacy breaches, no data sharing.
> The artists are calling the AI companies oppressors because they are breaking the artist's stranglehold on the market.
Tt's because these companies profit from all the existing art without compensating the artists. Even worse, they are now putting the very people out of a job who (unwittingly) helped to create these tools in the first place. Not to mention how hurtful it must be for artists seeing their personal style imitated by a machine without their consent.
I totally see how it can empower regular people, but it also empowers the megacorps and bad actors. The jury is still out on whether AI is providing a net positive to society. Until then, let's not ignore the injustice and harm that went into creating these tools and the potential and real dangers that come with it.
When you imagine my position, "I hate these companies for democratizing code/art", then debate that it is called a strawman logical fallacy.
Ascribing the goals of "democratize code/art" onto these companies and their products is called delusion.
I am sure the 3 letter agency directors on these company boards are thrilled you think they left their lifelong careers solely to finally realize their dream to allow you to code and "manifest your artistic ideas".
Yes, but the quality of output from open/local models isn't anywhere close to what you get from Claude or Gemini. You need serious hardware to get anything approaching decent processing speeds or even middling quality.
It's more economical for the average person to spend $20/month on a subscription than it is for them to drop multiple thousands $ and untold hours of time experimenting. Local AI is a fun hobby though.
> If I am unable to convince you to stop meticulously training the tools of the oppressor (for a fee!) then I just ask you do so quietly.
I'm kind of fascinated by how AI has become such a culture war topic with hyperbole like "tools of the oppressor"
It's equally fascinating how little these comments understand about how LLMs work. Using an LLM for inference (what you do when you use Claude Code) does not train the LLM. It does not learn from your code and integrate it into the model while you use it for inference. I know that breaks the "training the tools of the oppressor" narrative which is probably why it's always ignored. If not ignored, the next step is to decry that the LLM companies are lying and are stealing everyone's code despite saying they don't.
The prompts and responses are used as training data. Even if your provider allows you to opt out they are still tracking your usage telemetry and using that to gauge performance. If you don’t own the storage and compute then you are training the tools which will be used to oppress you.
> The prompts and responses are used as training data.
They show a clear pop-up where you choose your setting about whether or not to allow data to be used for training. If you don't choose to share it, it's not used.
I mean I guess if someone blindly clicks through everything and clicks "Accept" without clicking the very obvious slider to turn it off, they could be caught off guard.
Assuming everyone who uses Claude is training their LLMs is just wrong, though.
Telemetry data isn't going to extract your codebase.
I am curious where your confidence that this is true, is coming from?
Besides lots of GPU's, training data seems the most valuable asset AI companies have. Sounds like strong incentive to me to secretly use it anyway. Who would really know, if the pipelines are set up in a way, if only very few people are aware of this?
And if it comes out "oh gosh, one of our employees made a misstake".
And they already admitted to train with pirated content. So maybe they learned their lesson .. maybe not, as they are still making money and want to continue to lead the field.
1. There are good, ethical people working at these companies. If you were going to train on customer data that you had promised not to train on there would be plenty of potential whistleblowers.
2. The risk involved in training on customer data that you are contractually obliged not to train on is higher than the value you can get from that training data.
3. Every AI lab knows that the second it comes out that they trained on paying customer data saying they wouldn't, those paying customers will leave for their competitors (and sue them int the bargain.)
4. Customer data isn't actually that valuable for training! Great models come from carefully curated training data, not from just pasting in anything you can get your hands on.
Fundamentally I don't think AI labs are stupid, and training on paid customer data that they've agreed not to train on is a stupid thing to do.
1. The people working for these companies are already demonstrably ethically flexible enough to pirate any publicly accessible training data they can get their hands on, including but not limited to ignoring the license information in every repo on GitHub. I'm not impressed with any of these clowns and I wouldn't trust them to take care of a potted cactus.
2. The risk of using "illegal" training data is irrelevant, because no GenAI vendors have been meaningfully punished for violating copyright yet, and in the current political climate they don't expect to be anytime soon. Even so,
3. Presuming they get caught redhanded using personal data without permission- which, given the nature of LLMs would be extremely challenging for any individual customer to prove definitively- they may lose customers, and customers may try to sue, but you can expect those lawsuits to take years to work their way through the courts; long after these companies IPO, employees get their bag, and it all becomes someone else's problem.
4. The idea of using carefully curated datasets is popular rhetoric, but absolutely does not reflect how the biggest GenAI vendors do business. See (1).
AI labs are extremely shortsighted, sloppy, and demonstrably do not care a single iota about the long term when there's money to be made in the short term. Employees have gigantic financial incentives to ignore internal malfeasance or simple ineptitude. The end result is, if anything, far worse than stupidity.
There is an important difference between openly training on scraped web data and license-ignored data from GitHub and training on data from your paying customers that you promised you wouldn't train on.
Anthropic had to pay $1.5bn after being caught downloading pirated ebooks.
So Anthropic had to pay less than 1% of their valuation despite approximately their entire business being dependent on this and similar piracy. I somehow doubt their takeaway from that is "let's avoid doing that again".
First: Valuations are based on expected future profits.
For a lot of companies, 1% of valuation is ~20% of annual profit (P/E ratio 5); for fast growing companies, or companies where the market is anticipating growth, it can be a lot higher. Weird outlier example here, but consider that if Tesla was fined 1% of its valuation (1% of 1.5 trillion = 15 billion), that would be most of the last four quarter's profit on https://www.macrotrends.net/stocks/charts/TSLA/tesla/gross-p...
Second: Part of the Anthropic case was that many of the books they trained on were ones they'd purchased and destructively scanned, not just pirated. The courts found this use was fine, and Anthropic had already done this before being ordered to: https://storage.courtlistener.com/recap/gov.uscourts.cand.43...
Every single point you made is contradicted by the observed behavior of the AI labs. If any of those factors were going to stop them from training on data they legally can't, they would have done so already.
> I am curious where your confidence that this is true, is coming from?
My confidence comes from working in big startups and big companies with legal teams. There's no way the entire company is going to gather all of the engineers and everyone around, have them code up a secret system to consume customer data into a secret part of the training set, and then have everyone involved keep quiet about it forever.
The whistleblowing and leaking would happen immediately. We've already seen LLM teams leak and and have people try to whistleblow over things that aren't even real, like the Google engineer who thought they had invented AGI a few years ago (lol). OpenAI had a public meltdown when the employees disagreed with Sam Altman's management style.
So my question to you is: What makes you think they would do this? How do you think they'd coordinate the teams to keep it all a secret and only hire people who would take this secret to their grave?
"There's no way the entire company is going to gather all of the engineers and everyone around, have them code up a secret system "
No, that is why I wrote
"Who would really know, if the pipelines are set up in a way, that only very few people are aware of this?" (Typo fixed)
There is no need for everyone to know. I don't know their processes, but I can think of ways to only include very few people who need to know.
The rest is just working on everything else. Some work with data, where they don't need to know where it came from, some with UI, some with scaling up, some .. they all don't need to know, that the source of DB XYZ comes from a dark source.
> I am curious where your confidence that this is true, is coming from?
We have a legal binding contract with Anthropic. Checked and vetted by our laywers, who are annoying because they actually READ the contracts and won't let us use services with suspicious clauses in them - unless we can make amendments.
If they're found to be in breach of said contract (which is what every paid user of Claude signs), Anthropic is going to be the target of SO FUCKING MANY lawsuits even the infinite money hack of AI won't save them.
> Besides lots of GPU's, training data seems the most valuable asset AI companies have. Sounds like strong incentive to me to secretly use it anyway. Who would really know, if the pipelines are set up in a way, if only very few people are aware of this?
Could be, but it's a huge risk the moment any lawsuit happens and the "discovery" process starts. Or whistleblowers.
They may well take that risk, they're clearly risk-takers. But it is a risk.
Eh they’re all using copyrighted training data from torrent sites anyway. If the government was gonna hold them accountable for this it would have happened already.
I find it hard to believe there are people who know these companies stole the entire creative output of humanity and egregiously continually scrape the internet are, for some reason, ignoring the data you voluntarily give them.
> I know that breaks the "training the tools of the oppressor" narrative
"Narrative"? This is just reality. In their own words:
> The awards to Anthropic, Google, OpenAI, and xAI – each with a $200M ceiling – will enable the Department to leverage the technology and talent of U.S. frontier AI companies to develop agentic AI workflows across a variety of mission areas. Establishing these partnerships will broaden DoD use of and experience in frontier AI capabilities and increase the ability of these companies to understand and address critical national security needs with the most advanced AI capabilities U.S. industry has to offer. The adoption of AI is transforming the Department’s ability to support our warfighters and maintain strategic advantage over our adversaries [0]
Is 'warfighting adversaries' some convoluted code for allowing Aurornis to 'see a 1337x in productivity'?
Or perhaps you are a wealthy westerner of a racial and sexual majority and as such have felt little by way of oppression by this tech?
In such a case I would encourage you to develop empathy, or at least sympathy.
> Using an LLM for inference .. does not train the LLM.
In their own words:
> One of the most useful and promising features of AI models is that they can improve over time. We continuously improve our models through research breakthroughs as well as exposure to real-world problems and data. When you share your content with us, it helps our models become more accurate and better at solving your specific problems and it also helps improve their general capabilities and safety. We do not use your content to market our services or create advertising profiles of you—we use it to make our models more helpful. ChatGPT, for instance, improves by further training on the conversations people have with it, unless you opt out.
> Is 'warfighting adversaries' some convoluted code for allowing Aurornis to 'see a 1337x in productivity'?
Much as I despair at the current developments in the USA, and I say this as a sexual minority and a European, this is not "tools of the oppressor" in their own words.
Trump is extremely blunt about who he wants to oppress. So is Musk.
"Support our warfighters and maintain strategic advantage over our adversaries" is not blunt, it is the minimum baseline for any nation with assets anyone else might want to annex, which is basically anywhere except Nauru, North Sentinel Island, and Bir Tawil.
> "Support our warfighters and maintain strategic advantage over our adversaries" is not blunt, it is the minimum baseline for any nation with assets anyone else might want to annex
I think its gross to distill military violence as defending 'assets [others] might want to annex'.
What US assets were being annexed when US AI was used to target Gazans?
> I think its gross to distill military violence as defending 'assets [others] might want to annex'.
Yes, but that's how the world works:
Another country wants a bit of your country for some reason, they can take it by force unless you can make at the very least a credible threat against them, sometimes a lot more than that.
Note that this does not exclude that there has to be an aggressor somewhere. I'm not excluding the existence of aggressors, nor the capacity for the USA to be an aggressor. All I'm saying is your quotation is so vague as to also encompass those who are not.
> What US assets were being annexed when US AI was used to target Gazans?
First, I'm saying the statement is so broad as to encompass other things besides being a warmonger. Consider the opposite statement: "don't support our warfighters and don't maintain strategic advantage over our adversaries" would be absolutely insane, therefore "support our warfighters and maintain strategic advantage over our adversaries" says nothing.
Second, in this case the country doing the targeting is… Israel. To the extent that the USA cares at all, it's to get votes from the large number of Jewish people living in the USA. Similar deal with how it treats Cuba since the fall of the USSR: it's about votes (from Cuban exiles in that case, but still, votes).
Much as I agree that the conduct of Israel with regard to Gaza was disproportionate, exceeded the necessity, and likely was so bad as to even damage Israel's long-term strategic security, if you were to correctly imagine the people of Israel deciding "don't support our warfighters and don't maintain strategic advantage over our adversaries", they would quickly get victimised much harder than those they were victimising. That's the point there: the quote you cite as evidence, is so broad that everyone has approximately that, because not having it means facing ones' own destruction.
There's a mis-attributed quote, "People sleep peaceably in their beds at night because rough men stand ready to do violence on their behalf", that's where this is at.
> These two thoughts seem at conflict.
Musk is openly and directly saying "Canada is not a real country.", says "cis" is hate speech, response to pandemic was tweeting "My pronouns are Prosecute/Fauci.", and self-justification for his trillion dollar bonus for hitting future targets is wanting to be in control of what he describes as a "robot army"; Trump openly and explicitly wants the USA to annex Canada, Greenland, Panama canal, is throwing around the national guard, openly calls critics traitors and calls for death penalty. They're a subtle as exploding volcanoes, nobody needs to take the worst case interpretations of what they're saying to notice this.
Saying "support our warfighters" is something done by basically every nation everywhere all the time, because those places that don't do this quickly get taken over by nearby nations who sense weakness. Which is kinda how the USA got Texas, because again, I'm not saying the USA is harmless, I'm saying the quote doesn't show that.
> What 'assets' were being protected from annexation here by this oppressive use of the tool? The chips?
This would have been a much better example to lead with than the military stuff.
I'm absolutely all on board with the general consensus that the US police are bastards in this specific way, have been since that kid got shot for having a toy gun in an open-carry state. (I am originally from a country where even the police are not routinely armed, I do not value the 2nd amendment, but if you're going to say "we allow open carry of firearms" you absolutely do not get to use "we saw someone carrying a firearm" as an excuse to shoot them).
However: using LLMs to code doesn't seem to be likely to make a difference either way for this. If I was writing a gun-detection AI, perhaps I'm out of date, but I'd use a simpler model that runs locally on-device and doesn't do anything else besides the sales pitch.
Who is the parent oppressing? Making a comment and companies looking to automate labor are a little bit different. One might disagree that automation is oppressive or whatever goals the major tech CEOs have in developing AIs (surveillance, influencing politics, increasing wealth gap), but certainly commenting that they are oppressive is not the same thing.
Careful with being blindly led by your own assumptions.
I actually disagree with your thesis here. I think if every comment was posted under a new account this site would improve its average veracity.
As it stands certain 'celebrity', or high karma, accounts are artificially bolstered by the network effect indifferent to the defensibility of their claims.
I know someone who is using a vibe coded or at least heavily assisted text editor, praising it daily, while also saying llms will never be productive. There is a lot of dissonance right now.
So this awful website hijacked my back button, directed me to an ad, and instead of telling me the name of the fish immediately, made me search for it deep in the article
It was not to blame, more to tell that there is a solution. Yes it is too bad those website exist. I agree with that completely. But it is not something that will stop until the root evil is destroyed, ads.
Hearst papers (sfgate.com and chron.com) are really bad about this on mobile - the advertising providers just go all out to take over your screen and it takes so long to load that the place you click is not the thing you want if you click too soon. The only plus is all the articles are free to read still.
This really isn’t an “article”. Its not content. Its a click bait summary of another piece of content shared by someone on an entirely different platform
That this content is “free” isn’t something we should be thankful for
I had someone argue on Twitter recently that they had made an “agent” when all they had really done was use n8n to make a loop that used LLMs and ran on a schedule
People are calling if-then cron tasks “agents” now
Can’t find the link now but a very comprehensive analysis of surgery vs physiotherapy for lower back issues found that physiotherapy was as effective as invasive, often dangerous spinal surgery. The only difference was time - surgery with recovery + recovery physio fixed the pain in about 4-6 months, while physiotherapy took 18-24 months
But on the plus side, physiotherapy is “free”, has no real risk, and most people who opted for the physiotherapy path found that they were happier and also fixed a lot of other pains simply because of regular stretching and exercise
The body is very weird and finds ways to compensate
I had a football injury when I was 13 that badly damaged my knee meniscus (though I didn’t know it at that time). At 16, I had a complete menisectomy - total removal of the lateral meniscus in my right knee
I was told that I would need to get a transplant and/or new knees in 10-15 years. I was also told that I shouldn’t put too much strain on the knee
I’m now 38 and my knee is mostly…fine. I can run, squat a reasonable amount of weight, walk for miles. Only thing I can’t do is fast directional changes (like in football) or bending down on the lateral side of my right knee
My plan is to extend this as long as possible and hopefully in 10 years, they’ll have tech to fix this for good
reply