I'm glad to see this. I'm happy to plan to pay for Zed - its not there yet but its well on its way - But I don't want essentially _any_ of the AI and telemetry features.
The fact of the matter is, I am not even using AI features much in my editor anymore. I've tried Copilot and friends over and over and it's just not _there_. It needs to be in a different location in the software development pipeline (Probably code reviews and RAG'ing up for documentation).
- I can kick out some money for a settings sync service.
- I can kick out some money to essentially "subscribe" for maintenance.
I don't personally think that an editor is going to return the kinds of ROI VCs look for. So.... yeah. I might be back to Emacs in a year with IntelliJ for powerful IDE needs....
I'm happy to finally see this take. I've been feeling pretty left out with everyone singing the praises of AI-assisted editors while I struggle to understand the hype. I've tried a few and it's never felt like an improvement to my workflow. At least for my team, the actual writing of code has never been the problem or bottleneck. Getting code reviewed by someone else in a timely manner has been a problem though, so we're considering AI code reviews to at least take some burden out of the process.
AI code reviews are the worst place to introduce AI, in my experience. They can find a few things quickly, but they can also send people down unnecessary paths or be easily persuaded by comments or even the slightest pushback from someone. They're fast to cave in and agree with any input.
It can also encourage laziness: If the AI reviewer didn't spot anything, it's easier to justify skimming the commit. Everyone says they won't do it, but it happens.
For anything AI related, having manual human review as the final step is key.
LLM’s are fundamentally text generators, not verifiers.
They might spot some typos and stylistic discrepancies based on their corpus, but they do not reason. It’s just not what the basic building blocks of the architecture do.
In my experience you need to do a lot of coaxing and setting up guardrails to keep them even roughly on track. (And maybe the LLM companies will build this into the products they sell, but it’s demonstrably not there today)
> LLM’s are fundamentally text generators, not verifiers.
In reality they work quite well for text and numeric (via tools) analysis, too. I've found them to be powerful tools for "linting" a codebase against adequately documented standards and architectural guidance, especially when given the use of type checkers, static analysis tools, etc.
The value of an analysis is the decision that will be taken after getting the result. So will you actually fix the codebase or it’s just a nice report to frame and put on the wall?
Code quality improvements is the reason to do it, so *yes*. Of course, anyone using AI for analysis is probably leveraging AI for the "fix" part too (or at least I am).
I find the summary that copilot generates is more useful than the review comments most of the time. That said, I have seen it make some good catches. It’s a matter of expectations: the AI is not going to have hurt feelings if you reject all its suggestions, so I feel even more free to reject it feedback with the briefest of dismissals.
Link to the ticket. Hopefully your team cares enough to write good tickets.
So if the problem is defined well in the ticket, do the code changed actually address it?
For example for a bug fix. It can check the tests and see if the PR is testing the conditions that caused the bug. It can check the code changed to see if it fits the requirements.
I think the goal with AI for creative stuff should be to make things more efficient, not replace necessarily. Whoever code reviews can get up to speed fast. I’ve been on teams where people would code review a section of the code they aren’t familiar with too much.
In this case if it saves them 30 minutes then great!
I agree and disagree. I think it's important to make it very visually clear that it is not really a PR, but rather an advanced style checker. I think they can be very useful for assessing more rote/repetitive standards that are a bit beyond what standard linters/analysis can provide. Things like institutional standards, lessons learned, etc. But if it uses the normal PR pipeline rather than the checker pipeline, it gives the false impression that it is a PR, which is not.
IMO, the AI bits are the least interesting parts of Zed. I hardly use them. For me, Zed is a blazing fast, lightweight editor with a large community supporting plugins and themes and all that. It's not exactly Sublime Text, but to me it's the nearest spiritual successor while being fully GPL'ed Free Software.
I don't mind the AI stuff. It's been nice when I used it, but I have a different workflow for those things right now. But all the stuff besides AI? It's freaking great.
I wouldn't sing them praises for being FOSS. All contributions are signed away under their CLA which will allow them to pull the plug when their VCs come knocking and the FOSS angle is no longer convenient.
The CLA assigns ownership of your contributions to the Zed team[^0]. When you own software, you can release it under whatever license you want. If I hold a GPL license to a copy, I have that license to that copy forever, and it permits me to do all the GPL things with it, but new copies and new versions you distribute are whatever you want them to be. For example Redis relicensed, prompting the community to fork the last open-source version as Valkey.
The way it otherwise works without a CLA is that you own the code you contributed to your repo, and I own the code I contributed to your repo, and since your code is open-source licensed to me, that gives me the ability to modify it and send you my changes, and since my code is open-source licensed to you, that gives you the ability to incorporate it into your repo. The list of copyright owners of an open source repo without a CLA is the list of committers. You couldn't relicense that because it includes my code and I didn't give you permission to. But a CLA makes my contribution your code, not my code.
[^0]: In this case, not literally. You instead grant them a proprietary free license, satisfying the 'because I didn't give you permission' part more directly.
Because when you sign away copyright, the software can be relicensed and taken closed source for all future improvements. Sure, people can still use the last open version, maybe fork it to try to keep going, but that simply doesn’t work out most times. I refuse to contribute to any project that requires me to give them copyright instead of contributing under copyleft; it’s just free contractors until the VCs come along and want to get their returns.
> I refuse to contribute to any project that requires me to give them copyright instead of contributing under copyleft
Please note that even GNU themselves require you to do this, see e.g. GNU Emacs which requires copyright assignment to the FSF when you submit patches. So there are legitimate reasons to do this other than being able to close the source later.
FSF and GNU are stewards of copyleft, and FSF is structured under 501(c)(3). Assigning copyright to FSF whose significant purpose is to defend and encourage copyleft…is contributing under copyleft in my mind. They would face massive backlash (and GNU would likely face lawsuits from FSF) were they to attempt such a thing. Could they? Possibly. Would they? Exceptionally unlikely.
So yes, I trust a non-profit, and a collective with nearly 50 years of history supporting copyleft, implicitly more than I will ever trust a company or project offering a software while requiring THEY be assigned the copyright rather than a license. Even your statement holds a difference; they require assignment to FSF, not the project or its maintainers.
That’s just listening to history, not really a gotcha to me.
It has been decades since I've seen an FSF CLA packet, but if I recall correctly, the FSF also made legally-binding promises back to the original copyright holder, promising to distribute the code under some kind of "free" (libre, not gratuit) license in the future. This would have allowed them to switch from GPL 2 to GPL 3, or even to an MIT license. But it wouldn't have allowed them to make the software proprietary.
But like I said, it has been decades since I've seen any of their paperwork, and memory is fallible.
In my opinion, it's not. They could start licensing all new code under a non-FOSS license tomorrow and we'd still have the GPL'ed Zed as it is today. The same is true for any project, CLA or not.
I found the OP comment amusing because Emacs with a Jetbrains IDE when I need it is exactly my setup. The only thing I've found AI to be consistently good for is spitting out boring boilerplate so I can do the fun parts myself.
I always hear this "writing code isn't the bottleneck" used when talking about AI, as if there are chosen few engineers who only work on completely new and abstract domains that require a PhD and 20 years of experience that an LLM can not fathom.
Yes, you're right, AI cannot be a senior engineer with you. It can take a lot of the grunt work away though, which is still part of the job for many devs at all skill levels. Or it's useful for technologies you're not as well versed in. Or simply an inertia breaker if you're not feeling very motivated for getting to work.
Find what it's good for in your workflows and try it for that.
I feel like everyone praising AI is a webdev with extremely predictable problems that are almost entirely boilerplate.
I've tried throwing LLMs at every part of the work I do and it's been entirely useless at everything beyond explaining new libraries or being a search engine. Any time it tries to write any code at all it's been entirely useless.
But then I see so many praising all it can do and how much work they get done with their agents and I'm just left confused.
Yeah, the more boilerplate your code needs, the better AI works, and the more it saves you time by wasting less on boilerplate.
AI tooling my experience:
- React/similar webdev where I "need" 1000 lines of boilerplate to do what jquery did in half a line 10 years ago: Perfect
- AbstractEnterpriseJavaFactorySingletonFactoryClassBuilder: Very helpful
- Powershell monstrosities where I "need" 1000 lines of Verb-Nouning to do what bash does in three lines: If you feed it a template that makes it stop hallucinating nonexisting Verb-Nouners, perfect
- Abstract algorithmic problems in any language: Eh, okay
- All the `foo,err=…;if err…` boilerplate in Golang: Decent
- Actually writing well-optimized business logic in any of those contexts: Forget about it
Since I spend 95% of my time writing tight business logic, it's mostly useless.
Highlighting code and having cursor show the recommended changes and make them for me with one click is just a time saver over me copying and pasting back and forth to an external chat window. I don’t find the autocomplete particularly useful, but the inbuilt chat is a useful feature honestly.
I'm the opposite. I held out this view for a long, long time. About two months ago, I gave Zed's agentic sidebar a try.
I'm blown away.
I'm a very senior engineer. I have extremely high standards. I know a lot of technologies top to bottom. And I have immediately found it insanely helpful.
There are a few hugely valuable use-cases for me. The first is writing tests. Agentic AI right now is shockingly good at figuring out what your code should be doing and writing tests that test the behavior, all the verbose and annoying edge cases, and even find bugs in your implementation. It's goddamn near magic. That's not to say they're perfect, sometimes they do get confused and assume your implementation is correct when the test doesn't pass. Sometimes they do misunderstand. But the overall improvement for me has been enormous. They also generally write good tests. Refactoring never breaks the tests they've written unless an actually-visible behavior change has happened.
Second is trying to figure out the answer to really thorny problems. I'm extremely good at doing this, but agentic AI has made me faster. It can prototype approaches that I want to try faster than I can and we can see if the approach works extremely quickly. I might not use the code it wrote, but the ability to rapidly give four or five alternatives a go versus the one or two I would personally have time for is massively helpful. I've even had them find approaches I never would have considered that ended up being my clear favorite. They're not always better than me at choosing which one to go with (I often ask for their summarized recommendations), but the sheer speed in which they get them done is a godsend.
Finding the source of tricky bugs is one more case that they excel in. I can do this work too, but again, they're faster. They'll write multiple tests with debugging output that leads to the answer in barely more time than it takes to just run the tests. A bug that might take me an hour to track down can take them five minutes. Even for a really hard one, I can set them on the task while I go make coffee or take the dog for a walk. They'll figure it out while I'm gone.
Lastly, when I have some spare time, I love asking them what areas of a code base could use some love and what are the biggest reward-to-effort ratio wins. They are great at finding those places and helping me constantly make things just a little bit better, one place at a time.
Overall, it's like having an extremely eager and prolific junior assistant with an encyclopedic brain. You have to give them guidance, you have to take some of their work with a grain of salt, but used correctly they're insanely productive. And as a bonus, unlike a real human, you don't ever have to feel guilty about throwing away their work if it doesn't make the grade.
> Agentic AI right now is shockingly good at figuring out what your code should be doing and writing tests that test the behavior, all the verbose and annoying edge cases,
That's a red flag for me. Having a lot of tests usually means that your domain is fully known so now you can specify it fully with tests. But in a lot of setting, the domain is a bunch of business rules that product decides on the fly. So you need to be pragmatic and only write tests against valuable workflows. Or find yourself changing a line and have 100+ tests breaking.
If you can write tests fast enough, you can specify those business rules on the fly. The ideal case is that tests always reflect current business rules. Usually that may be infeasible because of the speed at which those rules change, but I’ve had a similar experience of AI just getting tests right, and even better, getting tests verifiably right because the tests are so easy to read through myself. That makes it way easier to change tests rapidly.
This also is ignoring that ideally business logic is implemented as a combination of smaller, stabler components that can be independently unit tested.
Unit tests value is mostly when integration and more general tests are failing. So you can filter out some sections in the culprit list (you don’t want to spend days specifying the headlights if the electric design is wrong or the car can’t start)
Having a lot of tests is great until you need to refactor them. I would rather have a few e2e for smoke testing and valuable workflows, Integration tests for business rules. And unit tests when it actually matters. As long as I can change implementation details without touching the tests that much.
Code is a liability. Unless you don’t have to deal with (assembly and compilers) reducing the amount of code is a good strategy.
This is a red flag for me. Any given user-facing software project with changing requirements is still built on top of relatively stable, consistent lower layers. You might change the business rules on top of those layers, but you need generally reasonable and stable internal APIs.
Not having this is very indicative of a spaghetti soup architecture. Hard pass.
You can over specify. When the rules are stringent it's best to have extensive test suites (Like Formula 1). But when it's just a general app, you need to be pragmatic. It's like having a too sensitive sensor in some systems.
AI is solid for kicking off learning a language or framework you've never touched before.
But in my day to day I'm just writing pure Go, highly concurrent and performance-sensitive distributed systems, and AI is just so wrong on everything that actually matters that I have stopped using it.
But so is a good book. And it costs way less. Even though searching may be quicker, having a good digest of a feature is worth the half hour I can spend browsing a chapter. It’s directly picking an expert brains. Then you take notes, compare what you found online and the updated documentation and soon you develop a real understanding of the language/tool abstraction.
I’m using Go to build a high performance data migration pipeline for a big migration we’re about to do. I haven’t touched Go in about 10 years, so AI was helpful getting started.
But now that I’ve been using it for a while it’s absolutely terrible with anything that deals with concurrency. It’s so bad that I’ve stopped using it for any code generation and going to completely disable autocomplete.
A good example would be Prometheus, particularly PromQL for which the docs are ridiculously bare, but there is a ton of material and stackoverflow answers scattered al over the internet.
zed was just a fast and simple replacement for Atom (R.I.P) or vscode. Then they put AI on top when that showed up. I don't care for it, and appreciate a project like this to return the program to its core.
You can leave LLM Q&A on the table if you like, but tab auto complete is a godlike power.
I'm auto-completing crazy complex Rust match branches for record transformation. 30 lines of code, hitting dozens of fields and mutations, all with a single keystroke. And then it knows where my next edit will be.
I've been programming for decades and I love this. It's easily a 30-50% efficiency gain when plumbing fields or refactoring.
Just to echo the sentiment, I've had struggles trying to figure out how to use LLMs in my daily work.
I've landed on using it as part of my code review process before asking someone to review my PR. I get a lot of the nice things that LLMs can give me (a second set of eyes, a somewhat consistent reviewer) but without the downsides (no waiting on the agent to finish writing code that may not work, costs me personally nothing in time and effort as my Org pays for the LLM, when it hallucinates I can easily ignore it).
If it's formulaic enough, I will use the editor templates/snippets generator. Or write a code generator (if it involves a bunch of files). If it's not, I probably have another class I can copy and strip out (especially in UI and CRUD).
> integrate module A into module B
If it's cannot be done easily, that's the sign of a less than optimal API.
> entire codebase A into codebase B
Is that a real need?
> get someones github project up and running on your machine, do you manually fiddle with cmakes and npms
If the person can't be bothered to give proper documentation, why should I run the project? But actually, I will look into AUR (archlinux) and Homebrew formula if someone has already do the first jobs of figuring dependency version. If there's a dockerfile, I will use that instead.
> convert an idea or plan.md or a paper into working code?
Iteratively. First have an hello world or something working, then mowing down the task list.
> Fix flakes, fix test<->code discrepancies or increase coverage etc
Either the test is wrong or the code is wrong. Figure out which and rework it. The figuring part always take longer as you will need to ask around.
> If you do all this manually, why?
Because when something happens in prod, you really don't want that feeling of being the last one that have interacted with that part, but with no idea of what has changed.
To me, using AI to convert an idea or paper into working code is outsourcing the only enjoyable part of programming to a machine. Do we not appreciate problem solving anymore? Wild times.
i'm an undergrad, so when i need to implement a paper, the idea is that i'm supposed to learn something from implementing it. i feel fortunate in that ai is not yet effective enough to let me be lazy and skip that process, lol
When I was younger, we all had to memorize phone numbers. I still remember those numbers (even the defunct ones) but I haven't learned a single new number since getting a cellphone.
When I was younger, I had to memorize how to drive to work/the grocery store/new jersey. I still remember those routes but I haven't learned a single new route since getting a smartphone.
Are we ready to stop learning as programmers? I certainly am not and it sounds like you aren't either. I'll let myself plateau when I retire or move into management. Until then, every night debugging and experimenting has been building upon every previous night debugging and experimenting, ceaselessly progressing towards mastery.
I can largely relate... that said, I rarely rely on my phone for remembering routes to places I've been before. It does help that I've lived in different areas of my city and suburbs (Phoenix) so I'm generally familiar with most of the main streets, even if I haven't lived on a given side of town in decades.
The worst is when I get inclined to go to a specific restaurant I haven't been to in years and it's completely gone. I've started to look online to confirm before driving half an hour or more.
*Outsourcing to a parrot on steroids which will make mistakes, produce stale ugly ui with 100px border radius, 50px padding and rainbow hipster shadows, write code biased towards low quality training data and so on. It's the perfect recipe for disaster
Disastrous? Quite possibly, but my concerns are based on different concerns.
Almost everything changes, so isn’t it better to rephrase these statements as metrics to avoid fixating on one snapshot in an evolving world?
As the metrics get better, what happens? Do you still have objections? What objections remain as AI capabilities get better and better without limit? The growth might be slow or irregular, but there are many scenarios where AIs reach the bar where they are better at almost all knowledge work.
Stepping back, do you really think of AI systems as stochastic parrots? What does this metaphor buy you? Is it mostly a card you automatically deal out when you pattern match on something? Or does serve as a reusable engine for better understanding the world?
We’ve been down this road; there is already much HN commentary on the SP metaphor. (Not that I recommend HN for this kind of thing. This is where I come to see how a subset of tech people are making sense of it, often imperfectly with correspondingly inappropriate overconfidence.)
TLDR: smart AI folks don’t anchor on the stochastic parrots metaphor. It is a catchy phrase and helped people’s papers get some attention, but it doesn’t mean what a lot of people think it means. Easily misunderstood,
it serves as a convenient semantic stop sign so people don’t have to dig in to the more interesting aspects of modern AI systems. For example: (1) transformers build conceptual models of language that transcend any particular language. (2) They also build world models with spatial reasoning. (3) Many models are quite resilient to low quality training data. And more.
To make this very concrete: under the assumption of universal laws of physics, people are just following the laws of physics, and to a first approximation, our brains are just statistical pattern matchers. By this definition, humans would also be “stochastic parrots”. I go all this trouble to show that this metaphor doesn’t cut to the heart of the matter. There are clearer questions to ask: they require getting a lot more specific about various forms and applications of intelligent behavior. For example
- under what circumstances does self play lead to superhuman capability in a particular domain?
- what limits exist (if any) in the self supervised training paradigm used for sequential data? If the transformer trained in this way can write valid programs then it can create almost any Turing machine; limited only by time and space and energy. What more could you want? (Lots, but I’m genuinely curious as to people’s responses after reflecting on these.)
Until the thing can learn on its own and advance its capabilities to the same degree that a junior developer can, it is not intelligent enough to do that work. It doesn't learn our APIs, it doesn't learn our business domain, it doesn't learn from the countless mistakes I correct it on. What we have now is interesting, it is helping sometimes and wasteful others. It is not intelligent.
2. Intelligence is better measured on a scale than with 1 bit (yes/no).
3. Intelligence is better considered as having many components instead of just one. When people talk about intelligence, they often mean different things across domains, such as emotional, social, conceptual, spatial, kinetic, sensory, etc.
4. Many researchers have looked for -- and found -- in humans, at least, some notions of generalized intellectual capability that tends to help across a wide variety of cognitive tasks.
If some of these make sense, I suggest it would be wise to conclude:
5. Reasonable people accentuate different aspects and even definitions of intelligence.
6. Expecting a yes/no answer for "is X intelligent?" without considerable explanation is approximately useless. (Unless it is a genuinely curious opener for an in-depth conversation.)
7. Asking "is X intelligent?" tends to be a poorly framed question.
> Until the thing can learn on its own and advance its capabilities to the same degree that a junior developer can, it is not intelligent enough to do that work.
This confuses intelligence with memory (or state) which tends to enable continuous learning.
Update: it might have been clearer and more helpful if I wrote this instead…
This idea of intelligence stated above seems to combine computation, memory, and self-improvement. These three concepts (as I understand them) are both different and logically decoupled.
For example, in the context of general agents, computational ability can change without affecting memory capability. Also, high computational ability does not necessarily confer self-improvement abilities. Having more memory does not necessarily benefit self-improvement.
In the case of biology, it is possible that self improvement demands energy savings and therefore sensory processing degradation. This conceptually relates to a low power CPU mode or a gasoline engine that can turn off some cylinders.
A time traveler from the future has recommended we both read or reread “Disputing Definitions” by Yudkowsky (2008).
Some favorite quotes of mine from it:
> Dictionary editors are historians of usage, not legislators of language. Dictionary editors find words in current usage, then write down the words next to (a small part of) what people seem to mean by them.
> Arguing about definitions is a garden path; people wouldn't go down the path if they saw at the outset where it led.
>> Eliezer: "Personally I'd say that if the issue arises, both sides should switch to describing the event in unambiguous lower-level constituents, like acoustic vibrations or auditory experiences. Or each side could designate a new word, like 'alberzle' and 'bargulum', to use for what they respectively used to call 'sound'; and then both sides could use the new words consistently. That way neither side has to back down or lose face, but they can still communicate. And of course you should try to keep track, at all times, of some testable proposition that the argument is actually about. Does that sound right to you?"
Another thing that jumps out to me is just how fluidly people redefine "intelligence" to mean "just beyond what machines today can do". I can't help wonder much your definition has changed. What would happen if we reviewed your previous opinions, commentary, thoughts, etc... would your time-varying definitions of "intelligence" be durable and consistent? Would this sequence show movement towards a clearer and more testable definition over time?
My guess? The tail is wagging the dog here -- you are redefining the term in service of other goals. Many people naturally want humanity to remain at the top of the intellectual ladder and will distort reality as needed to stay there.
My point is not to drag anyone through the mud for doing the above. We all do it to various degrees.
Now, for my sermon. More people need to wake up and realize machine intelligence has no physics-based constraints to surpassing us.
A. Businesses will boom and bust. Hype will come and go. Humanity has an intrinsic drive to advance thinking tools. So AI is backed by huge incentives to continue to grow, no matter how many missteps economic or otherwise.
B. The mammalian brain is an existence proof that intelligence can be grown / evolved. Homo sapiens could have bigger brains if not for birth-canal size constraints and energy limitations.
C. There are good reasons to suggest that designing an intelligent machine will be more promising than evolving one.
D. There are good reasons to suggest silicon-based intelligence will go much further than carbon-based brains.
E. We need to stop deluding ourselves by moving the goalposts. We need to acknowledge reality, for this is reality we are living in, and this is reality we can manipulate.
Let me know if you disagree with any of the sentences below. I'm not here to preach to the void.
> A. Businesses will boom and bust. Hype will come and go. Humanity has an intrinsic drive to advance thinking tools. So AI is backed by huge incentives to continue to grow, no matter how many missteps economic or otherwise.
Corrected to:
A. Businesses will boom and bust. Hype will come and go. Nevertheless, humanity seems to have an intrinsic drive to innovate, which means pushing the limits of technology. People will seek more intelligent machines, because we perceive them as useful tools. So AI is pressurized by long-running, powerful incentives, no matter how many missteps economic or otherwise. It would take a massive and sustained counter-force to prevent a generally upwards AI progression.
This also reveals a failure mode in conversations that might go as follows. You point to some version of Webster’s dictionary, but I point to Stuart Russell (an expert in AI). If this is all we do, it is nothing more than an appeal to authority and we don’t get far.
This misunderstands the stated purpose of a dictionary: to catalog word usage — not to define an ontology that other must follow. Usage precedes cataloging.
Regarding the phrase statistical parrot, I would claim that statistical parrotism is an ideology. As with any ideology, what we see is a speciation event. The overpopulation of SEO parrots has driven out a minority of parrots who now respecialize in information dissemination rather than information pollution, leaving their former search-engine ecological niche and settling in a new one that allows them to operate at a higher level of density, compression and complexity. Thus it's a major step in evolution, but it would be a misunderstanding to claim that evolution is the emergence of intelligence.
“The final goal of any engineering activity is some type of documentation. When a design effort is complete, the design documentation is turned over to the manufacturing team. This is a completely different group with completely different skills from the design team. If the design documents truly represent a complete design, the manufacturing team can proceed to build the product. In fact, they can proceed to build lots of the product, all without any further intervention of the designers. After reviewing the software development life cycle as I understood it, I concluded that the only software documentation that actually seems to satisfy the criteria of an engineering design is the source code listings.” - Jack Reeves
I'm pretty fast coding and know what I'm doing. My ideas are too complex for claude to just crap out. If I'm really tired I'll use claude to write tests. Mostly they aren't really good though.
AI doesn't really help me code vs me doing it myself.
this is something i've found LLMs almost useless at. consider https://arxiv.org/abs/2506.11908 --- the paper explains its proposed methodology pretty well, so i figured this would be a good LLM use case. i tried to get a prototype to run with gemini 2.5 pro, but got nowhere even after a couple of hours, so i wrote it by hand; and i write a fair bit of code with LLMs, but it's primarily questions about best practices or simple errors, and i copy/paste from the web interface, which i guess is no longer in vogue. that being said, would cursor excel here at a one-shot (or even a few hours of back-and-forth), elegant prototype?
I have found that whenever it fails for me, it's likely that I was trying to one-shot the solution and I retry by breaking the problem into smaller chunks or doing a planning work with gemini cli first.
smaller chunks works better, but ime, it takes as long as writing it manually that way, unless the chunk is very simple, e.g. essentially api examples. i tend not to use LLMs for planning because thats the most fun part for me :)
For stuff like adding generating and integrating new modules: the helpfulness of AI varies wildly.
If you’re using nest.js, which is great but also comically bloated with boilerplate, AI is fantastic. When my code is like 1 line of business logic per 6 lines of boilerplate, yes please AI do it all for me.
Projects with less cruft benefit less. I’m working on a form generator mini library, and I struggle to think of any piece I would actually let AI write for me.
Similar situation with tests. If your tests are mostly “mock x y and z, and make sure that this spied function is called with this mocked payload result”, AI is great. It’ll write all that garbage out in no time.
If your tests are doing larger chunks of biz logic like running against a database, or if you’re doing some kinda generative property based testing, LLMs are probably more trouble than they’re worth
To do those things, I do the same thing I've been doing for the thirty years that I've been programming professionally: I spend the (typically modest) time it takes to learn to understand the code that I am integrating into my project well enough to know how to use it, and I use my brain to convert my ideas into code. Sometimes this requires me to learn new things (a new tool, a new library, etc.). There is usually typing involved, and sometimes a whiteboard or notebook.
Usually it's not all that much effort to glance over some other project's documentation to figure out how to integrate it, and as to creating working code from an idea or plan... isn't that a big part of what "programming" is all about? I'm confused by the idea that suddenly we need machines to do that for us: at a practical level, that is literally what we do. And at a conceptual level, the process of trying to reify an idea into an actual working program is usually very valuable for iterating on one's plans, and identifying problems with one's mental model of whatever you're trying to write a program about (c.f. Naur's notions about theory building).
As to why one should do this manually (as opposed to letting the magic surprise box take a stab at it for you), a few answers come to mind:
1. I'm professionally and personally accountable for the code I write and what it does, and so I want to make sure I actually understand what it's doing. I would hate to have to tell a colleague or customer "no, I don't know why it did $HORRIBLE_THING, and it's because I didn't actually write the program that I gave you, the AI did!"
2. At a practical level, #1 means that I need to be able to be confident that I know what's going on in my code and that I can fix it when it breaks. Fiddling with cmakes and npms is part of how I become confident that I understand what I'm building well enough to deal with the inevitable problems that will occur down the road.
3. Along similar lines, I need to be able to say that what I'm producing isn't violating somebody's IP, and to know where everything came from.
4. I'd rather spend my time making things work right the first time, than endlessly mess around trying to find the right incantation to explain to the magic box what I want it to do in sufficient detail. That seems like more work than just writing it myself.
Now, I will certainly agree that there is a role for LLMs in coding: fancier auto-complete and refactoring tools are great, and I have also found Zed's inline LLM assistant mode helpful for very limited things (basically as a souped-up find and replace feature, though I should note that I've also seen it introduce spectacular and complicated-to-fix errors). But those are all about making me more efficient at interacting with code I've already written, not doing the main body of the work for me.
What do you mean by this? If you just mean moving things around then code refactoring tools to move functions/classes/modules have existed in IDEs for millennia before LLMs came around.
> get someones github project up and running on your machine
docker
> convert an idea or plan.md or a paper into working code
I sit in front of a keyboard and start typing.
> Fix flakes, fix test<->code discrepancies or increase coverage etc
I sit in front of a keyboard, read, think, and then start typing.
> If you do all this manually, why?
Because I care about the quality of my code. If these activities don't interest you, why are you in this field?
Ah well then, this is the cultural divide that has been forming since long before LLMs happened. Once software engineering became lucrative, people started entering the field not because they're passionate about computers or because they love the logic/problem solving but because it is a high paying, comfortable job.
There was once a time when only passionate people became programmers, before y'all ruined it.
i think you are mis-categorizing me. i have been programming for fun since i was a kid. But that doesn't mean i solve mundane boring stuff even though i know i can get someone else or ai to figure those parts out so i can do the fun stuff.
Ah perhaps. Then I think we had different understandings of my "why are you in this field?" question. I would say that my day job is to "deliver shareholder value"[0] but I'd never say that is why I am in this field, and it sounds like it isn't why you're in this field either since I doubt you were thinking about shareholders when you were programming as a kid.
[0] Actually, I'd say it is "to make my immediate manager's job easier", but if you follow that up the org chart eventually it ends up with shareholders and their money.
Defining one's worth by shareholder value is pretty dystopian, so yeah, even "make the world a better place" is preferable, at least if whoever said it really means it…
> I can kick out some money to essentially "subscribe" for maintenance.
People on HN and other geeky forums keep saying this, but the fact of the matter is that you're a minority and not enough people would do it to actually sustain a product/company like Zed.
It's a code editor so I think the geeky forums are relevant here.
Also, this post is higher on HN than the post about raising capital from Sequoia where many of the comments are about how negatively they view the raising of capital from VC.
The fact of the matter is that people want this and the inability of companies to monetize on that desire says nothing about whether the desire is large enough to "actually sustain" a product/company like Zed.
"Happy to see this". The folks over at Zed did all of the hard work of making the thing, try to make some money, and then someone just forks it to get rid of all of the things they need to put in to make it worth their time developing. I understand if you don't want to pay for Zed - but to celebrate someone making it harder for Zed to make money when you weren't paying them to begin with -"Happy to PLAN to pay for Zed"- is beyond.
The only path forward I see for a classic VC investment is the AI drive.
But I don't think the AI bit is valuable. A powerful plugin system would be sufficient to achieve LLM integration.
So I don't think this is a worthwhile investment unless the product gets a LOT worse and becomes actively awful for users who aren't paying beaucoup bucks for AI tooling- the ROI will have to center the AI drive.
It's not a move that will generate a good outcome for the average user.
The fact of the matter is, I am not even using AI features much in my editor anymore. I've tried Copilot and friends over and over and it's just not _there_. It needs to be in a different location in the software development pipeline (Probably code reviews and RAG'ing up for documentation).
- I can kick out some money for a settings sync service. - I can kick out some money to essentially "subscribe" for maintenance.
I don't personally think that an editor is going to return the kinds of ROI VCs look for. So.... yeah. I might be back to Emacs in a year with IntelliJ for powerful IDE needs....