Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I feel like I'm taking crazy pills, because every time I try to use AI for any of my projects - including the "better" Anthropic and OpenAI models, I get terribly written, buggy code that has nothing to do with the domain or question. Unless I'm feeding it actual softball, single function questions it adds almost no value to my dev cycle. Mind you, all of this terrible code also costs a ton in tokens.

It's sad because I have seen some truly remarkable progress in LLMs, but I feel like we aren't allowed to be honest anymore that LLMs aren't going to replace programmers or moderate our expectations.



It takes some practice to get good using them. I use it as a sophisticated autocomplete. For example, if I'm building an api client layer in a web frontend, I might have `fooApi.ts`, `mockFooApi.ts`, and `barApi.ts`. instead of writing `mockBarApi.ts`, something that might take 5 - 15 minutes, I can feed those 3 existing files into context, and a few seconds and pennies later I have a nearly perfect file ready for review.

For any semi novel development I want to write it myself to establish the patterns I like and get an understanding of what I'm building. It's the boring stuff I hand over to Claude.


That's just it, I try not to write the boring parts at all. That's what libraries and design pattern are for - avoiding writing boilerplate.

I certainly can see it being useful for writing additional tests, although I am very worried based on what I've seen that it will introduce more bugs than it's worth, which is why it's still a tool I use for hobby projects and not for real work.


> I certainly can see it being useful for writing additional tests

If your metric is number of tests, sure. If your metric is number of useful tests… eh


> but I feel like we aren't allowed to be honest anymore that LLMs aren't going to replace programmers or moderate our expectations.

But literally every single programmer I know has this opinion.


Agree, who is not allowing us to be honest? I have found programmers are freely talking about the limitations of AI.


It looks like if the stuff you read online, written by people who need to optimize for various metrics of which objectivity is not among the top ones, shapes the perception of reality


It’s a matter of incentives and who benefits from a generalized belief that AI will wipe the floor with the whole programmer class.


This opinion used to be a lot less popular especially on hackernews


I know developers that love it.

Of course they are terrible at their job, so AI doesn't make their code much worse.


Copilot Edits works incredibly well. It writes 80% of my code with the majority of my work being small fixes.

https://code.visualstudio.com/docs/copilot/copilot-edits


What kind of code/projects do you work on?


Python, TypeScript, Java, Go.

I've done some stuff for a compiler in Java, backend web services in Python/Go/TS, frontend w/ React/TS.

I was a huge skeptic of this stuff at first but started using it about a year ago. I could definitely live without it, but it also saves me a significant amount of time.

I've been working on a feature in Python. With Copilot edits I just needed to find files the implemented the pattern, add it to the chat context. and write something like "implement feature x following the same pattern". It never gets it right the first time, but you can just keep the conversation going and have it iterate.

Afterwards I can just write /tests and Copilot generates reasonable tests. If it missed cases I can ask it to write cover those tests. Often times I can also just write literally "cover edge cases" and it handles all reasonable scenarios.


You're already above average if you can see that AI-generated code is bad. Those who are below average will think it's amazing and improving their abilities.


I feel like it's so hot or cold. People have been raving about Loveable "one-shotting" the creation of an app given one prompt. I tried to have it recreate a basic landing page from a screenshot and it wasn't even close. It invented some sections, reproduced others terribly and completely ignored others. I went through numerous prompts until I ran out for the day.

With what I was left with at that point, it probably would have been a wash to fix that code or just start from scratch manually.

I don't blame Loveable, I think it's a wonderful product bordering on magic, like v0 and Bolt.new. I just think they are amazing products that have been over hyped to god-like status.


You don't use LLMs to write code for you. This is like asking it what is 2+2, it may be able to answer that but that's not how it works.

What it is just a summary of the internet.


I've got a colleague at work and when he's wrong, I've got to get Claude to agree with me before he believes me.

I used to think he was competent but now I think he's a moron


If you formulate the questions right, AI will agree to anything.


I see this here often, but really, can someone please make a video or blog post with a complete code session so we can see what exactly happens? I am curious.


Can you provide a specific example of where an LLM failed? If you show us your prompt, your "want/need", and the result, we can better judge your situation. My guess: Your domain is weird/rare, so LLMs are terrible because their training data is very limited.


  > Your domain is weird/rare, so LLMs are terrible because their training data is very limited.
And this is how knowledge collapse [1] shows its' head.

[1] https://arxiv.org/abs/2404.03502

I am not a big fan of LLMs so I try them once in a while, asking to "implement blocked clause decomposition in Haskell." They can recite a paper (and several other papers, with references) almost verbatim, but they do not posess enough comprehension of what is going on in these papers. As time passes by, with each new LLM, the level of comprehension drops. Last time LLM tried to persuade me to write code instead of providing me with code.


I gave Claude 3.5 Sonnet your prompt and it generated this: https://claude.site/artifacts/7aa41881-937e-4863-a407-c999ea...

With this example usage:

    -- Example usage:
    let clause1 = Set.fromList [1, 2]  -- represents (x1 ∨ x2)
    let clause2 = Set.fromList [-1, 3] -- represents (¬x1 ∨ x3)
    let formula = Set.fromList [clause1, clause2]
    
    -- Decompose the formula
    let (nonBlocked, blocked) = decompose formula
How did it do?


This is very good.

Did it do just from the prompt or you had to nudge it? Can you share full chat history?


No nudging. This was the whole interaction: https://gist.github.com/jcheng5/c6f15f4c3dc31bf15ab44683ad6a...


This is a gist, not an interaction with Claude AI.

Can you share actual interaction on their site?


No, ChatGPT has that feature but not Claude (that I could find) so I pasted my input and its response into a gist. It’s verbatim, I promise.



> My guess: Your domain is weird/rare, so LLMs are terrible because their training data is very limited.

The training data is always limited, because if there was an existing software that already did what I'm doing, I'd be using it instead of writing it :)

So by definition, unless I'm learning how to program and doing exercises that thousands have done before me, I'm doing something for which there is no training data.


It's great that you work on truly novel, unseen problems. The rest of us peasants tends to recombine previously solved problems into to novel solutions to solve business needs, and that works rather well with LLMs.


    > The rest of us peasants tends to recombine previously solved problems into to novel solutions to solve business needs, and that works rather well with LLMs.
You took the words right out of my mouth! I wrote a draft reply to OP (but discarded it) with similar thoughts -- roughly: I work on CRUD apps, and so do most other devs; LLMs are a terrific fit for this subject matter.


It’s a tool with limited functionality. I think over time you vaguely learn the shape of its limits and work with what it can do, and it becomes genuinely useful.

Doing basic stuff with a very limited scope that’s likely well documented all over the internet, great.

Boilerplate, tedious yet simple things, works awesome as enhanced autocomplete.

More complex stuff, give it a shot, maybe you’ll get lucky, otherwise if it starts fucking up I’ve had the most success just taking a step back and doing it myself, maybe chatting with it as a live docs substitute, or a realtime stack overflow/discord programming channel which are also sometimes of dubious quality but frequently useful.

The misery really lies in getting stuck in that cycle of it just messing up over and over as you try to get it to make this thing work that is clearly beyond its scope and it’s just turning everything into a greater and greater mess of hallucinated bullshit.


> It’s a tool with limited functionality. I think over time you vaguely learn the shape of its limits and work with what it can do, and it becomes genuinely useful.

Yeah, I think this is exactly right. I use it for adding new files to an existing codebase, but without giving it access to that codebase, by passing in the definitions I want it to work with. It gets stuff wrong a lot, and I need to keep anything I'm asking for _well_ within a scope that I can be eagle-eyed about. But if I want to do something pretty simple that would be slightly annoying to write by hand, and can be done in a single file, it makes a pleasant alternative to typing out the code by hand.

I don't think I would be able to do this as a junior developer, I think this is only working because I can tell when it's full of shit, and I lasted about 90m of being willing to let IDE-integration happen, because it's too easy to be lazy and not scrutinize every little change, and that way lies total madness. This makes me beyond skeptical of non-developers writing anything significant with it: they'd be much, much better off learning Bubble or similar.


I think it's part art of prompting, and part problem scope -- they're not gonna get you a super polished final product without a LOT of support (yet), but they excel at rapid I-don't-care-what-the-code-looks-like-just-get-me-an-interactive-prototype work, and at solving specific problems (e.g. the function should do THIS, write it and write me a test suite). However, there are some agentic IDEs like Windsurf that are working to have LLMs orchestrate their own work. Well supported with a rock solid test suite, I've seen some people do pretty incredible things with them. The more a codebase provides tools for LLMs to live of the land in terms of tests and comments, the more aggressive you can be with what you ask them to do.

I've built a couple pretty simple apps that have been basically no code beyond tweaks for me.

I do think a big knowledge gap is the acceleration of prompting by knowing how to talk about code. LLMs show a lot of difference in response between "make the top bit stay on the screen on small screens" and "make the <th> element sticky for viewports <600px".


Same here. I find it useful for:

- autocompletion (about half the time)

- questions on the syntax for certain commands (faster than looking up in the docs, usually pretty accurate)

- questions for how to solve a particular problem or gotcha (faster than looking up on SO, but also less accurate)

It's a useful tool for certain cases, for sure. But it's not a "game changer" in terms of productivity.


I'm pretty new in this, but a couple months back, I tried some direct AI code execution in Unity. It worked surprisingly well. Video: https://www.youtube.com/watch?v=0xCR4fiyugA

So I definitely think the value proposition of AI coding is higher when you have programmatical control over the workflow.

On a related note, the recent Cline Plan+Act function has also been a game changer.


As with all other tools you should expect there to be some learning curve. When desktop top publishing became a thing, professionals had to adjust their ways of working but if you stuck with glue and knives you were out of a job.

LLMs are a tool. It’s also weird and counterintuitive so it takes some time to get really good with them.


If it did everything in one shot always you wouldn't need an IDE for it.


just like googling was a skill back in the day, coding with AI is a skill. you should figure out how to see someone use it properly to learn.


> I feel like we aren't allowed to be honest anymore that LLMs aren't going to replace programmers or moderate our expectations.

I haven't heard anyone in tech say programmers will be replaced by this tech.

I have heard persons outside of tech say it, but I think they lack sufficient context.

I mean, if we get AGI, yes, we might be fully out of a job. But, so far I think many programmers agree this seems to be a ways off.

But, I think you underestimate how many programmers used to copy paste code from StackOverflow and articles, etc.

We even had/have a phrase for that: "copy pasta".

Those programmers are getting more applicable templated code than they used to get via copy paste.

When I'm coding in my preferred languages, I am faster without AI.

But, when I'm writing yaml or something else for an unfamiliar tool or platform, I do get a productivity boost, even if I have to debug the code / configuration.


Heresy!

Also don't believe your lying eyes.


lol I always find these sort of posts amusing. It really shows those who can use ai effectively vs those who don’t understand it at all.


It's called over-hyped.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: