I don't know if I'm doing something wrong, but every time I ask gemini 2.5 for c...

lukeschlather · 2025-05-06T16:10:29 1746547829

I usually remove the comments by hand. It's actually pretty helpful, it ensures I've reviewed every piece of code carefully, especially since most of the comments are literally just restating the next line, and "does this comment add any information?" is a really helpful question to make sure I understand the code.

tasuki · 2025-05-06T17:44:02 1746553442

Same! It eases my code review. In the rare occasions I don't want to do that, I ask the LLM to provide the code without comments.

Benjammer · 2025-05-06T15:38:07 1746545887

I've found that heavily commented code can be better for the LLM to read later, so it pulls in explanatory comments into context at the same time as reading code, similar to pulling in @docs, so maybe it's doing that on purpose?

koakuma-chan · 2025-05-06T15:50:14 1746546614

No, it's just bad. I've been writing a lot of Python code past two days with Gemini 2.5 Pro Preview, and all of its code was like:

```python

def whatever():

  --- SECTION ONE OF THE CODE ---

  ...

  --- SECTION TWO OF THE CODE ---

  try:
    [some "dangerous" code]
  except Exception as e:
     logging.error(f"Failed to save files to {output_path}: {e}")
     # Decide whether to raise the error or just warn
     # raise IOError(f"Failed to save files to {output_path}: {e}")

```

(it adds commented out code like that all the time, "just in case")

It's terrible.

I'm back to Claude Code.

NeutralForest · 2025-05-06T16:29:15 1746548955

I'm seeing it trying to catch blind exceptions in Python all the time. I see it in my colleagues code all the time, it's driving me nuts.

JoshuaDavid · 2025-05-06T17:43:29 1746553409

The training loop asked the model to one-shot working code for the given problems without being able to iterate. If you had to write code that had to work on the first try, and where a partially correct answer was better than complete failure, I bet your code would look like that too.

In any case, it knows what good code looks like. You can say "take this code and remove spurious comments and prefer narrow exception handling over catch-all", and it'll do just fine (in a way it wouldn't do just fine if your prompt told it to write it that way the first time, writing new code and editing existing code are different tasks).

NeutralForest · 2025-05-06T21:11:53 1746565913

It's only an example, there's pretty of irrelevant stuff that LLMs default to which is pretty bad Python. I'm not saying it's always bad but there's a ton of not so nice code or subtly wrong code generated (for example file and path manipulation).

jerkstate · 2025-05-06T16:45:00 1746549900

There are a bunch of stupid behaviors of LLM coding that will be fixed by more awareness pretty soon. Imagine putting the docs and code for all of your libraries into the context window so it can understand what exceptions might be thrown!

maccard · 2025-05-06T17:12:56 1746551576

Copilot and the likes have been around for 4 years, and we’ve been hearing this all along. I’m bullish on LLM assistants (not vibe coding) but I’d love to see some of these things actually start to happen.

kenjackson · 2025-05-06T17:35:52 1746552952

I feel like it has gotten better over time, but I don't have any metrics to confirm this. And it may also depend on what type of you language/libraries that you use.

maccard · 2025-05-06T21:49:56 1746568196

I feel like there was a huge jump when cursor et al appeared, and things have been “changing” since then rather than improving.

NeutralForest · 2025-05-06T21:13:14 1746565994

It just feels to me like trying to derive correct behavior without a proper spec so I don't see how it'll get that much better. Maybe we'll collectively remove the pathological code but otherwise I'm not seeing it.

tclancy · 2025-05-06T17:06:01 1746551161

Well, at least now we know who to blame for the training data :)

des429 · 2025-05-10T01:21:15 1746840075

What’s a blind exception?

brandall10 · 2025-05-06T15:54:29 1746546869

It's certainly annoying, but you can try following up with "can you please remove superfluous comments? In particular, if a comment doesn't add anything to the understanding of the code, it doesn't deserve to be there".

diggan · 2025-05-06T16:23:02 1746548582

I'm having the same issue, and no matter what I prompt (even stuff like "Don't add any comments at all to anything, at any time") it still tries to add these typical junior-dev comments where it's just re-iterating what the code on the next line does.

tough · 2025-05-06T17:32:54 1746552774

you can have a script that drops them all

shawabawa3 · 2025-05-06T16:45:50 1746549950

You don't need a follow up

Just end your prompt with "no code comments"

brandall10 · 2025-05-06T20:23:54 1746563034

I prefer not to do that as comments are helpful to guide the LLM, and esp. show past decisions so it doesn't redo things, at least in the scope of a feature. For me this tends to be more of a final refactoring step to tidy them up.

breppp · 2025-05-06T17:38:19 1746553099

I always thought these were there to ground the LLM on the task and produce better code, an artifact of the fact that this will autocomplete better based on past tokens. Similarly always thought this is why ChatGPT always starts every reply with repeating exactly what you asked again

rst · 2025-05-06T18:14:27 1746555267

Comments describing the organization and intent, perhaps. Comments just saying what a "require ..." line requires, not so much. (I find it will frequently put notes on the change it is making in comments, contrasting it with the previous state of the code; these aren't helpful at all to anyone doing further work on the result, and I wound up trimming a lot of them off by hand.)

puika · 2025-05-06T15:51:07 1746546667

I have the same issue plus unnecessary refactorings (that break functionality). it doesn't matter if I write a whole paragraph in the chat or the prompt explaining I don't want it to change anything else apart from what is required to fulfill my very specific request. It will just go rogue and massacre the entirety of the file.

mgw · 2025-05-06T15:55:01 1746546901

This has also been my biggest gripe with Gemini 2.5 Pro. While it is fantastic at one-shotting major new features, when wanting to make smaller iterative changes, it always does big refactors at the same time. I haven't found a way to change that behavior through changes in my prompts.

Claude 3.7 Sonnet is much more restrained and does smaller changes.

cryptoz · 2025-05-06T16:07:24 1746547644

This exact problem is something I’m hoping to fix with a tool that parses the source to AST and then has the LLM write code to modify the AST (which you then run to get your changes) rather than output code directly.

I’ve started in a narrow niche of python/flask webapps and constrained to that stack for now, but if you’re interested I’ve just opened it for signups: https://codeplusequalsai.com

Would love feedback! Especially if you see promising results in not getting huge refactors out of small change requests!

(Edit: I also blogged about how the AST idea works in case you're just that curious: https://codeplusequalsai.com/static/blog/prompting_llms_to_m...)

HenriNext · 2025-05-06T17:32:51 1746552771

Interesting idea. But LLMs are trained on vast amount of "code as text" and tiny fraction of "code as AST"; wouldn't that significantly hurt the result quality?

cryptoz · 2025-05-06T17:55:28 1746554128

Thanks and yeah that is a concern; however I have been getting quite good results from this AST approach, at least for building medium-complexity webapps. On the other hand though, this wasn't always true...the only OpenAI model that really works well is o3 series. Older models do write AST code but fail to do a good job because of the exact issue you mention, I suspect!

jtwaleson · 2025-05-06T17:02:47 1746550967

Having the LLM modify the AST seems like a great idea. Constraining an LLM to only generate valid code would be super interesting too. Hope this works out!

tough · 2025-05-06T17:34:05 1746552845

Interesting, i started playing with ts-morph and neo4j to parse TypeScript codebases.

simonw has symbex which could be useful for you for python

polyaniline · 2025-05-07T12:48:05 1746622085

Asking it explicitly once (not necessarily every new prompt in context) to keep output minimal and strive to do nothing more than it is told works for me.

nolist_policy · 2025-05-06T16:40:35 1746549635

Can't you just commit the relevant parts? The git index is made for this sort of thing.

tasuki · 2025-05-06T17:46:51 1746553611

It's not always trivial to find the relevant 5 line change in a diff of 200 lines...

fwip · 2025-05-06T18:33:29 1746556409

Really? I haven't tried Gemini 2.5 yet, but my main complaint with Claude 3.7 is this exact behavior - creating 200+ line diffs when I asked it to fix one function.

bugglebeetle · 2025-05-06T16:05:31 1746547531

This is generally controllable with prompting. I usually include something like, “be excessively cautious and conservative in refactoring, only implementing the desired changes” to avoid.

fkyoureadthedoc · 2025-05-06T15:54:35 1746546875

Where/how do you use it? I've only tried this model through GitHub Copilot in VS Code and I haven't experienced much changing of random things.

diggan · 2025-05-06T16:21:48 1746548508

I've used it via Google's own AI studio and via my own library/program using the API and finally via Aider. All of them lead to the same outcome, large chunks of changes to a lot of unrelated things ("helpful" refactors that I didn't ask for) and tons of unnecessary comments everywhere (like those comments you ask junior devs to stop making). No amount of prompting seems to address either problems.

dherikb · 2025-05-06T15:54:53 1746546893

I have the exactly same issue using it with Aider.

Maxatar · 2025-05-06T15:42:10 1746546130

Tell it not to write so many comments then. You have a great deal of flexibility in dictating the coding style and can even include that style in your system prompt or upload a coding style document and have Gemini use it.

Trasmatta · 2025-05-06T15:49:30 1746546570

Every time I ask an LLM to not write comments, it still litters it with comments. Is Gemini better about that?

grw_ · 2025-05-06T16:27:36 1746548856

No, you can tell it not to write these comments in every prompt and it'll still do it

sitkack · 2025-05-06T16:00:33 1746547233

LLMs are extremely poor at following negative instructions, tell them what to do, not what not to do.

diggan · 2025-05-06T16:23:58 1746548638

Ok, so saying "Implement feature X" leads to a ton of comments. How do you rewrite that comment to not include "don't write comments" while making the output not containing comments? "Write only source code, no plain text with special characters in the beginning of the line" or what are you suggesting here in practical terms?

sroussey · 2025-05-06T16:33:35 1746549215

“Constrain all comments to a single block at the top of the file. Be concise.”

Or something similar that does not rely on negation.

sitkack · 2025-05-06T16:41:34 1746549694

I also include something about "Target the comments towards a staff engineer that favors concise comments that focus on the why, and only for code that might cause confusion."

I also try and get it to channel that energy into the doc strings, so it isn't buried in the source.

diggan · 2025-05-06T18:09:51 1746554991

But I want no comments whatsoever, not one huge block of comments at the top of the file. How'd I get that without negation?

Besides, other models seems to handle negation correctly, not sure why it's so difficult for the Gemini family of models to understand.

staticman2 · 2025-05-06T16:55:07 1746550507

This is sort of LLM specific. For some tasks you might try including the word comment but give the order at the beginning and end of the prompt. This is very model dependent. Like:

Refractor this. Do not write any comments.

As a reminder your task is to refractor the above code and do not write any comments.

diggan · 2025-05-06T18:14:11 1746555251

> Do not write any comments. [...] do not write any comments.

Literally both of those are negations.

staticman2 · 2025-05-06T21:32:40 1746567160

Yes my suggestion is that negations can work just fine, depending on the model and task, and instead of avoiding negations you can try other promoting strategies like emphasizing what you want at the beginning and at the end of the prompt.

If you think negations never work tell Gemini 2.5 to "write 10 sentences that do not include the word the" and see what happens.

FireBeyond · 2025-05-06T16:28:21 1746548901

"Implement feature X, and as you do, insert only minimal and absolutely necessary comments that explain why something is being done, not what is being done."

sitkack · 2025-05-06T16:45:33 1746549933

You would say "omit the how". That word has negation built in.

Hackbraten · 2025-05-06T21:39:07 1746567547

"Whenever you are tempted to write a line or block comment, it is imperative that you just write the actual code instead"

nearbuy · 2025-05-06T19:36:42 1746560202

Sample size of one, but I just tried it and it worked for me on 2.5 pro. I just ended my prompt with "Do not include any comments whatsoever."

dheera · 2025-05-06T15:52:10 1746546730

I usually ask ChatGPT to "comment the shit out of this" for everything it writes. I find it vastly helps future LLM conversations pick up all of the context and why various pieces of code are there.

If it is ingesting data, there should also be a sample of the data in a comment.

HenriNext · 2025-05-06T17:21:57 1746552117

Same experience. Especially the "step" comments about the performed changes are super annoying. Here is my prompt-rule to prevent them:

"5. You must never output any comments about the progress or type of changes of your refactoring or generation. Example: you must NOT add comments like: 'Added dependency' or 'Changed to new style' or worst of all 'Keeping existing implementation'."

Workaccount2 · 2025-05-06T17:18:28 1746551908

I have a strong sense that the comments are for the model more than the user. It's effectively more thinking in context.

stavros · 2025-05-07T02:22:52 1746584572

It definitely dumped its CoT into a huge comment just now when I asked it to add some function calls.

Scene_Cast2 · 2025-05-06T15:36:38 1746545798

It also does super defensive coding. Not that it's a bad thing in general, but I write a lot of prototype code.

prpl · 2025-05-06T15:48:57 1746546537

Production quality code is defensive. Probably trained on a lot of google code.

Tainnor · 2025-05-06T17:16:43 1746551803

Depends on what you mean by "defensive". Anticipating error and non-happy-path cases and handling them is definitely good. Also fault tolerance, i.e. allowing parts of the application to fail without bringing down everything.

But I've heard "defensive code" used for the kind of code where almost every method validates its input parameters, wraps everything in a try-catch, returns nonsensical default values in failure scenarios, etc. This is a complete waste because the caller won't know what to do with the failed validations or thrown errors, and it's just unnecessary bloat that obfuscates the business logic. Validation, error handling and so on should be done in specific parts of the codebase (bonus points if you can encode the successful validation or the presence/absence of errors in the type system).

neilellis · 2025-05-06T17:28:46 1746552526

this!

lots of hasattr("") rubbish, I've increased the amount of prompting but it still does this - basically it defers it's lack of compile time knowledge to runtime 'let's hope for the best, and see what happens!'

Trying to teach it FAIL FAST is an uphill struggle.

Oh and yes, returning mock objects if something goes wrong is a favourite.

It truly is an Idiot Savant - but still amazingly productive.

montebicyclelo · 2025-05-06T16:19:11 1746548351

Does the code consist of many large try except blocks that catch "Exception", which Gemini seems to like doing, (I thought it was a bad practice to catch the generic Exception in Python)

hnuser123456 · 2025-05-06T16:59:21 1746550761

Catching the generic exception is a nice middleground between not catching exceptions at all (and letting your script crash), and catching every conceivable exception individually and deciding exactly how to handle each one. Depends on how reliable you need your code to be.

montebicyclelo · 2025-05-06T22:20:54 1746570054

Hmm, for my use case just allowing the lines to fail would have been better, (which I told the model)

chr15m · 2025-05-07T00:06:29 1746576389

Many of the comments don't even describe the code itself, but the change that was made to it. So instead of:

x = 1 // set X to 1

You get:

x = 1 // added this to set x to 1

And sometimes:

// x = 1 // removed this

These comments age really fast. They should be in a git commit not a comment.

As somebody who prefers code to self-describe what it is doing I find this behaviour a bit frustrating and I can't seem to context-prompt it away.

n_ary · 2025-05-07T00:19:36 1746577176

May be these comments are actually originating from training annotated data? If I were to add code annotations for training data, I would sort of expect such comments which makes not much value for me but for the model, gives more contextual understanding…

Semaphor · 2025-05-06T17:26:05 1746552365

2.5 was the most impressive model I use, but I agree about the comments. And when refactoring some code it wrote before, it just adds more comments, it becomes like archaeological history (disclaimer: I don’t use it for work, but to see what it can do, so I try to intervene as little as possible, and get it to refactor what it thinks it should)

sureIy · 2025-05-06T16:59:35 1746550775

My custom default Claude prompt asks it to never explain code unless specifically asked to. Also to produce modern and compact code. It's a beauty to see. You ask for code and you get code, nothing else.

taf2 · 2025-05-06T15:43:22 1746546202

I really liked the Gemini 2.5 pro model when it was first released - the upload code folder was very nice (but they removed it). The annoying things I find with the model is it does a really bad job of formatting the code it generates... I know I can use a code formatting tool and I do when i use gemini output but otherwise I find grok much easier to work with and yields better results.

throwup238 · 2025-05-06T18:29:36 1746556176

> I really liked the Gemini 2.5 pro model when it was first released - the upload code folder was very nice (but they removed it).

Removed from where? I use the attach code folder feature every day from the Gemini web app (with a script that clones a local repo that deletes .git and anything matching a gitignore pattern).

taf2 · 2025-05-07T12:12:06 1746619926

Maybe I got stuck with a bad experiment that removed it but it has been gone for me for a few weeks so I just stopped using it

throwup238 · 2025-05-08T01:31:40 1746667900

It just got removed from the Add menu for me too. Now I have to click "Import Code" and then the "Upload Folder" button in the dialog. Maybe you got this roll out much earlier than I did?

bugglebeetle · 2025-05-06T16:04:11 1746547451

It’s annoying, but I’ve done extensive work with this model and leaving the comments in for the first few iterations produced better outcomes. I expect this is baked into the RL they’re doing, but because of the context size, it’s not really an issue. You can just ask it to strip out in the final pass.

merksittich · 2025-05-06T16:48:19 1746550099

My favourites are comments such as: from openai import DefaultHttpxClient # Import the httpx client

energy123 · 2025-05-06T15:53:40 1746546820

It probably increases scores in the RL training since it's a kind of locally specific reasoning that would reduce bugs.

Which means if you try to force it to stop, the code quality will drop.

guestbest · 2025-05-06T15:38:56 1746545936

What kind of problems are you putting in where that is the solution? Just curious.

Hikikomori · 2025-05-06T18:08:58 1746554938

So many comments, more verbose code and will refactor stuff on its own. Still better than chatgpt, but I just want a small amount of code that does what I asked for so I can read through it quickly.

freddydumont · 2025-05-06T18:26:52 1746556012

That’s been my experience as well. It’s especially jarring when asking for a refactor as it will leave a bunch of WIP-style comments highlighting the difference with the previous approach.

muzani · 2025-05-07T03:54:03 1746590043

It's trained on the Google style I guess. Google code always feels excessively commented, to the point where I delete comments from Google samples so I can read the code.

tucnak · 2025-05-06T15:31:49 1746545509

Ask it to do less of it, problem solved, no? With tools like Cursor it's become really easy to fit the models to the shoe, or the shoe to the foot.

asadm · 2025-05-06T16:47:17 1746550037

you need to do a 2nd step as a post-process to erase the comments.

Models use comments to think, asking to remove will affect code quality.

benbristow · 2025-05-06T16:57:43 1746550663

You can ask it to remove the comments afterwards, and it'll do a decent job of it, but yeah, it's a pain.

renewiltord · 2025-05-07T08:11:04 1746605464

It's effectively CoT for the model. Just run again after saying "Remove all comments".

upcoming-sesame · 2025-05-06T23:07:50 1746572870

I noticed the same. Even if I explicitly tell it not to add new comments, it just can't help it

hispanus · 2025-05-07T08:57:16 1746608236

I have had the same experience: overly commented code by default

dyauspitr · 2025-05-06T15:40:56 1746546056

Just ask it for fewer comments, it’s not rocket science.

GaggiX · 2025-05-06T15:31:52 1746545512

You can ask to not use comments or use less comments, you can put this in the system prompt too.

ChadMoran · 2025-05-06T15:37:55 1746545875

I've tried this, aggressively and it still does it for me. I gave up.

koakuma-chan · 2025-05-06T16:02:40 1746547360

Have you tried threats?

ChadMoran · 2025-05-19T19:23:43 1747682623

I did not, but I'm curious if this would actually work.

throwup238 · 2025-05-06T18:30:41 1746556241

It strips the comments from the code or else it gets the hose again.

ziml77 · 2025-05-06T15:47:38 1746546458

I tried this as well. I'm interfacing with Gemini 2.5 using Cursor and I have rules to to limit the comments. It still ends up over-commenting.

shawabawa3 · 2025-05-06T16:48:41 1746550121

I have a feeling this may be a cursor issue, perhaps cursors system prompt asks for comments? Asking in the aistudio UI for code and ending the prompt with "no code comments" has always worked for me

ChadMoran · 2025-05-07T16:36:26 1746635786

https://github.com/x1xhlol/system-prompts-and-models-of-ai-t...

blensor · 2025-05-06T15:47:44 1746546464

Maybe too many comments could be a good metric to check if someone just yolo accepted the result or if they actually checked if it's correct.

I don't have problems with getting lot's of comments in the output, I am just deleting it while reading what it did

tough · 2025-05-06T17:39:24 1746553164

another great tell of code reviewers yolo'ing it is that LLM's usually put the full filename path on the output, so if you see a file with the filename / path on the first line, thats prob a llm output

kurtis_reed · 2025-05-06T17:25:02 1746552302

> all the gang

What does that mean?

dankwizard · 2025-05-06T23:51:59 1746575519

This comment has been removed and replaced with a quote:

"Sometimes the big fish isn't only the fish"

AuthConnectFail · 2025-05-06T19:32:17 1746559937

you can ask it to remove, it does p good job at it

mrinterweb · 2025-05-06T16:45:14 1746549914

If you don't want so many comments, have you tried asking the AI for fewer comments. Seems like something a little prompt engineering could solve.

cchance · 2025-05-06T15:31:38 1746545498

And comments are bad? I mean you could tell it to not comment the code or to self-document with naming instead of inline comments, its a LLM it does what you tell it to