Hacker Newsnew | past | comments | ask | show | jobs | submit | more retsibsi's commentslogin

> The only people who use LLMs "as a tool" are those who are incapable of doing it without using it at all.

Do you mean that? It's clearly false, but I don't want to waste time gathering famous-person counterexamples if you already know it's a huge exaggeration at best.


I do believe it, but for whatever it's worth (maybe not much!):

If the author is willing and able to write understandable English, I'd prefer to read their version (even if it's very imperfect) than the LLM-polished version.

Alternatively, I'll happily read an article that was written in the author's native language and then translated directly to English.

This one bothered me because it's pretty clearly neither of those things, and so it reads just like any other LLM-written/LLM-polished piece.

[edit: just realised 'willing and able' might sound snarky in some way! All I meant was to acknowledge that even if you can write in a second (or third, etc.) language, you might not want to]


> detractors-for-no-good-reason

It's partly just a matter of taste; we can disagree on whether that's a good reason, but I'd be surprised if there were no writing styles that you personally find offputting.

The LLM smell is also a signal of low effort, and a signal that we as readers can't rely on our usual heuristics for judging credibility. The whole thing with LLMs is that they're great at producing polished, plausible-looking outputs, but they're still prone to bullshitting and making errors that don't match the usual human patterns. (And of course they're a great tool for churning out human-initiated disinformation.) If you don't have any kind of immune response against the LLM smell, I reckon you're probably absorbing more bs than you realise.


> signal of low effort

Is low effort really a valuable signal though? Or is it what's actually in the content that's valuable? Like here readers are literally saying that they found the content valuable "but AI smell". Why is there a "but"? Would there be a similar issue if the author had contracted a human assistant to do X? Definitely not, and I see no reason why the treatment should be different for AI.


A personal opinion: I would much prefer to read the rough, human version of this article than this AI-polished version. I'm interested in the content and the author clearly put thought and effort into it, but I'm constantly thrown out of it by the LLM smell. (I'm also a bit mad that `--` is now on the em dash treadmill and will soon be unusable.)

I'm not just saying this to vent. I honestly wonder if we could eventually move to a norm where people publish two versions of their writing and allow the reader to choose between them. Even when the original is just a set of notes, I would personally choose to make my own way through them.


I'm not going to pretend I'm great at reading social situations, but I think your approach in this story would have annoyed 99% of interviewers, even if they genuinely valued directness. If they'd asked for feedback on the interview process, then sure, they'd be a hypocrite if they claimed to value directness but got mad when you told them honestly that you were bothered by their lateness. But when they ask for questions, they're not inviting criticism, and framing the criticism as a question is always going to come across as passive aggressive. (edit: Or maybe 'snarky' is a better word here, as you did follow it up with a direct criticism, so 'passive aggressive' might not be quite right.)


Being annoyed is fine but I would argue that they should deal with it if they’re going to make me waste half the interview time sitting around.

I’ll admit to a bit of douchiness on my end but I think they should have understood the snarkiness in this situation if they value directness.


Fair enough. I wasn't there and this probably depends a lot on your tone and general vibe, the dynamic between you and the interviewer up until then, and so on. I do think it's almost always a risky move, but I think I assumed too much and I apologise for that.


> If we accept that any one person can take responsibility for their feelings then it follows that everyone is responsible for their own mind.

I don't think this follows! People are very different, so something can be genuinely true of a subset without generalising to everyone.

Crocker's Rules definitely wouldn't work for me, but it's explicit in them that they can only be self-invoked. Some people seem genuinely to be very thick-skinned (but easily annoyed by indirection and politeness) and able to 'take responsibility for their own feelings' in this sense. I doubt (m)any of them are truly unoffendable... and one could argue that they should be taking responsibility for their own feelings of frustration triggered by normal politeness... but I assume they know themselves well enough to know that they are better off when people try to be as direct as possible when interacting with them.

Where it breaks down is if/when they treat this as an objectively superior state of being and mode of interaction, and use it as an excuse to be rude to others.


I don't think it's intended as that kind of binary. It's more like "yeah, it's flawed in that way, and here's how you can get around that". If someone's claiming the tool is perfect, they're wrong; but if someone's repeatedly using it in the way that doesn't work and claiming the tool is useless, they're also wrong.


> completely meaningless

This is way too strong isn't it? If the user naively assumes Claude is introspecting and will surely be right, then yeah, they're making a mistake. But Claude could get this right, for the same reasons it gets lots of (non-introspective) things right.


It's not too strong. If it answered from its weights, it's pretty meaningless. If it did a web search and found reports of other people saying this, you'd want to know that this is how it answered - and then you'd probably just say that here on HN rather than appealing to claude as an authority on claude.

They also said it "admitted" this as a major problem, as if it has been compelled to tell an uncomfortable truth.


GP here, this is indeed exactly whT I was getting at, thanks for wording it for me; you put it better than I would've.

In this specific case I'd go one step further and say that even if it did a web search, it's still almost certainly useless because of the low quality of the results and their outdatedness, two things LLMs are bad at discerning. From weights it doesn't know how quickly this kind of thing becomes outdated, and out of the box it doesn't know how to account for reliability.


Maybe I'm just being too literal, but I don't know if you're really disagreeing with me. I was disputing "the response they give to this kind of question is completely meaningless". An answer from its weights is out of date, but only completely meaningless if this is a completely new issue with nothing relevant in the training data. And, as you say, the answer could be search-based and up to date.


In the original studies, most people made an error that can't be explained by that misunderstanding: they failed to select the card showing 'not y'.


From my armchair this feels relevant:

> Decoding analyses of neural activity further reveal significant above chance decoding accuracy for negated adjectives within 600 ms from adjective onset, suggesting that negation does not invert the representation of adjectives (i.e., “not bad” represented as “good”)[...]

From: Negation mitigates rather than inverts the neural representations of adjectives

At: https://journals.plos.org/plosbiology/article?id=10.1371/jou...


Quoting the Wikipedia article's formulation of the task for clarity:

> You are shown a set of four cards placed on a table, each of which has a number on one side and a color on the other. The visible faces of the cards show 3, 8, blue and red. Which card(s) must you turn over in order to test that if a card shows an even number on one face, then its opposite face is blue?

Confusion over the meaning of 'if' can only explain why people select the Blue card; it can't explain why people fail to select the Red card. If 'if' meant 'if and only if', then it would still be necessary to check that the Red card didn't have an even number. But according to Wason[0], "only a minority" of participants select (the study's equivalent of) the Red card.

[0] https://web.mit.edu/curhan/www/docs/Articles/biases/20_Quart...


People in everyday life are not evaluating rules. They evaluate cases, for whether a case fits a rule.

So, when being told:

"Which card(s) must you turn over in order to test that if a card shows an even number on one face, then its opposite face is blue?"

they translate it to:

"Check the cards that show an even number on one face to see whether their opposite face is blue and vice versa"

Based on this, many would naturally pick the blue card (to test the direct case), and the 8 card (to test the "vice versa" case).

They wont check the red to see if there's an odd number there that invalidates the formulation as a general rule, because they're not in the mindset of testing a general rule.

Would they do the same if they had more familiarity with rule validation in everyday life or if the had a more verbose and explicit explanation of the goal?


Yeah maybe if you phrased it as "Which card(s) must you turn over in order to ensure that all odd-numbered cards are blue?" you'd get a better response?


Even*


Exactly. We invented rule-based machines so that we could have a thing that follows rules, and adheres strictly to them, all day long.

Im not sure why people keep comparing machine-behaviour to human's. Its like Economic models that assume perfect rationality... yeah that's not reality mate.


I've confidently picked 8+blue and is now trying to understand why I personally did that. I think that maybe the text of the puzzle is not quite unambiguous. The question states "test a card" followed by "which cards", so this is what my brain immediately starts to check - every card one by one. Do I need to test "3"? No, not even. Do I need to test "8"? yes. Do I need to test "blue"? Yes, because I need to test "a card" to fit the criteria. And lastly "red" card also immediately fails verification of a "a card" fitting that criteria.

I think a corrected question should clarify in any obvious way that we are verifying not "a card" but "a rule" applicable to all cards. So a needs to be replaced with all or any, and mention of rule or pattern needs to be added.


It also doesn't explain why people don't think it necessary to check the 3 to make sure it's not blue (which it would be if "if" meant "if and only if").


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: