More

pablobaz · 2026-03-19T11:32:15 1773919935

which bits of this do you think llm based agents can't do?

interstice · 2026-03-19T11:44:58 1773920698

Not get stuck on an incorrect train of thought, not ignore core instructions in favour of training data like breaking naming conventions across sessions or long contexts, not confidently state "I completely understand the problem and this will definitely work this time" for the 5th time without actually checking. I could go on.

mbesto · 2026-03-19T12:46:01 1773924361

LLMs by their nature are not goal orientated (this is fundamental difference of reinforcement learning vs neural networks for example). So a human will have the, let's say, the ultimate goal of creating value with a web application they create ("save me time!"). The LLM has no concept of that. It's trying to complete a spec as best it can with no knowledge of the goal. Even if you tell it the goal it has no concept of the process to achieve or confirm the goal was attained - you have to tell it that.

ModernMech · 2026-03-19T11:38:29 1773920309

The main thing they cannot do is be held accountable for any decisions, which makes them not trustworthy.

vbezhenar · 2026-03-19T11:42:20 1773920540

This is not correct. They can say "sorry" which makes them as accountable as ordinary developer.

bluefirebrand · 2026-03-19T12:15:40 1773922540

That's not what accountability is

pixl97 · 2026-03-19T15:41:43 1773934903

Accountability: "Something that SWE's run screaming from".

Example: "We should have professional accountability in software"

SWE: "This would bring about the end of the world!!!1!"

bigfishrunning · 2026-03-19T16:09:56 1773936596

The economics of software development have lowered the bar for software engineers: there simply aren't enough people who are good at it (or even want to be), and the salaries are very high, so plenty of people who shouldn't be SWE's are.

I am a software engineer, and I would absolutely love to see more professional accountability in this field. Unfortunately, it would make the cost of software go up significantly (because many many people writing software would be ejected from the industry)

interstice · 2026-03-19T11:48:02 1773920882

I've found recent versions of Claude and codex to be reluctant in this regard. They will recognise the problem they created a few minutes ago but often behave as if someone else did it. In many ways that's true though, I suppose.

bee_rider · 2026-03-19T13:21:57 1773926517

Does it do this for really cut and dry problems? I’ve noticed that ChatGPT will put a lot of effort into (retroactively) “discovering” a basically-valid alternative interpretation of something it said previously, if you object on good grounds. Like it’s trying to evade admitting that it made a mistake, but also find some say to satisfy your objection. Fair enough, if slightly annoying.

But I have also caught it on straightforward matters of fact and it’ll apologize. Sometimes in an over the top fashion…

bigfishrunning · 2026-03-19T16:07:45 1773936465

Ordinary developers get fired for poor performance *all the time*.

datsci_est_2015 · 2026-03-19T15:17:18 1773933438

LLM based solutions don’t need to stay dry and warm at night, with a full belly, possibly with their sexual partner with whom they have a drive to procreate.

MoreQARespect · 2026-03-19T14:08:04 1773929284

any of them.

pablobaz · 2025-09-09T15:28:22 1757431702

Or:

https://en.wikipedia.org/wiki/Gareth_Anscombe

:-)

pablobaz · 2025-08-01T19:52:56 1754077976

That could work. 15 managers doing 10 1:1 meetings each isn't so hard. It can get tricky with people being on vacation etc. But very possible and normal.

cyberpunk · 2025-08-01T20:07:43 1754078863

Have you ever had to do these? 10 back to back layoffs is a rough day. I had to do 5 in one day once and had to seek out a very expensive hangover.

Sucks for everyone. I’ve been laid off by email, it’s fine.

pluto_modadic · 2025-08-02T06:09:05 1754114945

maybe doing something wrong should be emotionally painful for managers and HR staff.

cyberpunk · 2025-08-02T19:18:43 1754162323

Not all layoffs are “wrong”; we maybe kept 200 people employed for a year or two longer than we would have by restructuring and laying off 20-30.

Net positive in my book. Of course on an individual level it sucks, these are people with dependants and so on, but so are the 200 who were saved.

quietbritishjim · 2025-08-01T20:23:37 1754079817

That's not so good for the people remaining, or even those laid off but later in the queue. Once the first person gets laid off, everyone will know it's happening and be wondering whether they're included. You're just dragging out the suspense over the hours or (more likely) days those meetings take place, rather than getting it out of the way in a few minutes. That's probably worse than the dubious joy of a personalised message about your termination.

(Though, here in the UK, redundancy procedures can take weeks, so a few days is not much compared to that.)

Someone1234 · 2025-08-01T20:08:14 1754078894

What if their direct manager was also terminated? It could result in a manager's manager having such a large cohort as it to take several days while employees wait to see if they're fired or not (word would get out immediately).

kimos · 2025-08-01T20:35:04 1754080504

Or some other unrelated manager doing the firing.

kimos · 2025-08-01T20:34:31 1754080471

This is how I have seen it done. You end up with managers firing people they do not know, and employees getting 15 min meeting invites and knowing what it means. But it’s much more compassionate and human.

pablobaz · on March 12, 2025

https://www.youtube.com/watch?v=pit0OkNp7s8 This Irish Sheep farmer is my favorite example of a hard to understand Irish accent. I've lived nearby to this location and can attest that it is quite common.

pablobaz · on March 12, 2025

Some of the west of Ireland accent also reflects pronunciation prior to the Great Vowel Shift. For example pronouncing "tea" as "tay" and "meat" as "mate"

https://en.wikipedia.org/wiki/Great_Vowel_Shift

pablobaz · on Jan 15, 2025

If the child had practiced on a balance bike for balance and a tricycle for pedalling it all comes together quite easily.

pablobaz · on Jan 7, 2025

In my experience with very large codebases, a common problem is devs trying to improve random things.

This is well intentioned. But in a large old codebase finding things to improve is trivial - there are thousands of them. Finding and judging which things to improve that will actually have a real positive impact is the real skill.

The terminal case of this is developers who in the midst of another task try improve one little bit but pulling on that thread leads to them attempting bigger and bigger fixes that are never completed.

Knowing what to fix and when to stop is invaluable.

jimbokun · on Jan 8, 2025

Which can lead to trying to rewrite Netscape Navigator from scratch and killing the company:

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

ninalanyon · on Jan 8, 2025

> common problem is devs trying to improve random things.

Been there, been guilty of that at the tail end of my working life. In my case, looking back, I think it was a sign of burnout and frustration at not being able to persuade people to make the larger changes that I felt were necessary.

Kinrany · on Jan 8, 2025

Do you think boyscouting, "leave it better than you found it" is misguided as well?

bogdan · on Jan 8, 2025

I always took it as "leave it better than you found it" across the files that I've been working on (with some freedom as long I'm on schedule). My focus is to address the ticket I'm working on. Larger improvements and refactorings get ticketed separately (and yes, we do allocate time for them). In other words, I don't think it's misguided.

bricestacey · on Jan 8, 2025

I do not believe in "boyscouting". I think if you want to leave it better, make a ticket and do it later. Tacking it on to your already planned work is outside the scope of your original intent. This will impact your team's ability to understand and review your changes. Your codebase is unlikely to be optimized for your whimsy. Worse though is when a reviewer suggests boyscouting.

I've seen too many needless errors after someone happened to "fix a tiny little thing" and then fail to deliver their original task and further distract others trying to resolve the mistake. I believe clear intention and communication are paramount. If I want to make something better, I prefer to file a ticket and do it with intention.

bitcraft · on Jan 9, 2025

Boyscouting works because you don’t need to get permission to fix tech debt when it is bundled with something else. 98% of those tickets you file to fix warts will never be addressed because the business demands that time is spent on features that make money.

_Tev · on Jan 11, 2025

Isn't the point of OP of this thread that most of those wart-fixes are pointless?

I tend to agree, if you can't sell it as a ticket you probably shouldn't work on it. And "boyscout" PRs are pain to review.

pablobaz · on Nov 15, 2024

What you are seeing is probably cold induced vasodilation

https://pmc.ncbi.nlm.nih.gov/articles/PMC4843861/

Incidentally there are some studies that show you get better at it with more frequent exposure. I have kayaked for many years and have found this to be the case - if my hands get cold now, dipping them into the water to further cool then hence opening the veins is very effective if counterintuitive way of warming my hands up.

pablobaz · on Nov 11, 2024

As the article discusses you don't need to ban alcohol you can just make it more awkward: - tax it - restrict the sales by age, location and time(see Nordic countries for a really strict version of this) - minimum unit pricing - warning labels Etc. You can argue if this is the right thing to do or not but it is enforceable and there's good evidence that these measures reduce consumption and harms.

pablobaz · on Nov 11, 2024

It goes deeper than that. Fundamental forces have chirality. This was a little controversial when first discovered.