More

Roguelazer · 2026-03-12T23:47:04 1773359224

The important thing to remember is that for a large number of people (in the US), "work" is a place where they do things that they hate for eight hours a day, for people they hate (surveys routinely show between 40% and 60% of people are "satisfied" with their jobs). Those of us who are in the tech industry because we like actually programming computers (the "craft-lovers", in the parlance of this blog post) have been lucky enough to have jobs where where we get to actually do something we enjoy (even if it's intermingled with meetings and JIRA). If AI slop really is the future and programming becomes as rare of a job as hand-building wood furniture, then most of us are going to be living the normal experience of capitalism in a way that we are probably not well-prepared for.

Personally, I have noticed that I still produce substantially more and better code than the people at my company spending all day writing prompts, so I'm not too worried yet, but it seems plausible at some point that a machine that stole every piece of software ever written will be able to reliably turn a few hundred watt-hours of of electricity into a hallucination-free PR.

sockgrant · 2026-03-12T23:56:17 1773359777

I agree some people go to work to work, and claude is find / good for them, but I feel that characterization of us who are loving claude is disingenuous. I’m a creative, while I loved coding and honed my craft, it was creating that always had me hooked. Claude is creating on steroids. Not to mention, it can help you massively improve your code cleanliness. All of the little nice-to-have features, the cleanups, the high unit test coverage, nagging bug fixes, etc., they’re all trivial to do now.

It’s not the same as writing code, but it’s fun.

If your coworkers can’t outpace your code output they’re either not using opus4.6 or they aren’t really trying.

It’s pretty easy to slam 20 PRs a day with some of them being complex changes. The hardest part is testing the diffs, but people are figuring that out too.

ccosky · 2026-03-13T00:53:00 1773363180

I have a suite of Claude skills all about craftsmanship. Refactoring, renaming, deconstructing god classes, detecting deleted code, etc. I've never written better, more readable, more maintainable code in my life than I have with Claude. All of that is trivial and hardly takes any time at all to accomplish.

Before moving to agentic AI, I thought I'd miss the craftsmanship aspect. Not at all. I get great satisfaction out of having AI write readable, maintainable code with me in the driver's seat.

dinkumthinkum · 2026-03-13T03:54:49 1773374089

But, would you feel that same satisfaction out on the street?

g-b-r · 2026-03-13T00:55:26 1773363326

> it can help you massively improve your code cleanliness. All of the little nice-to-have features, the cleanups, the high unit test coverage, nagging bug fixes, etc., they’re all trivial to do now.

It can help if you write poor code without it, probably

High unit test coverage only means something if you carefully check those tests, and if everything was tested

sockgrant · 2026-03-13T03:26:15 1773372375

The only way Claude can help improve your code cleanliness is if you write poor code?

Code coverage means nothing if you didn't carefully check every test? "and if everything was tested" do you know what code coverage is?

not gonna engage the trolling

g-b-r · 2026-03-13T05:27:41 1773379661

> The only way Claude can help improve your code cleanliness is if you write poor code?

No? You assert that it writes better code than the average software developer?

> Code coverage means nothing if you didn't carefully check every test? "and if everything was tested" do you know what code coverage is?

Do you know?

Code coverage only tells what amount of the code gets *touched* by the tests.

To achieve code coverage it's enough to CALL the code, it doesn't tell you anything about the correctness of the tests: they could all end with a return true, and a code coverage tool would be perfectly happy.

So, yes, if you don't carefully check the test suite that the agent writes, it might well be worthless (or simply much less useful than you assume it to be, more realistically).

With "if everything was tested" I meant that you also need to check if the agent wrote all the tests that are needed, besides verifying that the ones it wrote are correct.

wiseowise · 2026-03-13T09:29:25 1773394165

> You assert that it writes better code than the average software developer?

Absolutely. It contains a lot, if not majority, of all the code available at our hands right now and can reason, whatever it means for LLMs to reason anyway, about it. It absolutely demolishes average software developer and it’s not even close.

> To achieve code coverage it's enough to CALL the code, it doesn't tell you anything about the correctness of the tests: they could all end with a return true, and a code coverage tool would be perfectly happy.

> So, yes, if you don't carefully check the test suite that the agent writes, it might well be worthless (or simply much less useful than you assume it to be, more realistically).

That’s like saying that if you don’t check every line your coworker writes it becomes worthless.

g-b-r · 2026-03-26T00:31:57 1774485117

You seem to have a pretty big misunderstanding of LLMs.

> > So, yes, if you don't carefully check the test suite that the agent writes, it might well be worthless (or simply much less useful than you assume it to be, more realistically).

> That’s like saying that if you don’t check every line your coworker writes it becomes worthless

A coworker is (supposedly) a *competent person*, placing some trust on that is not stupid.

You'd usually want to have a review of everything everyone does, but even if you don't do it, a reasonably competent and honest developer is never going to be as misleading as LLMs can be.

Furthermore, in traditional development, even without code reviews you will look at the test suite every once and then (or rather, always, when you touch the code that it tests).

ctoth · 2026-03-12T23:50:04 1773359404

[flagged]

g-b-r · 2026-03-13T00:32:38 1773361958

Stealing has been used for copyright infringement since forever, it is the correct word

Roguelazer · 2026-03-12T23:35:40 1773358540

That's absolutely not true. The places that have embraced "agentic engineering" are mostly garbage factories, and lots of places, including plenty of startups and fast-moving companies are staying off of this trend. I recognize that most of the people on this site are just trying to self-promote for their own gig, but the level of misinformation is sometimes just staggering.

burningChrome · 2026-03-13T00:04:27 1773360267

Want something to be terrified of?

I work at a massive health care company. They're 100% on the AI bandwagon and are putting AI everywhere they can. Billing, Software, DevOps, everywhere. If you think you can give an Agent some information and have go to work for some user, its 100% on the table for the company to do and either a) then outsource the rest offshore or b) lay the person off or shrink the department to increase the bottom line.

Your healthcare, right now, is being offloaded to AI agents and bots and this is only the beginning.

lp0_on_fire · 2026-03-13T00:57:46 1773363466

I literally just sat through the annual “choose your healthcare” plan bullshit and the “meeting” was literally one of the Hr people pulling up a power point narrated by “AI”. You could tell in the first ten seconds.

You’d think our plans would be cheaper given they’re offloading all this work to agents they don’t have to pay a salary to…right?

operatingthetan · 2026-03-12T23:41:42 1773358902

>lots of places, including plenty of startups and fast-moving companies are staying off of this trend.

Provide some examples then? Everyone who is all in on agentic code are pretty vocal about it. Who is declaring the opposite stance? Anyone?

VoidWarranty · 2026-03-13T00:05:35 1773360335

Both claims are hyperbole.

Reality remains in the middle, but there are plenty of examples of either extreme right now.

k32k · 2026-03-13T03:55:11 1773374111

Indeed, I feel this place has gone insane. There's no balance here.

You've got boosters and then you've got people who are panicking/fighting against anything pro-AI.

sothatsit · 2026-03-12T23:46:57 1773359217

It is not just startups or small companies embracing agentic engineering… Stripe published blog posts about their autonomous coding agents. Amazon is blowing up production because they gave their agents access to prod. Google and Microsoft develop their own agentic engineering tools. It’s not just tech companies either, massive companies are frequently announcing their partnerships with OpenAI or Anthropic.

You can’t just pretend it’s startups doing all the agentic engineering. They’re just the ones pushing the boundaries on best practices the most aggressively.

Roguelazer · 2025-12-27T17:06:19 1766855179

So what's going to happen in 3 years after these startup bros have left government, none of the frameworks they're using are supported any more, and nobody in the office that they parachuted into is trained to maintain whatever spaghetti they crapped out over three months of all-nighters? There's a reason that we don't build critical infrastructure by giving it to some guy whose entire accomplishments are "working at Airbnb for 10 years"

fragmede · 2025-12-27T17:17:46 1766855866

You cast too broad a brush! Having worked at Airbnb for 10 years would have been fine. The problem is DOGE was staffed by twenty year olds. How would that have worked? They started at AirBnB when they were 10?

wredcoll · 2025-12-27T18:17:59 1766859479

What's the relationship to doge?

oneandonley1 · 2025-12-28T15:55:44 1766937344

They came into this project from Doge. They just didnt mention it.

wredcoll · 2025-12-28T18:19:54 1766945994

I was trying to find evidence for that but nothing popped up, where did you find that?

Roguelazer · 2025-11-09T21:04:39 1762722279

DCR is cool, but I haven't seen anyone roll it out. I know it has to be enabled per-tenant in Okta and Azure (which nobody does), and I don't think Google Workspace supports it at all yet. It's a shame that OIDC spent so long and got so much market-share tied to OAuth client secrets, especially since classic OpenID had no such configuration step.

cyberax · 2025-11-09T23:07:08 1762729628

DCR is now being pushed by AI companies, using the MCP protocol that basically requires DCR.

So it might get some traction, and finally break the monopoly of "Login With Google" buttons.

hirsin · 2025-11-09T23:19:17 1762730357

This is because the MCP folks focus almost entirely on the client developer experience at the expense of implementability CIMD is a bit better and will supplant DCR, but overall there's yet to be a good flow here that supports management in enterprise use cases.

Roguelazer · 2025-09-11T23:03:41 1757631821

Some API questions/observations

- I don't see an idempotency key in the request to authorize a charge; that might be something nice for people looking to build reliable systems on this. - How long are accessTokens valid? Forever? Do they become invalid if the subject metadata (firstName, lastName, email) changes?

I think this is a super-cool idea, but I think the idea of extending net30 terms to every customer of some B2C product seems pretty iffy; since you're deferring charging until the end of the month, you won't get most of the fraud signals from Stripe until then and anything popular that used this system seems like it'd be pretty inundated with fraud. I would at least consider doing the charges more frequently (i.e., charge at the end of the month or every $50, whichever comes first) to put a better bound on how long you can go before finding out that someone gave you a stolen card.

strnisa · 2025-09-12T07:50:21 1757663421

We run Stripe Radar and 3-D Secure when adding a card (before first use), which filters out a lot of obvious fraud (and 3DS often shifts liability to card networks in many regions).

The balances are not settled just at the end of the month. Each customer has a "maximum owed limit", which starts low (currently 10 USD) and grows with successful payments (up to 30 USD currently). The customer is charged as soon as they hit that limit (with some grace to allow for continued use).

Idempotency keys are on the near-term roadmap. Access tokens do not currently expire; however, they can be revoked by the customer at any time.

franga2000 · 2025-09-13T22:36:35 1757802995

I really like the growing maximum owed limit idea and it would be really interesting to also make it possible to set an auto payment threshold lower than it.

The idea being something like "charge at X € owed but let it go up to N*X € if a payment fails before suspending service" where the N scales with something like the number of paid invoices or even total past spend.,

strnisa · 2025-09-13T23:04:23 1757804663

The primary objective of the max-owed limit is to cap per-customer risk.

If you are suggesting that the max-owed limit is actually N×X, then that would multiply worst-case exposure by N, which is undesirable.

If you are suggesting that we charge the customer when they owe X while their max-owed limit is N×X, this would be worse for the customer, since they would pay `N × (X × variable_rate_fee + fixed_rate_fee)` instead of `N×X×variable_rate_fee + fixed_rate_fee` in payment processing fees.

franga2000 · 2025-09-14T00:07:03 1757808423

If your payment processor takes a per-transaction fee (not all do) then yes, this is a slightly worse deal for the customer. However, I think this would still be a good choice to give them, even if many would probably not chose it in order to save a few cents.

If I burn 1 € per day and my max owed based on whatever risk assessment is 20 €, I can set my payment treshold to 15 €, meaning if a payment fails, I have 5 days to fix it and settle the debt before you suspend my access. If the trigger amount is the same as the max owed, I have zero time (well, presumably there is already some wiggle room for the time it takes to process the transaction).

strnisa · 2025-09-14T00:25:10 1757809510

I see what you mean — yes, this could be useful to some customers. We already implement a small grace amount above the max-owed limit that allows for continued service; your idea would essentially allow the customer to increase the default grace amount.

freakynit · 2025-09-12T03:32:35 1757647955

A better model I had in mind works like this: customers purchase tokens in any amount they choose. Companies then charge for their services using these tokens through the platform's APIs. At the end of each month, settlements are made based on the total token value. The smallest token unit could be as small as one-millionth of a dollar.

It’s similar to a digital wallet, but without currency conversion: customers cannot exchange tokens back into money.

strnisa · 2025-09-12T07:58:59 1757663939

That approach generally doesn't work from a legal perspective: prepaid tokens are often treated as e-money (especially if it's not for company's own products or services), and in many jurisdictions, holding value for users requires an e-money/money transmitter license.

ta12653421 · 2025-09-15T14:25:39 1757946339

In EU, this depends mainly on the question of exchange/interchangeability: If you sell them as vouchers and do not allow redeeming/payout in the original cash, its not a problem.

strnisa · 2025-09-15T14:39:58 1757947198

The key legal issue is interchangeability. Single-merchant vouchers are generally acceptable. If a voucher can be used across multiple merchants, it's often treated as e-money in the EU. Not being able to use funds across multiple merchants would significantly reduce the value for customers, as they would no longer be able to share payment processing fees across merchants.

freakynit · 2025-09-13T03:39:13 1757734753

I kind of expected this, though not want this way :( ... it seems governments will go to any extent to prevent creation of alternative source of value other than the one they can fully control... for good mostly, bad at other times..

NoahZuniga · 2025-09-14T01:00:26 1757811626

Surely you want any company that offers a prepaid credit card to be regulated, so that you can be extremely sure they won't just take your money and run.

And what really is the difference between a prepaid credit card and prepaid credits you can use at a large selection of tech companies. (Legally there is no distinction)

freakynit · 2025-09-14T09:39:17 1757842757

The problem is the constant charge part that comes along with the percentage part of commissions that payment processors companies takes.

Spending 100$ in 1 dollar each transactions mean I end up spending extra 30$, on top of the percentage charges.

A system based on tokens only takes the percentage part(as expected), but the constant part is added just once.

It opens up per-request charging model, across service providers.

This benefits both: the consumers for obvious reasons, and sellers since now customers don't have to "commit" to subscription or a large charge for a servive they may or may not use or continue.

NoahZuniga · 2025-09-14T11:53:07 1757850787

I don't really see the connection between my comment and your reply. Constant charges aren't necessary for regulatory reasons?

freakynit · 2025-09-15T04:54:16 1757912056

Misread your question. Apologies.

pjc50 · 2025-09-12T12:14:59 1757679299

> anything popular that used this system seems like it'd be pretty inundated with fraud

I coined "micropayments means microfraud"; I would expect this to have similar situations to the AWS mystery bill problem, but on a tiny scale. If you can charge customers without their confirmation it's easy to run up bills. And of course the amounts are so tiny you can't afford dispute resolution.

strnisa · 2025-09-12T12:47:41 1757681261

Yes, merchant abuse is a risk. What we do and plan to do:

  - Each merchant requires an OAuth grant, and customers can revoke it at any time.
  - A customer ledger shows what, when, and how much each merchant charged. This can be shown in the customer's dashboard and monthly statement emails.
  - Customers have account-level spending caps to limit exposure. We will add per-merchant caps.
  - If patterns look off or we get complaints, we can pause new charges and review.

Roguelazer · 2025-09-10T17:24:27 1757525067

I think this is ignoring a lot of prior art. Our deploys at Yelp in roughly 2010 worked this way -- you flagged a branch as ready to land, a system (`pushmaster` aka `pushhamster`) verified that it passed tests and then did an octopus merge of a bunch of branches, verified that that passed tests, deployed it, and then landed the whole thing to master after it was happy on staging. And this wasn't novel at Yelp; we inherited the practice from PayPal, so my guess is that most companies that care at all about release engineering have been doing it this way for decades and it was just a big regression when people stopped having professional release management teams and started just cowboy pushing to `master` / `main` on github some time in the mid 2010's.

jd__ · 2025-09-10T17:54:26 1757526866

That's super interesting, thanks for sharing the Yelp/PayPal lineage. You're right: there's probably a lot of prior art in internal release engineering systems that never got much written up publicly.

The angle we took in the blog post focused on what was widely documented and accessible to the community (open-source tools like Bors, Homu, Bulldozer, Zuul, etc.), because those left a public footprint that other teams could adopt or build on.

It's a great reminder that many companies were solving the "keep main green" problem in parallel (some with pretty sophisticated tooling), even if it didn't make it into OSS or blog posts at the time.

qlm · 2025-09-10T18:11:59 1757527919

Gotta be honest: the AI-ness of both the images and the text in this blog post (as well as your response) leaves a bad taste.

w10-1 · 2025-09-10T23:02:47 1757545367

> we inherited the practice from PayPal

Paypal got it from eBay, which in 2000's was rolling out 20M LOC worldwide every week or two on "release trains". There, a small team of kernel engineers rotated doing the merging -- two weeks of clearcase hell when it was your turn.

And, since eBay wrote their own developer tools, you'd have to deploy different tooling depending on the branch you were on. But because of their custom tooling, if there was a problem in the UI, in debug mode you could select an element in the browser UI and navigate to the java class in a particular component and branch that produced that element.

Roguelazer · 2025-09-09T20:41:22 1757450482

For most users, that'll just result in them going to Google, searching for the name of your business, and then clicking the first link blindly. At that point you're trusting that there's no malicious actors squatting on your business name's keyword -- and if you're at all an interesting target, there's definitely malvertising targeting you.

The only real solution is to have domain-bound identities like passkeys.

Roguelazer · 2025-07-24T13:48:28 1753364908

It's really "cool" when you get vendors like 6sense that combine browser fingerprinting with semi-licit data brokers to do full deanonymization of visitor traffic. Why bother doing marketing when you can just get a report of the name, email address, mailing address, and creditworthiness of every person who's visited your website?

I've seen people argue with a straight face that these tools and their reports don't run afoul of GDPR/CCPA because they don't involve information that a user gave you on purpose, so it's not protected. Ghouls, all of them.

Roguelazer · 2025-07-10T14:33:30 1752158010

It does seem like there's something wrong with that data; I find it somewhat implausible that the average parent was only caring for their child for 1.7 hours a day in 1985; even if you assume that all of the tween and teens were free-range and only got an hour or two of parenting a day, little kids have always required nonstop attention to make sure that they're not actively dying.

Although... the infant mortality rate in the US has dropped by more than 50% since 1985, so who knows...

chlodwig · 2025-07-10T15:08:36 1752160116

Yeah, I've wondered if there is some sort change in how people think about and label their activities. Would a 1950s parent even think of themselves as doing a defined activity called "childcare"? Or rather, the children are just around, as the parent is doing things. If I am cooking dinner while a toddler putters around the floor and a baby is in a high-chair eating scraps I give him, am I doing "childcare"? Would a 1950s parent think of that as doing "childcare"?

throwaway173738 · 2025-07-10T16:05:45 1752163545

Toddlers don’t just putter around. They want to be wherever you’re at doing whatever you’re doing and opening all the cabinets and boxes and pulling everything out to look at it. I think people were more apt to put them to work around the house in the past whereas now people infantilize them more. My son doesn’t speak very well as a 19 month old but he understands a lot and pays attention, and right now we’re trying to figure out how to put him to work in the kitchen and around the house so he feels involved and we get what little help he is able to contribute.

tstrimple · 2025-07-10T18:56:59 1752173819

I was born in '83 and I'd say this mostly describes my upbringing. We were left to our own devices the vast majority of the time. By the time I hit my teens, most days I'd barely see my parents at all. At some point you've got kids raising other kids as the parents are absent.

Roguelazer · 2025-07-10T14:08:38 1752156518

I mean, ~90M people live in one of the top 10 metro areas, which is about ¼ of the country. Not sure that I'd necessarily call that an "exception".

asdff · 2025-07-10T19:08:19 1752174499

So 75% lives outside of it. Yeah I'd say the majority lives this way and to live otherwise is an exception for the remaining 25%. And even within those top 10 some are more like what I describe. There are definitely parts of those metros where the "mile a minute" travel estimation from uncongested highways applies. Certainly true for philadelphia outside the ~50sq miles of the gridded central city. Places like Houston average home is only like 250k pretty much at parity with midwest prices.