More

nickstinemates · 2026-03-06T18:29:59 1772821799

I've been programming for literally my entire life. I love it, it's part of me, and there hasn't been more than a week in 30 years that I haven't written some code.

This is the first time that I feel a level of anxiety when I am not actively doing it. What a crazy shift that I am still so excited and enamored by the process after all of this time.

But there's also the double edged sword. I am also having a really hard time moderating my working hours, which I naturally struggle with anyway, even more. Partly because I am having so much fun and being so productive. But also because it's just so tempting to add 1 more feature, fix one more bug.

nickstinemates · 2026-03-04T20:48:00 1772657280

None in my area. Time to disperse. Get out of major cities like the pandemic promised. Fill in this great country we live in. Proliferate the governments surveillance for them.

nickstinemates · 2026-03-04T20:11:28 1772655088

Yeah - this easily replaces the Macbook Air M1, which I only use for traveling. I am hoping the battery life is just as insanely good.

nickstinemates · 2026-03-02T00:43:49 1772412229

AI Agent's like Claude Code are an arms race to the bottom. Just like frontier model quality, they all converge on feature sets over time (plan mode, skills, remote execution, sandboxing, etc.,) and opencode is holding its own, preferred even, in a lot of cases.

The real differentiated value comes from the environment the AI Agent operates in, the runtime.

The runtime is agent agnostic but provides a stable interface to your domain. People tried this with MCP, but MCP is a dead end. Local tool calling is so much better. Being able to extend integrations autonomously is the way, instead of being forced in to a bloated bag of tools.

This is why we built swamp - https://swamp.club. We can integrate with any technology with an API, CLI, or code base and build repeatable, typed, validated automation workflows. There are no providers to wait for. No weird tool call paths trying to get the right concoction of MCP. The agent builds the integration itself, on the spot, in minutes.

nickstinemates · 2026-02-09T19:08:41 1770664121

Key changes are

- ID verification to see porn on Discord.

- Also, some warnings to not befriend stangers.

Not very heavy handed, you can google porn anytime. I am not sure who this serves.

WorldMaker · 2026-02-09T19:21:10 1770664870

It serves UK, EU, and various US States' regulations to "protect the kids".

Discord is only the next biggest canary in the coal mine. These regulations are going to force a lot more websites and apps to do this, too.

I wish these sorts of regulations had been written hand-in-hand with a more directly technically-minded approach. The world needs a better technical way to try to verify a person's estimated age cohort without a full ID check and/or AI-analyzed video face scan before we start regulating "every" website that may post "adult content" (however you choose to define that) starts to require such checks.

reddalo · 2026-02-10T07:02:12 1770706932

I just wish parents would do what parents used to do: parenting. Then we wouldn't need any of this bullshit.

nickstinemates · 2026-02-06T05:07:18 1770354438

The way they teach math is stupid

zeroonetwothree · 2026-02-06T05:53:45 1770357225

I like the common core math curriculum. I think it makes a lot of sense. I prefer it to how I was taught.

I have a kid in school and a math degree so I have some knowledge of this.

analog31 · 2026-02-06T05:37:34 1770356254

Math education has always been a failure, or a "crisis." The number of people who come out of school with any functional math ability has been fairly constant over the decades, and depends a lot on family background and economic class. I'm not even sure that differences across countries are all that significant when people reach adulthood.

Don't get me wrong. I was one of the successful ones, but I think math education is in need of reform. In fact I would reform it quite radically.

nickstinemates · 2026-02-05T20:27:49 1770323269

You write a generic architecture document on how you want your code base to be organized, when to use pattern x vs pattern y, examples of what that looks like in your code base, and you encode this as a skill.

Then, in your prompt you tell it the task you want, then you say, supervise the implementation with a sub agent that follows the architecture skill. Evaluate any proposed changes.

There are people who maximize this, and this is how you get things like teams. You make agents for planning, design, qa, product, engineering, review, release management, etc. and you get them to operate and coordinate to produce an outcome.

That's what this is supposed to be, encoded as a feature instead of a best practice.

satellite2 · 2026-02-05T20:30:50 1770323450

Aren't you just moving the problem a little bit further? If you can't trust it will implement carefully specified features, why would you believe it would properly review those?

frde_me · 2026-02-05T21:41:05 1770327665

It's hard to explain, but I've found LLMs to be significantly better in the "review" stage than the implementation stage.

So the LLM will do something and not catch at all that it did it badly. But the same LLM asked to review against the same starting requirement will catch the problem almost always

The missing thing in these tools is that automatic feedback loop between the two LLMs: one in review mode, one in implementation mode.

resonious · 2026-02-05T21:51:18 1770328278

I've noticed this too and am wondering why this hasn't been baked into the popular agents yet. Or maybe it has and it just hasn't panned out?

bashtoni · 2026-02-05T21:56:42 1770328602

Anecdotaly I think this is in Claude Code. It's pretty frequent to see it implement something, then declare it "forgot" a requirement and go back and alter or add to the implementation.

cbovis · 2026-02-06T18:11:04 1770401464

AFAICT this is already baked into the GitHub Copilot agent. I read its sessions pretty often and reviewing/testing after writing code is a standard part of its workflow almost every time. It's kind of wild seeing how diligent it is even with the most trivial of changes.

bethekidyouwant · 2026-02-06T00:51:46 1770339106

You have to dump the context window for the review to work good.

tclancy · 2026-02-05T20:31:44 1770323504

How does this not use up tokens incredibly fast though? I have a Pro subscription and bang up against the limits pretty regularly.

doctoboggan · 2026-02-05T20:33:52 1770323632

It _does_ use up tokens incredibly fast, which is probably why Anthropic is developing this feature. This is mostly for corporations using the API, not individuals on a plan.

digdugdirk · 2026-02-05T20:44:16 1770324256

I'd love to see a breakdown of the token consumption of inaccurate/errored/unused task branches for claude code and codex. It seems like a great revenue source for the model providers.

shafyy · 2026-02-05T21:06:33 1770325593

Yeah, that's what I was thinking. They do have an incentive to not get everything right on the first try, as long as they don't over do it... I also feel like that they try to get more token usage by asking unnecesary follow up questions that the user may say yes to etc.

indemnity · 2026-02-06T06:31:07 1770359467

I had to go to Max, Pro is more like a taster.

At work tho we use Claude Code thru a proxy that uses the model hosted on AWS bedrock. It’s slower than consumer direct-to-Anthropic and you have to wait a bit for the latest models (Opus 4.5 took a while to get), but if our stats are to be believed it’s much much cheaper.

nickstinemates · 2026-02-06T01:59:10 1770343150

I don't know, all I can say is with API-based billing, doing multi-thousand like refactors that would take days to do costs like $4. In terms of value : effort, it's incredible.

andyferris · 2026-02-05T21:08:52 1770325732

It does use tokens faster, yes.

nickstinemates · 2026-02-05T20:17:38 1770322658

Did you give it any architecture guidance? An architecture skill that it can load to make sure it lays out things according to your taste?

nurettin · 2026-02-05T22:45:15 1770331515

Yes, it has a very tight CLAUDE.md which it used to follow. Feels like this happens a couple of times a month.

nickstinemates · 2026-02-05T19:42:27 1770320547

Is this a case of doing it wrong, or you think accuracy is good enough with the amount of context you need to stuff it with often?

romanovcode · 2026-02-05T20:06:53 1770322013

In my example the Figma MCP takes ~300k per medium sized section of the page and it would be cool to enable it reading it and implementing Figma designs straight. Currently I have to split it which makes it annoying.

kimixa · 2026-02-05T19:51:50 1770321110

I mean the systems I work on have enough weird custom APIs and internal interfaces just getting them working seems to take a good chunk of the context. I've spent a long time trying to minimize every input document where I can, compact and terse references, and still keep hitting similar issues.

At this point I just think the "success" of many AI coding agents is extremely sector dependent.

Going forward I'd love to experiment with seeing if that's actually the problem, or just an easy explanation of failure. I'd like to play with more controls on context management than "slightly better models" - like being able to select/minimize/compact sections of context I feel would be relevant for the immediate task, to what "depth" of needed details, and those that aren't likely to be relevant so can be removed from consideration. Perhaps each chunk can be cached to save processing power. Who knows.

nickstinemates · 2026-02-01T05:59:09 1769925549

Or you just fully embrace the thin client life and offload everything to the server. pxe boot with remotely mounted filesystems. local hard drives? who needs those?

kkfx · 2026-02-01T08:05:22 1769933122

And the server is handled how? We're always there: complexity can be managed or hidden.

Why do you think some people asked SUN to un-free ZFS back in the day? Because unlike most, they understood its potential. Why do you think PC components today, graphics cards first, then RAM, and NVMe drives after that, cost so much? Because those who understand realize that today, a GNU/Linux homeserver and desktop are ready for the masses, and it's only a matter of time before a umbrel.com, start9.com, or even frigghome.ai succeeds and sweeps away an increasingly banning and therefore unreliable and expensive cloud providers. Most still haven't grasped this, but those who live above the masses have.

Why are snaps, flatpaks, docker etc are pushed so hard even though they have insane attack surfaces, minimal control over your own infrastructure, and are a huge waste of resources? Because they allow selling support to people who don't know. With NixOS or Guix, you only sell a text config. It's not the same business model, and after a while, with an LLM, people learn to do it themselves.