More

d4rkp4ttern · 2025-11-20T12:13:26 1763640806

My understanding is that they trained it to explicitly use a self-prune/self-edit tool that trims/summarizes portions of its message history (e.g. use tool results from file explorations, messages that are no longer relevant, etc) during the session, rather than "panic-compact" at the end. In any case, it would be good if it does something like this.

d4rkp4ttern · 2025-11-15T13:04:02 1763211842

Doesn’t seem to be available in the Agent SDK yet

d4rkp4ttern · 2025-11-15T12:29:51 1763209791

Magit was my favorite terminal/TUI way to interact with Git, until I found GitUI.

d4rkp4ttern · 2025-11-14T12:08:36 1763122116

Photo-realism is great but the real step-jump in image-gen I’m looking for is the ability to draw high quality technical diagrams with a mix of text and images, so I can stop having LLMs generate crappy diagrams with mermaid, SVG, HTML/CSS, draw.io

d4rkp4ttern · 2025-11-09T12:43:55 1762692235

There are “needle in the haystack” benchmarks for long context performance. It would be good to see those.

throwuxiytayq · 2025-11-09T13:45:08 1762695908

These aren’t really indicative of real world performance. Retrieving a single fact is pretty much the simplest possible task for a long context model. Real world use cases require considering many facts at the same time while ignoring others, all the while avoiding the overall performance degradation that current models seem susceptible to when the context is sufficiently full.

d4rkp4ttern · 2025-11-10T11:47:07 1762775227

I agree, retrieving a single fact is necessary but not sufficient.

d4rkp4ttern · 2025-11-06T12:28:40 1762432120

I built a similar tool called “lmsh” (LM shell) that uses Claude-code non-interactive mode (hence no API keys needed, since it uses your CC subscription): it presents the shell command on a REPL like line that you can edit first and hit enter to run it. Used Rust to make it a bit snappier:

https://github.com/pchalasani/claude-code-tools?tab=readme-o...

It’s pretty basic, and could be improved a lot. E.g make it use Haiku or codex-CLI with low thinking etc. Another thing is have it bypass reading CLAUDE.md or AGENTS.md. (PRs anyone? ;)

fouc · 2025-11-10T05:25:12 1762752312

>it presents the shell command on a REPL like line that you can edit first and hit enter to run it.

Oh genius, that's the best UX idea for the situation of asking an LLM to flesh out the CLI command without relying entirely on blind faith.

Even better if we can have that kind of behavior in the shell itself. For example if we started typing "cat list | grep foo | " and then suddenly realized we want help with the awk command so that it drops the first column.

iagooar · 2025-11-06T12:43:33 1762433013

This a pretty neat approach, indeed. Having to use the API might be an inconvenience for some people indeed. I guess having the Claude or ChatGPT subscription and using it with the CLI tools is what makes developers stick with these tools, instead of using what is out there.

d4rkp4ttern · 2025-11-06T13:07:49 1762434469

Right, when we’re already paying $100 or $200 per month, leveraging that “almost-all-you-can eat buffet” is always going to be more attractive than spending more on per token API billing.

d4rkp4ttern · 2025-10-28T11:11:53 1761649913

Weird to see so much discussion when it’s still behind a waitlist. And it seems aimed at “enterprise” only

d4rkp4ttern · 2025-10-27T11:39:57 1761565197

A couple books I'd put in this category:

1. Roadside Picnic, by the Strugatsky brothers, loose basis of Tarkovsky's Stalker movie.

2. XX by Rian Hughes -- hugely under-rated book. Starts with a signal from outer space and goes quite far, and also has a book-within-a-book. Nearly 1000 pages but found it very engaging.

Currently trying to read Stanlislaw Lem's His Master's Voice which has a similar theme of a possible signal from an alien intelligence.

MattPalmer1086 · 2025-10-27T19:10:23 1761592223

I absolutely love Lem, and I just read Roadside Picnic this year.

I had not heard of XX before - will have to check it out, thanks.

d4rkp4ttern · 2025-10-20T10:49:29 1760957369

Reddit appears to be only semi operational. Frequent “rate limit” errors and empty pares while just browsing. Not sure if related

d4rkp4ttern · 2025-10-18T16:57:59 1760806679

I like each at different times in different ways. Now I have both running in separate Tmux panes and have one talk to the other to ask/delegate/verify/validate, using my Tmux-cli tool (now a Claude skill of course):

https://github.com/pchalasani/claude-code-tools

Now my work on a project often spans multiple sessions of these agents. So I use a session-finder and resume/dump tool (also in that repo). I often ask Claude or codex to extract all useful details from a jsonl session log file so I can continue the work.

mpaepper · 2025-10-18T19:44:45 1760816685

What about the security aspects, can it run anything?

d4rkp4ttern · 2025-10-19T11:04:16 1760871856

I assume by “it” you mean Claude code or codex-cli — that depends on how you launched them or how you modified the permissions within the CLI chat; that’s orthogonal to my CLI tools.