More

threecheese · 2026-02-19T19:23:03 1771528983

Edit: this is a ridiculous question, I know. Trying to eat my dogfood so to speak

Does Tailscale maintain an q&a agent, mcp, or llms.txt that anyone is aware of?

I’m trying to use Tailscale across my personal networks - without investing a lot of time - and so I’m throwing agents at it. It’s not going well, primarily because their tools/interfaces have been changing a lot, and so tool calls fail (ex ‘tailscale serve —xyz’ is now ‘tailscale funnel ABC’ and needs manual approval, and that’s not in the training set).

threecheese · 2026-02-17T23:01:07 1771369267

For one, qmd uses SQLite (fts5 and SQLite-vec, at least at some point) and then builds reranked hybrid search on top of that. It uses some cool techniques like resilient chunking and embedding, all packaged up into a typescript cli. Id say it sits at a layer above Wax.

threecheese · 2026-02-17T22:57:44 1771369064

Tell us more. I had Codex port this to Python so I could wrap my head around it, it’s quite interesting. Why would I use this WAL-check pointing thingamajig when I have access to SQLite-vec, qdrant and other embedded friends?

ckarani · 2026-02-17T23:38:47 1771371527

WAL/checkpointing is about control over durability and crash behavior, not “better vectors.”

  sqlite-vec and Qdrant are storage engines first; their durability is mostly “under the hood.” If your goal is a clean
  local RAG system, owning that layer can be better when you want:

  1. deterministic ingest semantics (append-only event log of chunks, then materialize state),
  2. fast recovery from partial writes (replay only WAL since last checkpoint),
  3. precise checkpoint boundaries tuned to your app (e.g., after every batch/conv/session ingest),
  4. a single-file, dependency-light artifact you can own end-to-end.

  That’s why it can be better than sqlite-vec/Qdrant in this specific case: not for raw ANN quality, but for operational
  predictability + composability of ingestion, retrieval, and memory lifecycle in one library.
  If you don’t care about that control and are fine with a managed server/extension model, built-ins are usually the
  simpler and smarter choice.

threecheese · 2026-02-15T05:01:32 1771131692

We aren’t users here though; we’re visitors.

jama211 · 2026-02-15T10:30:47 1771151447

Great point, but why call it a directory then rather than someone’s personal recommendations?

telliott1984 · 2026-02-15T22:08:07 1771193287

There's no TLD for the latter.

jama211 · 2026-02-16T18:46:59 1771267619

I’m sure there’s plenty you could find that would work. Or bobslist.com, don’t let your dreams be dreams.

threecheese · 2026-02-14T22:22:10 1771107730

Claude could access anything on your device, including system or third party commands for network or signal processing - it may even have their manuals/sites/man pages in the training set. It’s remarkably good at figuring things out, and you can watch the reasoning output. There are mcp tools for reverse engineering that can give it even higher level abilities (ghidra is a popular one).

Yesterday I watched it try and work around some filesystem permission restrictions, it tried a lot of things I would never have thought of, and it was eventually successful. I was kinda goading it though.

threecheese · 2026-02-08T21:52:20 1770587540

We are missing some building blocks IMO. We need a good abstraction for defining the invariants in the structure of a project and communicating them to an agent. Even if we had this, if a project doesn’t already consistently apply those patterns the agent can be confused or misapply something (or maybe it’s mad about “do as I say not as I do”).

I expend a lot of effort preparing instructions in order to steer agents in this way, it’s annoying actually. Think Deep Wiki-style enumeration of how things work, like C4 Diagrams for agents.

threecheese · 2026-02-08T21:41:51 1770586911

For the first, I think maintaining package-add instructions is table stakes, we need to be opinionated here. Agents are typically good at following them, if not you can fall over to a Makefile that does everything.

For the second, I totally agree. I continue to hope that agents will get better at refactoring, and I think using LSPs effectively would make this happen. Claude took dozens of minutes to perform a rename which Jetbrains would have executed perfectly in like five seconds. Its approach was to make a change, run the tests, do it again. Nuts.

catlifeonmars · 2026-02-09T13:31:19 1770643879

Does the agent have a way to interact with the lsp?

onionisafruit · 2026-02-09T14:45:50 1770648350

I don’t know about other lsps, but gopls has an -mcp flag that makes it run an mcp server. There’s also a jetbrains plugin for claude that gives claude the ability to use a subset of your jetbrains IDE’s features.

I usually have both of those configured when using claude on Go repos, and I still have the same frustrations as the comments above. Gopls has symbol search, but claude almost always uses grep to find uses instead.

catlifeonmars · 2026-02-10T14:05:08 1770732308

Didn’t know about the go lsp builtin mcp server. That’s neat!

Does preventing the agent from using a shell help at all with the grep issue?

threecheese · 2026-02-07T18:40:20 1770489620

They are idiots, but getting better. Ex: wrote an agent skill to do some read only stuff on a container filesystem. Stupid I know, it’s like a maintainer script that can make recommendations, whatever.

Another skill called skill-improver, which tries to reduce skill token usage by finding deterministic patterns in another skill that can be scripted, and writes and packages the script.

Putting them together, the container-maintenance thingy improves itself every iteration, validated with automatic testing. It works perfectly about 3/4 of the time, another half of the time it kinda works, and fails spectacularly the rest.

It’s only going to get better, and this fit within my Max plan usage while coding other stuff.

noosphr · 2026-02-07T19:03:10 1770490990

LLMs are idiots and they will never get better because they have quadratic attention and a limited context window.

If the tokens that need to attend to each other are on opposite ends of the code base the only way to do that is by reading in the whole code base and hoping for the best.

If you're very lucky you can chunk the code base in such a way that the chunks pairwise fit in your context window and you can extract the relevant tokens hierarchically.

If you're not. Well get reading monkey.

Agents, md files, etc. are bandaids to hide this fact. They work great until they don't.

threecheese · 2026-02-07T17:54:01 1770486841

So much of this resonated with me, and I realize I’ve arrived at a few of the techniques myself (and with my team) over the last several months.

THIS FRIGHTENS ME. Many of us sweng are either going be FIRE millionaires, or living under a bridge, in two years.

I’ve spent this week performing SemPort; found a ts app that does a needed thing, and was able to use a long chain of prompts to get it completely reimplemented in our stack, using Gene Transfer to ensure it uses some existing libraries and concrete techniques present in our existing apps.

Now not only do I have an idiomatic Python port, which I can drop right into our stack, but I have an extremely detailed features/requirements statement for the origin typescript app along with the prompts for generating it. I can use this to continuously track this other product as it improves. I also have the “instructions infrastructure” to direct an agent to align new code to our stack. Two reusable skills, a new product, and it took a week.

beepbooptheory · 2026-02-07T18:04:07 1770487447

Sorry if rude but truly feel like I am missing the joke. This is just LinkedIn copypasta or something right?

threecheese · 2026-02-07T18:30:10 1770489010

My post? Shiiiii if that’s how it comes across I may delete it. I haven’t logged into LI since our last corp reorg, it was a cesspool even then. Self promotion just ain’t my bag

I was just trying to share the same patterns from OPs documentation that I found valuable within the context of agentic development; seeing them take this so far is was scares me, because they are right that I could wire an agent to do this autonomously and probably get the same outcomes, scaled.

cbeach · 2026-02-07T18:39:47 1770489587

Please let’s not call ourselves “swengs”

Is it really that hard to write “developer” or “engineer”?

threecheese · 2026-02-08T16:42:23 1770568943

Amusingly I use that term that to avoid the “not an engineer” and “I don’t make websites” comments. But noted, Tu.

threecheese · 2026-02-06T23:40:27 1770421227

Vercel also released a similar tool with a unique interface/dsl - https://github.com/vercel-labs/agent-browser

> agent-browser click "#submit" > agent-browser fill "#email" "test@example.com" > agent-browser find role button click --name "Submit"

I appreciate that there’s innovation in the space, we will get closer to the interface that’s most appropriate for models to tool-call. I’m going to check your link out, sounds interesting.

antves · 2026-02-06T23:59:32 1770422372

it's interesting to see how things will play out, but I really believe that doing Claude Code (maybe with Opus 4.6) + click tool + move_mouse tool + snapshot page tool + another 114 more tools is definitely not the best approach

the main issue with this interface is that the commands are too low-level and that there is no way of controlling the context over time

once a snapshot is added to the context those tokens will take up very precious context window space, leading to context rot, higher cost, and higher latency

that's why agents need to use very large models for these kind of systems to work and, unfortunately, even then they're very slow, expensive, and less reliable than using a purpose-made system

I wonder if a standardized interface will organically emerge over time. At the moment SKILL.md + CLI seem to be the most broadly adopted interface - even more than MCP maybe