Hacker Newsnew | past | comments | ask | show | jobs | submit | generallyjosh's commentslogin

Did it make any mistakes on your taxes?

Personally, I know coding pretty well. So when I'm using it for coding, I can spot most of its mistakes / misunderstandings

I would not trust using it on a complex domain I'm not super familiar with, like doing taxes

A mistake here is pretty high cost (getting audited, and/or having to pay a bunch in penalties)


Larger models need more hardware resources to run

And, depending on effort settings, they do more 'thinking', i.e., use more rounds of inference to generate longer internal chains of thought

Both very good reasons to prefer a smaller model, if the small model is good enough for the task


The problem isn't giving MORE context to an agent, it's giving the right context

These things are built for pattern matching, and if you keep their context focused on one pattern, they'll perform much better

You want to avoid dumping in a bunch of data (like a year's worth of git logs) and telling it to sort out what's relevant itself

Better to have pre-processing steps, that find (and maybe summarize) what's relevant, then only bring that into context

You can do that by running your git history through a cheap model, and asking it to extract the relevant bits for the current change. But, that can be overkill and error prone, compared to just maintaining markdown files as you make changes


"You want to avoid dumping in a bunch of data (like a year's worth of git logs) and telling it to sort out what's relevant itself"

So instead you give it a years worth of changelog.md?

"Better to have pre-processing steps, that find (and maybe summarize) what's relevant, then only bring that into context"

So, not a list of commits that touched the relevant files or are associated with relevant issues? That kind of "preprocessing" doesn't count?

"You can do that by running your git history through a cheap model, and asking it to extract the relevant bits for the current change. But, that can be overkill and error prone, compared to just maintaining markdown files as you make changes"

And somehow extracting the same data out of a [relatively] unstructured and context-free (the changelog only has dates and description, that will need to be correlated to actual changes with git anyway...) markdown file is magically less error-prone?


Hey you can try it if you like. That's one of the beauties of the current moment, nobody REALLY knows what works best, just a whole lot of people trying stuff

And no, I wouldn't ever give it a year of changelog.md. I give it a short description of the current functionality, and a well-trimmed list of 'lessons-learned' (specific pitfalls/traps from previous work, so the AI doesn't have to repeat them)

If you think git logs are a good way to give context, try it and and see how it works! My instinct's that it won't work as well as a short readme, but I could be wrong. It's so easy to prototype these days, no reason to not give it a shot


"a short description of the current functionality, and a well-trimmed list of 'lessons-learned'"

Where does that come from?

"And no, I wouldn't ever give it a year of changelog.md."

No, instead you'll "[run] your git history through a cheap model". Except it's "overkill and error prone". So you're writing it up yourself? You didn't do the work, how do you know what the pitfalls and traps are?


I'd assume it probably depends how large and varied your logs are?

But, my guess, I could see an algorithm like that being very fast. It's basically just doing a form of compression, so I'm thinking ballpark, like similar amount to just zipping the log

Can't be anything CLOSE to the compute cost of running any part of the file through an LLM haha


I think what most people are worried about is that, as you say, AGI won't necessarily have our biases/biological drives

That might also mean it has no drive for self-determination. It might just be perfectly happy to do whatever humans tell it to, even if it's far smarter than us (and, this is exactly the sort of AI people are trying to make)

So, superintelligence winds up doing whatever a very small group of controlling humans say. And, like you say, humans want to win


You say this as though it's a pithy point.

Might as well say humans are just a better search tool - it's true in the exact same sense you're using.

All humans do is absorb information, then search through our memories and apply that information in relevant contexts to affect the world


> pithy point.

Not really, because I do think all knowledge can be obtained by searching true randomness.


The whole point of an economy is to generate value. Very, very different than caring for people

Feudalism was the dominant economic system for millennia. The point is to extract value for the upper class. Peasants only matter as a source of labor, and they only get 'cared for' to the extent of keeping them alive and working.

Now think about what feudalism might look like if the peasants' labor could be automated


Well, yeah, "keeping alive" sounds like caring to me. Not to a great standard, that's how we got numerous revolutions, and feudalism did end eventually. People stopped believing it, and some kings lost their heads.


There are lots of intelligent people looking at AI and imagining its potential

Are you just saying that you're more intelligent than them? You can see clearly, where all the steam engine technicians can't?


What are they saying that contradicts with something I said?


Well, you said:

The potential of the current crop of LLM/AIs will stop at being a very powerful tool to search large volumes of text using free-form questions.

I do think that pretty clearly contradicts with what a lot of people who make/use LLM models are saying haha


Openclaw isn't new (and the actual project never made itself out to be new)

It's a nice packaging, of a whole bunch of preexisting things. Agentic AI inside a nice sandbox container, running the model on a cron schedule, and with an ecosystem of ready made skills

Nothing new, but it made the tech easy for people to download and start using immediately. That's why you see so many people treating it as new - it's their first time hearing about such a setup


I do strongly agree on the framing, but I'd argue with the conclusion

Yeah, it really doesn't matter if AGI has happened, is going to happen, will never happen, whatever. No matter what sort of definition we make for it, someone's always doing to disagree anyway. For a looong time, we thought the Turing test was the standard, and that only a truly intelligent computer could beat it. It's been blown out of the water for years now, and now we're all arguing about new definitions for AGI

At the end of the day, like you say, it doesn't matter a bit how we define terms. We can label it whatever we want, but the label doesn't change what it can DO

What it can DO is the important part. I think a lot of software devs are coming to terms with the idea that AI will be able to replace vast chunks of our jobs in the very near future.

If you use these things heavily, you can see the trajectory.

6 months ago I'd only trust them for boiler plate code generation and writing/reviewing short in-line documentation.

Today, with the latest models and tools, I'm trusting them with short/low impact tasks (go implement this UI fix, then redeploy the app locally, navigate to it, and verify the fix looks correct).

6 months from now, my best guess is that they'll continue to become more capable of handling longer + more complex tasks on their own.

5 years from now, I'm seeing a real possibility that they'll be handling all the code, end to end.

Doesn't matter if we call that AGI or not. It very much will matter whose jobs get cut, because one person with AI can do the work of 20 developers


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: