The api is never the bottleneck but how fast the cli provides context. So just by using ripgrep it will be faster than using grep. On top of this concurrent code search compared to sync etc
> I've consistently found Gemini to be better than ChatGPT [ because ] Google has crawled the internet so they have more data to work with.
This commonly expressed non-sequitur needs to die.
First of all, all of the big AI labs have crawled the internet. That's not a special advantage to Google.
Second, that's not even how modern LLMs are trained. That stopped with GPT-4. Now a lot more attention is paid to the quality of the training data. Intuitively, this makes sense. If you train the model on a lot of garbage examples, it will generate output of similar quality.
So, no, Google's crawling prowess has little to do with how good Gemini can be.
> Now a lot more attention is paid to the quality of the training data.
I wonder if Google's got some tricks up their sleeves after their decades of having to tease signal from the cacophony of noise that the internet has become.
Google's search is finely tuned to push you into clicking the link of who pays them the most. The search results are excellent quality for their customers. Your mistake is thinking you are the customer.
My biggest issue with iOS 26 is not the UI (it’s subpar compared to prior work), but the fact that it drains my battery 2x faster than before and I’m with an iPhone 16 Pro. That’s unacceptable performance degradation on a 1-year old phone.
I mean, they need arguments to make their user base buy a new iPhone. Does a thinner, smaller battery make you buy it? No. Does Apple Intelligence make you buy it? Maybe, but it won't be released in the next 100 years. Does a very slightly better camera make you buy it? No. What if we make your current phone slower and use up more storage of it with every update, would that make you upgrade?
There used to be NO way I'd make the jump to Android, but the iOS 26 lewk combined with a recent frustration of trying to add some podcasts to my iPhone from my hard drive (the show isn't online anymore).
It was such a headache to find an app that could half-decently play a local podcast mp3 and remember the episode & playback when closed. And the apps I found that did that well were all loaded with the kind of tracking/data-logging that I am on iOS to avoid.
Makes me miss my Android devices that let me use them almost like a flash drive, without weird restrictions and USB 2.
While this is cool, can anything be done about the speed of inference?
At least for my use, 200K context is fine, but I’d like to see a lot faster task completion. I feel like more people would be OK with the smaller context if the agent acts quickly (vs waiting 2-3 mins per prompt).
There’s work being done in this field - I saw a demo using the same method as stable diffusion does, but then for text. Was extremely fast (3 pages of text in like a second). It’ll come.
Sounds nice, in theory, but in practice I want to iterate on one, perhaps, two tasks at a time, and keep a good understanding of what the agent is doing, so that I can prevent it from going off the rails, making bad decisions and then building on them even further.
Worktrees and parallel agents do nothing to help me with that. It's just additional cognitive load.
You don't say that - you instruct the LLM to read files about X, Y, and Z. Putting the context in helps the agent plan better (next step) and write correct code (final step).
If you're asking the agent to do chunks of work, this will get better results than asking it to blindly go forth and do work. Anthropic's best practices guide says as much.
If you're asking the agent to create one method that accomplishes X, this isn't useful.
You don't have to think about it, you can just go try it. It doesn't work as well (yet) for me. I'm still way better than Claude at finding an initial heading.
Many of these still exist today, though is not as large of numbers. Maybe it's the same for software engineers. I still don't recall hearing of bank tellers accelerating the pace of ATMs or Travel Agents encouraging their clients to use Expedia.
Sadly, I don't think this astroturfing is limited to announcement threads. It seems it is becoming increasingly hard to source real human opinions online, even on specialized forums like this or Reddit communities.
I hope that I am wrong, but, if I am not, then these companies are doing real and substantial damage to the internet. The loss of trust will be very hard to undo.
Yes, this is a paid comment, in the sense that it's probably a bot. 22 day old account, with 1 post, praising Claude.
For more than a year, Anthropic has engaged in an extensive guerrilla marketing effort on Reddit and similar developer-oriented platforms, aiming to persuade users that Claude significantly outperforms competitors in programming tasks, even though nearly all benchmarks indicate otherwise.
> Please don't post insinuations about astroturfing, shilling, brigading, foreign agents, and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data. [0]
I believe that's talking about general insinuations, e.g. "you're just a russiabot" and the like.
The GP account above with only one comment that is singing the praises of a particular product is obviously fake. They even let the account age a bit so that wouldn't show up as a green account.
The most alarming to me thing is that it seems to be happening at scale. This is one of dozens similar posts I've seen all over the programming communities with similar characteristics (high praise, new-ish accounts, little if any other activity).
Well, I’m not a paid comment, and I agree 100% with the op, and have the exact same experience. I haven’t touched Cursor since paying for Claude Code (max or whatever the $100/mo plan is). That said, I never found Cursor very useful. Claude Code was useful out of the gate, so my experience may not be typical.
I recently wrote a 5+ page internal guide on how I do vibe coding and one of the first sections is cost
Keep in mind much of the guide is about how to move from 30s chats to doing concurrent 20min+ runs
----
Spending
Claude Code $$$ - Max Plan FTW
TL;DR: Start with Claude Max Pro at $100/mo.
I was about $70/day starting day 2 via the pay-as-you-go plan. I bought in $25 increments to help pace. The Max Plan ($100/mo) became attractive around day 2-3, and on week 2 I shifted to $200/mo.
Annoyingly, you have to make the plan decision during your first login to Claude Code, which is confusing as I wanted to trial on pay-as-you-go. (That was a mistake: do Max Pro.) The upgrade flow is pretty broken from this perspective.
The Max Plan at the $100/mo level has a cooldown of 22 question / 5 hour: That does go by fast when your questions are small and get interrupted, or you get good at multitasking. By the time you are serious, the $200/mo is fine.
Other vibe IDEs & LLM providers $$$
I did anywhere from about 50K to 200K tokens a day on Claude 3.7 Sonnet during week 1 on pay-as-you-go, with about a ratio of 300:1 of tokens in:out. Max Plan does not report usage, but for periods I am using it, I expect my token counts to now be higher as I have gotten much better at doing long runs.
The equivalent in OpenAI of using gp4-4o and o3 would be $5-40/day on pay-as-you-go, which seems cheaper for using frontier models… until Max Pro gets factored in.
Capping costs
Not worrying about overages is typically liberating. Max Pro helps a lot here. One of my next experiments is seeing about self-hosting of reasoning models for other AI IDEs. Max Pro goes far, but to do automation and autonomy, and bigger jobs, you need more power.