> If you personally work on developing LLMs et al, know this: I will never work with you again, and I will remember which side you picked when the bubble bursts.
After using Claude code for an afternoon, I have to say I don't think this bubble is going to burst any time soon.
I think there is a good chance of the VC/Startup side of the bubble bursting.
However, I think we will never go back to a time without using LLMs, given that you can run useful open-weights models on a local laptop.
share your settings and system specs please, I haven't seen anything come out of a local LLM that was useful.
if you don't, since you're using a throwaway handle, I'll just assume you're paid to post. it is a little odd that you'd use a throwaway just to post LLM hype.
Happy to post mine (which is also not behind throwaway handle).
Machine: 2021 Macbook Pro with M1 Max 32GB
LLMs I usually use: Qwen 2.5 Coder 7B for coding and the latest Mistral or Gemma in the 4B-7B range for most other stuff
For interactive work I still use mostly Cursor with Claude, but for scripted workflows with higher privacy requirements (and where I don't want to be hit with a huge bill due to a rogue script), I also regularly use those models.
If you are interested in running stuff locally take a look at /r/LocalLLaMA [0] which usually gives good insights into what's currently viable for what use cases for running locally. A good chunk of the power-users there are using dedicated machines for it, but a good portion are in the same boat as me and trying to run whatever can be fit on their existing local machine, where I would estimate the coding capbilities to lag ~6-9 months in comparison to the SOTA big models (which is still pretty great).
Not sam. I am running it with ollama on a server on my lan with two 7900XT. I get about 50-60 tokens per second on phi4-mini with full precision, it only loads on a single card.
The few requests I tried were correct, I think that phi4 the 14 b parameters model produced better code though. I don't recall what it was, it was rather simple stuff though.
QwQ seems to produce okay code as well, but with only 40GB of vram I can only use about 8k context with 8bit quantization.
> I haven't seen anything come out of a local LLM that was useful.
By far the most useful use case for me, is when I want to do something in a repl or the shell, I only vaguely remember how the library or command I am about to use works and just ask it to write the command for me instead of reading through the manual or docs.
That’s funny because after using Cursor with Claude for a month at work at the request of the CTO, I have found myself reverting to neovim and am more productive. I see the sliver of value but not for complex coding requirements.
I had the same experience. Initially it was a fun and exciting, but in the end, the consistent small bugs and destructive agentic behaviour (deleting random code) made it less productive than simply writing the code yourself. And if you write yourself, you already understand it, and it's of a higher quality (unless you are a beginner).
After using Claude code for an afternoon, I have to say I don't think this bubble is going to burst any time soon.