Depends on who's making the call for who gets cut. A key part of decimation was that the doomed soldiers were beaten to death by their comrades to give the remaining 9 a bloody, lasting impression of their dishonor. If Meta makes everybody sit in a group with their ten closest coworkers and debate until they decide who gets cut it's a lot closer to decimation than if management suddenly shuts off 10% of employee computers.
Yesterday I actually read the docs on Open WebUI. It’s pretty impressive all the features it supports. I will give thunderbolt a look but I’m seriously considering doubling down on Open WebUI. The only thing I don’t like about their features are that tools and functions can’t be offloaded to external compute, it runs on the same machine.
You got it right I think. I’m sitting with two “AI Ready Radeon AI Pro 9700 workstation cards, which are RDNA4 not CDNA. My experience is that my cards are not a priority. Individual engineers at AMD may care, the company doesn’t. I have been trying since February to get ahold of anyone responsible for shipping tuned Tensile gfx1201 kernels in rocm-libs, which is used by Ollama.its been three weeks since I raised enough hell on the discord to get a response, but they still can’t find “who” is responsible for Tensile tuning, and “if” they are even going to do it for the gfx12* cards.
Yeah I own an AMD Instinct MI50 and i need to patch all of my applications to work, like PyTorch, bitsandbytes, blender etc, while Nvidia cards from the same generation are still mostly supported. But the better value and hardware are worth it
Yup, meanwhile Jensen is on the Lexfriedman podcast stating the reason why CUDA is successful is because all thier devices run it. The on ramp is at the individual user.
I have and RDNA4 card and they certainly are prioritizing CDNA over a CDNA + RDNA strategy or a unification strategy.
I have been trying since February to get someone at AMD to shipped tuned Tensile kernels in the rcom-libs for the gfx1201. They are used by Ollama but no one on the Developer Discord knows who is responsible for that. It has been pretty frustrating and it shows that AMD has an organizational problem to overcome in addition to all the things technically that they want rocm to do.
I love local first. I am finding that a 120B MoE is hitting the sweet spot for local hosted. Right now that takes a 2K strix halo, a 4k GB10 machine, or a 5k Mac Pro. 2 years from now I think hardware will take us back to the 2k ish range with good performance.
I love my dual GPU setup (2AMD Radeon r9700 64GB vram) but it costs 5x electricity than my GX10 (GB10 chip inside) and since layers are landing in system memory my TPS is half the GX10.
Now a dense model like Devstral2 24B slaps on the Dual GPU setup. I just haven’t gotten as much out of that as I have the 120 MoEs
Alternative headline: household spyware cash machine forced to pay $20 for being bad.
If you want to punish Meta then you have to punish the wonder boy who runs it. Not even share holders can fight off the guy spending 80B on the metaverse.
Sadly I don't think it's enough for Meta to change, because they have no business model if they are forced to be serious about online safety. That's probably also why they are pushing so hard for age verification, make safety a problem for someone else.
reply