There's a number of scaled AMD deployments, including Lamini (https://www.lamini.ai/blog/lamini-amd-paving-the-road-to-gpu...) specifically for LLM's. There's also a number of HPC configurations, including the world's largest publicly disclosed supercomputer (Frontier) and Europe's largest supercomputer (LUMI) running on MI250x. Multiple teams have trained models on those HPC setups too.
Do you have any more evidence as to why these categorically don't work?
Do you have any more evidence as to why these categorically don't work?