Hacker News new | past | comments | ask | show | jobs | submit login

There's also less of a special sauce for text models itself these days with the propietary being more on the pre-training data and training stack (e.g. how to get 10k GPUs/TPUs running together smoothly). Multi-modal models (or adjacent like Sora) are less likely to be open sourced in the immediate term.



There is a lot of work to make the actual infrastructure and lower level management of lots and lots of GPUs/TPUs open as well - my team focuses on making the infrastructure bit at least a bit more approachable on GKE and Kubernetes.

https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main

and

https://github.com/google/xpk (a bit more focused on HPC, but includes AI)

and

https://github.com/stas00/ml-engineering (not associated with GKE, but describes training with SLURM)

The actual training is still a bit of a small pool of very experienced people, but it's getting better. And every day serving models gets that much faster - you can often simply draft on Triton and TensorRT-LLM or vLLM and see significant wins month to month.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: