Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I feel like it makes much more sense to just run it in the CPU instead. CPUs have access to far more memory so you could fit the entire model at its original size.

Instead of messing around with inefficient nonsense like this, figure out a way to prune and modify the models so that they run efficiently on a CPU.




Right now most CPUs are orders of magnitude slower than GPUs for doing forward/backward passes, so you're unlikely to get a similar speed. Some kind of pruning may help though.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: