Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How possible is it to run these models on a gaming GPU?


If you're patient, https://github.com/FMInference/FlexGen lets you trade off GPU RAM for system RAM or even disk space.


Yes, if your gaming rig is NVidia DGX-1 workstation.

https://youtu.be/5TRr2oWeSw0


a very rough approximation is 2GB of vram for every billion fp16 model parameters, so the lower end models may be just about achievable on high-end cards




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: