How possible is it to run these models on a gaming GPU? | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		modeless on Feb 24, 2023 \| parent \| context \| favorite \| on: LLaMA: A foundational, 65B-parameter large languag... How possible is it to run these models on a gaming GPU?

sp332 on Feb 25, 2023 | [–]

If you're patient, https://github.com/FMInference/FlexGen lets you trade off GPU RAM for system RAM or even disk space.

coolspot on Feb 24, 2023 | | [–]

Yes, if your gaming rig is NVidia DGX-1 workstation.

https://youtu.be/5TRr2oWeSw0

_kuvn on Feb 25, 2023 | [–]

a very rough approximation is 2GB of vram for every billion fp16 model parameters, so the lower end models may be just about achievable on high-end cards

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact