Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don’t really see how this works though — Isn’t it the case that longer “compute” times are more expensive? Hogging a gpu overnight is going to be more expensive than hogging it for an hour.


Nah, it’d take all night because it would be using the GPU for a fraction of the time, splitting the time with other customer’s tokens, and letting higher priority workloads preempt it.

If you buy enough GPUs to do 1000 customers’ requests in a minute, you could run 60 requests for each of these customers in an hour, or you could run a single request each for 60,000 customers in that same hour. The latter can be much cheaper per customer if people are willing to wait. (In reality it’s a big N x M scheduling problem, and there’s tons of ways to offer tiered pricing where cost and time are the main trafeoffs.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: