Good job digging into all of this Paul! At my company (onspecta.com) we solve similar problems (and more!) to accelerate AI/deep learning/computer vision problems, across both CPUs, GPUs as well as other types of chips.
This is a fascinating space, and there are tons of speed up opportunities. Depending on the type of the workload you're running, you might be able to ditch the GPU entirely and run everything just on the CPU, greatly reducing cost & deployment complexity. Or, at the very least, improve SLAs and 10x decrease the GPU (or CPU) cost.
I've seen this over and over again. Glad someone's documenting this publicly :-) If any one of you readers have more questions about this I'm happy to discuss in the comments here. Or you can reach out to me at victor at onspecta dot com.
This is a fascinating space, and there are tons of speed up opportunities. Depending on the type of the workload you're running, you might be able to ditch the GPU entirely and run everything just on the CPU, greatly reducing cost & deployment complexity. Or, at the very least, improve SLAs and 10x decrease the GPU (or CPU) cost.
I've seen this over and over again. Glad someone's documenting this publicly :-) If any one of you readers have more questions about this I'm happy to discuss in the comments here. Or you can reach out to me at victor at onspecta dot com.