Anyway, here's a must: having a different key for uploading a model and doing inference with it. Or even, there should be a set of keys for each model, with each access logged separately.
Oh that's very interesting, how ready for production is it? It only works for TF right?
> If you need a few dozen inferences per second per server, this is the cheapest way. And you're not depending on a proprietary solution whose parent company could go out of business in a year.
Definitely the cheapest way.
We've been in business for more than a year already actually :)
NN-512 has no connection to TensorFlow. It is an open source Go program (with no dependencies) that generates C code (with no dependencies). And it's fully ready for production. Similarly, LibNC is stand-alone, and Fabrice Bellard (author of FFmpeg, QEMU, etc.) will release the source to anyone who asks for it.
I'm giving performance comparisons versus TensorFlow, which I consider to be a standard tool.
People who use your proprietary, closed, black-box service are dependent on the well-being of your business. You could vanish tomorrow.
> The model size is the zipped size of your model that is uploaded to Inferrd (either through the SDK or the website).
Nice to hear!
> We only have servers in the United States at the moment but are looking to have servers all around NA and EU very soon.
Sorry, my question was not quite clear. What I actually wanted to know was more along the lines of being able to use your service in Europe legally. For example, I can not find a privacy policy or a way to get a GDPR data processing agreement.
We don't have any cold start delay! In our custom environment, you can do exactly what you are describing (running both CPU and GPU code). We provide you with access to the GPU and the CUDA libraries installed. It's basically lambda (minus the cold start) with GPU access.
We can scale a lot very quickly depending on how much you need.
Are you willing to talk a bit about how this all works? I assume you host the hardware yourself somewhere, which in the days of AWS et al must be pretty tough to pull off, especially with these specs. Where do you get the hardware from these days with the crypto craze?
Yes! A more in-depth blog post is coming soon. We do host the hardware ourselves, for complete control over the GPUs. We found a great infrastructure provider that is also experiencing shortages.
It helps us figure out what got done and where we are in our roadmap