Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is this a drop in solution that works with every existing tflite model?


Yes, these optimizations work with existing tflite models, so long as the quantized operators they use are supported in XNNPACK.


I see, in order to benefit, model has to be quantized. It is not super clear which kinds of quantization are supported. Both Fp16 and Int8?


In order to benefit from optimizations in *this blog post* the model needs to be quantized to 8-bit integers. However, XNNPACK supports floating-point inference as well (including with FP16 weights), see https://blog.tensorflow.org/2020/07/accelerating-tensorflow-...


Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: