Training/retraining: Done arbitrarily, mostly locally, sometimes distribute. Done with either TensorFlow or Torch. We have a custom backend for it. Often times linked with Keras.
Inference: For our custom platform, we have our own framework (in progress). For other products where our hardware is unavailable, we will use either MXNet mobile or a custom framework on mobile frameworks. For deployments where we have the luxury of a cloud link, we will either use TensorFlow serving (with a custom backend once it's done) or Flask linked to TF/Caffe/Keras (also with a custom backend once it's done).
Inference: For our custom platform, we have our own framework (in progress). For other products where our hardware is unavailable, we will use either MXNet mobile or a custom framework on mobile frameworks. For deployments where we have the luxury of a cloud link, we will either use TensorFlow serving (with a custom backend once it's done) or Flask linked to TF/Caffe/Keras (also with a custom backend once it's done).