The integration part (connectors) is the key here. I can see how beneficial this would be for companies as they can plug and play.
Adding the vectorisation locally is superb, I've played around with sbert models before and ability to run without GPU is going to simplify the process a lot.
Ah yes, this reminds me! I forgot to mention it but for the local NLP models that we run, they're in the range of 100 million parameters so they're able to be run on CPU (no GPU required!) with pretty low latency.
Also a fun tidbit on the connectors, more than half of them now are built by open source contributors! We just have an interface that needs to be implemented and people have been able to figure it out generally.
Adding the vectorisation locally is superb, I've played around with sbert models before and ability to run without GPU is going to simplify the process a lot.