Homomorphic encryption has such an enormous overhead that it would never be faster than just running the model locally. Or probably on your wristwatch for that matter.
The library they're using is literally called Hivemind [0]. I'm interested to see how the approach they're using differs from what we use in federated learning or gossip learning.
> Hivemind is a PyTorch library for decentralized deep learning across the Internet.
A Hivemind/Petals dev here. As far as I understand, most federated learning methods can't efficiently train very large models (with billions of parameters) because they repeat some calculations on many peers and/or involve excess communication.
In contrast, the training methods implemented in Hivemind struggle to minimize compute and communication but don't provide data privacy guarantees. This is mostly okay for LLMs, since they are trained on public data scraped from the Internet anyway.
("Sorry, I don't know how to answer that – but you can try getting closer to a bunch of other people running the app on their device and ask again!")