Omi has a pretty interesting architecture that has been completely open-sourced:
- The device itself is a Nordic Semiconductor BLE chip with GPIO, mic, and BLE.
- Audio is streamed (as OPUS) via BLE to the associated mobile device where some initial processing is handled.
- The audio is then passed to a FastAPI-based backend API service that handles integrations with Deepgram, etc.
Overall I think it's a clever way to handle this: you get to use very cheap hardware that sips power (battery) while utilizing the connectivity of the associated mobile device whether it be WiFi or cellular.
- The device itself is a Nordic Semiconductor BLE chip with GPIO, mic, and BLE.
- Audio is streamed (as OPUS) via BLE to the associated mobile device where some initial processing is handled.
- The audio is then passed to a FastAPI-based backend API service that handles integrations with Deepgram, etc.
Overall I think it's a clever way to handle this: you get to use very cheap hardware that sips power (battery) while utilizing the connectivity of the associated mobile device whether it be WiFi or cellular.