The process is still complicated enough to be "enthusiast" (aka nerd) territory but it is getting better with every release. It will still be here in 10 years, nobody can take it away from us.
"
Why did you pick these default wake words and not something like “computer” or “okay assist”?
A wake word should be uncommon in everyday conversations at home or in media, such as music or TV, to minimize the risk of the device activating unintentionally. “Nabu”, “Jarvis”, and “Mycroft” ...
"
They hardcoded the wake words in hardware...
Why not just use LLM common sense to say "does this really sound like a purposeful activation?"
Or put a GPU in there, or export the call to your PC like they require for text to speech?
For being a DIY thing, they made it inexplicably hard to D
They actually have thought it out well. The short of it is you can set up a device to constantly stream to home assistant so that you can use any wake word.
That comes with the draw back of more power use and more importantly higher cpu use for each microphone you add. It's still possible (https://github.com/dscripka/openWakeWord)
The alternative for a dedicated low power device is to have trained a model to run on smaller micro controllers so that it can run locally on low powered devices (https://github.com/kahrendt/microWakeWord) this is what they have chosen for their dedicated devices.
This choice also comes with much higher default privacy. Which is great as home assistant offers cloud integration and the fact that they put privacy first in this area makes it much easier to trust they do the right thing in other areas.
>> Because then you'd be running a full speech to text model all the time, and an LLM any time any speech is detected.
> Uh yea, I want that to happen.
Be careful what you wish for.
> Have you seen Star Trek? We should have computer by now. An agent which knows when to tell us things we need to know, not just respond to requests.
Have you read 1984[0]?
Star Trek is fiction. A good fiction IMHO, but a fiction none the less. As such, the writers took "creative liberties" in order to present a storyline for viewers of same.
If you'd read 1984 recently enough you'd know it is based upon the notion of state and social imposed ideology becoming enmeshed, a locally running hub of pure logic is the antithesis of that concept.
A better metaphor would be mephi from yamyatin's We (the original inspiration for 1984) which represents the kernel of u modulated truth which exists outside the calculus of the imposed social matrix
Hilarious that you thought I'd wouldn't be familiar with the work, when you yourself appear to be that person
For what it's worth, I was worried about that as well, but I found it to be _fun_, and I actually look forward to playing with it, even before bed. Weird, I know, but I'm enjoying it and didn't think I would.
https://www.home-assistant.io/voice-pe/
The process is still complicated enough to be "enthusiast" (aka nerd) territory but it is getting better with every release. It will still be here in 10 years, nobody can take it away from us.