Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I did the same thing, but I went the easy way and used OpenAI's API. Half way through, I got fed up with all the boilerplate, so I wrote a really simple (but very Pythonic) wrapper around function calling with Python functions:

https://github.com/skorokithakis/ez-openai

Then my assistant is just a bunch of Python functions and a prompt. Very very simple.

I used an ESP32-Box with the excellent Willow project for the local speech recognition and generation:

https://github.com/toverainc/willow



> > Building a fully local LLM voice assistant

> I did the same thing, but I went the easy way and used OpenAI's API.

This is a cool project, but it's not really the same thing. The #1 requirement that OP had was to not talk to any cloud services ("no exceptions"), and that's the primary reason why I clicked on this thread. I'd love to replace my Google Home, but not if OpenAI just gets to hoover up the data instead.


Sure, but the LLM is also the easy part. Mistral is plenty smart for the use case, all you need to do is to use llama.cpp with a JSON grammar and instruct it to return JSON.


I might get downvoted for this but OpenAI's API pretty clearly says that the data isn't used in training


I'd imagine their ToS which they can update whenever they want links to a privacy policy which they can update whenever they want, which is where this restriction is actually codified. The ToS probably also has another part saying they'll use your data "for business reasons including [innocuous use-cases]", and yet another part elsewhere which defines "business reasons" as "whatever we want including selling it".


See Magentic for something similar: https://github.com/jackmpcollins/magentic


That looks very interesting, thanks!


I assume the issue is about privacy in your case. I am not using Alexa, Siri, etc.


that is correct! I would much rather run everything in-house, where I know the quality won't be degraded over time (see the Google Assistant announcement from yesterday) and I am in full control of my data.

using a cloud service is much easier and cheaper, but I was not comfortable with that trade-off.


Based on your experience and existing code, it is easy to add continuous listening? Have not tested it but probably is already there. For example, I would like to have it always turned on and speaking to it about ideas at random times.


I never tried it, but I think it would go very poorly without a wake word of sorts.

HomeAssistant seems to natively support wake words, but I haven't looked into it yet. I simply use my smartwatch (Wear OS supports replacing Google Assistant with HomeAssistant's Assist functionality) to interact with the LLM


The solution I've got (in alpha) is a basic webcam that detects when you're looking at it.

The cam is positioned higher than most things in the room to reduce triggering it unnecessarily.

When it triggers (currently using just simple cvv facial landmark detection) it emits a beep and then listens for a verbal command.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: