Because that's not going to be reliable, and doesn't solve any of the other issues with this technology. Please don't shoehorn us into these conversations about insane privacy-invading gadgets.
(I do think AI/LLMs have a lot of potential for making life easier for autistic people, but ... this isn't it.)
It makes you wonder what the legal implications are of owning an Amazon, Google or Apple device with a voice assistant turned on in one of these jurisdictions.
Voice assistants have a wake word. There is low power circuitry that is running locally listening for a specific series of syllables and a buffer of a second or two of audio.
Once that wake word circuit detects that series of syllables, then it activates the rest of the device and starts streaming the buffer and current audio into whatever systems it has for transcription (or cloud).
In many cases, this happens locally. If you have an iPhone, put it airplane mode and say "Siri, what time is it?" And it will respond - all processing is local, no recoding on the cloud for that request. Some other requests may require additional processing. "Hey Siri, where am I?" -> "To do that, you will need to turn off airplane mode."
If you have an Amazon device, enable the "Start of request sound" ( https://www.amazon.com/b?ie=UTF8&node=21341310011 ). With this in place, you can then hear when the wake word has been triggered.
None of these devices are constantly recording or streaming to the cloud (aside: consider the network and compute requirements if every iPhone or Android was constantly streaming sound to Apple or Google for it to be recorded).
I guess if it a device is transcribing audio data from a buffer it’s not the same as a recording. Still I remember Apple was using some humans to review recordings:
I was pointing out that walking around with a voice assistant is not walking around with an open microphone recording everything.
For many applications with the iPhone, the transcription is done locally.
If you have an iPhone, turn on airplane made and switch on dictation mode.
I need to go to the store and get a two by four
I am five foot ten inches tall
You'll note that it does a fairly good form of transcription on device (you can still trick it with some ambiguous homophones).
I will also point out that Apple is using an opt-in process rather than opt-out. The article you linked is from October 29th, 2019. iOS 13.2 was released October 28th. It is possible that the release and saying that it is opt in ( https://support.apple.com/en-us/118392 ) "Privacy settings to control whether or not to help improve Siri and Dictation by allowing Apple to store audio of your Siri and Dictation interactions" is what triggered the article (the change prompted the article rather than the article prompting the change).
If you opt in to Improve Siri and Dictation, the audio of your interactions with Siri and Dictation may be stored on Siri servers and reviewed by Apple employees to develop and improve Siri, Dictation, and natural language processing functionality in Apple products and services. For general text Dictation performed on device (for example, composing messages and notes, but not dictating in a search box), transcripts and audio are not shared with Apple by default, but are shared if you opt in to Improve Siri and Dictation. In addition, other Siri Data, such as computer-generated transcriptions of your requests, names of your contacts, apps installed on your devices, and location, is also used to improve Siri.
This is fundamentally different than having an open microphone during a conversation that is transcribing the entire conversation and then using that to summarize it.
Fine in theory but man are they terrible in practice. At which point it is sent to the cloud and then, as it turns out (from siri), evaulated by contracted humans.