> Unfortunately many, many people think that talking to your computer in actually already possible, they just haven't experienced it yet.
Given how often another person can't correctly infer meaning when people talk, I doubt it will ever be how people imagine it.
Initially, I imagine it will be a lot of people trying to talk normally and a lot of AI responses asking how your poorly worded worded request should be carried out, choice A, or choice B. You'll be annoyed because why can't the system just do the thing that's right 90% of the time? The problem is that 10% of the time you're just not explaining yourself well at all and 10% is really quite a lot of the time, and probably ranges from a few requests to dozens of requests a day, and while being asked what you meant is annoying, having the wrong thing done can be catastrophic.
As humans, we get around this to some degree by modeling the person we are conversing with as thinking "what would they want to do right now", which is another class of problem for AIs I assume, so I think we may not get human level interaction until AI's can fairly closely model humans themselves.
I imagine we'll probably develop some AI pidgin language that provides more formality that as long as we follow it AI voice interactions become less annoying (but you have to spend a bit of time before to formulate it).
Then again, maybe I'm way off base. You sound like you would know much better than me. :)
In a previous job I looked into the viability of pen computing, specifically which companies had succeeded commercially with a pen-based interface.
A lot of folks are too young to know this, but there was a time when it was accepted wisdom that much of personal computing would eventually converge to a pen-based interface, with the primary enabling technology being handwriting recognition.
What I found is that computer handwriting recognition can never ever be good enough to satisfy people's expectations. Why? Because no one is good enough to satisfy most people's expectations for handwriting recognition, including other humans, and often even including themselves!
Frustration with reading handwriting is a given in human interaction. It will be a given in computer interaction too--only, people don't feel like they need to be polite to a computer. So they freely get mad about it.
The companies that had succeeded with pen computing were those that had figured out a way to work around interpreting natural handwriting.
One category was companies who are specialized in pen computing as an art interface--Wacom being the primary example at the time. No attempt at handwriting recognition at all.
The other was companies who had tricked people into learning a new kind of handwriting. The primary example is the Palm Pilot's "Graffiti" system of characters. The neat thing about Graffiti is it reversed customer expectations. Because it was a new way of writing, when recognition failed, customers often blamed themselves for not doing it right!
We all know what happened instead of pens: touchscreen keyboards. The pen interface continues to be a niche UI, focused more on art than text.
It's interesting to see the parallels with spoken word interfaces. I don't know what the answer is. I doubt many people would have predicted in 2001 that the answer to pen computing was a touchscreen keyboard without tactile keys. Heck people laughed about it in 2007 when the iPhone came out.
But I won't be surprised if the eventual product that "solves" talking to a computer, wins because it hacks its way around customer expectations in a novel way--rather than delivering a "perfect" voice interface.
> The other was companies who had tricked people into learning a new kind of handwriting [...] It's interesting parallels with spoken word interfaces. I don't know what the answer is.
Well, here's a possibility: we'll meet in the middle. Companies will train humans into learning a new kind of speech which is more easily machine-recognisable. These dialects will be very similar to English but more constrained in syntax and vocabulary; they will verge on being spoken programming languages. They'll bear a resemblence to, and be constructed in similar ways to, jargon and phrasing used in aviation, optimised for clarity and unambiguity over low-quality connections. Throw in optional macros and optimisations which increase expressive power but make it sound like gibberish to the uninitiated. And then kids who grow up with these dialects will be able to voice-activate their devices with an unreal degree of fluency, almost like musical instruments.
Of course, voice assistants are already doing this to us a bit. Once I learn the phrasing that works reliably, I stick with it, even though repeating "Add ___ to shopping list" for every item feels like the equivalent of GOTO statements.
The challenge with learning anything new from our voicebots is the same as it already is: discoverability. I don't have the patience to listen to Siri's canned response when it's done what I've asked it to, and I'm probably not going to listen to it tell me "Next time, you can say xyz and I'll work better."
The easiest way to learn a new language is to be a kid and listen to native speakers converse. Unless Amazon produces a kids' show starring a family of Echos speaking their machine code, I don't see that happening.
I miss the times before voice assistants, when it was expected of user to train a voice interface. That way I could, if needed, negotiate pronunciation with the machine.
Your comment reminded me of an interesting video I saw a while back. The video is a talk [0] by a programmer explaining and demonstrating how he developed and used a voice recognition system to program in emacs, after a significant RSI injury.
The commands and syntax he uses to interact with the system might be a decent approximation of the custom English syntax you suggest.
I'd just like to see a voice interface that has any attempt at natural error correction.
We do this in a hundred ways in English, emphasis, meta-words, too many to fold into a system. But with a little user prodding, I think you could cover enough of this ground to still sound almost natural.
Asst.: Would you like me to read that back to you?
User: Yes
A: The click brown ox...
U: The _quick_ brown _fox_...
A: Ok. The the quick brown fox jumped over the lazy log.
Is that correct?
U: No.
A: Which part [is incorrect]?
U: The the quick
A: The the quick. What should it be?
U: The quick.
A: The quick brown fox jumped over the lazy log. Is that
correct?
U: No. Lazy dog.
A: Lazy log?
U: No. Lazy dog.
A: Spell "log." <Or "Spell" + recording of user's voice saying ambiguous word>
U: D-O-G.
A: Ah. The quick brown fox jumped over the lazy dog. Do
I have it now?
U: Yes.
A: Say 'send' to send message.
Agreed that voice systems are an incredibly hard problem, but I suspect rule-based natural language correction systems would go a long way.
I'm not so sure. Maybe I'm naive but I could see imitation and reinforcement learning actually becoming the little bit of magic that is missing to bring the area forward.
From what we know, those techiques play an important roles in how humans themselves go from baby glibberish to actual language fluency - and from what I understand there is already a lot of research ongoing how reinforcement learning algorithms can be used to learn walking cycles and similar things.
So I'd say, if you find a way to read the non-/semiverbal cues humans use to communicate understanding or misunderstanding among themselves and use those as penalties/rewards for computers, you might be on a way to learn some kind of languages that humans and computers can communicate with.
The same goes into the other direction. There is a sector of software where human users not only frequently master highly intricate interfaces, they even do it out of their own motivation and pride themselves on archieving fluency: Computer games. The secret here is a blazing fast feedback loop - which gives back penalty and reward cues in a similar way to reinforcement learning - and a carefully tuned learning curve which starts with simple, easy to learn concepts and gradually becomes more complex.
I would argue, if you combine those two techniques - using penalties and rewards to train the computers and communicate back to the user how well the computer understood you, you might be able to move to a middle-ground representation without it seeming like much effort on the human side.
The first generation Newton was insanely good at recognizing really terrible cursive handwriting and terrible at recognizing tidy or printed handwriting. The second generation Newton was very good at recognizing tidy handwriting (and you could switch back to the old system if you were one of the freaks whose handwriting the old system grokked). The Newton also had really nice correction system, e.g. gestures for changing the case of a single character or word. In my opinion the later Newton generations recognized handwriting as well as you could expect, when it failed you forgave it and could easily correct the errors, and was decently fast on the 192MHz ARM CPUs.
I got a two-in-one Windows notebook and I was really, really impressed with the handwriting recognition. It can even read my cursive with like 95% accuracy, which I never imagined. That said, it's still slower than typing.
Finger-on-nose! It seems to be a question of utility and efficiency rather than primarily an emotional response to effectiveness of a given UI.
I type emails because it's faster. I handwrite class notes because I often use freeform groupings and diagrams that would make using a word processor painful.
On the other hand, learning to type took some effort and I see many people who do not know how to properly type. The case for handwriting might have looked better if you assumed people would not all want to become typists.
Back in the early 2000s, the hybrid models (the ones with a keyboard) became the most popular Windows tablet products. A keyboard is a pretty convenient way to work around the shortcomings of handwriting recognition.
Most owners eventually used them like laptops unless illustrating, or operating in a sort of "personal presentation mode," where folding the screen flat was convenient. For most people, handwriting recognition was a cool trick that didn't get used much (as I suspect it was for you, eventually).
The utility of a handwriting recognition system is ultimately constrained by the method of correcting mistakes. Even a 99% accuracy means that on average you'll be correcting about one word per paragraph, which is enough to feel like a common task. If the only method of correction is via the pen, then the options are basically to select from a set of guesses the system gives you (if it does that), or rewrite the word until it gets recognized.
But if a product has a keyboard, then you can just use the keyboard for corrections. But if you're going to use the keyboard for corrections, why not just use it for writing in the first place...
There is software that allows you to draw the characters on a touch screen, but it's still slower than typing and most people either type the words phonetically (e.g. in Pinyin) or use something like Cangjie which is based on the shape of character components. Shape-based methods are harder to learn but can result in faster input.
For Japanese there are keyboard-based romaji and kana based input methods.
The correct Hanzi/Kanji is either inferred from context or, as a back-up, selected by the user.
Almost nobody uses kana input in Japanese. There are specialized systems for newspaper editors, stenographers, etc., that are just as fast as English-language equivalents, but learning them is too difficult for most people to bother.
Huh? The standard Japanese keyboard layout is Hiragana-based, and every touchscreen Japanese keyboard I've used defaults to 9key kana input. Do people really normally use the romanji IME instead?
Yes. Despite the fact that kana are printed on the keys, pretty much no Japanese person who is not very elderly uses kana input. They use the romaji (no N) input instead, and that's the default setting for keyboards on Japanese PCs.
You are right that the ten-key method is more common on cell phones (and older feature phones had kana assigned to number keys), although romaji input is also available.
Not sure about that. I guess it depends on how good the prediction is for your text, but in general the big problem is that the step where you select what character you actually meant takes time.
I've always thought it'd be nice to be able to code at the speed of thought - you can often envisage the code or classes that'll make up part of a solution, it's just so slow filtering that through your fingers and the keyboard; if the neural interface 'thing' could be solved that'd be a winner I think, both for coding and manipulating the 'UI'.
It seems within the bounds of possibility to me (from having no experience in the field ;), if patterns could be predictably matched in real-time or better time than it takes to whack the keyboard or move the mouse (can EEG or FMRI make predictable sort of patterns like that?).
>I've always thought it'd be nice to be able to code at the speed of thought - you can often envisage the code or classes that'll make up part of a solution,
In my experience that 'solution' isn't a solution but rather a poorly guessed attempt at one single part of it. Whenever I've thought I have the architecture of a class all figured out there is one corner case that destroys all the lovely logic and you either start from scratch on yet another brittle monolith, or accept that you are essentially writing spaghetti code. At that point documentation that explains what does what and why is the only way to 'solve' the issue of coding.
Unfortunately commercial-grade non-invasive EEG headsets can only reliably respond to things like intense emotional changes and the difference between sleep/awake state, but not much more than that. Awhile ago I had some moderate success classifying the difference between a subject watching an emotional short film vs. stand-up comedy. Mostly they are effective at picking up background noise and muscular movement and not much else.
This is based on my experience with the Muse [1] headset, which has 4 electrodes plus a reference electrode. I know there are similar devices with around a dozen or so electrodes, but I can't imagine they're a significant improvement.
So I think we're still a long way off from what you're describing :(
This comment is way too long to justify whatever point you are making. First of all, how long ago do you mean? Are you 100 years old and people in the 1940's thought computers would be operated by pens?
But also, when my college had a demo microsoft tablet in their store in 2003, I could sign my name as I usually do (typical poor male handwriting) and it recognized every letter. That seems pretty good to me.
I also had a Palm Pilot then and Grafitti sucked, but that's just one implementation.
In addition to what novia said about your first sentence, your comment is so weirdly off-base. You have very strong opinions about pen computing, but where in reality are they coming from?
No, we are not talking about the 1940s, why would anyone be talking about the 1940s? How does that make any sense at all?
We aren't talking about a Microsoft demo that worked well for you once, either. We are talking about the fact that, since the 2000s when it was tried repeatedly, people do not like to hand-write on computers. It is a strange thing not to be aware of when you're speaking so emphatically on the topic.
I don't have strong opinions about pen computing, but the comment I replied to was just giving a bad impression of the tech in the early 2000s. It worked quite well actually.
Since he didn't specify the timeframe, I gave him the benefit of the doubt and thought it he was talking about before my time.
I think you have a good point. Something that was fascinating for me was that one of the things the search engine did after being acquired by IBM was looking at search queries for common questions people would ask a search engine. Taking 5 years of search engine query strings (no PII[1]) and turning them into a 'question' that someone might ask.
What was interesting is that it was clear that search engines have trained a lot of people to speak 'search engine' rather than stick with English. Instead of "When is Pizza hut open?" you would see a query "Pizza Hut hours". That works for a search engine because you have presented it with a series of non stopword terms in priority order.
It reminded me of the constraint based naming papers that came out of MIT in the late 80's / early 90's. It also suggested to me ways to take dictated speech and reformat it into a search engine query. There is probably a couple of good academic papers in that data somewhere.
That experience suggests to me that what you are proposing (a pidgin type language for AI) is a very likely possibility.
One of the bits that Jimmy Fallon does on the Tonight Show is an interviewer who asks people a question but uses nonsense words that sound "right" in the question. The most recent example was changing the question
"Do you think you will see Avengers: Infinity War"
With things like "Do you think you will see Avengers something I swear." Similar sounds and people have already jumped to the question with enough context that they always answer as if he had said "Infinity War." How will conversational systems deal with that, I have no idea.
> One of the bits that Jimmy Fallon does on the Tonight Show is an interviewer who asks people a question but uses nonsense words that sound "right" in the question. The most recent example was changing the question
Yeah, that's exactly what I'm talking about. I suspect this works because the interviewees have a model of what they think Jimmy Fallon is going to ask, and when hearing a question, if they can't make out words or they sound wrong, the simplest thing is to assume you heard them wrong and think what it's likely he would have asked. To that, the answer is fairly obvious.
We've all experienced those moments when we're caught off guard when this fails. For example, you're at work and someone you're working with says something that sounds like it was wildly inappropriate, or out of character, or just doesn't follow the conversation at all, and you stop and say "excuse me? What was that?" because you think it's more likely you heard them wrong than to believe you fuzzy matched their statement "Hairy Ass?" and you were correct.
I've been using search engines since Lycos and before, so I reflexively use what you call "search engine" language. However, due to the evolution of Google, I've started experimenting with more english-like queries, because Google is getting increasingly reluctant to search for what I tell it in a mechanistic way, often just throwing away words, and from what I've read it's increasingly trying to derive information from what used to be stopwords.
One thing that is very frustrating now is if there are two or three things out there related to a query, and Google prefers one of them, it's more difficult than ever to make small tweaks to the query to get the one you want. Of course, sometimes I don't know exactly what I want, but I do know I don't want the topic that Google is finding. The more "intelligent" it becomes, the harder it is to find things that diverge from its prejudices.
Then I stop myself. Isn’t it possible that he expects Alexa to recognize a prompt that’s close enough? A person certainly would. Perhaps Dad isn’t being obstreperous. Maybe he doesn’t know how to interact with a machine pretending to be human—especially after he missed the evolution of personal computing because of his disability.
...
Another problem: While voice-activated devices do understand natural language pretty well, the way most of us speak has been shaped by the syntax of digital searches. Dad’s speech hasn’t. He talks in an old-fashioned manner—one now dotted with the staccato march of time. “Alexa, tell us the origin and, uh, well, the significance, I suppose, of Christmas,” for example.
>One of the bits that Jimmy Fallon does on the Tonight Show is an interviewer who asks people a question but uses nonsense words that sound "right" in the question.
There was a similar thing going around the Japanese Web a few years ago, where you'd try to sneak in soundalikes into conversation without anybody noticing, but the example I remember vividly was replacing "irasshaimase!" ("welcome!") with "ikakusaimase!" ("it stinks of squid!").
Humans are really good at handling 'fuzzy' input. Something that is supposed to have meaning, and should if everything were heard perfectly, but which has some of the context garbled or wrong.
At least, assuming they have the proper context. Humans mostly fill in the blanks based on what they /expect/ to hear.
Maybe someday we'll get to a point where a device we classify as a 'computer' hosts something we would classify as an 'intelligent life form'; irrespective of origin. Something that isn't an analog organic processor inside of a 'bag of mostly salty water'. At that point persistent awareness and context(s) might provide that life form with a human-like meaning to exchanges with humans.
People usually judge AI systems based on superhuman performance criteria with almost no human baseline.
For example, both Google Translate and Facebook's translation system could reasonably be considered superhuman in performance because the singular systems can translate into dozens of languages immediately more accurately and better than any single human could. Unfortunately people compare these to a collection of the best translators in the world.
So you're exactly on track, that humans are heavily prejudiced against even simple mistakes that computers make, yet let consistent continuous mistakes slide for humans.
>Unfortunately people compare these to a collection of the best translators in the world.
I don't think that's really true. They're comparing them to the performance of a person who is a native speaker of both of the languages in question. That seems like a pretty reasonable comparison point, since it's basically the ceiling for performance on a translation task (leaving aside literary aspects of translation). If you know what the sentence in the source language means, and if you know the target language, then you can successfully translate. Translation isn't some kind of super difficult task that only specially trained translators can do well. Any human who speaks two languages fluently (which is not rare, looking at the world as a whole) can translate between them well.
> Any human who speaks two languages fluently (which is not rare, looking at the world as a whole) can translate between them well.
No, sadly, that's not the case. A bilingual person may create a passable translation, in that they can keep the main gist and that all the important words in the original text will appear in the translated text in some form, but that does not automatically make it a "well-translated" text. They are frequently contorted, using syntaxes that hardly appear in any natural sentences, and riddled with useless or wrong pronouns.
A good translation requires more skill than just knowing the two languages.
Right, but I don't think many people expect machine translation to do better than that. And I don't think the ability to do good translations is quite as rare among bilinguals as you're making out. Even kids tend to be pretty good at it according to some research: https://web.stanford.edu/~hakuta/www/research/publications/(...
> They are frequently contorted, using syntaxes that hardly appear in any natural sentences, and riddled with useless or wrong pronouns.
I find it hard to believe that this is the case if the translator is a native speaker of the target language. I mean, I might do a bad job of translating a Spanish text to English (because my Spanish sucks), but my translation isn't going to be bad English, it's just going to be inaccurate.
Consider yourself lucky, then. You are speaking English, with its thousands (if not millions) of competent translators translating everything remotely interesting from other languages. The bar for "good translation" is kept reasonably high for English.
As a Korean speaker, if I walk into a bookstore and pick up any book translated from English, and read a page or two, chances are that I will find at least one sentence where I can see what the original English expression must have been, because the translator chose a wrong Korean word which sticks out like a sore thumb. Like, using a word for "(economic/technological) development" to describe "advanced cancer". Or translating "it may seem excessive [to the reader] ..." into "we can consider it as excessive ..."
And yes, these translators don't think twice about making sentences that no native speaker would be caught speaking. Some even defends the practice by saying they are faithful to the European syntax of the original text! Gah.
> Like, using a word for "(economic/technological) development" to describe "advanced cancer"
That sounds like a mistake caused by the translator having a (relatively) poor knowledge of English. A bilingual English/Korean speaker wouldn't make that mistake. I mean, I don't know your linguistic background, but you clearly know enough English and Korean to know that that's a bad translation, and you presumably wouldn't have made the same mistake if you'd been translating the book.
>Some even defends the practice by saying they are faithful to the European syntax of the original text!
I think there's always a tension between making the translation faithful to the original text and making it idiomatic. That's partly a matter of taste, especially in literature.
> A bilingual English/Korean speaker wouldn't make that mistake.
Well, "bilingual" is not black and white. I think you have a point here, but considering that people who are paid to translate can't get these stuff right, the argument veers into the territory of "no true bilingual person".
Anyway, my pet theory is that it is surprisingly hard to translate from language A to B, even when you are reasonably good at both A and B. Our brain is wired to spontaneously generate sentences: given a situation, it effortlessly generates a sentence that perfectly matches it. Unfortunately, it is not trained at all for "Given this sentence in language A, re-create the same situation in your mind and generate a sentence in language B that conveys the same meaning." In a sense, it is like acting. Everybody can laugh on their own: to convincingly portray someone else laughing is quite another matter.
Not entirely, but it is definitely possible for someone to be a native speaker of two languages, and they wouldn't make those kinds of mistakes if they were.
>They're comparing them to the performance of a person who is a native speaker of both of the languages in question.
Which is synonymous with the best translators in the world. Those people are relatively few and far between honestly - I've traveled a lot and I'd argue that native bi-lingual people are quite rare.
Depends on which part of the world you're in. Have you been to the USA? English/Spanish bilingualism is pretty common there. And there are lots of places where it's completely unremarkable for children to grow up speaking two languages.
This is well said, but one reason this double-standard is rational is that current AI systems are far worse at recovery-from-error than humans are. A great example of this is Alexa: if a workflow requires more than one statement to complete, and Alexa fails to understand what you say, you are at the mercy of brittle conversation logic (not AI) to recover. More often than not you have to start over or worse yet execute an incorrect action, then do something new to cancel. In contrasts, even humans who can barely understand each other can build understanding gradually because of the context and knowledge they share.
Our best AIs are superhuman only at tightly scoped tasks, and our prejudice encodes the utility we get from the flexibility and resilience of general intelligence.
> our prejudice encodes the utility we get from the flexibility and resilience of general intelligence
I don't think any particular human is so general in intelligence. We can do stuff related to day to day survival (walk, talk, eat, etc) and then we have one or a few skills to earn a living. Someone is good at programming, another at sales, and so on - nobody is best at all tasks.
We're lousy at general tasks. General intelligence includes tasks we can't even imagine yet.
For thousands of years the whole of humanity survived with a mythical / naive understanding of the world. We can't even understand the world in one lifetime. Similarly, we don't even understand our bodies well enough, even with today's technology. I think human intelligence remained the same, what evolved was the culture, which is a different beast. During the evolution of computers, CPUs remain basically the same (just faster, they are all Turing machines) - what evolved was the software, culminating in current day AI/ML.
What you're talking about is better explained by prejudice against computers based on past experience, but we're bad at predicting the evolution of computing and our prejudices are lagging.
I might use this in future critical discussions of AI. “It’s not really intelligent.” Yeah, well, neither am I. On a more serious note, it seems obvious to me that technology is incremental, and we are where we are. Given 20 more years of peacetime, we’ll be further along. When VGA 320x200x256 arrived it was dubbed photorealistic. I wonder what HN would have had to say about that.
Being able to do many things at a level below what trained humans can isn't what any reasonable person would call superhuman performance. If machine translation could perform at the level of human translators at even one pair of languages (like English-Mandarin), that would be impressive. That would be the standard people apply. But they very cleary can't.
Generally people think superhuman = better than the best humans. I understand this and it's an obvious choice, but it assumes that humans are measured on a objective scale of quality for a task, which is rarely the case. Being on the front line of deploying ML systems, I think it's the wrong way to measure it.
I think Superhuman should be considered relative to the competence level of average person who has the average amount of training on the task. This is because from the "business decision" level, if I am evaluating between hiring a human with a few months or a year of training and a tensorflow docker container that is reliably good/bad, then I am going to pick the container every time.
That's what is relevant today - and the container will get better.
Well not explicitly or in any measurable terms [1]. The term 'Superhuman' lacks technical depth in the sense of measurement. So for the purposes of measuring systems we build vs human capability, it's a pretty terrible measure.
> I imagine we'll probably develop some AI pidgin language that provides more formality that as long as we follow it AI voice interactions become less annoying (but you have to spend a bit of time before to formulate it).
There's been some tentative discussion of using Lojban as a basis for just that!
>> I imagine we'll probably develop some AI pidgin language that provides more formality that as long as we follow it AI voice interactions become less annoying (but you have to spend a bit of time before to formulate it).
That's what Roger Schank is advocating against in his article: artificial intelligence that makes humans think like computers, rather than computers that think like humans.
Personally, I think it's a terrible idea and the path to becoming like the Borg: a species of people who "enhanced" themselves until they lost everything that made them people.
Also: people may get stuff wrong a lot of the time (way more than 10%). But, you can always make a human understand what you mean with enough effort. Not so with a "computer" (i.e. an NLP system). Currently, if a computer can't process an utterance, trying to explain it will confuse it even further.
Given how often another person can't correctly infer meaning when people talk, I doubt it will ever be how people imagine it.
Initially, I imagine it will be a lot of people trying to talk normally and a lot of AI responses asking how your poorly worded worded request should be carried out, choice A, or choice B. You'll be annoyed because why can't the system just do the thing that's right 90% of the time? The problem is that 10% of the time you're just not explaining yourself well at all and 10% is really quite a lot of the time, and probably ranges from a few requests to dozens of requests a day, and while being asked what you meant is annoying, having the wrong thing done can be catastrophic.
As humans, we get around this to some degree by modeling the person we are conversing with as thinking "what would they want to do right now", which is another class of problem for AIs I assume, so I think we may not get human level interaction until AI's can fairly closely model humans themselves.
I imagine we'll probably develop some AI pidgin language that provides more formality that as long as we follow it AI voice interactions become less annoying (but you have to spend a bit of time before to formulate it).
Then again, maybe I'm way off base. You sound like you would know much better than me. :)