>Understanding pauses/inflection changes doesn't have to be a "solved problem" to work for cases such as discerning common shopping list style items.
Okay... But Alexa isn't just shopping lists. You only know you are dealing with a shopping list after parsing the text.
Even if you did go back, is the narrower use case any more solved than the general one? Guessing with text alone turns out to be fairly accurate and so even if you could do this decently, it would have to be notably better to be worth the trouble.
>That's an argument against discerning "milk" from "silk" or "coke" from "cork", but that's still managed satisfactorily enough.
Irrelevant to this though, considering that problem has mostly been solved at this juncture.
Okay... But Alexa isn't just shopping lists. You only know you are dealing with a shopping list after parsing the text.
Even if you did go back, is the narrower use case any more solved than the general one? Guessing with text alone turns out to be fairly accurate and so even if you could do this decently, it would have to be notably better to be worth the trouble.
>That's an argument against discerning "milk" from "silk" or "coke" from "cork", but that's still managed satisfactorily enough.
Irrelevant to this though, considering that problem has mostly been solved at this juncture.