Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What happened if I said "peanuts, butter". I expect two items, but will Alexa give one?


As others have said, there's the pluralization of "peanut[s]" to distinguish between the two. This is a useful feature of English: the adjective-like role of a noun in a complex noun phrase is (almost?) always singular.

    - Computer engineer
    - NOT computers* engineer

    - Toothbrush
    - NOT teethbrush*

    - Foot doctor
    - NOT feet* doctor

    - Alarm clock
    - NOT alarms* clock, even when it supports multiple alarms!
Additionally, there's phrasal intonation. If the intonation and stress decrease throughout the phrase, it's a single item. If the intonation and stress reset for "butter," then it's a new item.

    - 'PEA ,Nut but ter
    - 'PEA nut 'BUT ter


Proudfeet!


Alexa, please tell me about the...

Attorneys general Senators elect

Ahhhhhhh!


The difference is that these are phrases with adjectives, not nouns being used adjectivally.


"Attorney generals" is a noun phrase (admittedly of questionable adjectivity). "Attorneys general" is a blind idiot translation of a phrase in a language with different grammatical rules (Latin, IIRC).


"Attorney" is a noun. "General" as used here is an adjective. It's unusual in that the adjective follows the noun without a hyphen, but it's common enough, and it's where prepositional phrases are seen, like "Big man on campus" and "powers that be".

Did ancient Romans have attorneys general?


No one ever gets a single "peanut". So unless you mush mouth the "S", the reasonable expectation for both your cohabitator and the robot is to bring peanuts and butter.

A better question is "coconut, milk" versus "coconut milk".


> A better question is "coconut, milk" versus "coconut milk".

Sure, but if you were dictating to a human that would still be an easy one for them to get wrong, depending on how long you paused.

I find this interesting with phone numbers. In some countries you hear people say "thirty three sixty two" and they mean 303602


"Coconut" is still an anomalous grocery item. You'd want one of

- a coconut

- [number] coconuts

- shredded coconut

"Coconut" is best matched to that last option, but it's not a natural word choice. (Although it is a natural list entry... do people think of themselves as dictating to Alexa, or as writing the list themselves while happening to use their voice?)


If i'm making a list as a reminder to actually pick up items.. coconut will suffice.


Yes, I agree. If you're writing a list for yourself, a bare "coconut" is a typical entry. But if you're dictating a shopping list to someone else, you're quite unlikely to say "coconut" because that isn't grammatical.

So it turns into a question of how people think about dictating to Alexa.


This is a good point - in reality Alexa doesn't really have to do a great job transcribing at all if it's just constructing a list as a reminder for you later.

If this is a precursor to being able to quickly voice order stuff off amazon to be delivered though it's a different story.


The one I thought of was "peanut butter M&Ms"


This is a very interesting observation. The whole point of speech to text models being biased towards the US in terms of training data and innovation is valid not only across the larger things (gender/race/religion) but just small things like this. And these are likely to cause daily problems.


And that's what makes it interesting. Peanut butter versus peanuts, butter is easy. No one gets a single peanut.

As for the phone number, that's why anyone in a serious occupation (aviation, military, etc) treat each digit as is stand-alone.

Three-zero-three-six-zero-two.


>No one gets a single peanut

You'd be surprised: https://www.youtube.com/watch?v=HoPFQm9PQ_M


A relevant question is, if you asked that to your spouse, would you be angry if they came home with a jar of peanut butter?


If it correctly understands peanutS, it will classify it as "more likely 2 items" considering it would check everything against some sort of dictionary. Which contains "peanuts, butter, peanut butter".

PS. I implemented something similar without machine learning and that's how i did it. With text it's easier though, i suppose in NLU it could have a parameter for "pause time between words" which could also contribute to a different conclusion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: