This classification is very useful to discuss this issue. The difference between...

whatshisface · on July 10, 2019

It's not wrong for humans to label training data. It's wrong to let humans listen to voice recordings that users believed would be between them and a computer. The solutions are obvious: sell the things with a big sticker that says, "don't say anything private in earshot," revert to old fashioned research methods where you pay people to participate in your studies and get their permission, or ask people for permission to send in mis-heard commands like how Ubuntu asks me if I want to send them my core dumps.

JoshTriplett · on July 10, 2019

> ask people for permission to send in mis-heard commands

Note that you also want the "correctly" heard commands, because some of them will have been incorrect. It's frustrating when an assistant gives the "I don't know how to do that", but it's even more frustrating to get "OK, doing (the wrong thing)".

Also, another alternative: provide an actual bug reporting channel. "Hey Google, report that as a bug" "Would you like to attach a transcript of the recent interaction? Here's what the transcript looks like." "Yes."

Dextro · on July 10, 2019

To be fair the system already has something like that. If you complain to the Home it'll ask if you want to provide feedback and give you a few seconds to verbally explain what went wrong.

I'm not sure if humans will then review that feedback of if it goes through a speech to text algorithm first but the mechanism for feedback is there.

a3n · on July 10, 2019

Yeah, i think I've experienced that. I was driving with Maps directions, and while i was driving Google decided to show me new things Maps can do.

I tried to voice my way back to directions, unsuccessfully. I said "Fuck you Google."

"I see that you're upset," followed by some instructions on how to give feedback. While I was driving. It sounded almost exactly like "I'm sorry Dave, I can't help you."

jpswain · on July 11, 2019

iOS voicemail transcription has this.

inetknght · on July 10, 2019

> like how Ubuntu asks me if I want to send them my core dumps

While I like how Ubuntu does it, I actually like better how Fedora does it. Not only do they ask to submit core dumps but gives you the ability to annotate and inspect what gets sent as well as gives you a bug report ID which you can use to follow up on.

alias_neo · on July 10, 2019

Agreed, I'd like to support Ubuntu development, I often run it on bleeding edge hardware I'd like to submit crash reports for, but the inability to sanitise the data causes me not to unless it's a "fresh" device.

kevin_thibedeau · on July 10, 2019

Just give participants the choice to opt in for a chance to get early access to new products. Make it invite only to feel exclusive. They will have millions of willing test subjects.

jamesmadison66 · on July 10, 2019

good point, there's precedent from hospitals wrt IRB and other infrastructure involved w/ data gathering. Hospitals/research institutions self-regulate in this regard, doesn't appear tech does

joey_bob · on July 10, 2019

Handling the data in an ethical way doesn't need to be handling the data in an completely anonymous fashion. That would be one solution, but you can also create a tust-based system for how the data being labeled is handled, similar to HPAA. In addition, there are simple operational methods that could help ensure the data is processed as close to anonymously as possible. For example with voice data, you could filter the voices, work with the data in segments, and ensure that metadata for the samples is only accesible by trusted individuals certified under the above framework.

blackflame7000 · on July 10, 2019

In trust-based systems like HIPPA or Clearances, there is a fundamental aspect of requiring 2 conditions to access data: privilege, and the necessity to know. Taking data and mining for valuable insights isn't a "need to know" it's a "need to discover something unknown". This is where the security breaks down. In a conventional HIPPA system, only your doctor needs to access your info. You don't have to worry about some other doctors accessing your information in bulk to try and conduct a study on cancer rates. They don't NEED to know your info, they just WANT to know. When you WANT to know how to accurately fingerprint people by their voice, then obfuscating it is counterproductive.

vharuck · on July 10, 2019

>You don't have to worry about some other doctors accessing your information in bulk to try and conduct a study on cancer rates.

This not only happens, it's my job (though I'm not a doctor). Of course, it's tightly controlled on my end. I work for the government, but health systems have their own analysts. As part of my job, I have access to sensitive and identifying information.

This isn't to be contrairian. There are existing systems using very personal data in bulk for analysis. The wheel doesn't need reinvented.

mikeash · on July 10, 2019

Is it feasibility, or just laziness?

My car has a little blurb that explains that they collect data to use for training and gives me the choice to participate or not. Opting out doesn’t affect any functionality. Why can’t Google do the same thing?

PeterStuer · on July 10, 2019

That should never be an opt-out. It is both ethically and in some regions legally required to be opt-in.

mikeash · on July 10, 2019

Or just an opt, where you have to make a choice during setup.

blackflame7000 · on July 10, 2019

Because Google's first allegiance is to the shareholders and data has value so it's not in their best interest to make it easy not to share your data.

vkou · on July 10, 2019

The shareholder value theory is rubbish, because it has no predictive or descriptive powers for why one decision was made over another.

I can just as easily say that the best way to maximize shareholder value is to minimize public scandal, scrutiny, and potential for legislature.

Nearly every single decision, including contradictory ones, made by every single company, everywhere, can be retroactively justified to have been done in the name of shareholder value.

blackflame7000 · on July 10, 2019

> I can just as easily say that the best way to maximize shareholder value is to minimize public scandal, scrutiny, and potential for legislature.

Scandals can get free marketing, for example, Nike and Colin Kaepernick. Attention is always better than no attention at all for a business. Every single decision is made to increase profit but there might be many things that need to be accomplished first so its hard to see the big picture. For example, a developer might want to improve a feature because they want more people to use their product. A manager gets approval to pay that developer because the investment is deemed a profitable one. What does the person who gave them that money care about the number of users. It's not their invention and they don't even use the service? They give the money because they know that More users = more market share = more ads to sell = a return greater than the initial investment. Until a business can run with people working for free, the person paying for things always dictates what is bought and thus the direction the company is headed.

Let's say that direction is contrary to the direction of another prominent member of the business wants it to go. Whether you want to believe it or not, the same calculus goes on in every person's mind: Is this the potential payoff of Option A greater than the potential loss of Option B given the risk?

mikeash · on July 10, 2019

This is a wonderfully condescending response but it answers nothing. The question was, why can’t google do it differently? This doesn’t answer the question. We can plainly see this from the fact that other companies, operating under the same conditions you describe, make different choices.

This is the business equivalent of saying “because physics.” It’s not wrong, it’s just not useful.

blackflame7000 · on July 11, 2019

Sorry, I didn't mean to be condescending. To answer your question, the reason Google can't do things differently is that they have already established themselves as first and formost and advertisement company and the way to do that best is to know their audience very intimately. Other businesses like Apple have established themselves as a hardware company first so they aren't dependent on user data as much so they took advantage of that and established themselves as the "Secure" phone. Google is too large and it makes too much money from its core business which is ad drive. As long as search and ads are their cash cow they cannot change in the way you hope.

mikeash · on July 10, 2019

Right! All the companies doing it differently are also trying to satisfy their shareholders.

blackflame7000 · on July 10, 2019

That's what is so great about capitalism. If one company starts to take advantage of its users for profit, it opens up a niche for another company to take a different approach.

georgebarnett · on July 10, 2019

No it’s not.

Google has many primary concerns it needs to manage. That’s how you get big - by managing lots of concerns successfully.

If they drop one too long, they start going backwards very quickly.

blackflame7000 · on July 10, 2019

Then explain why they changed to Alphabet. Shareholders were sick of things like project loon, siphoning cash from google search. You are extremely naive if you think there are many concerns of higher importance than profit. Everything else is about maintaining and growing profit even if that means doing an ad campaign convincing people you are fighting the good fight..for profit.

icebraining · on July 10, 2019

> Then explain why they changed to Alphabet. Shareholders were sick of things like project loon, siphoning cash from google search.

Alphabet is still spending billions from Google into "other bets" like Loon, so I don't see how this explains the change.

blackflame7000 · on July 11, 2019

Because now they have to report it to their shareholders where the money is going so that if the board doesn't like it they can replace the CEO. Before since it was all google, the money went where they said it went, there was no oversight. They had this massive R&D budget that was opaque to the investors. Money that could have been paid to shareholders as a dividend or return was instead spent on projects they had no idea about.