Sure, but if there's a law forbidding you to exceed a particular speed on the road, you can't do it anyway and say "you can't be perfectly safe anyway".
The analogy here is: there are laws regarding confidentiality that probably were broken here.
I agree. But at the same time, insisting on 100% certainty/safety would mean to not do anything at all and stick to the status quo forever. It boils down to cost-benefit-calculations.
While I agree that it is unacceptable to use customer data without consent (as suggested by OP's post), I disagree with the implicit assumption behind the comment that I responded to:
Namely the implicit assumption that human/biological intelligence/agents are somehow superior to artificial intelligence/agents in face of secrecy/confidentiality.
It boils down to the question whether it's possible or not to create algorithms that outperform humans in tasks involving secrecy and confidentiality.
While I can't think of any reason why that should not be possible in general, I agree that the current SOTA of generative LLMs is not sufficient for that.
Is throwing lots and lots of data and RLHF training on an LLM enough in order to make the probability of customer data leaks small enough to be acceptable?
I don't know. But I don't trust MBAs who salivate with dollar signs in their eyes to know either. And I fear that their lack of technical understanding will lead to bad decisions. And I fear those might lead to scandals that make Gemini's weird biases in image generation pale in comparison.
Yes, the user bears the cost when their confidential data is leaked and the company derives the economic benefit of mishandling it, which is why this keeps happening.
I used to work with extremely sensitive data. My employer made it a point to hire people with memory disorders and intellectual disabilities to deal with raw data.
There was a young lady I had to reintroduce myself to every week or so. I think of her every so often.
Parent links to Yann LeCunns's "Text understanding from Scratch" paper, from 2015, where the authors uses a conv-net, originally build for image recognition to do text categorisation.
The NN techniques falls squarely in the "counting words"-bracket, although this one is actually counting characters.
It is a great paper, with great results, but none of those models therein have an opinion on ISIS, an ability to converse or anything the author of TFA calls cognition.
Once you have identified the most significant individuals on a field, you can use google scholar alerts to get notified when they publish something new. (http://scholar.google.com/scholar_alerts?view_op=list_alerts). In the case of machine learning that would be Geoffrey Hinton, Yann LeCun, Yoshua Bengio, Andrew Ng (this list is not exhaustive of course).
Sometimes the underlying processes an application is designed for are too complex to be self-explanatory. Look at e.g. Photoshop, would not work without tutorials.
I'm not arguing against any help, I'm simply stating that if it's easy enough to explain with an overlay then it's probably possible to make the interface intuitive enough without it.