> Telemetry is perfectly acceptable as long as it is opt-in and does not contain personal data
Telemetry contains personal data by definition. It just varies how sensitive & how it's used. Also it's been shown repeatedly that 'anonymized' is shaky ground.
In that popcon example, I'd expect some Debian-run server to collect a minimum of data, aggregate, and Debian maintainers using it to decide where to focus effort w/ respect to integrating packages, keeping up with security updates, etc. Usually ok.
For commercial software, I'd expect telemetry to slurp whatever is legally allowed / stays under users' radar (take your pick ;), vendor keeping datapoints tied to unique IDs, and sell data on "groups of interest" to the highest bidder. Not ok.
Personal preference: eg. a crash report: "report" or "skip" (default = skip), with a checkbox for "don't ask again". That way it's no effort to provide vendor with helpful info, and just as easy to have it get out of users' way.
It's annoying the degree to which vendors keep ignoring the above (even for paying customers), given how simple it is.
Why it has to include PII by definition? I'd say DNF Counting (https://github.com/fedora-infra/mirrors-countme) should be considered "telemetry", yet it doesn't seem to collect any personal data, at least by what I understand telemetry and personal data to mean.
I'm guessing that you'd either have to be able to argue that DNF Counting isn't telemetry, or that it contains PII, but I don't see how you could do either.
Yes, so the vendor must not store it. Something along those lines is usually said in the privacy policy. If you don't trust the vendor to do that, then do not opt-in to sending data, or even better, do not use the vendor's software at all.
Sometimes, we have to or we simply want to run software from developers we don't know or entirely trust. This just means that the software developer needs to be treated as an attacker in your threat model and mitigate accordingly.
I would argue that users can't inherently trust the average developer anymore. Ideas about telemetry, phoning home, conducting A/B tests and other experiments on users, and fundamentally, making the software do what the developer wants instead of what the user wants, have been thoroughly baked in to many, many developers over the last 20 or so years. This is why actually taking privacy seriously has become a selling point: It stands out because most developers don't.
I can't argue that you are wrong, but I can argue that, for myself, if I don't trust a developer to not screw me over with telemetry, I cannot trust the developer to not screw me over with their code. I can't think of a scenario where this trust isn't binary, either I can trust them (with telemetry AND code execution), or I can't trust them with either.
Could you describe what scenario I am missing?
You’re not missing anything. In general, I don’t think you can really trust the vast majority of software developers anymore. Incentives are so ridiculously aligned against the user.
If you take the next step: “do not use software from vendors you don’t trust,” you are severely limiting the amount of software you can use. Each user gets to decide for himself whether this is a feasible trade off.
Yeah, isn't that a shame? Wouldn't it be nice if instead of catastrophizing that telemetry data is always only ever there to spy on us, that we might assume that there are actually trustworthy projects out there? Especially for FOSS projects, which can usually not afford extensive in-house user testing, telemetry provides extremely valuable data to see how their software is used and where it can be improved, especially in the UX department, where many FOSS is severely lacking. This thread here is a perfect example of this kind of black/white thinking that telemetry must be ripped out of software no matter what, usually based on some fundamental viewpoint that anonymity is impossible anyway, so why bother even trying. This is not helping. I usually turn on telemetry for FOSS that offers it, because I hope they will use this to actually improve it.
Many corporate privacy policies per their customer contracts agree with this. Even a single packet regardless of contents is sending the IP address and that is considered by many companies to be PII. Not my opinion, it's in thousands of contracts. Many companies want to know every third party involved in tracking their employees. Deviating from this is a compliance violation and can lead to an audit failure and monetary credits. These policies are strictly followed on servers and less so on workstations but I suspect with time that will change.
I can only repeat myself from above: it's about what data you store and analyze. By your definition, all internet traffic would fall under PII regulations because it contains IP addresses, which would be ludicrous, because at least in the EU, there are very strict regulations how this data must be handled.
If you have a nginx log and store IP addresses, then yes: that contains PII. So the solution is: don't store the IP addresses, and the problem is solved. Same goes for telemetry data: write a privacy policy saying you won't store any metadata regarding the transmission, and say what data you will transmit (even better: show exactly what you will transmit). Telemetry can be done in a secure, anonymous way. I wonder how people who dispute this even get any work done at all. By your definitions regarding PII, I don't see how you could transmit any data at all.
By your definitions regarding PII, I don't see how you could transmit any data at all.
On the server side you would not. Your application would just do the work it was intended to do and would not dial out for anything. All resources would be hosted within the data-center.
On the workstation it is up to the corporate policy and if there is a known data-leak it would be blocked by the VPN/Firewalls and also on the corporate managed workstations by IT by setting application policies. Provided that telemetry is not coded in a way to be a blocking dependency this should not be a problem.
Oh and this is not my definition. This is the definition within literally thousands of B2B contracts in the financial sector. Things are still loosely enforced on workstations meaning that it is up to IT departments to lock things down. Some companies take this very seriously and some do not care.
Telemetry contains personal data by definition. It just varies how sensitive & how it's used. Also it's been shown repeatedly that 'anonymized' is shaky ground.
In that popcon example, I'd expect some Debian-run server to collect a minimum of data, aggregate, and Debian maintainers using it to decide where to focus effort w/ respect to integrating packages, keeping up with security updates, etc. Usually ok.
For commercial software, I'd expect telemetry to slurp whatever is legally allowed / stays under users' radar (take your pick ;), vendor keeping datapoints tied to unique IDs, and sell data on "groups of interest" to the highest bidder. Not ok.
Personal preference: eg. a crash report: "report" or "skip" (default = skip), with a checkbox for "don't ask again". That way it's no effort to provide vendor with helpful info, and just as easy to have it get out of users' way.
It's annoying the degree to which vendors keep ignoring the above (even for paying customers), given how simple it is.