*> Telemetry is perfectly acceptable as long as it is opt-in and does not contai...

diggan · 2025-05-22T12:46:19 1747917979

> Telemetry contains personal data by definition

Why it has to include PII by definition? I'd say DNF Counting (https://github.com/fedora-infra/mirrors-countme) should be considered "telemetry", yet it doesn't seem to collect any personal data, at least by what I understand telemetry and personal data to mean.

I'm guessing that you'd either have to be able to argue that DNF Counting isn't telemetry, or that it contains PII, but I don't see how you could do either.

kevin_thibedeau · 2025-05-22T13:51:04 1747921864

IPs are PII. You hit the server, and your anonymity is breached.

deng · 2025-05-22T14:08:53 1747922933

Yes, so the vendor must not store it. Something along those lines is usually said in the privacy policy. If you don't trust the vendor to do that, then do not opt-in to sending data, or even better, do not use the vendor's software at all.

ryandrake · 2025-05-22T14:40:41 1747924841

Sometimes, we have to or we simply want to run software from developers we don't know or entirely trust. This just means that the software developer needs to be treated as an attacker in your threat model and mitigate accordingly.

I would argue that users can't inherently trust the average developer anymore. Ideas about telemetry, phoning home, conducting A/B tests and other experiments on users, and fundamentally, making the software do what the developer wants instead of what the user wants, have been thoroughly baked in to many, many developers over the last 20 or so years. This is why actually taking privacy seriously has become a selling point: It stands out because most developers don't.

happysadpanda2 · 2025-05-22T17:45:15 1747935915

I can't argue that you are wrong, but I can argue that, for myself, if I don't trust a developer to not screw me over with telemetry, I cannot trust the developer to not screw me over with their code. I can't think of a scenario where this trust isn't binary, either I can trust them (with telemetry AND code execution), or I can't trust them with either. Could you describe what scenario I am missing?

ryandrake · 2025-05-22T19:33:37 1747942417

You’re not missing anything. In general, I don’t think you can really trust the vast majority of software developers anymore. Incentives are so ridiculously aligned against the user.

If you take the next step: “do not use software from vendors you don’t trust,” you are severely limiting the amount of software you can use. Each user gets to decide for himself whether this is a feasible trade off.

dsr_ · 2025-05-22T14:05:58 1747922758

The ongoing problem with popcon is that it's known not to be accurate, but since it's the data that's available, people make decisions based on it.

popcon is least likely to be turned on by:

- organizations with any kind of sensible privacy policy (which includes almost everyone running more than a handful of machines)

- individuals concerned about privacy

popcon is most likely to be turned on by Debian developers, and people new to Debian who have just installed it for the first time.

deng · 2025-05-22T15:20:40 1747927240

Yeah, isn't that a shame? Wouldn't it be nice if instead of catastrophizing that telemetry data is always only ever there to spy on us, that we might assume that there are actually trustworthy projects out there? Especially for FOSS projects, which can usually not afford extensive in-house user testing, telemetry provides extremely valuable data to see how their software is used and where it can be improved, especially in the UX department, where many FOSS is severely lacking. This thread here is a perfect example of this kind of black/white thinking that telemetry must be ripped out of software no matter what, usually based on some fundamental viewpoint that anonymity is impossible anyway, so why bother even trying. This is not helping. I usually turn on telemetry for FOSS that offers it, because I hope they will use this to actually improve it.

LtWorf · 2025-05-22T17:16:54 1747934214

Turning it on and being unable to turn it off aren't the same.

hulitu · 2025-05-24T18:15:23 1748110523

> I usually turn on telemetry for FOSS that offers it, because I hope they will use this to actually improve it.

And if they, or someone else, use this for RCE ? Asking for a friend. /s

deng · 2025-05-22T12:43:43 1747917823

> Telemetry contains personal data by definition.

No. Please look up the definition of "telemetry" and "personal data". The latter always refers to an identifiable person.

hedora · 2025-05-22T13:31:14 1747920674

Virtually all anonymization schemes are reversible, so “identifiable” isn’t carrying any weight in your definition.

“Person” isn’t either, unless the software knows for sure it’s not being uses by a person.

deng · 2025-05-22T14:03:04 1747922584

By your definition, all data is PII.

Bender · 2025-05-22T15:15:01 1747926901

Many corporate privacy policies per their customer contracts agree with this. Even a single packet regardless of contents is sending the IP address and that is considered by many companies to be PII. Not my opinion, it's in thousands of contracts. Many companies want to know every third party involved in tracking their employees. Deviating from this is a compliance violation and can lead to an audit failure and monetary credits. These policies are strictly followed on servers and less so on workstations but I suspect with time that will change.

deng · 2025-05-22T15:29:01 1747927741

I can only repeat myself from above: it's about what data you store and analyze. By your definition, all internet traffic would fall under PII regulations because it contains IP addresses, which would be ludicrous, because at least in the EU, there are very strict regulations how this data must be handled.

If you have a nginx log and store IP addresses, then yes: that contains PII. So the solution is: don't store the IP addresses, and the problem is solved. Same goes for telemetry data: write a privacy policy saying you won't store any metadata regarding the transmission, and say what data you will transmit (even better: show exactly what you will transmit). Telemetry can be done in a secure, anonymous way. I wonder how people who dispute this even get any work done at all. By your definitions regarding PII, I don't see how you could transmit any data at all.

Bender · 2025-05-22T15:31:50 1747927910

By your definitions regarding PII, I don't see how you could transmit any data at all.

On the server side you would not. Your application would just do the work it was intended to do and would not dial out for anything. All resources would be hosted within the data-center.

On the workstation it is up to the corporate policy and if there is a known data-leak it would be blocked by the VPN/Firewalls and also on the corporate managed workstations by IT by setting application policies. Provided that telemetry is not coded in a way to be a blocking dependency this should not be a problem.

Oh and this is not my definition. This is the definition within literally thousands of B2B contracts in the financial sector. Things are still loosely enforced on workstations meaning that it is up to IT departments to lock things down. Some companies take this very seriously and some do not care.