Instead of collecting more data why don't do something with the data we already ...

Aachen · on July 7, 2023

Apples and oranges. Bug reports are filed by a specific type of user and doesn't give a comprehensive view of all bugs. Statistics can also include a lot more than bugs, like "is the number of MIPS users proportional to the amount of extra effort we need to put in to make that happen?" is not a data point you'll find in bugzilla or other tickets.

TrueDuality · on July 7, 2023

Adding to this, bug reporting in the Red Hat ecosystem is an extremely painful process. Multiple accounts, no feedback, mandatory detailed system information collection if you want to use the built in bug report (which also needs an account). No automatic aggregation, processing, or association with similar issues.

It's a black box with no incentive to participate unless you're one of those specific types of users that is dedicated enough to put up with all of that, or users that have never done it before and are trying hard to contribute back.

denton-scratch · on July 7, 2023

So fix that first?

TrueDuality · on July 7, 2023

Yup I absolutely think that should be fixed before they try this telemetry option. Anonymous bug reporting (just crash details) are probably more palatable than this proposal but that would be more work on the bug report side to aggregate and filter out garbage, and they want something that is minimal effort on their side (stated directly in the discussion).

nixpulvis · on July 7, 2023

Because management can impose these new data collection policies more easily than fixing known issues. It then gives them the potential to find new easier work to have the engineers implement thus making it seem like they are being effective. Meanwhile, it can be unclear how these metrics relate to overall software quality.

Some metrics like startup time and crash counts lead to clear improvement, while others like pointer heatmaps and even more invasive focus tracking are highly dubious in my opinion.

On a related note, I’m coming to the opinion that A/B testing is harder to pull off than many think. And serving a single user both A and B at any point can confuse them and get in the way of their trusting the consistency of the software. Much like how when you search for something twice and get different results in Apple Maps. OK, now I’m just ranting…

hedora · on July 7, 2023

They moved to the CADT model twenty years ago, so the bug reports will never be read.

Now, with telemetry, they can say quantifiable things like "we've driven catastrophic root filesystem loss and permanent loss of network connectivity to 0% of installs!", and prioritize any contrary bug reports away in a data-driven, quantifiable way.

(Because, of course, weak telemetry signals are more valuable than actual humans taking the time to give you feedback on your product.)

everybodyknows · on July 8, 2023

What is "CADT"?

hyperdimension · on July 8, 2023

The "Cascade of Attention-Deficit Teenagers" development model.

As coined here (copy and paste the link into your browser if you don't want a 'surprise' from jwz) ttps://www.jwz.org/doc/cadt.html

api · on July 8, 2023

I’ve also heard CHIT, or cascade of historically ignorant twentysomethings, for HN and startup dev culture.

Now how long until someone reinvents peer to peer networking or document databases again…

AshamedCaptain · on July 7, 2023

Because they will claim that bugzilla is only used by "advanced users" that are not representative of the average user of Fedora.

I absolutely detest that Catch-22 argument, which some distro (not Fedora) actually tried to use on me in the past.

akikoo · on July 10, 2023

> the bottleneck doesn't lie in data collection, but in processing

I created a bug report [1] for tigervnc-server in Fedora because the Fedora documentation [2] for setting up a VNC server didn't match any more what was coming from dnf.

In the bug report I provided the info that would need to be fixed in the documentation. Now after two months, seemingly nothing has been done to fix the situation.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2193384

[2] https://docs.fedoraproject.org/en-US/fedora/latest/system-ad...

nullc · on July 7, 2023

If anything there often appears to be a negative correlation with increased data collection and product quality, in my experience.

I figure it must be due to an abdication of responsibility-- absent information, the product must at least appeal to someone working on it who is making decisions about what is good and what isn't, and so it will also appeal to people who share their preferences. But with the power of DATA we can design products for the 'average user' which can be a product that appeals to no single person at all!

Imagine that you were making shirts. To try to appeal to the most number of people, you make a shirt sized for the average person. But if the distribution of sizes is multimodal or skewed the mean may be a size that fits few or even absolutely no one. You would have done better picking a random person from the factory and making shirts that fit them.

When your problem has many dimensions like system functionality, the number of ways you can target an average but then fit no one as a result increases exponentially.

Pre-corporatized open source usually worked like fitting the random factory worker: developers made software that worked for them. It might not be great for everyone, but it was great for people with similar preferences. If it didn't fit you well you could use a different piece of software.

In corportized open source huge amounts of funding goes into particular solutions, they end up tightly integrated. Support for alternatives are defunded (or just eclipsed by better funded but soulless rivals). You might not want to use gnome, but if you use KDE, you may find fedora's display subsystem crashes out any time you let your monitor go to sleep or may find yourself unable to configure your network interfaces (to cite some real examples of problems my friends of experienced)-- you end up stuck spending your life essentially creating your own distribution, rather than saving the time that you hoped to save by running one made by someone else.

Of course, people doing product design aren't idiots and some places make an effort to capture multimodality though things like targeting "personas"-- which are inevitably stereotyped, patronizing, and overly simplified (like assuming a teacher can't learn to use a command prompt or a bug tracker). Or through efforts like user studies but these are almost always done with very unrepresentative users, people with nothing better to do then get paid $50 to try someting out, and you learn only about the experience of people with no experience and no real commitment or purpose to their usage (driving you to make an obscenely dumbed down product). ... or by things like telemetry, which even at their best will fail to capture things like "I might not use the feature often, but it's a huge deal in the rare events I need it." or get distorted by the preferences of day-0 users, some large percentage of which will decide the whole thing isn't for them no matter what you do.

So why would non-idiots do things that don't have good results? As sibling posts note, people are responding to the incentives in their organizations which favor a lot of wheel spinning on stuff that produces interesting reports. People wisely apply their efforts towards their incentives-- their definition of a good result doesn't need to have much relation to any external definition of good.