In my opinion, the whole idea of a third-party analytics system as they are currently implemented is incompatible with the idea of privacy. As soon as more than one site uses the same analytics system, then the company running the analytics system has aggregate data that the user did not consent to.
As a site owner, it is disingenuous showing your users a banner that says "This site uses cookies to allow us to maintain state" while leaving out the part about "This site uses google analytics which allows Google to track you across sites and build nearly complete picture of every page you browse".
The smaller hosted analytics systems claim to be more privacy oriented but I don't trust them either. Even with the best of intentions, sooner-or-later they will be sold or pivot. The data is just too valuable.
I know that some sites live and die by close reading of analytics but most sites don't need anything as elaborate. I implemented a very simple page counter than I host myself but that is just for vanity. Most sites don't even need that.
I can only speak to Fathom (since that's my product), but we have no personal data to sell if we sold our product. We collect nothing about website visitors other than a tick in the aggregate data. So if we sold our business, there's zero personal information from website visitors on any of our customers dashboards that could be used against them.
Thats a great attitude but even assuming that I trust you (I mean, I am sure you are a great guy and all, but I don't know you), that is what you deliberately store and allow customers to access. Your log files contain IP addresses and agent strings, enough information to identify people with fairly high accuracy. What do you do with them?
Google Analytics uses first-party cookies and can be configured to not share data with other Google services. IIRC, that's even the default setting. That's nowhere near allowing Google to track you across sites. Google's advertising cookies track you across sites, though.
Writers have pretty well 0 (and that's a big zero) control over what sales and advertising practices are used by the company they are writing for either as staff or freelance. But they can write what they want as long as its approved by editorial.
Likewise, the sales teams have pretty well 0 (and that's a big zero) control over published editorial content. But they have a lot of pull in implementing any and all kinds of analytics, a/b testing, and so on.
Without commenting on Google Analytics itself, why on earth would we wait until after something bad happens? If real risks can be identified, then one should think seriously about whether they outweigh the gains.
If the gains are worth the risk, then that's fine, but there's nothing "scaremongering" about pointing out things that could go wrong. That's at least half of engineering as a discipline.
That's a good point and I would like to read a good risk analysis. The problem is that when large companies are involved, you get very superficial risk analysis. Something analogous to:
"The worst thing that could happen if I put my money in a bank is they take all my money. So, we shouldn't put money in banks."
If you point out that this generally doesn't happen people will talk about what the company could do.
How do we get beyond fully general arguments that you can never trust any company to do anything (or alternatively that theoretical risks don't matter)? I guess to really understand the risk you'd have to understand a company's internal controls, and those generally aren't public.
Many define the harm as the loss of privacy itself.
In the case of analytics, this would be particularly egregious: the user gives up privacy to a third party, generally without knowledge and with no direct benefit.
See, this is treating "loss of privacy" as an abstraction meaning "any information somehow derived, no matter how indirectly, from my behavior." That's something I fundamentally disagree with. When people get upset about it then I have a hard time taking it seriously.
If I log all the hits to my website and generate aggregate reports based on that, there's no personal information disclosed in the reports. Nobody is harmed and they have no case, as far as I'm concerned. It's the same if I'm outsourcing to Google.
I've heard this argument and, sure, some people just don't care. Privacy. Shmivacy.
But, the thing is the example you gave is just a matter of degree. I mean, if you sold the logs with PII, location info, etc, then that would be near the other end of the privacy continuum. And, since you explicitly provided an example that included aggregate vs personal info, then I assume even you would care about the latter.
So, the question is, at what point along the continuum do you begin to care? Following that, surely you'd allow that others may choose a different point and that, given that it's their privacy, maybe their concerns shouldn't be summarily dismissed?
But, beyond that, the example you gave is invalid. There is a qualitative difference between you logging information about visitors who have chosen to visit your site, versus participating in a scheme to provide that information to a third party who your visitor did not choose to visit. This is especially so when that third party is then able to compile a dossier on that visitor's broader behavior, based on other site-owners who participate.
Add to this that Google is frequently able to personally identify that visitor via their Google login.
Rather than turn this into a one-bit argument (people who either care or don't care about privacy), I would rather talk about privacy risks for vulnerable populations, like activists or people who have crazy exes. A privacy risk is when data could be disclosed to someone who could use it against you.
I'm not sure any of them are threatened by Google Analytics? Google's terms of use prohibit uploading PII to Google Analytics, and it appears they upload data to a separate domain [1], so at least if you don't enable the DoubleClick cookie (which, if you're responsible you shouldn't do), they don't seem to have a way to correlate Google Analytics data with Google logins.
But I only did a shallow investigation. show me the risk and threat model and I'll care.
An issue with other analytics is that it's hard to track Google ads performance as analytics is tightly integrated with Google ads which let's us know what keywords got signups etc.
Google is embedded like a tick across the internet and getting rid of it is really really hard.
There's always something rich about these articles being posted online. I promise you the author of the article has no control over the business decisions Fast Company is making. While he ethically believes what he's writing he also knows how completely impossible it is to actually change the business' mind.
The only way a business changes is when there's a detectable impact on their income.
"The only way a business changes is when there's a detectable impact on their income."
Or when they're legally required to.
Or when a new upper management come in with a different philosophy or approach (Microsoft's new CEO is often credited here on HN as changing Microsoft's attitude towards open source).
Or when the company's employees protest effectively enough (see the recent Google employee revolts).
I do this too. The only downside seems to be that I get caught in recaptchas constantly now. Google seems to know how to make tracker avoidance painful enough that I doubt most would follow through with it after a few weeks.
I read somewhere that selecting a bit of text before clicking the "I'm not a robot" checkbox helps to avoid the Recaptcha popup. I've started doing this and think I've been seeing them far less often, but of course it's possible I'm fooling myself.
And I guess it will only be a matter of time before robots will start doing this too, so please don't tell anyone ;-)
That makes sense. They are profiling your responses to the captcha, not just checking for the correct answer.
They always seem to make me "play" their stupid "find the cars" game a lot longer when I quickly choose them. 20 minutes straight one time. Yeah I felt violated. It was on my stupid state unemployment benefits application page. Talk about being kicked while you're down. Not only do I have to convince the state I'm a citizen and eligible, I have to convince google I'm not a robot first.
> The only downside seems to be that I get caught in recaptchas constantly now.
That's probably a natural side-effect. I don't notice because, personally, when I'm presented with a captcha I usually just move on and make a mental note to not use that site anymore.
Does anyone know if Google uses GA data (in particular bounce rates, time-on-site, etc.) as signals in rankings? I've always wondered if the additional data gathered once a user clicks through from the SERP to the site was eventually factored in to the site's ranking. (For example, a faster site receiving better rankings.)
Google has a separate tracking mechanism with regards to bounce/back to SERP rates which does not require GA at all. Time on site could theoretically be implied.
It has also been (somewhat officially) stated that GA data is not used for ranking purposes, but make of that what you will.
Since they also render pages themselves, they don't really need client page load metrics to evaluate rendering performance.
As a site owner, it is disingenuous showing your users a banner that says "This site uses cookies to allow us to maintain state" while leaving out the part about "This site uses google analytics which allows Google to track you across sites and build nearly complete picture of every page you browse".
The smaller hosted analytics systems claim to be more privacy oriented but I don't trust them either. Even with the best of intentions, sooner-or-later they will be sold or pivot. The data is just too valuable.
I know that some sites live and die by close reading of analytics but most sites don't need anything as elaborate. I implemented a very simple page counter than I host myself but that is just for vanity. Most sites don't even need that.