Hacker News new | past | comments | ask | show | jobs | submit login
It’s Time to Ditch Google Analytics (fastcompany.com)
80 points by pauljarvis on Feb 1, 2019 | hide | past | favorite | 37 comments



In my opinion, the whole idea of a third-party analytics system as they are currently implemented is incompatible with the idea of privacy. As soon as more than one site uses the same analytics system, then the company running the analytics system has aggregate data that the user did not consent to.

As a site owner, it is disingenuous showing your users a banner that says "This site uses cookies to allow us to maintain state" while leaving out the part about "This site uses google analytics which allows Google to track you across sites and build nearly complete picture of every page you browse".

The smaller hosted analytics systems claim to be more privacy oriented but I don't trust them either. Even with the best of intentions, sooner-or-later they will be sold or pivot. The data is just too valuable.

I know that some sites live and die by close reading of analytics but most sites don't need anything as elaborate. I implemented a very simple page counter than I host myself but that is just for vanity. Most sites don't even need that.


I can only speak to Fathom (since that's my product), but we have no personal data to sell if we sold our product. We collect nothing about website visitors other than a tick in the aggregate data. So if we sold our business, there's zero personal information from website visitors on any of our customers dashboards that could be used against them.


Thats a great attitude but even assuming that I trust you (I mean, I am sure you are a great guy and all, but I don't know you), that is what you deliberately store and allow customers to access. Your log files contain IP addresses and agent strings, enough information to identify people with fairly high accuracy. What do you do with them?


> other than a tick in the aggregate data

What sorts of data items do you collect, though?


Our source code is open-source, you can view it here: https://github.com/usefathom/fathom

That's another way we're transparent. Anyone who knows code can see exactly how we collect data and what data we collect.


It's great but code on a GitHub repo doesn't mean that's whats running in production.

Note I'm not saying you are doing anything nefarious but that the source code been available isn't sufficient alone.


> Anyone who knows code can see exactly how we collect data and what data we collect.

What about people who don't know code?


Google Analytics uses first-party cookies and can be configured to not share data with other Google services. IIRC, that's even the default setting. That's nowhere near allowing Google to track you across sites. Google's advertising cookies track you across sites, though.


Interesting.. do a view source and you find this on the page

  <script>
    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
    (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
    })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

    ga('create', 'UA-4300461-2', 'auto');
    ga('create', 'UA-4300461-33', 'auto', {'name': 'rollup'});
    
      ga('rollup.set', 'dimension1', 'fastcompany');
    
      ga('rollup.set', 'dimension2', 'co-design');
    
      ga('rollup.set', 'dimension6', '2019-02-01');


Writers have pretty well 0 (and that's a big zero) control over what sales and advertising practices are used by the company they are writing for either as staff or freelance. But they can write what they want as long as its approved by editorial.

Likewise, the sales teams have pretty well 0 (and that's a big zero) control over published editorial content. But they have a lot of pull in implementing any and all kinds of analytics, a/b testing, and so on.

This doesn't surprise me at all. But it is funny.


Along with Facebook and Twitter links and also Google tag manager.

It's extremely hard to get rid of Google out of your life


Is it?

Just turn off javascript.


Changing your user agent to midori or links is surprisingly effective.


The time to ditch GA was a long time ago, but over the years there wasn't even a single decent competitor that suits me.

I stumble upon Reinvigorate.net, it has potential, but it was acquired by Adobe if I remember correctly.

There was Gaug.es, it got acquired by Github, then it sat there and no more active development.

There was Gosquared, I cant remember why it didn't fit ( Needs to look again )

There was Clicky, one of the cheapest around, and does all tracking metrics you will need. Except the UI... doesn't really make you want to use it.

The two shown in the article, Simple and Fathom, were lacking way too many information. Browser, Resolution, Devices, Location etc.

The best thing that came out in the last few years were GoAccess. Self Hosted log analyser, except not everyone wants the hassle of setting it up.

The closest thing to something I want would be Ghostboard, unfortunately it only works for Ghost.



Tried self hosted Countly?


Never heard of it. Will take a look once I have time.


For all the complaining, can you point to anyone, anywhere, who has been harmed by Google Analytics?

There are real issues with privacy, but the scaremongering sometimes seems to be independent of any actual harm to users.


Without commenting on Google Analytics itself, why on earth would we wait until after something bad happens? If real risks can be identified, then one should think seriously about whether they outweigh the gains.

If the gains are worth the risk, then that's fine, but there's nothing "scaremongering" about pointing out things that could go wrong. That's at least half of engineering as a discipline.


That's a good point and I would like to read a good risk analysis. The problem is that when large companies are involved, you get very superficial risk analysis. Something analogous to:

"The worst thing that could happen if I put my money in a bank is they take all my money. So, we shouldn't put money in banks."

If you point out that this generally doesn't happen people will talk about what the company could do.

How do we get beyond fully general arguments that you can never trust any company to do anything (or alternatively that theoretical risks don't matter)? I guess to really understand the risk you'd have to understand a company's internal controls, and those generally aren't public.


Many define the harm as the loss of privacy itself.

In the case of analytics, this would be particularly egregious: the user gives up privacy to a third party, generally without knowledge and with no direct benefit.


See, this is treating "loss of privacy" as an abstraction meaning "any information somehow derived, no matter how indirectly, from my behavior." That's something I fundamentally disagree with. When people get upset about it then I have a hard time taking it seriously.

If I log all the hits to my website and generate aggregate reports based on that, there's no personal information disclosed in the reports. Nobody is harmed and they have no case, as far as I'm concerned. It's the same if I'm outsourcing to Google.


I've heard this argument and, sure, some people just don't care. Privacy. Shmivacy.

But, the thing is the example you gave is just a matter of degree. I mean, if you sold the logs with PII, location info, etc, then that would be near the other end of the privacy continuum. And, since you explicitly provided an example that included aggregate vs personal info, then I assume even you would care about the latter.

So, the question is, at what point along the continuum do you begin to care? Following that, surely you'd allow that others may choose a different point and that, given that it's their privacy, maybe their concerns shouldn't be summarily dismissed?

But, beyond that, the example you gave is invalid. There is a qualitative difference between you logging information about visitors who have chosen to visit your site, versus participating in a scheme to provide that information to a third party who your visitor did not choose to visit. This is especially so when that third party is then able to compile a dossier on that visitor's broader behavior, based on other site-owners who participate.

Add to this that Google is frequently able to personally identify that visitor via their Google login.


Rather than turn this into a one-bit argument (people who either care or don't care about privacy), I would rather talk about privacy risks for vulnerable populations, like activists or people who have crazy exes. A privacy risk is when data could be disclosed to someone who could use it against you.

I'm not sure any of them are threatened by Google Analytics? Google's terms of use prohibit uploading PII to Google Analytics, and it appears they upload data to a separate domain [1], so at least if you don't enable the DoubleClick cookie (which, if you're responsible you shouldn't do), they don't seem to have a way to correlate Google Analytics data with Google logins.

But I only did a shallow investigation. show me the risk and threat model and I'll care.

[1] https://developers.google.com/analytics/resources/concepts/g...


An issue with other analytics is that it's hard to track Google ads performance as analytics is tightly integrated with Google ads which let's us know what keywords got signups etc.

Google is embedded like a tick across the internet and getting rid of it is really really hard.


Not that hard, really, as Google Ads exposes all data via their API.

Of course you'd need to glue the data together with whatever you have, but it's not really that hard to do.


There's always something rich about these articles being posted online. I promise you the author of the article has no control over the business decisions Fast Company is making. While he ethically believes what he's writing he also knows how completely impossible it is to actually change the business' mind.

The only way a business changes is when there's a detectable impact on their income.


"The only way a business changes is when there's a detectable impact on their income."

Or when they're legally required to.

Or when a new upper management come in with a different philosophy or approach (Microsoft's new CEO is often credited here on HN as changing Microsoft's attitude towards open source).

Or when the company's employees protest effectively enough (see the recent Google employee revolts).


I ditched it a long time ago -- I block GA scripts along with all of the other trackers.


I do this too. The only downside seems to be that I get caught in recaptchas constantly now. Google seems to know how to make tracker avoidance painful enough that I doubt most would follow through with it after a few weeks.


I read somewhere that selecting a bit of text before clicking the "I'm not a robot" checkbox helps to avoid the Recaptcha popup. I've started doing this and think I've been seeing them far less often, but of course it's possible I'm fooling myself.

And I guess it will only be a matter of time before robots will start doing this too, so please don't tell anyone ;-)


That makes sense. They are profiling your responses to the captcha, not just checking for the correct answer.

They always seem to make me "play" their stupid "find the cars" game a lot longer when I quickly choose them. 20 minutes straight one time. Yeah I felt violated. It was on my stupid state unemployment benefits application page. Talk about being kicked while you're down. Not only do I have to convince the state I'm a citizen and eligible, I have to convince google I'm not a robot first.


> The only downside seems to be that I get caught in recaptchas constantly now.

That's probably a natural side-effect. I don't notice because, personally, when I'm presented with a captcha I usually just move on and make a mental note to not use that site anymore.


VPNs also make recaptchas pretty terrible. If you browse without a login and have a VPN enabled they slam you with them.


using uMatrix extension for firefox i've been able to specifically block GA while enabling captcha.

problem is im sure they use captcha for fingerprinting too (and google fonts but that's a whole other can of worms)


Does anyone know if Google uses GA data (in particular bounce rates, time-on-site, etc.) as signals in rankings? I've always wondered if the additional data gathered once a user clicks through from the SERP to the site was eventually factored in to the site's ranking. (For example, a faster site receiving better rankings.)


Google has a separate tracking mechanism with regards to bounce/back to SERP rates which does not require GA at all. Time on site could theoretically be implied.

It has also been (somewhat officially) stated that GA data is not used for ranking purposes, but make of that what you will.

Since they also render pages themselves, they don't really need client page load metrics to evaluate rendering performance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: