In what sense is TCP tomorrow different from disallowing 3rd party cookies today?
On a side note: Isn't it odd, how the composition of different technologies (here HTTP and DNS) make it unintuitively complex to implement a so intuitively natural policy as "every website gets to read only its own cookies"?
You can see which browsers partition state (and which don't) in the State Partitioning section of https://privacytests.org. Firefox passes nearly all of those tests because Total Cookie Protection is enabled by default.
I have a problem with them including the "GPC" flag as a privacy feature. We know directly from "Do Not Track" that it was widely ignored, and frequently actually used to perform tracking because it was meant to be opt in, and so was rare. When it got turned on by default a bunch of companies said "DNT is not a user choice now so doesn't mean anything".
I'm not sure what document.referrer blocking is meant to accomplish - if the intent is for the referrer to pass information to the loaded site then they can just include that in the url. If the intent is "don't let people know where you saw the link", then sadly there are plenty of sites that gate access on the referrer and they get broken. The fact that no one is filtering that kind of indicates the cost/reward balance.
Calling "media queries" a privacy issue is absurd: literally the purpose of these is to allow sites to tailor to the view format. More over you can directly measure these from JS, it's just less efficient and more annoying.
The "known tracker" blocking is more curious as I'm unsure how that's expected to scale, nor what happens if a browser ever misclassifies one (be it a resource or a query parameter). Certainly query strings can be trivially changed to make anything a major browser does just break instantly, and similarly trivially changed to not be statically identifiable.
I also wonder how those are tested because browsers that do automatic "learned" identification of trackers do take time to identify what they consider trackers and start blocking them. e.g. that site says google analytics is not blocked by safari, yet I can look at Safari's tracker stats and see that it has identified and is blocking google analytics.
Hi! Author of PrivacyTests.org here. Thank you very much for the comment. (I only just saw your reply.) PrivacyTests is very much a work in progress, and all feedback is much appreciated.
Regarding document.referrer, you are absolutely right that there is a cost/reward balance and most browsers have chosen to allow cross-site passing of the referrer. However, there are browsers on Android that do block cross-site referrer altogether (see https://privacytests.org/android.html).
"Media queries" refers to the fingerprinting threat where, for example, screen width and height is divulged. You are right that JavaScript can also be easily used to get screen width and height: any fingerprinting resistance feature should protect against screen fingerprinting via both JS and media queries, in my view. Some browsers already do that, as the results show.
Your question about scale is a good one. Some browsers (such as Firefox and Brave) embed fairly large blocklists. You are right that query parameters can be changed, but in practice I haven't seen any cases of that happening (yet).
As far as I am aware, Safari is (by default) blocking cookies/storage from Google Analytics and similar trackers, but not blocking the scripts themselves. You can see that cookie blocking reflected in the "Tracking cookie protection tests".
> Calling "media queries" a privacy issue is absurd
The header at media queries says “Fingerprinting resistance”. Some data point like screenwidth doesn’t immediately disclose you identity. But having a few data points helps with fingerprinting.
on mobile so can't provide source but look up why privacy badger no longer uses a dynamic "learned" list of trackers to block, the way you mention safari works and instead uses a list that everyone else uses. it basically reduces privacy because your list of blocked sites is likely to be unique among other users (along with other data points) and unintuitively makes it easier to track you across the web.
>By default, Privacy Badger receives periodic learning updates from Badger Sett, our Badger training project. This “remote learning” automatically discovers trackers present on thousands of the most popular sites on the Web. Privacy Badger no longer learns from your browsing by default, as “local learning” may make you more identifiable to websites. You may want to opt back in to local learning if you regularly browse less popular websites.
Huh an interesting issue to have to deal with. I'm not sure it makes it easier to track you in practice, though I can see in theory it could absolutely work, so if you did have the choice of have a static block list it would be better.
That said, my point stands that the bigger "reputable" trackers (e.g. the ones that exist on basically every website) are all blocked by the learned trackers techniques, and so should be listed as blocked on that site.
That's already the case, isn't it? It's just that embedded parts by a 3rd party can set their own cookies, which can be read by the 3rd party whenever it is embedded in any other page.
You aren’t disagreeing with the person you are responding to, you just have slightly different ideas what it means when someone says “website”
You are calling all of the assets loaded, no matter from where, as one “website”. The person you are replying to considers each domain for each asset on the single page as being different “websites”
My understanding is, it's a simplified explanation for a somewhat more complex policy.
The obvious thing - every site gets its own cookies - has always been the case: Those are first-party cookies.
The question is what happens when one site embeds content from a different site. Traditionally, there was no special behaviour for that case: If site A embeds an iframe from site B, the top-level document gets to see the cookies for A while the embedded document gets to see the cookies for site B. B's cookies here are 3rd-party cookies.
The problem is that the two documents can communicate, so if site B is embedded into many other sites, B and the other sites can cooperate to track the user across them using B's cookies as a sort of shared identifier. (Or more commonly, B will do the tracking and pay/otherwise incentivise the other sites to embed itself and pass the necessary information to the embedded document)
To avoid this, TCP introduces another level of partitioning: Cookies are not just partitioned by which site has access to it but also by the top-level site.
So site B now doesn't get a single cookie jar as before, it will get several jars "B, embedded by A", "B embedded by C", "B top-level" etc with separate sets of cookies. Hence it becomes impossible to pass a common identifier from one top-level site to another one and (this particular way of) cross-site tracking won't work anymore.
Of course... knowing that this is all by well behaving browsers is a bit of a troubling point. Such that I find it equally odd that we have placed so much power and general trust into software that we keep expanding the capabilities of. It is a small step to "managed devices" that we don't even trust users to keep up to date being required for access.
I tried enabling this recently and I immediately noticed that websites started appearing in light mode instead of copying my system settings and displaying in dark mode. It seems like in it's efforts to make my fingerprint the same as everyone else's Firefox stops telling websites about my display settings. The issue immediately went away when I changed the setting back to privacy.resistFingerprinting = false.
Why does the browser need to tell the website about local display settings? In my case, with privacy.resistFingerprinting = true, the Zoom level resets to 100% every time I navigate to another page on a site. Why can't the browser just remember my zoom level locally and re-apply it? Why does it have to tell the website?
Zooming basically changes the dimensions of the viewport as JS/CSS see it. Reapplying the zoom level would involve running the same CSS media queries and/or JS so that the website looks good at those dimensions.
It's not just an "optical" zoom.
But more directly, zooming can put you in very nonstandard width x height dimensions. Carrying those dims across different pages makes for an easy fingerprint which is probably why it's reset.
> Zooming basically changes the dimensions of the viewport as JS/CSS see it.
I don't care what the website's JS/CSS says. At the end of the day the browser has a rendered canvas; I just want to zoom the canvas (and clip it at the window dimensions, providing scrollbars if necessary, if zooming makes it larger than the window dimensions). The browser shouldn't have to re-run anything to do that; zooming and clipping a canvas are graphics operations that have existed in computers for as long as there have been computers with graphics at all.
When people increase their font size/zoom they generally don't want that - what you're describing is the default zoom on phones, etc which is different from increasing the page text size, and you'll note is typically about increasing undersized ui components rather than simply reading text.
When people are in a browser and expanding the content they want the content reflowed - having to scroll to read the width of a line is super obnoxious, and makes reading text much harder. This is made even more frustrating when you recall that a lot of time the reason for zooming is to make things easier to read.
There are very few times where the correct response to "increase the zoom" is simply an affine transform of the rendered content, from both a usability standpoint or from user intent.
> When people are in a browser and expanding the content they want the content reflowed
Even if this is the case, I don't see why the browser has to re-run anything from the website or tell the website anything. It can just do the reflow operation locally. Yes, the central data structure then is the DOM rather than a rendered canvas, but the DOM is still held locally.
You can always zoom the website using built-in OS zooming.
However, browser zooming incurs layout logic. It's no different that resizing the browser viewport. Code is run on the site (whether CSS or JS) to determine how the site should render at that size.
CSS/JS is run even when loading the site in the first place. There is nothing special about zooming, so it's like asking why layout code has to be run when you visit a site.
Well, you can turn off JS and CSS styling, but that's too hamfisted for most people.
Here's how a site can load different stylesheets depending on viewport width:
<link rel="stylesheet" media="screen and (min-width: 601px)" href="desktop.css" />
<link rel="stylesheet" media="screen and (max-width: 600px)" href="mobile.css" />
It's unclear to me what you think should happen on first website load vs. zooming.
> Reflow and relayout is entirely local just as it is if you resize your window.
Then why does the zoom level reset itself to 100% every time I reload the page if I set privacy.resistFingerprinting = true?
> What are you concerned is happening?
I'm concerned that setting privacy.resistFingerprinting = true breaks a feature (that my browser remembers the zoom level for a given site so I don't have to reset it every time I reload that site) that should, as you say, be "entirely local".
The issue is not related to page loads, and layout behaviour is not impacting or causing differing load behavior.
First we need to consider what the goal of fingerprinting a browser is, and subsequently how that is done. The goal is not just "track a user", it is "track a user without using any explicit storage", so no cookies, client storage, etc. So instead all that a fingerprinting service can do is read implicit data from the browser, and using a collection of that data construct a unique ID. Most data that you read will be the same across large numbers of browsers: user agents, installed fonts, etc so what you do is build up a signature from those properties that vary from the mean. If you query enough different properties that hope is that you can accumulate enough variation to create a unique (-enough?) identifier that persists for that user.
Which gets us to your feature. The enormous majority of users have default zoom. So if your browser presents a different zoom level that provides a large amount of information to uniquely fingerprint you.
Hence `privacy.resistFingerprinting = true` disables non-default zoom on load, because it's directly finger-printable.
No. I already understand why non-default zoom gives websites a way to fingerprint you, if your browser insists on telling the websites that you have a non-default zoom level.
What I don't understand, and what nobody in this discussion has been able to explain, is why a browser with privacy.resistFingerprinting = true can't just lie to the website about what the zoom level is. You have said that zoom should be a local operation; that means the browser shouldn't have to tell the website anything about the actual zoom level if the user doesn't want it to. It should just load the page, telling the website whatever default things it tells the website when privacy.resistFingerprinting = true, including, presumably, a default zoom level, and then do the local zoom operation afterwards.
It doesn't have to but if you want websites to follow your system settings for light/dark mode then the browser has to tell the website which one you want at this moment.
It shouldn't, CSS should contain both modes. You need some checks to ensure JavaScript doesn't leak, but you can place limits on how much you check and avoid having to solve the halting problem.
It does contain both modes. But only one of those declarations will be used, and that declaration can do things like background images, which is behaviour the server can observe.
You could potentially try and "execute" all possible declarations at the same time, in effect just loading every URL or image declared in the CSS file at once, so the server can't tell which path was actually used. But (a) this would itself be identifiable as an anti-tracking measure (which can contribute to a fingerprint), and (b) this loads a lot more data in the general case, which is exactly what browsers want to avoid.
You can verify that in either path the same images are loaded, without loading them. (this is what I was getting at by invoking the halting problem - if you cannot determine easily that it is loaded in both paths they are trying to fool your anit-traking and so you default to assuming it is tracking)
The more people identified as having anti tracking on, which should be the default, the less useful that bit of tracking is.
I don't entirely understand your point, I'm sorry. Could you explain it again?
One would generally expect that both paths produce different outcomes, because this is the purpose of media queries, to produce different appearances for different screens. In the example about light mode vs dark mode, a well-designed, non-fingerprinting CSS file might well load different background images for an element to match the user's theme - a dark-background image for dark mode, and a light-background image for light mode. This is the sort of behaviour we are aiming for with this feature.
The problem is that this good behaviour is indistinguishable from more malicious behaviour where the images are only used to do fingerprinting. And FWIW, this is the simplest way of doing fingerprinting that I could think of. In the general case, it is not possible to detect whether a given media query would be fingerprintable by a server. For example, a given media query might increase the height of a particular element, pushing a lazy loaded image below the fold and causing it to not be requested immediately, but only after a few seconds when the user scrolls down to it. Or instead of having one "homepage" link on the page, you have multiple, but you only show one depending on which media query fits best. Then, as soon as the user clicks the "homepage" link, you know which link was visible to them and can fingerprint them accordingly.
Which is why the nuclear option here is just turning off all possibilities for existing a user's unique preferences, because it's the preferences themselves that are being used to fingerprint the user.
> if you want websites to follow your system settings
I don't want websites to follow my settings; I want my browser to follow my settings, overriding or ignoring what the website says if necessary. I don't see why the browser has to tell the website what it's overriding or ignoring.
The browser does follow your settings, and it doesn't necessarily directly tell the website what's going on. The problem is that the website can observe a lot of things indirectly.
For example, with the dark mode/light mode "attack", the browser will download the necessary HTML and CSS in as unidentifiable a way as possible, but then it needs to render that for your machine. But the CSS file might contain a media query line that says something like "if the user wants dark mode, load this dark image as a background for this element". And to correctly respond to the query, the browser then needs to send another request to the server to download that image, that effectively indicates whether the user is using dark mode or not.
This principle can be used to detect a lot of your user settings. For example, your zoom level will effectively change how wide the browser window appears to be from the perspective of a CSS file*, which means that it's possible to use more media queries to detect that. Likewise a lot of accessibility queries like prefers-reduced-motion, while really useful for many people, can be used alongside other information to create your unique browser fingerprint.
This is just with HTML and CSS. If you add Javascript to the mix, it's even easier to fingerprint you based on various settings.
* there are technically other ways of performing zooming that wouldn't necessarily be visible, but they have poor usability. For example, you could have the classic PDF-style zoom where the PDF is rendered in a fixed size, and the user simply views a small, viewport-sized portion of the file. But this is a pain if you want to read text that's wider than your screen, because now you need to scroll back and forth. The browser approach allows text to be reflowed to match the viewport width, but this reflow will always be observable, and therefore can always contribute to a fingerprint.
> The problem is that the website can observe a lot of things indirectly.
If the browser insists on doing those things, yes. But why does the browser have to do that?
For example, if I set privacy.resistFingerprinting = true, why can't the browser just locally have a "light mode" and a "dark mode" that does the best it can to render the site locally in those modes without making any additional requests that it didn't already make for the default version of the page? Yes, I'm sure the website designer has lots of wonderful stuff to customize the look and feel in those modes--and I might like that if I could be sure that the website wasn't also using that stuff to fingerprint me. But if I'm telling my browser to resist fingerprinting, clearly I don't trust that website, so why would I want all of its customizations for light mode/dark mode?
Sorry, I didn't see this earlier. The problem is that it's very difficult to determine what properties are observable for fingerprinting purposes. I used the background image as an example because it's very simple, but you can also trigger requests in more obscure ways. For example, you could have a lazy-loaded image in the rendered HTML - the image will only be loaded if the user's viewport contains the image. Then you create a rule where if the user is using dark mode, the element immediately before the image becomes really long, forcing the image off the screen. Now, if the user loads the website and doesn't immediately also load the image, you know that they were using dark mode.
Alternatively, everywhere where you have a link, you could have one link for each combination of bits that you want to send to the backend. Then using CSS, you can hide or display this links so that only one version of each link is displayed at a time, and then monitor what gets clicked. If the user clicks the link that says `/?dark-mode=true&orientation=vertical`, you now know two extra bits of information.
This is obviously all excluding Javascript, which can just read this information straight out and use it.
The problem ends up being that there's so many different (and often valid) ways to customise a website that it's very difficult to limit these customisations to only the "safe" ones. Even if the only properties I was allowed to use were colour/background-color, I'm sure I could come up with some sort of way to use them to convey information. So the only safe option here is to turn off the customisations altogether. Yes, it's still possible to track if a user is using light mode or not, but now they're all using light mode, so that bit of information becomes useless.
Also, there are some major drawbacks, like your browser setting UTC as the default timezone. That one has caused some mix ups when filling out Doodles.
this one really annoys me because proton mail uses the "spoofed" utc timezone to show when i sent/received emails, as if i live in the uk or west africa. they also refuse to add timezone as an app setting to correct this issue.
Enabling this did weird things to web sites. But it didn't seem to help. Before I enabled it, according to https://amiunique.org I was unique among the 1646592 fingerprints in their entire dataset. After enabling it, I was unique among the 1646593 fingerprints in their entire dataset.
I recently noted that they display some of my browser settings as used by 200.3% of their users. And many other figures over 100%. And they offer positions with a deadline last summer.
I also tried setting privacy.resistFingerprinting = true in Firefox, but it's sad to see that most websites become unusable (most sites using canvas just render a green/purple mess), zooming in Google Maps is basically broken (skips several levels at a time), and like others have mentioned dark mode and time zones also stop working.
What a mess the (somewhat private) web is nowadays. The more I think about it, the more I am convinced legislating privacy is the only way out of this arms race we seem to be losing.
EDIT: Sorry, the parent post already included fingerprinting, missed that. Rant still stands though :)
PaleMoon - http://www.palemoon.org/ - a hard fork of Firefox, includes a setting called canvas.poisondata which really messes up browser fingerprinting without distorting any canvas use on a site. It is also a zero telemetry browser and doesn't make any automated connections on startup if the appropriate settings are enabled / disabled (unlike Firefox).
Frankly, Firefox should take the nuclear option, and just start emulating Chrome with regard to website identifiable information.
In particular, User-Agent strings are now a net-negative. I've never run into a website that doesn't work on Firefox. I do occasionally run into websites that claim to not work on Firefox. We software developers have shown that we can't be trusted with information pertaining to what browser someone is using, and as such should have the privilege taken away. If you're reading this, and you have access to a codebase that reads User-Agent strings for anything more than idle curiosity, just delete it and push to master.
I actually ran into a case for using UA strings yesterday, if you know of a better solution let me know!
I have a canvas element where the user needs to be able to both scroll and zoom with the mouse wheel. Generally, zooming involves scrolling while also holding the ctrl button down - except on MacOS, where it's cmd. I obviously want to use the operating system conventions that my users will be expecting, but the best way of determining that, that I've found, is searching for the string 'Mac' in the user agent string.
Like I say, if you know of a better way to detect whether I should be binding on ctrl or cmd, let me know! This is the first time I've used the user agent in a long time, and it feels like there should be a better system...
Especially considering it is vanishingly unlikely for a Windows user to inadvertently send you a command key down, or the Mac to send you a ctrl down. It's not like you have behaviors to swap.
UA string sniffing sucks, but if Firefox starts pretending to be Chrome, sites will start using even worse methods to figure out what browser they're dealing with, and it'll be much harder to work around those. Firefox is obviously not Chrome if you actually poke at it a bit… the saving grace of UA string checks is they're easy to fool.
Brave has had this "new feature" as a default setting (i.e., block cross-site cookies). Glad Firefox has finally implemented a basic privacy feature such as this, though like you mention perhaps it's a little late.
As far as fingerprinting issue, Brave again does a better job than most, again as a default:
"If you need to use Chromium, then Brave browser is a good choice. It also randomizes fingerprint for each session, making it harder to link your browsing sessions."[1]
Its basically a tool that lets you manage all of the js tracking scripts (tags) that advertisers, affiliates, etc have you paste into a website.
So instead of having a million js snippets on your page, you have one google tag manager snippet and then your marketing team can inject 400mb of javascript into the site without bothering dev.
1) Developer installs Google Tag Manager at the request of someone in marketing
2) Marketing department has free reign (via Google Tag Manager) to add arbitrary javascript on any page with Google Tag Manager installed. Namely tracking scripts, adwords conversions, retargeting pixels, or really any arbitrary javascript without limitation.
Edit: Google Tag Manager can also be used to inject scripts that aren't inherently bad, like a 3rd party support/chat widget (Zendesk, Intercom, etc)
It's worse than that. Attackers can add arbitrary Javascript to your pages. It takes another vulnerability to use this as a large scale exploit, but it's been done. [1][2]
Blocking "googletagmanager.com" does break a few sites, but not enough that it matters.
What I hate is that because it can be used for shady and benign purposes, you can never tell what will break without it. I’d love to wholesale block or at the edge or host but sometimes you need it.
Will bitbucket.org sending me to oauth.atlassian.com fail with Total Cookie Protection enabled ?
Will someonlineshop.com redirecting me to payments.mybank.com fail with Total Cookie Protection enabled because the payments site uses cookies to store the return url for the online shop ?
With oauth probably not since the state doesn't transfer with cookies. You send the user to the IdP with a postback in the URL then when the user logs it it sends
them back to the original site with some data that depends on the flow you chose.
Your login cookie as far as the IdP is concerned lives on oauth.thirdparty.net and on a successful login the app issues you your own session token that lives on myapp.com.
I don't want to be a killjoy here. Don't get me wrong, this is certainly a good thing but to me it seems like most major trackers are moving/has already moved to things like canvas fingerprinting, WebAssembly fingerprinting etc.
This seems to be like a typical cat-and-mouse game. And browsers will not get more privacy friendly as long as there will be a trend that moves applications and basically every aspect of our lives into the browser thus making it unbelieveably complex.
I suspect that we will soon have to choose to either live as a "digital hermit" (links/neomutt and other lightweight apps that do only the one thing) or give up and join everyone with Youtube/Facebook/Twitter.
I'm blocking pretty much all website advertising. I'm sure I'm being tracked but I can't see their personalised ads so whatever. Out of sight out of mind.
There were plenty of legitimate use cases for 3rd party cookies. The negative ones just started to trump the legitimate ones over time and browsers lag behind these trends (or have different incentives).
> allowing those [third party] cookies to fulfill their less invasive use cases (e.g. to provide accurate analytics
Separate cookie jars don't prevent data brokers from correlating cookies from a common IP + user agent pair. The only solution is to block third parties altogether.
The long term workaround for data brokers is to become a first party... have the main website serve your 3rd party JS from their own server under their own domain. That makes it much harder to block.
For your IP+UA hypothetical, the easy fix is to make everyone look the same. IP should be the only personally identifiable factor. Tor and desktop Firefox w/ privacy.resistFingerprinting turned on both do this. Open window not-maximized at 1000x1000 size, use the same UA regardless of what platform & browser version you're actually on, and a bunch of other stuff. That said, I don't know how effective it actually is against modern comprehensive fingerprinting libraries.
User-Agent strings are now net-negative in the web world. Their uses are slim, and their abuses are plentiful. They should probably be removed altogether, but in the meantime non-Chrome browsers should universally adopt Chrome's UA.
User-Agent strings are still useful for identifying non-browser user agents but I agree that given the privacy issues they present all browsers should adopt a single standard UA string (one each for desktop and mobile). (Did you know that Chrome on Android puts your device model in the UA string? One of the many reasons to use Firefox for Android...)
Obviously doesn't help when using an app, but what about an extension that changes your user agent every single time you send a request to a website? Does such an extension even work, or exist?
Even if it did, it will be useless because fingerprinting (of various kinds - whether TCP or TLS stack, doesn't even have to be JS-based) will still tell which browser you're running on.
Worse, it may actually make you more identifiable because a Chrome on Linux user agent with the TCP fingerprint of Windows and TLS fingerprint of Firefox is going to stick out like a sore thumb, even more than a rare-but-at-least-consistent fingerprint with no trickery.
I'm not sure TCP fingerprints are very reliable because a lot of ISPs and Mobile Carriers are doing proxying at the application level for HTTP/HTTPS ports so even if you are on an iPhone you might end up having a linux TCP fingerprint. This site allows you test your TCP fingerprint: http://witch.valdikss.org.ru
Has Chrome already been doing this for a while? (I'm not sure, but I think I've heard that it has).
Disclaimer: I'm a FF user, not trying to say FF is bad, I just want to know how much of this is actually new cutting edge privacy protection vs. marketing fluff.
This is nice to see, but it worth pointing out that it is years overdue. Mozilla dragged its feet for years about Tracking Protection and cookies in general, coming up with halfway measures while continuing to serve the needs of advertisers that had a lot of influence on the company. The end state of Total Cookie Protection was clear from the beginning. It is of course not easy to change the advertising based model, but it's not like Firefox is really more than an influencer of Google and Apple, its market share has all but disappeared. No, the motivation to take seven years to get to TCP was all about money, including big salaries and bonuses for Mozilla executives.
On a side note: Isn't it odd, how the composition of different technologies (here HTTP and DNS) make it unintuitively complex to implement a so intuitively natural policy as "every website gets to read only its own cookies"?