Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So if you send a picture to a Signal user, it's retrieved via cloudflare, and cached in a data center near that user; now you can look up the cache status and find the data center used. I'd say "deanonymization" is stretching it, unless the user is in the middle of nowhere (no other users near the data center). But interesting writeup anyway.


"Near a user" is also a big assumption. I'm ~200 miles to ORD and ~500 to IAD, but my ISP's peering & upstream arrangements mean Cloudflare serves my traffic 700 miles from DFW.

But, at the same time: Cloudflare isn't going to serve me a cache from Seattle, Manchester, or Tokyo. Pinning down an unknown Signal user to even a rough geographic location is an important bit of metadata that could combine to unmask an individual. Neat attack!


It's also quite insidious as you don't need to control anything on any server to get this information; as long as you can get your target to load a unique URL never before loaded by anyone else, you can simply later poll it with an unauthenticated HTTP GET from different locations, and find which one reports a Cloudflare HIT (or, even if they hid that information, finding the one that returns with lower latency).

If you're allowing user uploaded content, and you use Cloudflare as a CDN, you could mitigate and provide your users with plausible deniability by prefetching each uploaded URL from random data centers. But, of course, that's going to make your Cloudflare bill that much more expensive.

Cloudflare could allow security-sensitive clients to hide the cache-hit header and add randomized latency upon a cache hit, but the latter protection would also be expensive in how many connections must be kept alive longer than they otherwise would. Don't do anything on a personal device or account if you want your datacenter to be hidden!


Pre-fetching also becomes an issue for apps that are meant to be e2e encrypted, since it requires the server to download (read) every attachment. But if the app is already caching the attachment then they’re effectively reading it anyway.

(EDIT: Apparently signal e2e encrypts images prior to upload, so pre-fetching the encrypted blob from one or multiple servers would in fact be a mitigation of this attack.)

I do wonder if Telegram is as invulnerable as the author assumes. They might not be using Cloudflare for caching, or even HTTP, but the basic elements of this attack might still work. You’d just need to modify the “teleport” aspect of it.


Telegram doesn't use local CDNs for caching. All users are associated with one of about five telegram DCs, and upload files to their local DC. If a file was uploaded by a user on another DC, users connect to it temporarily to download the file.

The DC that a user is associated with is exposed by the API - you don't need to get them to upload a file to discover it - but it's so broad that it's not much of a deanonymizing signal. (Knowing that your target is in DC1, for example, just means that they're probably somewhere in North or South America. Or that they registered using a phone number that said they were.)

https://core.telegram.org/cdn


Going forward uploaded content should never go through cloudflaire and it never really needed to.

Add unique urls.

Maybe just avoid it altogether.


> Going forward uploaded content should never go through cloudflaire and it never really needed to.

The problem in this case isn't cloudflare. The problem is that these images load without the user's interaction and the person sending it gets to choose if it's cloudflare or not. So your statement within this context doesn't really work.


The person receiving it chooses to download images or whatever automatically though.

I dunno, I'd still say the problem is at least 50% cloudflare. Why should they make which datacenters have a resource cached be obvious public knowledge? I do agree though, one could still end up inferring this information noisily by sending an attachment, waiting a while, and then somehow querying a lot of DCs and trying to infer times to see if it's cached or not.

Personally, I've never been a fan about so many things like URLs being so public. I get the benefits of things like CDNs and what not and the odds of guessing a snowflake value and what not, but still...all attachments in Discord are public. If you have a URL, you have the attachment. And they're not the only ones with this kind of access model.


Isn’t that because the URL parameters are so long that by design they effectively _are_ the password protection for the resource ? They shouldn’t be able to ‘leak’ to unintended recipients.

Personally, like you I’m also not a huge fan of this, but URLs like that basically should be treated as the passwords. Don’t post them publicly / don’t give them out to people you don’t trust.


There's a part of me that's fine with it for a short-lived URL which contains a temporary access key but for a forever URL with a forever access key I'm not entirely happy with it.

I use it to share memes and shitpost but definitely not something to share sensitive content IMO.


For signal then the issue becomes saving who owns what image (so that you can re-issue “passwords”) and THAT is much more dangerous to the users than simply allowing users to grab semi-anonymous links into their cdn with enough of a url to be nearly impossible to iterate through every combination without hitting tons of rate limits. (Ignoring this location cache timing issue.)

Edit: Actually... (in signal's case) it might be possible to provide the user's device 2 tokens, 1 to access the url and 1 to issue new access links. Then the user can request a new access link with their second token when their url access token expires. Signatures would help prevent it from needing to be stored in the database. It would be interesting to try.

Edit2: Also I am now curious... does this mean only text messages are e2ee? yikes.


Discord doesn't do forever URLs for attachments any more, they changed that a while back.[0]

The problem here is avatar URLs.

[0] https://www.bleepingcomputer.com/news/security/discord-will-...


This is good to know, thanks for sharing this knowledge.


My main gripe is that if someone finds a vulnerability that gives you a list of urls the model falls apart. I’ve seen this happen in organisations :/

But agree with your statement here and others about the lifetime of the data - if something is sensitive or secret you want proper access controls applied, not just openssl rand -hex 8


> Why should they make which datacenters have a resource cached be obvious public knowledge?

I agree that having it in the header for everyone is maybe too obvious. But you could otherwise infer that from timing.


Would removing cloudflare fix the issue? Then the problem is cloudflare related.

Your defense doesn't really work. Sure many entities could share blame but the one fix is getting rid of cloudflare.


Note that CF will also route relative to the sites' plan. Enterprise sites are almost always routed to the closest DC, while if that DC is overloaded then lower tier websites, typically just Free sites, will get routed elsewhere (I suppose this is achieved via different anycast ranges where a specific DC is excluded). Although Discord, Signal, etc are almost certainly Enterprise sites.

I have this old site to test this (the list of sites is a bit old): https://cloudflare-test.judge.sh/


WTF? the trace endpoint allows CORS from any origin?!? Why?!


I doubt how useful it would be as an attack. As a single point of info it tells you next to nothing. As part of a composition of other indicators it would be the weak link in the chain probably just causing noise for the not un-likly scenario where the person you're targeting is using a VPN.

If it was any less specific we'd be talking about a deanonymization attack that outs whether or not a target is still on Earth.


Oh, this attack would be a useful tool for e.g., identifying whistleblowers that travel a lot (e.g., in academia, military). If you know their Signal ID, you could send them images from time to time and then compare their coarse locations with travel information for a number of suspects.


I believe they'd have to accept the chat request before any images would be loaded?

Looking at the app options it seems to be possible to disable media auto-download entirely; there's tickboxes for Images/Audio/Video/Documents via Mobile Data/Wi-Fi/Roaming.


Yes, I agree. This attack won't work on competent / paranoid people. What I had in mind when writing the comment: a whistleblower who wants to inform the press about illegal practices in their company and installed Signal to communicate anonymously with journalists. Somehow, a detective working for the company got their Signal ID and contacted them, impersonating a journalist.


> not un-likly scenario where the person you're targeting is using a VPN

Do you think a large proportion of Signal users also use VPNs? I'd expect it would be a higher proportion than the general population but still only a small minority.


> Do you think a large proportion of Signal users also use VPNs?

It is feasible to consider that interesting Signal users mostly use VPN as an extra protection layer.


Being 'interesting' doesn't make you more likely to understand VPNs and opsec. I expect it makes you more likely to try, but there's a good chance of doing it ineffectively.


I disagree, it does significantly increase the likeliness. Like having cancer makes you significantly more likely to know a lot of medical facts about cancer.

If you fear for your life you are much more likely to have spent time researching how to protect yourself digitally.


Fair point. But there are lot of educational resources for whistleblowers and others. OPSEC is crucial nowadays.


There's a lot of nonsense too. In another HN thread, someone was explaining to me that email is more secure than Signal, and desktops more secure than phones - and they had a link to someone's blog to prove it.

That's a HN reader. For the non-technical, it is a minefield.


for "normal people", that's a pain, but with enough resources,...

Although. it has edge usecases even for "normal people":

Eg. you suspect your coworker to be catfishing you on eg. discord, you know that he's in your city now, verify, then wait for him to leave for a vacation to somewhere abroad, check again.


This is actually pretty smart, and shows that this exploit could be chained with other information to identify a specific individual. This could also be used to e.g. check which world-travelling reporter is communicating with you.


It's not an edge case. Using multiple sources of information to paint a more complete picture is the norm. That's how marketing profiles work, for example.


Cloudflare does serve me from France. When I'm in Australia. (My ISP bought some IP addresses that were original regional France, back in the early 90s.)

So though this does have implications, the assumptions they utilise, like always, are not universal.


> My ISP bought some IP addresses that were original regional France

CLoudflare uses anycast, and IP geo location is not how anycast works.


That may be true. But you still need to explain why Cloudflare serves me from France, and not Sydney, in that case.


Wow doesn't that make things really slow due to the RTT of the acknowledgements?


Australia. Our fastest networks are pathetically slow.

The L2 FTTN parts of the NBN have been known to have an RTT in the range of minutes, for some locations.

My own varies from 5ms, for those who don't assume my geography, out to 890ms for those that do.


It gets more interesting when you think about the impact on groups. Sending an image to a group is enough for all devices associated with that group to be identifiable from CloudFlare's side, who additionally see a giant chunk of unencrypted traffic from the same client addresses going to other web sites. Given Cloudflare's less-than-straight approach to sales, it is astonishing the words "secure" and "Signal" ever appear in the same sentence.

CloudFlare get to see a fuckton of metadata from private and group chats, enough to trace who originally sends a piece of media (identifiable from its file size), who reads it, when it is is read, who forwards it and to whom. It really doesn't matter that they can't see an image or video, knowing its size upfront or later (for example in response to a law enforcement request) is enough


> Given Cloudflare's less-than-straight approach to sales, it is astonishing the words "secure" and "Signal" ever appear in the same sentence.

This is an overly binary take. Security is all about threat models, and for most of us the threat model that Signal is solving is "mainstream for-profit apps snoop on the contents of my messages and use them to build an advertising profile". Most of us using it are not using Signal to skirt law enforcement, so our threat model does not include court orders and warrants.

Signal can and should append some noise to the images when encrypted (or better yet, pad them to a set file size as suggested by paulryanrogers in a sibling comment) to mitigate the risks of this attack for those who do have threat models that require it, but for the vast majority of us Signal is just as fit for purpose as we thought it was.


Hello, I'm an organizer for a system to coordinate multiple mutual aid networks, many of which are only organizing by Signal & Protonmail exclusively because they think they're secure and private.

People who are doing work to help people in ways the state tries to prevent (like giving people food) rely on this tech. These are the same groups who were able to mobilize so quickly to respond to the LA fires, but the Red Cross & police worked to shut down.

This impacts the people who are there for you when the state refuses to show up. This impacts the future version of you who needs it.

Most people aren't disabled, yet. Doesn't mean they don't need us building infrastructure for if/when they become disabled.


What groups did the police and Red Cross shut down? Any links?


In any geopolitical crisis, you tend to have victims on both sides be prevented from getting relief, except when the one side is imperial.

The powerful entities tend to prohibit relief to the oppressed side, even making it illegal.


I’m thinking as well more “mundane” things as well, like red states with “charitable feeding” laws that in effect make it illegal to feed the homeless without large amounts of red tape.

But, truly, I think you’re right to highlight wars.

https://www.salon.com/2023/08/07/criminalizing-the-samaritan...


Someone should tell anyone who seeks confidentiality that no email is secure. Use Signal and enable the data retention (i.e., automatic message deletion) feature. By itself that is not perfectly secure, but it's a start.


The people involved are likely all using Protonmail. So that would mean TLS for the connection to Protonmail with E2EE for messages passing through Protonmail.

Not sure that encrypted email in general would be less secure than, say, Signal. Since Signal is an instant messenger on a phone it might actually be less secure[1].

[1] https://articles.59.ca/doku.php?id=em:emailvsim


This is why I say that it's overly binary, not incorrect. Some people do have such needs, and Signal can and should fix this for those people.


people who think protonmail is secure it's to the same level as mail.yahoo.com :)


Maybe not individual warrants (at least not warrants to do non-scalable collections like hardware bugs in one's phone - I.e. warrants that, most users, with high probability, are not subject to). But mass surveillance, e.g. NSA, even with 'mass warrants' (e.g. Verizon-FISA warrant), that everyone is subject to, is probably in most people's attacker model. I don't have a study handy, but it seems reasonable that most users use signal to protect against mass surveillance and signal advertises itself as being good for this.

Also Marlinspike and Whittaker are quite outspoken about mass surveillance.

If cloudflare can compile a big part of the "who chats with whom" graph, that is a system design defect.


I highly doubt that signal does anything to help with mass surveillance. Signal started keeping people's name, photo, phone number, and contacts in the cloud protected by a "secure" enclave the NSA almost certainly has access to and hackers already got into (https://community.signalusers.org/t/sgx-cacheout-sgaxe-attac...) and even leaving all that aside, all anyone needs is a PIN that can be trivially brute forced. (https://www.vice.com/en/article/signal-new-pin-feature-worri...)


I thought it was digits only but see there's always been the option to use an alphanumeric passphrase as the "PIN". That prevents brute-forcing for anyone that bothered to use one, right?


It was only digits initially (https://old.reddit.com/r/signal/comments/oc6ow4/so_a_four_di...), with nothing preventing very easy ones like "1234", but even after they fixed it they continued to call it a PIN and many people would just assume is a number ("number" is right in the acronym), and often a very short one. Most people didn't want to set a PIN at all, they'd been being nagged about setting one and then got nagged again and again to reenter it.

It was not clear to most people that their highly sensitive info was being uploaded to the cloud at all let alone that it was only protected by the PIN. I wouldn't be surprised if a lot of people picked something as simple as possible.

https://old.reddit.com/r/signal/comments/gqc2hu/the_new_pin_...


Their announcement post says "at least 4 digits, but they can also be longer or alphanumeric", though maybe the feature had launched before that was written? https://signal.org/blog/signal-pins/

Far from ideal I agree.


> Signal can and should append some noise to the images when encrypted (or better yet, pad them to a set file size as suggested by paulryanrogers in a sibling comment) to mitigate the risks of this attack for those who do have threat models that require it

Adding padding to the image wouldn't do anything to stop this "attack". This is just watching which CF datacenters cache the attachment after it gets sent.


Right, my bad on the ambiguity—I was replying to the OP's concern about image sizes, not the attack in TFA:

> It really doesn't matter that they can't see an image or video, knowing its size upfront or later (for example in response to a law enforcement request) is enough


That makes sense. Thanks for the clarification, my bad!


I think the threat model of enough signal users to matter is nation-state actors, and signal should be secure against those actors by default so that they may hide among the entire signal user population


>It gets more interesting when you think about the impact on groups. Sending an image to a group is enough for all devices associated with that group to be identifiable from CloudFlare's side,

Doesn't this open up the possibility to identify groups that have been infiltrated by spies or similar posers? If you use this method to kinda-sorta locate or identify all the users in your group and one or more of those users ends up being located in a region where you should have no active group members then you may have identified a mole in your network.

Just thinking out loud here since there's no one else home.


>If you use this method to kinda-sorta locate or identify all the users in your group and one or more of those users ends up being located in a region where you should have no active group members then you may have identified a mole in your network.

...unless they happen to be using a VPN for geo-unblocking reasons or whatever.


If you're in a group like this where people are seriously concerned about their location being discovered by governments or by their own contacts, anyone in that group who is not already on a VPN all the time is either ignorant or nuts.


Communication of any sort over any channel risks sharing location information. Silence is secrecy.


I wonder if we'll see assets being padded to some common byte sizes to combat this.


Hi there, Signal dev here. We do, in fact, pad attachments to a limited set of bucket sizes.


Nothing stops Cloudflare from inspecting the file contents, or using a hash to distinguish between identically-sized files.

The only reason we assume they don't do this is because it's a waste of resources for no good reason. But what if somebody gave them a good reason?


Aren’t the files end-to-end encrypted? How would they inspect the files?


yeah, the person you're referring to is confused because the Cloudflare HTTP service terminates TLS and presents a Cloudflare certificate, but that doesn't have anything to do at all with Signal's E2EE which is not based on HTTPS PKI


Last time I used Cloudflare I think their settings default to only "Origin SSL/TLS" (or whatever they call it), which wouldn't encrypt anything between Cloudflare and the origin, it would only encrypt data between Cloudflare and the end-user/browser.


But the Signal client encrypts images before sending them to the Signal server. If it padded out the images at that point, the images would all be indistinguishable from each other unless Cloudflare were actually able to break the encryption (which would completely undermine the entire security model).


Ah yes, I'm sorry, I mistook the context. If Signal encrypts the images E2E, you're right that it wouldn't matter what Cloudflare does, especially if padded.


So the image is uploaded for each recipient with an individual key?


TLS doesn’t matter for End-to-end encrypted stuff though, you could exchange the data over Telnet and it would still be secure. The content itself is already encrypted before being transmitted and can only be decrypted by the receiver.


AFAIK the attack described by OP only works if the attacker knows the (randomly generated) URL of the image, which probably means they have a Signal client that can decrypt the image already. So the secrecy of the content is not at issue. The question is whether some specific person has received the same image, and from where.


Part of his attack requires disabling the cache on his (sender) side so that he doesn’t pollute the cache. That implies that both sides of the conversation share the same URL, which means Cloudflare could assume two IP addresses requesting the same URL on the Signal attachment domain are participating in a shared conversation.


Yeah, that's a problem. It is leaking metadata, not content.

Ideally, the image should be padded, encrypted with a different key, and given a different URL for each user who is authorized to view it. But this would increase the client's burden significantly, especially in conversations that include more than two people.


> , it is astonishing the words "secure" and "Signal" ever appear in the same sentence.

You misspelled "I do not understand what end to end encryption means"


It could be useful for correlation.

Say for example that you're an investigating agent in regular contact with someone.

A single data-point wouldn't mean anything. However, a sequence of daily image retrievals might tell you that they spend 90% of their time in WA and 10% of their time elsewhere.

That information alone still might not mean anything, but if you also have a specific suspect in mind, it may help confirm it. Or if you have access to the suspected person directly, if you're able to also befriend their "clean" profile, you might be able to pull the same trick and correlate the two location profiles.

De-anonymisation isn't about single pieces of information, but all information helps feed into a profile to narrow suspects or confirm suspicions.

( By "agent" I just mean a person, not an AI agent nor Law enforcement, who could presumably just get the information more directly from cloudflare. )


There's probably at least a few instances where you send someone you think is American a picture but it gets cached in Moscow, or vice versa. Or you post a meme to a Californian left-wing group and it gets cached in DC. Not hard to imagine situations where getting an unexpected rough location could be a valuable signal.


>Or you post a meme to a Californian left-wing group and it gets cached in DC. Not hard to imagine situations where getting an unexpected rough location could be a valuable signal.

Not really. Any public meme group is inevitably going to be monitored by intelligence agencies, and you should assume as such. Even if it isn't, I can imagine agitators from the other side joining the group with a Russian VPN to poison the well. If there's a private group of people that you supposedly trust, any competent mole is going to be using device/network level VPN to cover their tracks. Otherwise they're 1 click away (eg. if someone shared a link) from an opsec fail.


I would bet money almost no public meme groups are monitored by any intelligence agencies. And the few that are mostly only are just in the sense of being casually co-opted by state-sponsored trolls with almost no attention from actual intelligence agency staff (in the way this thread implies, with investigations and deanonymization and such).


I'm sure they're "monitored by intelligence agencies" in the sense of having a line in a database/report somewhere (that probably no-one reads). If the technique mentioned in TFA can be used automatically (and I see no reason it shouldn't) then it will probably be incorporated in due course (if it hasn't been already) - it doesn't have to be 100% accurate, it's just one more datapoint to add to the mix.


you don't have to "befriend" them. you send a friend request because that defaults to a push notification for users with the discord app on their phone. Now, with signal, i don't use it so i don't know how initial chats start, or whatever. The discord one is 0-click because the PFP in the friend request is the payload delivered via PUSH.

And to someone else's point - they had to block the request on their end with a MITM to do the 1-click version on signal. No such MITM is needed with the friend request.

As an aside, one time i got doxxed hard in an IRC channel with several hundred active users. I had a suspicion of who it was, and i knew they lived in chicago. So i "accidentally" sent a link to "screenshot proof" that was hosted on one of my domains. there was 1 immediate click. instant. Chicago. "accidentally" because it looked like i pasted an email body.

Packed the real screenshot and a complaint to the ircadmin. they said "and so you dox them back?"

can't win for trying.


You can also ping the same person multiple times, like once a day at different time of the day. That provides a more complete range.


It's not stretching it. The expectation is that Signal does not reveal any observable aspect of your IP address or location when receiving messages on it.

Whether this specific level/type of deanonymization is a problem for your particular use case is an entirely different question. Personally, I wouldn't even care if mutual contacts were to see my IP address outright (and they do for calls), but I'm not every user.


I don't care if users see "my" ipv4 because cgnat. I think i don't care if they can see my ipv6 because each machine gets a /64 to itself, that's the logic, right?

But my PBX and my matrix server both use coturn. Our 10 user "private" PBX we have to VPN into a fortigate in a DC to use, but to my understanding, there's literally no way to eavesdrop on those calls without already compromising the server it's running on, and if that's the case, no extra VPN steps or whatever will help.

anyhow even with a real, publicly routable IP, stock windows 11, stock macos (used to be true), and most linuxes won't get compromised by stuff like backorifice or whatever else l0pht put out as "remote administration tools". that is, there usually isn't any listening ports on a public IP these days. Shield's Up!


> to my understanding, there's literally no way to eavesdrop on those calls without already compromising the server it's running on

That's probably correct (with the caveat that I suspect NSA/FSB/MSS/Mossad/whoever can reasonably be assumed to have backdoored Fortinet)

There is still the problem that an attacker with "global passive observer" capabilities (which almost certainly includes most non 3rd world nation states, and probably a few of the more problematic 3rd world ones too) can still do traffic analysis to uncover your social network (or criminal/terrorist/whistleblower/journalistic network) by identifying the call traffic endpoints.


>whoever can reasonably be assumed to have backdoored Fortinet)

Considering the almost weekly discovery of fortinet vulnerabilities that seems like a rather low bar


> I think i don't care if they can see my ipv6 because each machine gets a /64 to itself, that's the logic, right?

I suspect you're looking at that wrong.

It's each internet connection that gets a /64, not each machine. Your ISP hands you a /64 and you can do whatever you like with it on your home(/corporate) network.

So you can choose from 18 thousand trillion IPV6 addresses for any machine behind your ISP/internet connection, but the top half of your IPV6 address uniquely identifies that ISP and they can connect that to your account/payment details, with 4 billion times as much precision as an IPV4 address.


> It's each internet connection that gets a /64,

i get a /48, which i can delegate the prefix to 255 subnets of size /64, so each machine on my LAN gets a /64 this is Prefix Delegation, part of DHCP v6 aka DHCP-PD

edit: this is still "new" in that half the consumer routers only partially support it. but afaik it was in the spec for ipv6 that each node should be a /64, so realistically my LAN having each node with /64 is per spec, and machines that are NAT behind a single /64 at the gateway are out of spec and part of the reason that no one uses ipv6, IMO...


That still means your /48 identifies you with much higher precision than a cgnat-ed ipv4 address ever could.


this isn't some gotcha directed at you; but isn't that true if i have a public ipv4 as well? also an adversary would have to know that i am actually using the entire /48, that the ISP does PD, etc which means a skid won't. a government will, but a government isn't gunna fiddle with ipv6 when they can just subpoena the DCs my data traverses and get the same info.

If i visit some site via v6 on my desktop today and in a month from my phone, at home via v6 over wifi, what percentage of companies will pool those two devices (assuming no pooling from merely being my device, etc). Either ipv6 is a nightmare or it's the utopia we were promised i will accept no compromises.


Exactly. Especially when considering that Signal was often advertised as that *one* privacy friendly open-source messaging solution in a world dominated by data-collecting demons like WhatsApp, etc. I don't think even WhatsApp let's such status details leak; notwithstanding whatever they might be doing with the user data on the backend.


I can send a link in Whatsapp to a domain I control and track if clicked. How is that different?


The difference is that your target needs to actually click it. For this, they don't.


"Deanonymization" doesn't have to refer to a full exact address. There are people who wish to conceal which country or region they live in, which this cripples.

There was a real example of that amount of information being relevant in the Silk Road investigation. Ulbricht accidentally revealed his timezone early on, which was useful to US authorities since it narrowed him down to being in the US, whereas without that information he could have been from anywhere in the world.


Not really.

Anyone who wants to conceal what continent they're on will also be using a VPN 24/7, or will have the proxy setup in Signal (AKA running 24/7), which defeats this.


Yep: If your threat model includes an attack like this and you're not always on a VPN already, you're likely already compromised.

This is a neat demo, but it should not fundamentally alter the way that anyone is using Signal. Either it doesn't matter to you or you already have mitigations in place.


> If your threat model includes an attack like this

The problem is, nobody's threat model includes state level attackers, until one day it does.

Back when Ulbricht was publicly asking questions using an easily uncovered identity, he wasn't thinking that in a few years he'd have the full force of every relevant TLA in the US (and Five Eyes/14 Eyes) trying to track him down.


But he also chose to go on and found a darknet narcotics service. Most people don't do something like that.

Yes, it's vogue right now to speculate that what you're doing right now could suddenly become illegal in a new administration, but if that happens tomorrow, most of us would be one of hundreds of thousands who are all in the same boat. For that reason, most of us won't get targeted retroactively for behaviors that were legal at the time, and we have the option to reevaluate our security posture when the political landscape changes.

But yeah, if you're actively speculating about starting an illegal service today, you should definitely have a better security posture than Ulbricht did.


> Yes, it's vogue right now to speculate that what you're doing right now could suddenly become illegal in a new administration, but if that happens tomorrow, most of us would be one of hundreds of thousands who are all in the same boat

I'm probably more paranoid than needed, but I'm way less sure than you seem to be about being able to hide as one of a few hundred thousand needles in the US public haystack.

I, for one, would be terrified right now if I were the child of illegal immigrants. The hateful portion of the hard right are gleefully looking forward to ICE rounding up hundreds of thousands of people.

You should probably be concerned if you were publicly pro-choice a few years back. Or if you came out as trans. Or got gay married. Or any of probably hundreds of other things that most people would have thought perfectly safe and socially reasonable in the recent past, which are looking much less so today.


> but if that happens tomorrow, most of us would be one of hundreds of thousands who are all in the same boat. For that reason, most of us won't get targeted retroactively for behaviors that were legal at the time

I'm sure that would be part of any oppressive government's plan. They wouldn't go after people for their past "transgressions" as long as they keep their heads down, do as they're told, and don't cause any trouble. At that point you're morally compromised.


When I was ~15 and this was ~2004, some friends and I ran a forum with a lot of users and did some bad things where we would track down repeat banned users and screw with them. (In our defense, they were screwing with us.)

We used everything, from browser fingerprinting (and EFF only made the world aware of it 6 years later), looking them up in databases, tracing every digital evidence they left, etc.

Every little thing counted. What I learned is that people leave a lot of traces and you can collect these traces to dox them. The way you write is even sometimes fairly identifiable.


If I know someone on Signal I can now check if they’ve left the country.

Or send this to a bunch of signal users whom you suspect one of them being a particular person, and if you know that the person you are looking for is going to travel you can send it once before and once after. Then see which of these users were in the home city and subsequently in the destination city.


A VPN obfuscates this. Assuming a target is even remotely aware, you might think they are in Australia, while they're actually in Nova Scotia


Say I send a message to someone who has a phone with push notifications enabled, showing message previews. Will the phone still be connected to the VPN when it wakes up to display the message? Because my iPhone doesn't seem to stay connected to my VPN when it sleeps, at least not reliably.

There really should be a "never use the internet without VPN" mode on devices.


That exists on Android. VPN on ios is known to be rather leaky.


Valid point. Afaict, vpns I've used route all network activity regardless of phone state, but that's likely dependant on the service.


I don't see how that can work for the push packet itself, cause I thought that's specially handled by some low-power hardware on the phone while the main parts are shut off. Unless that hardware is also managing the VPN connection, which I doubt.

So if there's no always-on hardware maintaining that VPN connection, probably the phone is going to wake up without it. And even if it auto-reconnects, it'll probably load stuff before it's connected to the VPN.


Yeah, probably only if mobile data is turned off so the packet doesn't hit the mobile network, and only wifi calling /messaging could the VPN hide location.


The real attack is that a law enforcement agency can trivially subpoena CloudFlare with the attachment URL they will hand over the IP address of the recipient of the image along with whatever other requests they made through the CDN which can pretty precisely and rapidly de-anonymize you.


Indeed, "incredibly precise estimate of the user's location" feels like an exaggeration. But still, very interesting!


I'd say it'd be useful for very specific use cases. Such as finding out what country Jia Tan, the XZ Utils backdoor attacker, is in.


I wonder if it'd be a good idea for Signal to implement a "simple" mode that would deactivate most features in order to reduce the attack surface for people who really think they are being targeted. Would that be a good idea ?


Caching attachments at a single nice, big, juicy honeypot like CloudFlare is one of the reasons Signal's privacy guarantees don't feel totally solid to me. I get that it's pragmatic, but feel there must be a better way.

Does the caching occur even if both users are online when the attachment is sent?


Even time zone leaks are privacy issues, and the leak we're discussing is more fine grained than time zone.


It only takes 33 bits to identify someone. This reveals a couple of bits.


Not really. It's only true if the bits are uncorrelated, and you can acquire additional bits of information. I don't see how you can go from "this guy on the internet lives near Albuquerque, New Mexico" to "this guy is Walter Hartwell White, and lives at 308 Negra Arroyo Lane, Albuquerque, New Mexico, 87104" without massive opsec failures.


If you want to extend the analogy, Gus Fring's threat model for RFP contractors at the superlab required flying people into the United States and driving them for days before reaching the final destination. i.e. If you aren't selected for the final proposal, the most you should know is the lab is "somewhere reachable by driving from the United States".

Locating the superlab to within 800 miles would break Gus' threat model.

Combined with the information the police have, which is that a new form of "blue meth" is spreading across the American southwest, a reasonable conclusion would be that the "underground superlab" is where the meth is being manufactured. It's independent corrobation of a major manufacturing operation occurring in the United States in the exact region where a new drug is taking off.

This is useful, since it helps rule out the meth being smuggled in from Mexico. It also makes the lab a high priority target, because a DEA agent investigating doesn't need to liaise with a foreign government, and you can secure a domestic prosecution + American prison time instead of attempting to extradite the cooks.

It also allows me to send a detailed memo about the superlab to ASAC Schrader's office in Albuquerque telling him about a threat in his jurisdiction, rather than circulating a brief summary about this superlab in the weekly intelligence briefing sent to all high-ranking DEA officials they probably don't read.


Brilliant. Please consider writing a book about things like this.


Every little bit helps.

You can plot the timestamps of every message, read receipt and emoji reaction, which gives you the timezone and hints at work schedule, commute duration and vacations.

Often people will post photos or have profile pictures.

Say you have a photo taken at a random mcdonalds. That'd be 36'000 locations. Imagine cloudflare location and timezone help you narrow it down to new mexico. That's 80 locations. Small enough that you can look at every single one using street view and check where the photo actually was taken.

Now you can subpoena the McDonald's cctv footage and figure out who sent that picture.


You can almost certainly narrow down the McDonalds with a wide variety of things - this example is fairly contrived.

If you can see outside of the McDonalds for street view to be usable, you're almost certainly able to determine what country it is in, and potentially the exact location, depending on what is visible outside.

If it's a picture that shows the menu, well, street view isn't likely to be super useful, but you'd have a trivial time figuring out what country it is in at that point - menus vary from country to country, even when they are still in English.

New Mexico has relatively few McDonald's restaurants because New Mexico has a fairly low population - only 2.1m for the whole state. With that in mind, it seems unlikely that that Cloudflare has a close enough POP for you to be able to specifically decide it's NM.

If I can see enough for Street View to be able to confirm location, it seems like I can just search via the data there and get far more narrowed down results. If I can see a Burger King and a Best Buy outside from the picture, I can just use one of the many mapping services with APIs to get a list of all McDonalds locations within a tenth of a mile of a Burger King and Best Buy and look through a smaller list. If I'm confident of the time zone, like you suggest we should be able to be, then that's an even smaller list.

I'm not saying this attack is useless by any means, but I don't see a world where the sharing of the pictures to begin with isn't the most significant opsec failure and doesn't open you up to being de-anonymized in a myriad of other ways.


>Often people will post photos or have profile pictures.

>Say you have a photo taken at a random mcdonalds. That'd be 36'000 locations. Imagine cloudflare location and timezone help you narrow it down to new mexico. That's 80 locations. Small enough that you can look at every single one using street view and check where the photo actually was taken.

Sounds like the bigger opsec failure is posting the pictures, and the leaking the cloudflare POP only makes the search slightly easier.


> Sounds like the bigger opsec failure is posting the pictures, and the leaking the cloudflare POP only makes the search slightly easier.

I would not define 3 orders of magnitude as "slightly easier".


There is a fun post that explores this idea via an anime called Death Note.

https://gwern.net/death-note-anonymity


Repeat the attack daily for a few weeks and you might get a pattern of movement. Of course if the target hasn’t left their general area then this won’t help. But if you’re a nation state watching a target move between multiple international locations, you could match this up with passport travel data to significantly reduce the anonymity set.


Seems contrived. What type of a person cares about deanonymization attacks and nation-states trying to find him, but doesn't have an always-on VPN? Even without this attack, not using a VPN means you're 1 wrong click/tap away (if you accidentally clicked on a link) from leaking your IP.


Right, agreed that VPN is the primary mitigation against this from a user perspective. But opsec is hard, especially when the attack can be triggered by a notification when the victim might not be expecting it and might not have VPN enabled (e.g. maybe they only enable VPN when using Discord).

(But notifications are already a bad idea for opsec anyway.)


>But opsec is hard [...]

That's why the attack is contrived. If you have poor opsec you don't need need this attack at all. You can probably get the victim's exact IP by getting him to click on a link, or sending him an email. If he has good opsec he's going to be using a VPN that renders this attack useless. For this attack to be valuable you need a guy who has such good opsec that you can't get his location any other way, but for whatever reason isn't using an always-online VPN.


Two or three very small opsec failures equals one massive opsec failure.


Combined with other information, it may identify someone reliably, just like you can with zip code, age and gender. For example, if you know this person is part of a group with members in several locations, or if you can corroborate someone's movements, etc.

For example, imagine someone suspected of sharing sensitive information with a journalist. They might have a short list of suspects, and use this technique to confirm which one it is. They might identify which journalist it is - maybe only a limited number cover this beat.


Or you want to find a specific journalist, and you find out that they just arrived to a certain city, and there are only three hotels in that city...


That doesn't tell you whether that journalist is investigating you. Identifying them as the recipient of a Signal message from a suspect is valuable information.


I mean to assassinate


Why are we talking about assassination?


It's leaking so many bits idk what else you would call it, deanonymization isn't a one shot thing and it's a spectrum not a binary outcome


CloudFlare has the actual IP address that viewed the image. Which means some powerful (or rich enough) actors can get it.

This is very very bad.


This was... always, the case though? For any CDN service? How do you serve traffic to people without knowing where to send it?


Agree. Though a valid concern might be that a victim uses signal because of E2EE, thinking no 3rd party involved in delivery, not knowing/thinking about a CDN used.


Onion protocol.


> cached in a data center near that user

Not necessarily. Cloudflare is very upfront that they do not cache everything, and the time things are cached can vary greatly.

The kid keeps talking about "deanonymization" and he has no idea what the term actually means.


> attacker can use the cache geolocation method to pinpoint the recipient’s location

Agree, good writeup, but also a stretch to say they are "pinpointing" anyone's location.


Send picture to multiple accounts, perhaps on different services, the links that are cached at the same data center can be more confidently believed to be related.


This is not unique to signal. URL strings can contain identifying information regardless of where they are shared or posted. For example, if you send a link that ends with string of characters, these may correspond to a geographic location or browser settings. Blogger urls used to be geolocated, such as .ca for Canadian viewers. it is always safe to strip out unnecessary chacters if you're paranoid.


WhatsApp has an option to disable link previews.

Surprised signal doesn't have this option.

I only message people I know on Signal anyway.

Edit: it seems signal does have the option


I had this same thought before reading the article - this isn't about link previews, it's about attachment caching


But previewing can involve automatically loading resources. This "attack" is very similar to CSRF in that your exploit involves making the victim load a specific resource. That's why in secure mail clients, nothing but plaintext should be rendered, and an optional "Load all resources" button is shown for when you trust the sender, and want to load any media elements that require HTTP onto your client.

Signal could mitigate this with something similar, where it didn't load the image file AT ALL, and instead showed a message:

<User> wants you to load an image from https://example.com/foo.png. Load image? > Yes > No


The difference being is that it's not a resource controlled by the attacker, it's an attachment hosted by Signal. But yes, removing previews for everything would mitigate the issue.


Why would cloudflare ever operate a data center that only one user at a time is ever near?


Looks like it's possible to hit 2 datacenters due to load-balancing, which would narrow it down a bit more. Suppose you do this repeatedly as the target is moving around, hitting even more datacenters.


You underestimate the value of this piece of information taken at different times. It can be enough to know in which country a person was yesterday or is today.


Why does it need to be cached though?

The only case where it might be downloaded more than once is if the user has multiple clients. Not that common and still very little traffic.


For that reason that's why federated setup such as matrix are better. It is much harder to deanonymiza a set of users on different servers in group chat.


Did you see the GIF? It's able to triangulate.


Mmmm "qualified deanonymization" perhaps?


Imagine sending a friend request to bin Laden's videographer and getting a reply from Pakistan while your entire military is looking for him in Afghanistan?

There's definitely cases where this is going to be immediately used. Shit, just using it to scrape Cloudflare for additional metadata on everyone from other user table leaks is probably valuable data. Even triangulation over time as they move around is going to get a more precise result. Maybe you find a vulnerability that takes that cloudflare node offline and run it again, repeat until you've got a fairly small radius they could be in.


Headline feels like a click bait :)


timing and location can usually prune things down to enough data about a person.


> (no other users near the data center).

Yeah and in that case there won't be a data center because who puts one in places without clients nearby? :)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: