Hacker News new | past | comments | ask | show | jobs | submit login
Faking Twitter unfurling to phish you (harrydenley.com)
189 points by wongmjane on Nov 8, 2021 | hide | past | favorite | 67 comments



Judging from the comments, some are really confused on what's happening here.

The real trick is that TwitterBot and you see different pages. For TwitterBot, which always clearly identifies itself (and some other signs like whether it is from Twitter's network infrastructure), the flow is t.co -> attacker.site -> legitimate.site, and so shows in the card (technically called unfurling) the details of the legitimate site, including the coveted legitimate domain name. For you, the attacker.site detects that you're not TwitterBot and do whatever phishing attempt they need to do. Of course, if you do check the domain name on your browser, it won't work... but let's be honest, that's just a fraction of people here, not even including the general public.

Others ask why TwitterBot does redirections, and it seems that everyone here forgot that marketers love their Bit.ly and Sprinklr links so much that Twitter needs to have a concession here (and no, you can't just whitelist them because some companies uses their own different shortlinks like t.co, fb.me, g.co, msft.it, redd.it, and youtu.be).

Why not just directly serve the redirection as seen by TwitterBot? Because a) marketers and analytics and b) because services like Branch (app.link) and Adjust does redirect users differently depending on their specific device (like Windows vs macOS vs Linux (or even a specific distro!) vs iOS vs Android).


So could Twitter make two requests, one as TwitterBot and one anonymously, and then add a warning if they don't go to the same place?


The attacker doesn't need to detect whether the TwitterBot is making a request. They can redirect every request to the spoofed site after posting the link, until the preview is generated.


That's what Google does sometimes - but it's sometimes considered rude. Plus, anti-bot software may accidentally thwart Twitter's checking bot.


All true but the really bogus part IMO is that by default it would unfurl to the actual, bad URL but if you remove the &amp=1 param it unfurls to the good domain. Why is that?


I'm not really sure, ask Twitter since the amp=1 thing is just generated in their mobile website and application. This is definitely a guess, but maybe some websites implement AMP by checking the referrer and redirecting to it, and Twitter interprets that as "let's backtrack to the last page, that's the canonical version" and uses that?


Security > marketers.

Whoever made this decision at Twitter should have a think about themselves.


> and no, you can't just whitelist them because some companies uses their own different shortlinks like t.co, fb.me, g.co, msft.it, redd.it, and youtu.be

It won't be terribly hard to build a top 50 list of url shorteners etc that cover the vast majority of the traffic.


I think some URL shorteners allow editing the URL after creating a short link. So you are back to square one


If it stores the short link and destination url in a database that can be modified, yes.


> Others ask why Twitter does redirections, and it seems that everyone here forgot that marketers love their Bit.ly and Sprinklr links so much that Twitter needs to have a concession here.

As far as I know, users cannot view the metrics for t.co links, or am I mistaken about that?


You misunderstand. "Why twitter does redirections" is "why does twitter follow Location headers to get unfurl info / metadata", not "why does twitter have t.co", and the reason is because marketers use bit.ly etc, so twitter has to follow those redirects.

Marketers/users cannot view t.co metrics, but even if they could, they'd want to use their own url shorters anyway I'm sure... so twitter has to have the t.co previewer follow arbitrarily many redirects.


> they'd want to use their own url shorters anyway I'm sure

Yes, a single dashboard to view their marketing campaign (which Bitly and Sprinklr among others provides) is a very attractive option for marketers to the point that I actually see shortlinks on companies' own website. I personally digress, but the simple fact is that these companies provide what the marketers want.


This reminds me of a little joke link shortener I built[0] that allows you to set the various opengraph tags to the shortened url. This lets you completely fake the link preview generated by most platforms that show you one. Even though I originally built it as a joke, I find myself using it pretty often to make links 'self-explanatory'.

[0] https://github.com/radiantly/the-redirector


Not the blog author, but practically speaking, faking OpenGraph tags would result in same phishing capability (considering that most people don't check the cards carefully), but it'll still show that the link was not from that site. But again, most people would still click paypal-not-really.com or coinbase-is-not-controlling-this-site.com


Why does this require the extra step of using a burner account? Why not tweet https://twitter-unfurl-faker.herokuapp.com/ from your main account and that's it?

Does Twitter only unfurl t.co URLs? If so, why would they write separate code for unfurling t.co with ?amp=1 vs without ?amp=1 ? And why would Twitter unfurl a t.co link past the first non-t.co URL? I guess that's the vuln, right, that they don't stop after the first non-t.co URL?


Sometimes, t.co links redirect 5+ times until the target domain is reached, so I guess fixing this would break a lot of twitter‘s content.


You mean they redirect through various different domains not affiliated with Twitter before reaching the end? Who creates these links? Is it people creating short links before posting to twitter? What's the purpose? Just tracking?


Yes, my bet is on tracking.


Wait, my original question still is open. "Why not tweet https://twitter-unfurl-faker.herokuapp.com/"? If Twitter won't unfurl non-t.co URLs, then that "would break a lot of twitter‘s content."


Is there any sort of within-a-page HTTPS "secured" function?

Like how banks have that "only type your password if we show your correct profile picture". Almost like if embedded tweets could be "signed" by twitter in a way that would register in my browser in a graphical way that would not be known to the server itself (ie putting the twitter logo next to a tweet is easily faked). But if content appearing to be from twitter was verified instead, and had my custom chosen avatar next to each showing both my key and twitter's key had signed the text.

Maybe not making sense, if anyone wants to play this back clearer go for it :)


I think it's difficult-to-impossible to do anything secure within a page, because a malicious page can emulate virtually any kind of behaviour within the page. https://textslashplain.com/2017/01/14/the-line-of-death/

(For example, if you had some sort of "signed iframe", the page would probably find a way to show the part from twitter that says "verified" but cover up the part that it's supposed to be actually verifying with something else).


> (For example, if you had some sort of "signed iframe", the page would probably find a way to show the part from twitter that says "verified" but cover up the part that it's supposed to be actually verifying with something else).

This is the part where I imagined having a custom client side image. That way the server doesn't know what the "verified" image actually looks like. Could be a picture of my face, for example.


> That way the server doesn't know what the "verified" image actually looks like.

Right, but it doesn't need to - it just has to construct a page that has the "verified" image on the left and the malicious URL on the right. Which is very difficult to rule out.


How would it construct a page that has the verified image if it doesn't know what the image looks like?


It would construct a page that includes a part that's genuinely verified (so the browser displays the verified image) and a part that's malicious, but arrange it so that it looks like the verification goes with the malicious part.


Wow, this is a really great article. Thank you for sharing!

It discusses a fascinating point about browser UI: when the browser displays something inside its chrome, where a malicious page could render arbitrary pixels, it must establish a visual bridge back to the "trusted zone" (the chrome), providing proof that is in fact trusted content.

(The author points out that new APIs allowing writing to the entire screen means it's hopeless and we're all doomed though.)


signed web bundles [1] let you package up a page as a fixed resource, distributed with a signature to verify that the content of that page is what the author intended it to be, so a site like twitter that is embedding it can be sure that what they're embedding is always the original resource.

however, this is part of AMP, and so web developers who prefer the "do whatever the fuck you want" aspect of the internet push back against it. the ability to ensure that a web link contains the same content at a future date as it did when it was initially crawled was one of the things we destroyed along with AMP.

[1] https://github.com/WICG/webpackage


> Like how banks have that "only type your password if we show your correct profile picture".

Do they? My banks did that years ago, and they also stopped doing it years ago.


They are ineffective.

>Of the 63 participants whose responses to prior tasks had been verified, we were able to corroborate 60 participants’ responses to the removal of their site-authentication images. 58 of the 60 participants (97%) entered their passwords, de-spite the removal of the site-authentication image

See https://security.stackexchange.com/a/19801 which summarises https://sites.google.com/site/ianfischercv/emperor.pdf


Some still do. I don't see the point though - a determined attacker could just make http requests to your bank and substitute the parts they want to. Would be on the attack domain still so still technically phishing... But if the image is an anti phish measure it's not a great one. I suppose it could raise the bar to a successful attack a bit but certainly doesn't make it impossible.


Isnt't that what CORS/same-origin policies prevent? The attacker domain can be prevented from loading the bank resources within the same context by the browser. If the request is made by the attacker domain instead and proxied to you, then it doesn't have your cookies to display the private identification.

In either case, the "correct profile picture" would not load.


> If the request is made by the attacker domain instead and proxied to you, then it doesn't have your cookies to display the private identification.

Why is that a concern? You try to log in on a phishing site. The phishing site tries to log in as you at your bank's actual website. Your bank sends the phishing site your picture. The phishing site displays your picture to you.


The bank can and will use quite sophisticated request flow analysis to prevent one party from making too many attempts, so this means an attacker must grab a botnet or similar and be careful to avoid detection.


Most people would not question having to type in their username for a fresh login - Banks sign you out so quickly and their "remember me" is often intentionally gimped. So users are trained to type their username into the field, and the bad site can proxy that to the bank and send back the image just fine.

Okta still includes this "feature" by default, and is among the reasons I will never trust Okta or any client of theirs.


You can keep iterating on this if you like, and some banks did, but ultimately the bad guy has the exact same information you've presented to the bank to get this "correct profile picture". Cookies. CORS headers. None of that matters. If you get the "correct profile picture" so does the bad guy and then they just forward it to you.

We already know how to actually solve this problem. WebAuthn.


They can add "redirected from xyz" to the card perhaps.


And also don't trust this link: https://t.co/MPesRJdK5y


Ok, but what's unfurling? As far as I can tell, this is just tricking the thing that tells you the target domain of a shortened link? But if you clicked the link, you could just see the link though right. How is this fooling anyone?


Unfurling is the process of fetching additional information (title, description, image) and showing that on the platform itself. Twitter does it, Facebook, Slack too.

On Slack you can implement custom unfurling that does more than just show the title/description/image. See docs here: https://api.slack.com/reference/messaging/link-unfurling. I'm currently building one such custom integration


> Ok, but what's unfurling?

I've never heard the term before now, but I interpreted it to mean following redirects to get the end page.

> But if you clicked the link, you could just see the link though right. How is this fooling anyone?

It fools you before you click the link. After you do, you're no longer fooled, as long as you pay attention to the URL bar. The obvious problem is people who don't pay attention the the URL bar.

Another problem might be you're forbidden from viewing certain sites at work, you see a link that goes to news.ycombinator.com knowing that's safe, but then go to a forbidden site instead.

Another problem would be browser 0-days. A link to news.ycombinator.com would be safe assuming it hasn't been compromised itself, but a different website might spring a browser 0-day on you.


Unfurling is showing the preview of the link within the tweet.


Why would twitter trust the Location header and not just parse the URL given to them? This seems like a strange choice to rely on their backend lookup just to display the URL...


Twitter follows the Location header since the purpose of that header is to redirect to another page. It's considered a good user experience to display the final page as a link preview rather than the intermediary redirect.


Because that's how HTTP redirects work. A `t.co` link is simply a 301/302 redirect to another site with a `Location` header.


Trick’s on me, someone finally got me to look up what the heck some cryptocurrency thing is because this article made no sense otherwise.


Also this is much less sophisticated than fooling curl | sh, but people continue to insist that’s perfectly fine.


I don’t think I’ve seen anybody insist curl | sh is fine from untrusted sources.

In many contexts, curl | sh is an alternative to adding some kind of additional repository to install a third party package — and in most package managers this is done as root anyway, with arbitrary pre-install and post-install scripts.

I’m not really sold on how curl | sh (with https) is any less secure than blindly following steps to add a repo.

I used to strongly dislike curl | sh, and if there’s some looming security risk beyond accidentally trusting bad actors who couldn’t be bothered to go to all the effort of setting up a repo then I’d genuinely like to know.


I have seen plenty of curl | sh invocations that pass the "-k" flag to curl, meaning that curl will allow insecure connections even if there are invalid SSL/TLS certificates.


You can detect curl | sh server-side and respond with different content than the inspectable source. The link I typically cite isn’t loading for me but you should be able to find more info if you’re curious.


You can do the same with VirusTotal. Check the UA, if it's Virus Total, serve a legit page, if it's non VT, serve the malicious one. Simple 301 redirects will be detected and presented to the analyst, URL rewrites wont.


This doesn’t make sense. Why doesn’t the Twitter bot serve the final site location after following all the redirects.

Then since the bot sees the correct site it redirects to the correct site


Because you can make a bot do what you want and a non-bot do something else.


Why not do it the other way around? Serve a page without a redirect to the twitter-bot and detect if the header does NOT contain twitter to serve a redirect?


Because my domain evil.com cannot serve a page from uniswap.org (because it's a different domain).


Firefox is doing a horrendous job of rendering the text on this site for me, on Arch Linux. Is it just me?

https://i.imgur.com/ZP9wC85.png


Looks like some poor subpixel rendering. You might want to try looking for some settings to tune it.


Given that you're using a version of Arch Linux, Firefox and focusing on one particular site, It's probably only just you.


happens to have a same setup as you, I don't have such problem.


This is pretty dumb.

It's like writing `[google.com](notgoogle.com)` and making out like its a significant security flaw or new idea.


This is actually quite more complicated. Your webbrowser (firefox does, at least) will show you the destination link if you hover over any link element. In the case on the article, the destination link is exactly how it is written. So how can we now trust that twitter's shortening links only go to twitter?


Isn't the destination link `t.co/XYZ` ?

That's what it has been for me in Facebook, Twitter, Instagram, etc. their own mangled URL. And no, I don't trust those either.

>So how can we now trust that twitter's shortening links only go to twitter?

They never did.

Edit: Just checked his example, and yes. It looks like the hover link is still a random garbled `t.co/XYZ` and not `uniswap.org`. I'm still right and it's still pretty dumb.


Not really a logical phishing strategy, if the first domain looks safe and the attacker controls it, why wouldn't they just use that to serve a phishing page? Instead of needlessly redirecting...

A better example would be to show "google.com" and somehow redirect to "phishing.com"... but that's not really possible without control of "google.com"


I don't agree with your analysis. There are three domains at play: twitter-unfurl-faker.herokuapp.com, uniswap.org, and harrydenly.com. The first is the real link, the second is what Twitter's link previewer gets redirected to, and the third is where the user gets redirected to.

It seems to me that the author does not need control over the second domain, just the first and third. But the user will never see the first URL, only the second.


As I understand it, the webserver at twitter-unfurl-faker.herokuapp.com just dynamically redirects based on the user-agent.

The attacker doesn't need control over uniswap.org or harrydenly.com to make this work.

They only need control of harrydenly if they want to serve a phishing page. But as I said above this is redundant and they could just use this domain to also serve the redirection. Example below:

* (Twitter bot) phishing.com -> redirects -> fishtanks.com

Twitter bot makes shortened link t.co/aaa (but the preview shows fishtanks.com)

* (User) t.co/aaa -> phishing.com


>but that's not really possible without control of "google.com"

An open redirect bug in phished site should allow this scenario:

A) Set up the offending link, redirecting to the phishing.com site.

B) When receiving the twitter bot, redirect back to a safe page on the original site for the summary. I understand twitter shows either the original URL or the final URL, but doesn't care for phishing.com in the middle.

C) Don't redirect back for non twitter traffic, so they end up on phishing.com.

A complex scenario, but perhaps enough to show that redirect bugs also matter.


Recent memory tells me Google has an issue and had another issue. You can have a Google.com page like google.com/awesomesite/

And earlier there was a Google redirector that was forgotten about and was being used to redirect to phishing sites.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: