Not much you can do other than fight a war of escalation.
If you want to give the phisher lots to do, create say 10 versions of your site where the html/css/js are tightly intertwined. Such that html version "a" only works with css/versiona.css and js/versionb.js. Mismatches create a site you can't view, interact with etc.
Then, for each version, put your countermeasures in your main js and css files, and tweak your site so that it doesn't work at all without those main js and css files. Add subresource integrity to the each version of the main html that pulls them in. Obfuscate each version of js and css.
And vary up the countermeasures per version. Like displaying the bitcoin address with css (selector:after {content: 'abc'}, with an image, dynamically with obfuscated javascript, etc).
Round robin which version of the page is served. This will make caching anything hard for the phisher, and the variations of countermeasures might get them to find a softer target.
Shitload of work, though, just because some ahole wants to skim money.
If they just reverse proxy the site (and strip or change subresource integrity tags), how would any of this actually help?
I don't think there's really a way to prevent someone from creating a phishing version of your website. You can get alerted to it by checking your web server access logs, but beyond that you can probably only take legal measures.
The proxy is altering the JavaScript, css, and HTML to get rid of various countermeasures.
Maybe one version dynamically pulls in crucial js with subresource integrity, with obfuscated text that isn't as simple to strip out. Another version where the main obfuscated js bombs if the subresource integrity has been stripped from the HTML, etc.
The approach I described means they need to solve for X different variations of different types of countermeasures. Think things like obfuscated JS that checks location.host, or the Bitcoin address. And responds in different ways. Buried in the main js. But each variation of the site using slightly different countermeasures, div ids, ways to encode the text of the bitcoin address, etc.
Like copy protection. Yes, you can break it. But time and effort are expended.
The article already suggests some laziness from the scraper. He only substituted the hostname if it was lower case. The general idea here is lots of countermeasures, and ones that change with every request. Wearing them down.
>You can get alerted to it by checking your web server access logs
Maybe. Hard on tor, there may be nothing unique about the requests.
The guy who runs the phishing servers, "crimewave", (everything is automated, all hidden services have these BTC phisher clones) actually posts on r/onions sometimes. I had a good discussions about possible counter-measures with him.
So clearly the random garble in smspriv6fynj23u6.onion isn't recognizable enough for a user to spot a bogus one. And so as long as an adversary is willing to spend the same amount of money mining their bogus key as you are in generating the original, you're never going to win.
My suggestion would be to, during the mining process, search the "random" portion of the domain for "recognizable strings" (probably just any dictionary words) and keep mining until a domain is found with a suitable level of "recognizability". This way you get the extra "strength" of having mined for a significantly longer string at only a fraction of the cost whilst an adversary would have to search for a much more specific string to mimic you convincingly.
This is somewhat similar to facebook having "corewwwi" following their domain - not words they were specifically looking for I'm sure, but notable and so would be necessary to bruteforce.
Consider it to be a bit like having "correcthorsebatterystaple" on the end of your domain (yeah I know it's too long).
Whenever talking about security, it's all about probability. The probability of a user distinguishing "io" from "com" is higher than then distinguishing "frh2dj3" and "frh2di3".
This is why we need SSL certs for .onions. DigiCert was doing this, but they told me it is on hold. Maybe when V3 hidden services are out?
With an SSL cert, you get a cert for your .onion PLUS your clearnet domain. Then users can access the .onion and see your real enterprise's name.
This should be doable even for extrajurisdictional companies like mine. The key is that your clearnet site should be a TCP-level proxy to the hidden service. That way the SSL keys are never on an easily-discoverable system so only IP addresses can be logged, not any contents. Slight plug: Here's the tech design for our system: https://medium.com/@PinkApp/pink-app-trading-latency-for-ano... (ignore the clickbait title). This should work just fine for .onions too, even with the clearnet entry point. For a DarkNet Market, most small buyers are probably safe visiting over clearnet.
The point is to remain user-friendly while hiding the actual webservers, databases, and anything else that LE might want to take. An onion-routing proxy is easily setup and torn down and moved around quickly, making it a harder target for persistent surveillance.
How are .onion service providers supposed to communicate out to their users whether or not their site uses a certificate?
And, presuming they are able to do this, why not just use that communication channel to communicate the correct .onion URL to the user in the first place (thus removing the need for a certificate authority)?
EDIT: Perhaps it would make sense to create a separate URL type for Tor services whose keys are signed by a certificate authority? So the URL would become e.g. secure.smspriv6fynj23u6.onion, and the Tor browser would reject sites prefixed with “secure.“ that don’t have their key signed by a certificate authority. This way, an attacker must register with a certificate authority in order to phish a “secure.“ Tor site.
> Perhaps it would make sense to create a separate URL type for Tor services whose keys are signed by a certificate authority?
The onion IS a proof of key. If you use the whole onion address (which is a hash of the public key) then Tor requires that the hidden service be able to prove they own the private key. It's like a builtin CA.
The problem to me is that knowing someone has a key isn't as interesting as knowing that the person is a trusted source. And being anonymous takes some of the responsibility away.
It makes more sense to me that someone just use an HTTPS clearnet site and users who want to protect their own IP address can access it from Tor (it works just fine).
Protecting the site owners identity and then wanting to prove their identity to stop phishing attempts seems at odds to me.
But DigiCert says they are no longer issuing them. Probably because the .onion is an 80-bit truncation of a SHA1 hash so it doesn't meet baseline reqs?
I don't know what DigiCert's motivation was for suspending issuance of .onion certificates, but the CA/Browser Forum had given a special dispensation (by ballot) for issuance of EV certs despite the criticism about cryptographic strength mismatch. This dispensation didn't extend to DV, but it's never been revoked, so DigiCert would still apparently be able to continue its EV issuance if it wanted to! (But it looks like they've even let the facebookcorewwwi.onion certificate expire.)
I've recently become an observer at the CA/Browser Forum and plan to bring up the subject of DV certificates for next-generation onion services imminently, just as soon as I'm done with my relaxing vacation!
Are you talking only about EV certs? Because DV certs don't really display your enterprise name, at least not in an easily accessible location for users to see.
I'm not a tor expert, but could this be solved by using something like namecoin for DNS within tor? So there would be a "proper" domain system for the .onion routes?
The whole confusion comes from the fact that tor domains have this random string added to the name you actually want to take, right?
The readme includes a table of estimated computing time required. A 15 char prefix like Facebook's is not even on the table, and a 14 char prefix is estimated to take 2.6 million years. There is also a GPU version which should be an order of magnitude faster: https://github.com/lachesis/scallion/blob/gpg/README.md
Also, technically. the onion addresses not public keys, but derived from a public key. It's actually a hash of the public key.
Facebook probably looked for anything matching "facebook(web-related words)", and were lucky to find one with only one errant character. There were probably enough acceptable variants that the effective computing time was less than 14 characters, maybe less than 13.
FWIW I've struggled to get keys generated by Shallot to persist very long but haven't found the cause. We've had to fallback to a non-vanity address. If anybody knows what I'm doing wrong please let me know!
I don't think the way you've generated the key is likely to be the source of the problem, although I don't have any good ideas about what the problem might be, beyond the obvious (is the server still online? is for still running? is it still using the correct config?)
> It seems like everything except the 'i' is a prefix, a lot of computing must have went in to generating it.
If Facebook can generate an address with only a single character being random (the trailing ‘i’), couldn’t an attacker generate anyone’s address by just applying 26 times more computing power?
Either Facebook didn’t target the trailing “corewww” or the .onion URL scheme is broken (since Facebook would be able to take over any .onion URL by just spending 26 times as much compute power as they did with https://facebookcorewwwi.onion/).
This is properly solved by the switch to stronger encryption, the new v3 onion addresses protocol (available with Tor 0.3.2.x-alpha) replaces SHA1/DH/RSA1024 with SHA3/ed25519/curve25519. Onion addresses are now 56 characters long, example: http://ffqggapqevcmylx6vtk5357i7bfjwbb6qchds3hlohangshxrwvdd...
To my understanding, there is no way to handle conflicts. If another person gets your private key, then they can collide their domain name with your domain name (and, speculatively, probably split traffic?)
IMO the risk here is that onion link lists and search engines can't be trusted. How else are people getting these bogus onions? There's no authority on these "hidden services" and hiding identity is kinda the point of Tor so phishing is easy.
Onion addresses are basically hashes of the public key of the hidden service. Trusting only the first few characters for match (for these "vanity onions") to identify the service is just a bad idea...
We are moving towards a more centralized web, not decentralized. The technology industry, and the Internet, is consolidating. Look at the news for all the companies being purchased by facebook, google, microsoft, amazon, apple, hpe and cisco. Also look at ISPs, Timewarner and spectrum, verizon, att, etc all are consolidating. It will continue to become harder to hide, or be anonymous, as we make this path because not only will data be held by just a few players, but so will the transmission lines only be held by a few.
The more the "mainstream" web becomes centralised, the more technical people are being pushed towards working on tools for decentralisation and anonymity.
It's happening, just not in the places that most people tend to look.
It's not happening in a meaningful way. The technology for a decentralized web is already here -- it's just the normal web. Our artificial legal barriers are what keep it centralized, and those will be an issue regardless of technical innovation.
Legalize scraping and fix copyright law so that users can truly assert ownership over the content they generate, and this will quickly become a non-issue.
P2P technology is cool and it has its uses; I'm even developing a decentralized distribution thing as a side project. But it is more work, which means in the typical web browsing use case, it is slower and less convenient than a conventional 1:1 conversation with a stable endpoint.
Suggesting that everyone introduce four hops of latency or that we all participate in a multi-billion device DHT (which all just translate to "much slower" to the typical user) is just not a practical solution to these problems.
The good and correct solution is to look to the root cause and fix it. That root cause is the incentive structure, including legal and financial arrangements, that heavily encourages the "AOLizaiton" of the web and allows the AOLizers to use the courts to clobber the hackers that try to re-liberate it.
> Suggesting that everyone introduce four hops of latency or that we all participate in a multi-billion device DHT (which all just translate to "much slower" to the typical user) is just not a practical solution to these problems.
DHTs can actually scale really well.
But systems don't have to be fully distributed to be less centralized. For example DNS stakes out a strong middle ground.
The problem with e.g. Facebook is that it's a closed system. You can't "apt-get install" a Facebook daemon and run your own Facebook server whose users can still talk to the people using facebook.com.
Part of the reason for that is the law but most of it is just network effects. Most people don't use GNU social because most people don't use GNU social.
Sure, DHTs are efficient, but they're not cost-free. You are still introducing a lot of unreliable, inconsistent, and slow hosts into the mix, and potentially having to traverse many of them to get access to the entirety of the content you're seeking. This is not a pleasant user experience, without even getting into the significant privacy and security tradeoffs, which can only be mitigated by making the system do more hops, more crypto, more obfuscation (which means, slower still). Octopus DHT is cool, but it is by no means a speed demon.
>But systems don't have to be fully distributed to be less centralized. For example DNS stakes out a strong middle ground.
There isn't an obviously-better "less-centralized" solution for DNS as far as I know. See Zooko's Triangle. [0]
>The problem with e.g. Facebook is that it's a closed system. You can't "apt-get install" a Facebook daemon and run your own Facebook server whose users can still talk to the people using facebook.com.
Saying "it's a closed system" is forfeiting the point. I can send packets to facebook.com and then turn around and send packets to any other destination. Why can I not send packets in such a sequence that the packets obtained from Facebook are then transmitted to some other place that makes it more convenient to use them? Because if I do that, Facebook will sue the crap out of me, as they've done to others. There is no real technical barrier preventing this, it's purely legal.
Facebook, Google, et al are not in their position just due to network effects. They've both sued small companies because they both know that if people can get the same data through competing interfaces or clients, if it's simple and easy to multiplex the streams and move the content around, the consumer won't need their company specifically anymore. They'll be relegated to replaceable backend widgets. That's their nightmare!
Facebook and Google are middlemen, brokers between what the user really wants and the people who are providing it. They are terrified of a world where their brokerage is unneeded, and they work hard to make sure that you don't realize it.
Twitter had the same realization about multiplexed streams, leading to their infamous crippling of third-party clients. Craigslist had this realization in their brutal about-face with Padmapper, after coming to their senses and noting that it posed a serious threat to their business. The entity that controls the user's attention controls the game.
It is at this point practically illegal to use a third-party exporter to read out and easily transfer the content from your Facebook page to another site. Even if it's 100% original content that you own completely from a copyright perspective, you can't run a program to read it out because the Copyright Act has been interpreted to mean that loading someone's HTML into your computer's memory could be an act of copyright infringement (this is called "the RAM Copy doctrine").
It's also usually illegal to download that page with an unapproved browsing device such as a crawler or a scraper; this is exceeding authorized access under the Computer Fraud and Abuse Act. You agree to all of this when you agree to the site's Terms of Service, but your agreement is not necessarily needed for these provisions to be effective.
Why are there are no easy "Try NewFace.com Services, We'll Copy Your Friend List, Post History, and Photo Albums right over!"? Because you'll get sued and left owing Facebook $3 million dollars if you try to do that. [1] :)
Once you throw something into the Google or Facebook black hole, they make it very difficult to pull it back out again. That's not an accident, and it's naive to just attribute it all to organic "network effects". The competition is dead not because no one else wants to compete for these users, but because they'll be sued to death if they do it in a way that's accessible to the mass market.
[Note: I know that both Google and Facebook have buried deep in the innards of the user configuration a mechanism that allows you to request the generation of a crudely-formatted, multi-volume zip archive representing some or all of your account data, and that you can receive some email some hours later delivering this data in chunks. This is not a practical way to move data for most people, because even _if_ you get someone to go through all this pain, the amount of time it takes to build, process, collect, and upload these archives ensures it is essentially a one-way thing. It can and should be a free-flowing exchange of information, which the internet can already easily facilitated. The only barriers are artificial, legal barriers.]
> fix copyright law so that users can truly assert ownership over the content they generate
I agree copyright needs to be fixed, but the issue isn't user-generated content. Essentially, all users do is collaborative filtering.
The real work needs to be done to break up content 'owning' middlemen that form monopolies and use those to create walled gardens. Maybe a solution would be to ban the walled garden as 'anti-trust'.
Anti-trust is a kludge that proves something went awry. We shouldn't have to go through a manual process of breaking up the power brokers, especially since government and big corporations frequently get into bed together. We should design systems and processes that are self-healing -- systems that allow sufficient natural competition that the eventuality of a monopoly virtually never occurs.
It's like a computer system. Yes, if you write corrupted data back to the database, you can manually go in there and do your best to rectify it, restoring pieces from backups, etc. But the strong emphasis should be on designing the system such that corruption is a near-impossibility.
Copyright and network protections are so excessive and far-reaching, it's no wonder we're seeing this re-centralization.
Anti trust is needed. Power and influence breeds more of itself.
You need regulation to keep free markets free.
Laisez fair will lead to power concentrating into monopolies. Decent anti-trust regulation, combined with laws preventing anti-competitive behavior is needed.
Granted, the anti-competitive part is also important.
> SMS Privacy customers should make sure they're browsing either smsprivacy.org using HTTPS, or, if using Tor, smspriv6fynj23u6.onion is the only legitimate hidden service. Anything else is almost certainly harmful in one way or another.
Posting this very important information which must not be modified by the middle man on Non-TLS web site.
I really wondered the author's ability to operate the privacy centric SNS on Tor network.
Or am I reading the web page through malicious proxy?
It passes through the client User-Agent unchanged. I've not yet looked at anything else although I doubt it will do anything but pass through the client headers.
It might be interesting to send confusing or contradictory request headers and see how it reacts.
yeah, i'd definitely try sending it all sorts of weird things. Can you setup a path that will respond with a ton of data? Maybe a bunch of things that it will try to parse? etc
"GET /headers" will dump the request headers, but note this has gone through one layer of munging by nginx and another by Mojolicious, so don't draw any conclusions about what you see in the response without bearing that in mind!
I'd be interested to hear if you spot anything surprising.
EDIT: And in case you haven't seen it, "torsocks" is a tool that lets any (?) other program speak Tor, e.g. it works with both curl and nc.
It's probably just doing an RPC call to an actual bitcoin node, which (in my experience) can actually take a few seconds depending on how fast of a computer the node is running on.
The seemingly easy way to DoS it would be to hit pages with Bitcoin addresses.
You could even setup a hidden static page with hundreds or thousands of addresses and call it through that proxy. No load on the legit server but heavy load on the rogue proxy.
Maybe with a layer 7 attack, but not with a layer 3 attack. The author also described the difference between the proxy request and a legit request, so he can just block the proxy requests. Try using slowhttptest. You don't even need a botnet.
I think the article described a different response, but not a different request? Anyway they mentioned that some requests were cached, so you could request a cached object to avoid hitting the original host, if you wanted a level-7 attack.
If you want to give the phisher lots to do, create say 10 versions of your site where the html/css/js are tightly intertwined. Such that html version "a" only works with css/versiona.css and js/versionb.js. Mismatches create a site you can't view, interact with etc.
Then, for each version, put your countermeasures in your main js and css files, and tweak your site so that it doesn't work at all without those main js and css files. Add subresource integrity to the each version of the main html that pulls them in. Obfuscate each version of js and css.
And vary up the countermeasures per version. Like displaying the bitcoin address with css (selector:after {content: 'abc'}, with an image, dynamically with obfuscated javascript, etc).
Round robin which version of the page is served. This will make caching anything hard for the phisher, and the variations of countermeasures might get them to find a softer target.
Shitload of work, though, just because some ahole wants to skim money.