The problem with recaptcha alternatives is that they either are insecure or require time and money to continue to be ahead of bots.
All of the "interactive stand-alone approaches" from that page can be beaten with run-of-the-mill OCR (other than perhaps the 3d challenge) and with almost any mobile phone speech recognition engine (and, if the attacker has the money, can send it off to Google's cloud speech-to-text).
All of the non-interactive approaches from the page require this constant tuning and upkeep to make sure bots aren't able to sign up/abuse systems. There's also not \that\ secure if your website is targeted and scripts are made specifically to avoid your anti-abuse methods.
> The problem with recaptcha alternatives is that they either are insecure or require time and money to continue to be ahead of bots.
Sure great, but when I see behavior like the above, I just hit back and add the site to my routers firewall black list. If its this much of a PITA to "solve" a captcha, CORRECTLY but I keep getting the middle finger I don't give a crap anymore. Your site isn't worth going to if I have to spend literally minutes "solving" captchas for googles stupid ai which is treating me like prove i'm a bot even when I prove i'm not.
Just realize by using recaptcha this is what you're forcing some users to deal with. And I deal with it by making sure I never come back to your site ever again when you've wasted minutes of my time just to try to get to your page. Even if its googles fault for being jerks, I don't care. You choose to implement it.
Ok rant mode off and stepping off my personal soap box.
Your site isn't worth going to if I have to spend literally minutes "solving" captchas for googles stupid ai which is treating me like prove i'm a bot even when I prove i'm not.
I've run into state and local tax agencies, utility companies, and large healthcare companies that require Google's reCAPTCHA. So, unless you don't want healthcare, to have water service at your home, or you're in the mood to just shut down your business, you have to suck it up.
They can still use them if they meet certain criteria, and show that they 'need' them. The overuse probably comes from the incentive - Google is incentivized to encourage the use of captcha because it is curating a data collection for ai training. I imagine some of the 'gaslighting' that people experience is when they are given images that don't yet have a confidence rating high enough. I wonder if answering incorrectly often enough would result in being asked fewer questions?
‘Need’ here means exhausted all other opportunities, and have built alternative accessible ways of accessing the same service. I’d certainly have expected a service to have investigated a self-hosted solution, and I doubt a reliance on 3rd party JS from a Google service would fly, regardless of the service, as it breaks a whole bunch of separate resilience guidelines.
The few times I couldn't avoid Recaptcha, I spent 5 minutes randomly clicking on image tiles. Sometimes I got through by this strategy. If it didn't work, I tried a less random approach.
I've even seen state and government sites using Google's reCAPTCHA. People shouldn't be required to hand over their browsing history and other information to Google for essential services, especially to use government websites.
Thankfully, Indian government websites still use their own captchas - which though not as 'secure', works for most of the cases, and don't take minutes to solve.
It this case they get to deal with me offline. Like I'm using a credit card right now without internet banking. They send me letters, on paper, with how much I owe them and then I pay. All because registering for their internet banking was a crazy shitty experience that I abandoned.
Of course if it's an essential service like healthcare, formal education, paying bills etc. people will be forced to use it (if there's no option to change that service itself). But for that fancy startup showing some content for to consume when it's not necessary, I will just close that website.
uh, captchas don't just appear on Google products. Third parties use it -- government services, online shopping, all kinds of things you take for granted because clearly you aren't one of the people affected by it (ie you're fingerprinted). Many things we used to do in physical space now occurs virtually. There is a serious philosophical and moral case to be made for the relevance of privacy and anonymity that captcha is specifically and nefariously working to erode. And in that sense it's worse than bad building codes.
I suspect the Google product that the GP was referring to was Chrome, given that this is a co, ent thread about Firefox vs Chrome, and the behaviour of another Google product (recaptcha) betwee the aforementioned products.
Yeah, but then again, so many times that I run into Captcha issues, it's on a site that really doesn't need Captcha to begin with.
Why make me solve a Captcha to see static content?
Why make me solve a Captcha to log in when I've already completed one to register?
Why make me solve a Captcha to pay utility bills? Is there some underground group of deviants going around surreptitiously paying other people's utility bills? The monsters.
> Why make me solve a Captcha to see static content?
Fair point, I usually run into this when using Tor, or VPN when accessing content behind Cloudflare, and or similar services. This is some anti abuse stuff, but is often overly agressive with giving you captchas.
> Why make me solve a Captcha to log in when I've already completed one to register?
So attackers cannot password spray. This is typically after attackers has gotten access to the latest database breach, and are just blindly trying username/password combinations.
> Why make me solve a Captcha to pay utility bills? Is there some underground group of deviants going around surreptitiously paying other people's utility bills?
Sound like a strange place to have a captcha indeed. What information is needed in the form to submit it? Does it validate stuff that an attacker might want to scrape? I guess they added it for a reason.
This is not necessarily a reasonable assumption. People often do things because they heard it was a good practice, or because it solves a problem they don't actually have, but think they might, or arbitrarily without giving it much thought.
So attackers cannot password spray. This is typically after attackers has gotten access to the latest database breach, and are just blindly trying username/password combinations.
A simple ratelimit takes care of that. Plus, it's not like attackers would be easily defeated by a CAPTCHA anyway --- there are services selling batches of valid tokens, likely generated by actual humans or very close emulations thereof, for ReCAPTCHA.
CAPTCHA is not a fool proof, it is just the first layer in of defence in the signup/login form. CAPTCHAS increases the cost of password spraying, attackers can't simply fire up Hydra. They'll need additional tools and services which costs money.
Captcha solving service also has other costs than just the money it costs. It adds time costs and additional resource usage on the machines it is running on. A quick look at a service[1] shows that the average response for a challenge was 40 seconds (this value changed a lot when refreshing the page). The attacker has now gone from the 200ms range per attempt to several seconds, slowing the down a lot. This gives defenders additional time to respond, it is also a useful metric for detecting malicious logins.
By the account. 3 failed login attempts in a row, and you disallow further logins for 30 seconds.
This should waste less time than reCAPTCHAs. I know it's not 1:1 in terms of pros/cons, but it gets a good subset of the advantages without the key disadvantages mentioned above.
First, that's a bit user-hostile (and suddenly a DoS-vector; I can prevent a site's users from logging in by continuously firing bad password attempts).
Secondly, botnets can, and presumably do, randomize which accounts they try, too.
So rate-limiting is "user-hostile", but permanently hell-banning someone because their network is considered "seedy" is user-friendly?
Incidentally, you still need rate-limiting if you use Google's CAPTCHA. If you don't rate-limit CAPTCHA endpoint, an attacker can DDoS you (especially if your server-side captcha component uses low-performance single-threaded HTTP client). Furthermore, an attacker within the same AS as their target can purposefully screw over their account by performing attacks on Google's services until the reputation of the network hits rock bottom.
reCAPTCHA is a rate-limiting measure. Google handles all the heavy-lifting and attacker protection for you, and the slow fade you see in the video is that rate-limiting in action. But if you get a clean CAPTCHA result back from them, then that client is very unlikely to be an automated attacker. It's super easy and scales really well.
Conveniently, normal users with typical browser configurations get nothing but the animated checkbox. For nearly everyone, the whole experience is simple and easy. The only people who get inconvenienced are the low-grade privacy enthusiasts who think that preventing tracking is the path to Internet safety. Ironically, "tracking" is literally the mechanism by which legitimate users can be distinguished from attackers, so down that road lies a sort of self-inflicted hell for which the only sensible solution is to stop hitting yourself.
This is obviously a bad idea. It costs nothing for an attacker to send 3 http requests, every minute, every hour, all day. They could lock your account basically forever. IP filtering and locking accounts are terrible ways of preventing password spraying.
From that messed up email from support that leaked them. Or I assumed that you'll have a big cross-section with some other site that leaked.
This is not theory, this is hard-earned experience. Locking-out people is bad, the most that's acceptable is rate limiting to a once every few seconds.
> > Why make me solve a Captcha to pay utility bills? Is there some underground group of deviants going around surreptitiously paying other people's utility bills?
> Sound like a strange place to have a captcha indeed. What information is needed in the form to submit it? Does it validate stuff that an attacker might want to scrape? I guess they added it for a reason.
Ive seen captchas on payment forms to prevent credit card checking. You can take a dump of CC details and try them all out on a site and get back the valid ones. I'd assume they charge $1 to the CC to test it before allowing you to continue and then you could cancel your order before they charge the full amount. However, assuming you have to be logged in to pay your bill that seems less reasonable.
I've even seen people beat captcha in bulk to get to a payment form. My best guess is something along the lines of mechanical turk or a room full of low wage workers doing it manually. I think the payoff of verifying stolen cards is worth enough to justify some kind of workaround.
If you host a payment form that informs the user about whether payment was accepted, you're a target.
> Sound like a strange place to have a captcha indeed. What information is needed in the form to submit it? Does it validate stuff that an attacker might want to scrape? I guess they added it for a reason.
In the past, I used curl to get some billing info, add the money to a dedicated virtual prepaid card, then pay the bill, then send an email to a gmail (+paidinvoice) label. These day, at least for my bills, they have pre-approved withdraw directly from the bank. However I guess this is not widely deployed.
If other people did this, but ended up doing it from an insecure machine and lost the credentials / got hacked, I can see why at least some orgs might want to prevent people from doing this. This is a classic over reaction, but a plausible scenario.
> If other people did this, but ended up doing it from an insecure machine and lost the credentials / got hacked, I can see why at least some orgs might want to prevent people from doing this.
The measure is not really about protecting the user that is using the payment form, it is meant to "protect" the system that is validating the payment data. The payment form may be a target for attacker which has gotten a large batch of credit cards from somewhere else, and wants to validate the data. They then regularly exploit such forms, or other naive payment system to check if the credit card data is valid.
CandyJapan owner wrote some blog posts about the subject.
I imagine what you are proposing then is to record the entropy on the password when you first register and for accounts with sufficient password entropy to not ask for a captcha after few failed attempts.
With that, the site gives away whether the account has a low entropy password or not.
> I imagine what you are proposing then is to record the entropy on the password
Or just generate secure high-entropy passwords and force users to use them.
Making users look up SMS codes before each login is acceptable. Making them solve obnoxious, long, privacy-hostile riddles is acceptable. But forcing them to use pre-generated secure passwords?! That can't possibly work. They will revolt!
The weirdest one I have ever seen is on frikking walmart.com - here is my cynical paraphrasing of their 'thought process': "We don't want your money! Go back to Amazon! No captchas there cause they are not stupid!" I persist because I don't want to go back to being a 2nd-class non-Prime Amazon citizen but the darned unnecessary captchas really ruin my walmart.com shopping experience to no end.
If anyone from Walmart.com is reading, please please get rid of these useless captchas - it is an incredibly stupid thing that you do and unfortunately you do it too well as well.
The problem with CAPTCHA and the like are they seek to stop programmatic-browsing of websites, that both Firefox and Chrome support out of the box. If companies are concerned about non-human access they should make an official API instead of their website being a de-facto unofficial API. If they are concerned about fraud they will be woefully defended by CAPTCHA, it makes no judgement on the validity of transactions at all and doesn't prevent frauds signing in manually.
Ironically, Google has committed at least $75 million and likely hundreds more of fraud, via stolen refunds and stolen banned-account balances!
> If companies are concerned about non-human access they should make an official API instead of their website being a de-facto unofficial API
This is often impractical for several important use cases, like image rendering and PDF generation. Just hand waving away the cost of developing dedicated, pure APIs won't make companies more likely to do so.
> If they are concerned about fraud they will be woefully defended by CAPTCHA, it makes no judgement on the validity of transactions at all and doesn't prevent frauds signing in manually.
There are many different vectors of attack and fraud and CAPTCHA tackles one of them. It's silly to say it's unnecessary just because it doesn't cover all fraudulent activity
I implemented simple question / answer antibot filters on registration forms for a few sites. Nobosy ever made the effort to customize their bot to answer to those very few questions. I guess it doesn't make sense economically. However if a big site would go that way, it would be filled with bots in a day.
I once implemented a "poor man's captcha" that presented a simple randomized question that anyone would be able to answer (ranging from "what year is it" to "what's 2 + 2"). I guessed that nobody would make the effort to write a custom script for this, because the website in question was so niche and the stakes so low -- a very quiet corner of the Internet; I don't even remember what is was, possibly some feedback form that went to a support email. I actually felt some irrational measure of pride when, probably a year later, I was looking at some logs and discovered that some script kid had cracked the questionnaire and was currently using the form to post nonsense text with Viagra links. Someone had actually sat down and written code to crack my terrible solution, and probably spent more time on it than I had (which is to say, more than five minutes). Made my day.
For small scale sites you don't even need to do much that requires human intervention. Most bots (or at least most bot-actions) seem to invest very little in sophisticated techniques and rely instead on finding vulnerable servers by casting a very wide net. As long as that is true, you can filter out 99+% of the noise by applying very simple but slightly bespoke techniques.
As long as there continue to be enough cookie-cutter blog/forum/ecommerce sites out there for the bots to exploit, very simple techniques (JS-populated form fields or request parameters, very basic validation of the HTTP headers, taking into account the rate or frequency at which requests are made, etc.) will quickly and cheaply identify almost all of the bot activity.
Of course sophisticated or dedicated bots will still pose a problem, but assuming you're not just standing up a popular off-the-shelf platform without any hardening or customization, you'll need get pretty big (or otherwise valuable) before attracting that kind of attention.
A reasonable analogy here is the observation that simply running sensitive services on non-standard ports (e.g., not running SSH on port 22) will eliminate a ridiculous volume of malware probes against your system. To be clear, that's no substitute for actual robust security practices -- you almost certainly shouldn't have something like SSH world-visible to begin with -- but given how trivially easy it is do something like to change the default port for services you're not expecting the public at large to reach it's absurd that servers are compromised by dumb scripts blinding probing the Internet to exploit well-known and long-ago-patched exploits every day.
I did that for on an old forum that has been dead for year, I thought spammers would not care enough.
But one of them did! Whenever I changed the questions, bots would stop for a few days, and then start again. Someone cared enough to manually enter the correct responses (no, blind dictionary attacks were not possible)!
This is probably good enough for 90% of websites that accept user content. Then in the small chance it isn't because of growth or some random spammer decided to spend some time on your site, then you can switch to something like recaptcha.
Hobby sites may be in a more difficult position, but businesses may decide between developer convenience and low cost, or excluding some of their users and tormenting them.
There are also ways to reduce the damage reCAPTCHA causes, such as keeping it out of the default UX path. Discord for example will show a reCAPTCHA challenge on the login page only if you are signing in from a new location.
reCAPTCHA cannot effectively defend sites against targeted attacks either.
OK, Discord specifically is terrible. I login in incognito mode from the same location/browser every time, and have to deal with Captcha most of the time.
I use Discord from an incognito Chrome window. I avoid it most of the time, by doing:
1. Email is manually typed, password is copy pasted
2. I move the mouse around in the window in a fairly non-mechanical manner.
I don't know if you use Chrome proper for it, so that could still be a point of difference.
I don't understand this. You're logging in from a fresh browser. Do you want sites to fingerprint you in other ways so you can clear your cookies and not have to deal with captchas?
Not saying I like the precedent of Google being inescapable, you're not "signing up" for anything. A web server is 100% in its rights to refuse to send you a page, on their terms.
That is true. However, if I sign up for a service, for example TransferWise, then later, signing into the account, I get a Google Captcha, now I am engaged in a relationship/data share with Google and if I don’t agree, I lose access to my account. When I signed up, I didn’t have “you must help train Google AI” as a condition of use.
Not sure why you're downvoted, it's a valid point. It feels icky to use a service that you pay for, and incidentally provide free labor to Google's AI which they resell in Google Cloud as a walled garden. The result of reCaptcha isn't public as far as I can tell, and humanity probably doesn't get a net benefit from Google's monopoly on AI anymore.
People talk about "free labor" and forget all the times they were able to do Google searches or use Google Maps for free. It seems rather ungrateful? This isn't a one-sided relationship, both sides benefit.
The difference lies in whether you willingly subjected yourself to this transaction (give eyeballs, get Maps service) or whether it was imposed on you without anyone bothering to mention or question it beforehand.
Also the gratefulness part is strange. The corporation has no gratefulness for me, why should we show it any kind of loyalty. It's not a living entity with a consistent mind or consciousness. It will change its will based on Wall Street's demands. It will ban you silently with no recourse.
Perhaps "ungrateful" is the wrong word. But in a purely transactional society where we charge each other for every little thing we do on the Internet to avoid any "free labor", I suspect that we would be considerably worse off.
You seem to be a bot. Write a poem describing the outage and email it at larry@google.com . We will look at it and unblock you if we believe you are a human.
I believe we agree with you there. OP was just referencing the methodologies people user, often choosing tools like Google Analytics and ReCaptcha that are "free" by virtue of offloading compromises onto the site's users rather than the site itself.
I endorse a site's right to forbid me its content if I can't prove I'm human. I won't endorse a site that accomplishes it by asking me to pay the cost.
Not entirely accurate. The GDPR restricts the terms they can use, for example. And anti-discrimination law probably also applies. These don't really apply to captcha, of course, under current interpretations.
My reCaptcha strategy is to fire off an email to the site owners every time I am subjected to a reCaptcha, asking for all my data under GDPR. Most websites only need a few such requests to quickly start looking for an alternative. Fuck Google and their constant attacks on my rights.
> The problem with recaptcha alternatives is that they either are insecure or require time and money to continue to be ahead of bots.
You're posting this in response to an automated recaptcha solver. Clearly recaptcha also has trouble staying ahead of bots.
It seems to me that any simple automated test at the entrance is inevitably going to be easy to solve by bots, especially when it's a one-size-fits-all test like recaptcha, so bots have only a single target to aim at. A small-scale unique test will be more successful simply for that reason.
But it seems to me that the better way than to ban bots together with humans who fail to pass your Turing test, is to check for the behaviour you want. If you don't want spam, have a system to recognise spamming behaviour, rather than traffic lights.
wrong. captcha blocks bots and humans alike. so why bother with the fake puzzle at all? just replace whatever triggers your captcha with a straight up block. or else please consider a responsible alternative.
of course it does. so does an automatic ban. that's precisely not the issue.
i think you probably meant to say recaptcha allows an extraordinarily large number of humans compared to false positives? because that would be the relevant metric. you sure about that one?
> and with almost any mobile phone speech recognition engine
My only problem with recaptcha is when audio doesn't work (google decides I'm spamming their network… sure…). Because their audio validation seems to use only one rule that says "letters where typed". So I'm not sure how being able to beat it with voice recognition makes it worse.
How hard would it be to create an alternative using GPT-2 or the like?
Create a dozen models based on different things. Street signs, cats, houses, cars, etc. Then show the user a random selection of images generated from different models and say "select all the cats" and they get it right if they choose the images generated from the cat model.
So the short version is that they try to fingerprint the user and then distinguish fingerprints that seem like humans from fingerprints that don't.
The interesting question then becomes how this is going to interact with future browser anti-fingerprinting measures whose purpose is to prevent just that.
I don't doubt that it's far easier to abuse traditional captcha systems, but I wonder how wide spread that is. A while ago I did a test with securimage and tensorflow/python/opencv/keras after I read a Medium post. While it could solve captchas with a little distortion when I added squiggles, dots, and more distortion it was unable to solve the captchas. I'm sure you could spend more time and create a system that can solve these captchas, I wonder how much effort some random spammer will put in to attack your blog. Yandex uses traditional captchas, and they don't seem to have any issues.
All of the "interactive stand-alone approaches" from that page can be beaten with run-of-the-mill OCR (other than perhaps the 3d challenge) and with almost any mobile phone speech recognition engine (and, if the attacker has the money, can send it off to Google's cloud speech-to-text).
All of the non-interactive approaches from the page require this constant tuning and upkeep to make sure bots aren't able to sign up/abuse systems. There's also not \that\ secure if your website is targeted and scripts are made specifically to avoid your anti-abuse methods.