Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
I Hate Captcha- Come beta test my alternative
19 points by kapauldo on Sept 3, 2010 | hide | past | favorite | 45 comments
I am getting ready to launch BaffleBot, a captcha alternative. In a nutshell, it's a picture/question combo challenge, and the challenges are submitted by humans. It's monetized with a small ad, with revenue share for bloggers, and challenge creators are rewarded with a link shown in every challenge. So, if you want to contribute or check it out, please go www.bafflebot.com. If you are a WordPress blogger, I'd especially appreciate it if you would install the plugin and give me feedback on how well it works. The beta code is "baffleboy" (all lower case). Send me an email if you have any questions or comments (kapauldo AT gmail.com).

Thanks, Kevin



Hi, I just connected to your website and the first "bafflebot" submit to me is "what is this character's name" (a yellow bird) I'm from France, I dont know all the US kid cartoons so I cant answer to that captcha.. Whereas 1+2 or "enter the word" is pretty universal.

Second try, I have in front of me a red apple with legs and arms and the question is "what is it?" The question is wayyyyy to broad, what should I answer? is it another cartoon that I'm not aware of? is it an apple? a red apple?...

A captcha must be VERY simple and universal.

I read a few months ago a good captcha idea, a picture of a human and the challenge is to say if it's a man or a female. A bot cant read the picture well enough, but the human eye/brain is trained to identify a man or a female in less than a second!

My 2cents, good luck with your project. I agree captcha generally sucks


Male vs female is a poor test because you can guess correctly about 50% of the time. Thats on par with some current CAPTCHA bots.


You could show several faces and ask "how many are male/female?"


Another idea along these lines would be to present a scene and ask the user how many objects are in the scene. It's a very difficult problem for computers to descern individual objects in a scene, especially if they overlap.


It's also not so good for the visually impaired because you can't really provide an audio version of the challenge, whereas identifying the letters in a random string is easier to present as audio.


That's true. Maybe you could do something with profiles from FB. Show a small collections of faces one of which is your friend on FB. Then you have to select your friend. For the visually impaired you could read peoples' names.

A real person would instantly recognize one of their friends and even the guy getting paid to solve captchas would have a hard time.

Of course, you then have the problem of getting someone to connect FB to the CAPTCHA service for the first time instead of abandoning the form.


Do captcha bots ever use the audio string? Could you not hook up a voice recognizer to the audio portion?


I hate to say it, but I think this is a huge step in the wrong direction. First, there appear to be multiple correct answers for almost all the challenges. One showed a picture of a woman watering plants that appeared to have dollar bills as leaves. It asked me what she was doing. I had no clue what to put; is she "watering plants," "growing money," or just "watering?". I think you should at least provide multiple choice answers, even if the user has to enter in the full answer word for word. Forget about spam bots being able to figure these out, most humans would have a hard time!

Second, and this is a big one, I don't think there are many bloggers and web developers out there who are so frustrated with captchas that they'd be willing to use an ad-supported system, no matter how easy it makes things for the end user. I know I wouldn't. The ads also add a lot of confusion to the captcha. Again, on the money plant watering example, I saw an ad about making cash online. Combined with the strange money tree it made for a very confusing experience at best. I think having the same crap that spambots are likely to post on a site as advertisements on the very captcha that is trying to prevent it sends some pretty mixed signals.

Third, why does this need to be crowd-sourced? This seems like it's adding myriad quality control issues. I think if you sat down and came up with 10 or so internationally universal puzzles that are easily solved by all humans and not by robots, you would have enough variety to keep the bots guessing.

What I want to see is a captcha that determines I'm human without me having to do anything. I don't even know if this can be done, but to me all captcha solutions should be aiming toward that goal. In the meantime, I'll deal with the captchas that are out there right now.


yeah, the ads right now are low quality, but remember, the ads are revenue shared with the blogger. so it's a way to monetize your comments. if it gets traction, i'll be experimenting with deals, coupons, localized offers, etc. the current ads are really just there as a placeholder.


My advice is to be selective with your advertising partners. It still seems like an odd place to have ads, but if they were more tasteful or I had a say in what they were, I might be more inclined to use it. That said, it strikes me as a very odd revenue stream. If it were me, I would concentrate on making this the captcha of choice and then focus on monetization. The ads seem like they might kill off any chance of this happening.


How many different questions do you have? I ran it ~20 times and saw the same question 4-5 of those times.

One problem you might have is that classical captcha has 25^6 combinations, and you can generate a new set every x days by changing the algorithm used to generate the images.

Your set is human generated, and it seems that it wouldn't take long to build an answer set to use to break it, especially since it gives you a second and third chance at an answer.

A spammer could build a 'quiz' site with your questions and answers, and store the image hashes along with the answers in a db.


right now there is a set of seed data on there. there will hopefully be new challenges being entered every day, then i plan to use a quarantine, IP logging, etc. strategies to prevent catalog attacks. this is crowd sourced, so ihope to have a lot of people contributing challenges and grow the catalog base. also, i've got image mogrification working, so after a few days of use, each challenge can be modified to prevent automated catalog attacks.


Overall I thought it was great. I went through about 20 questions and got them all right. I also went through them about twice as fast as Captchas and I probably would have missed a few too.

I agree with the other posters though that it needs to be a bit more forgiving with spelling and adjectives. "Yellow Car" and "Red Apple" should both be accepted. I never saw Big Bird but if you decided to keep it, "bird" and "yellow bird" should be correct unfamiliar users.

I tried to use tineye.com to see if a spammer could automate searches to determine what the picture is of (obviously wouldn't work on all pictures though) and it failed for 'car' and 'pencil'. Google Googles on my phone identified the bird as a Lark: http://www.tropicalbirding.com/tripReports/TR_SouthIndia_Nov... which was interesting but not a correct answer.


The human submitters get to see what answers people have submitted. So, they can see that you typed "yellow car" and then decide if that is a legitimate answer. It's all crowd sourced. If you submit a challenge, you OWN that challenge, and you can see your hit and fail rates.


I definitely agree that the need is there. This was me when setting up google apps recently http://twitter.com/jackowayed/status/22408310028

Since people really have no way to know what kind of challenges you offer, it would be good if you had some screenshots/demos on your homepage instead of a generic "private beta" page.

You also might want to think about doing things like Google's image rotation captcha experiments. I was really excited when they announced that they were playing with those because I was hoping they'd kill traditional captchas. Here's a PDF of a paper by some Googlers: http://www.richgossweiler.com/projects/rotcaptcha/rotcaptcha...


Thoughts: ugly and mostly useless.

The problem is not beating bots. There are all kinds of ways of achieving that (some fairly effective ones without even requiring user interaction).

The hard one is stopping an outsourced third-world cubicle farm of human spam submitters. Solve that one and you're on to something.


like with email (where spam fighting has been quite successful) submitted posts should be run through spam-filters.


Allowing the person that writes the answer to submit a link is problematic, IMHO. Of the 3 I tried, 2 said "I love you" and linked to Glenn Beck. I personally dislike Mr. Beck very much, and that alone would keep me from using this on any project.

You are going to run afoul of people just as easily if someone links to Keith Olbermann, or a site about Gay Rights, or kicking out immigrants, or atheism, or the KKK, or [insert inflammatory thing here].

The whole reason for captcha's is so someone can't put arbitrary links on YOUR site, embedding your anti-captcha solution just makes sure they are always in the same place. No thanks, I'll just use ReCaptcha.


that link was submitted by an alpha tester. he submitted one of the picture/question challenges, and in exchange, he gets rewarded impressions of his link.


I think you missed his/her point though. Captchas should prevent spam, allowing a user to include an arbitrary link that will appear on any random site is not much of an improvement to the site owner.


Yeah, but this problem already exists on WordPress comments for example. People are free to add whatever link they want with their comments. I don't think a Glen Beck link is spam, (though I do personally think he's a gimmick).


What are you doing with the data you collect?

I assume you're trying to followi in the footsteps of Luis Van Ahn... http://www.youtube.com/watch?v=OvVAViDtKeA


i only collect answer data for the purposes of improving responses, nothing else. (not a huge fan of luis von ahn, for reasons i'll keep to myself).


Hi Kevin,

I like this, some feedback:

* Asides from the cultural challenges (I don't know who big bird is either) these are mostly better experiences to solve than captcha.

* The submit button is outside the rectangle. It's not visually part of the bafflebot area.

* Have a pay option with no ads. Let people pay per 1000 provided challenges, perhaps upfront. The ads are distracting and I run my own ads on my site, but I'd be happy to pay something reasonable for a captcha alternative

Overall, I think it works. Drop the cultural stuff, let me pay, and make it better looking and you'll have yourself a customer.


http://www.bafflebot.com - clickable

beta code: baffleboy


Publishers likely won't want the sort of user leakage that comes from an offsite link.

The most interesting CAPTCHA innovation I've seen recently is from AdCopy, who replace it with an advertisement that contains an answer the user must then provide - guaranteeing that the user pay attention to the ad. A portion of the revenue gets paid to the publisher, turning the CAPTCHA into a revenue stream.

Sample here: http://www.petitionspot.com/petitions/savetsl/captcha


Hum... yellow car. I entered Camaro told it was wrong. Was it a Chevelle? Are you using a semantic dictionary to decide if the answer was right or wrong or a predefined word list?


Hum... yellow car. I entered Camaro told it was wrong. Was it a Chevelle?

It is clearly a Mustang Mach 1...'69 or '70 (http://www.bafflebot.com/challenges/main_image/39a43dd6d6e34... vs http://www.classiccarstudio.com/images/auction/1166/1.jpg)

I got one that could have been a rat or a mouse...hard to tell, but both were accepted. Same with the bird picture, bird and sparrow were accepted. But the picture of the corvette with the question "how many wheels does this object likely have?" is ambiguous. It could be 4, 5 (counting the spare tire), or 6 (if you're cheeky and count the steering wheel).


The answer probably was just "car".


Less annoying than most traditional captchas, I think it may cause problems for international users. You are asking for relatively specific words (I noted pencil, saw and puzzle) that not everybody will know. Also, the pencil sharpener on one challenge is not a common type in Europe AFAIK.

Idea: why aren't you using photos instead of vector graphics? They are also much easier to create (and more difficult for a machine to recognize, especially if you add some random noise).


i definitely have to add country codes for localization. the reason i'm using pictures is because everyone some of the challenges are from photos, but i grabbed a bunch of royalty-free clipart for the seed data. a few of my early testers did upload photos, and they work fine too.


Not a bad idea, but some of the questions + pictures I've got have just been plain stupid. One was of a character which I had never seen, let alone know the name of.

"What are these beds called?"

There are about 4 separate names for bunk/cabin beds.

Typos are another issue. "Where do you where these things?" Also, how the hell can you expect people to answer such a vague question?

I have to say that I think this is both harder, and less accessible than a traditional CAPTCHA.


Hmmm, I think a major problem is going to be culturally specific questions. E.g. I got a picture of 'Big Bird'. I'd imagine that this would be easy for most beta testers who are likely to be of a certain demographic and from a country where Sesame Street was aired, but it's going to be baffling for everyone else, as well as giving a feeling of being excluded.


Yes, I got several which were impossible for me to answer.


I'm au fait with US culture, but I got it wrong about a quarter of the time, which is no better than reCaptcha.


Your stuff is not culture free. It requires knowledge which may be or not known by people. This knowledge may be obvious to you but people are quite different all around the world. Usually captcha is about copying numbers. Arabic numerals is way more wide-spread than english language.


I was checking out a few of your "baffles" and ran into a bug. After clicking refresh a few times I got stuck on a puzzle piece; the caption and description said to refresh it, but it just returned to the "please refresh".

I've repeated this 3x now; it appears to occur after the first reload.

(In Safari 5.0.1)


When I read the title, I was hoping to see a real alternative to captchas. What I see is not that different (pictures vs. letter/numbers)

Something revolutionary different is needed.


when I went to go sign up this is what i saw: http://imgur.com/QGEhB.png

I would suggest not using that banner on that page. While one could just click hide, it is just one more step for a user to say "nah, why bother".

BTW, that was in FireFox 3.6.8 on Windows 7 Professional


thanks for the catch, i really appreciate the screen cap!


There's going to be problems with spelling and phrasing -

e.g. 'sewing machine', 'sowing machine', 'sewing'

or 'a jigsaw', 'jigsaw'

etc


the human in charge of maintaining that challenge will see answers come in and decide to accept them. If you sign up for an account, you'll get a dashboard, which shows you stats on all of your challenges, how many impressions you have, etc.


Where can I see this actually working?


You can see it on Pikk.com and on my blog kapauldo.com.


This is no good for those of us who aren't native English speakers - I keep getting the words wrong (writing English is not that beg of a problem as small errors doesn't destroy the readers understanding and if you don't know a word, hey use another), and my English is better than 70% of the rest of my country.

This will go where all the other captcha replaces have gone - nowhere, sadly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: