Google Goggles

bradgessler · on Dec 7, 2009

This is a great example of how Google could out-do the iPhone. The more non-trivial mobile applications have some processing that happen in the cloud. If Android phones could translate voice in real-time during a phone call and the iPhone could not, which phone would you consider buying?

Realistically though Google wants their services and applications on all platforms; now if only the App store would approve these...

paul9290 · on Dec 7, 2009

Yeah but to me the quality of any manufacturer then Apple falls short. I've owned Windows Mobile, Andriod (HTC Hero) and now iPhone. iPhone for me is superior.

Google i think needs to make their own device to match the quality/stability of the iPhone!

stcredzero · on Dec 8, 2009

Yeah but to me the quality of any manufacturer then Apple falls short.

If it had been written "qualaty" then it would've been a delicious Dilbert reference. (For more than one reason -- left as exercise, hint included.)

jodrellblank · on Dec 7, 2009

One step closer to photographing a sign in a foreign country and doing OCR followed by Google Translate, maybe followed by overlaying the translated text back on the sign.

I think this would be a killer feature. I've never had much of a positive response when posting it as an idea previously - is it the case that if it really would be a killer feature, people would be all over the idea as well as the implementation, or is it possible to be a little supported idea that turns into a killer feature?

(Edit: Or maybe it's just more a European thing where several foreign languages are a few hours drive in any direction?)

sgk284 · on Dec 8, 2009

http://www.pictranslator.com/

I attended their launch party last month in Seattle. Works surprisingly well.

basugasubaku · on Dec 8, 2009

Off-topic, but I found it funny that their icon contains the Chinese, Korean, and Japanese flags yet they don't support any of those languages.

lpgauth · on Dec 8, 2009

Also off-topic, but one of the signs is "Danger tanks". Wonder if you'd pull out your iPhone to translate that sign.

litewulf · on Dec 7, 2009

One thought I had is that there really aren't enough different signs.

In most foreign countries, I've been able to recognize signs for bathrooms and such, and most place names I can memorize (even if I can't read the language I can still think "okay, I want to go to the one with the squiggly second character").

I actually think this has high "cool factor", but have difficulty really coming up with uses. I think restaurant menus (as mentioned by my sibling comment) is probably the only real use, and even then knowing the name of the dish doesn't guarantee an item I'm able to eat.

Also, when I go abroad, I am definitely not enabling data roaming. In some countries, buying a prepaid SIM requires lots of documentation which is difficult to do if wandering around doing tourist-y things, etc.

(Sorry, I do think the idea is cool, I'm just trying to come up with reasons why its not the neatest thing since sliced bread)

jodrellblank · on Dec 8, 2009

> but have difficulty really coming up with uses

Uses would be more about being a tourist and getting a feel for where you are than mens/womens toilets - as you say, you can memorise those fairly quickly.

More like, standing in a subway, there's a poster, it says something about the train something ... wonder what? Walking past a big lake, there's a sign about some project that involves a pipe low down and a pipe further up, but what is it doing - taking thermal energy or using the water as coolant? Walking through the city streets, there's a plaque on the side of a building - something something 1872 something national? people? something something Joan 1st. Eh? What's on these fliers that are being handed out? What are the menu descriptions?

The easier it is to do, the more likely you are to do it - it's only important things that you would fish out a dictionary and start deciphering, this would be for everything and anything which attracts your interest.

Good point about the data roaming, though.

thalur · on Dec 8, 2009

> Also, when I go abroad, I am definitely not enabling data roaming.

I think this is an important point, which is freqently forgotten. So many apps that would be useful in a foreign country are useless because they require internet access. When I went to Bulgaria last year I would have loved to have a translator app for my iphone, but every single one available required internet access, which there's no way I'm paying for at £3/MB!

gaius · on Dec 8, 2009

If you are with a provider that is everywhere anyway, like Vodafone or T-mobile, the idea of roaming is very obviously just about gouging money. It's not as if T-mobile has to pay a third party by the Mb to ship data from Germany to the UK, they own all the infrastructure anyway!

jacobolus · on Dec 8, 2009

All it needs is OCR and a cross-language dictionary built in. There's no need for data roaming to make this sort of thing.

mbrubeck · on Dec 8, 2009

True, but without a data connection you're limited to the client's processing power. With analysis on the server, you can use algorithms that require more code, more CPU, or more memory, as well as much larger datasets.

jacobolus · on Dec 7, 2009

I’ve seen a working demo of this in an iPhone app, a few months ago: it’s like some kind of dark magic to see the translated words show up in the still-moving picture. I don’t know if the guy has released his stuff yet though.

benatkin · on Dec 7, 2009

I've seen such a demo as well. Was it written in C++ so the core of the app could be developed and tested in Visual Studio?

dangrover · on Dec 8, 2009

I think I met that guy too. I wonder when it's gonna be out?

roc · on Dec 7, 2009

To sell it to Americans, switch your example to restaurant menus. ;)

natmaster · on Dec 8, 2009

Previous work in the domain by others:

Nokia: http://digital.venturebeat.com/2008/04/11/nokia-develops-nav... (they use the golden gate as their example too) Also: http://www.mobvis.org/publications/MMT2007_Paletta.pdf, http://mirw09.offis.de/paper/What%20is%20That%20-%20Object%2...

'World browser': http://www.wikitude.org/

Using GPS together with the compass to get interesting results: 2006, Japan: http://www.nytimes.com/2006/06/28/technology/28locate.html and now there are iPhone applications http://www.youtube.com/watch?v=U2uH-jrsSxs&feature=playe...

thrdOriginal · on Dec 7, 2009

For those of us who may have done a couple searches on the Market wondering why this isn't showing up: google goggles requires Android version 1.6.

jrockway · on Dec 7, 2009

Which Android products with Market don't have 1.6 yet?

mbrubeck · on Dec 7, 2009

At least some of the HTC phones with "Sense" UI, like the Verizon Droid Eris, are still shipping with Android 1.5.

jrockway · on Dec 7, 2009

Ah, the joys of unsupported third-party hacks :)

thrdOriginal · on Dec 7, 2009

The Sprint HTC Hero which launched two months ago as well as T-Mobile's MyTouch both shipped with 1.5; the T-Mobile G1 can be upgraded to 1.5.

benatkin · on Dec 7, 2009

The T-Mobile G1 was updated to 1.6 a couple of months ago.

http://www.gsmarena.com/tmobile_g1_and_mytouch3g_get_android...

simanyay · on Dec 7, 2009

myTouches should be on 1.6 already. T-Mobile was pushing the update in October (November?).

anguslong · on Dec 8, 2009

Confirmed on my MyTouch (worst phone name evar - though very happy with the beast).

App works fantastic in practice in my tests, though choked on one case tonight. Had kids in car, drove by movie theater -- no showtimes/reader board visible outside (theater is in the mall).

Tried to get showtimes by 'reading' AMC theater logo on side of building ("daddy, why are you taking a picture of that building?"). No go. Google maps and gps to the rescue.

nym · on Dec 7, 2009

Augmented reality starts becoming more than just a pipe dream. Expect a lot more of this.

P.S. Just went to ARdevcamp at Hacker Dojo last Saturday- it was a really exciting event. There were a lot more people than I expected, and lots of interesting discussion about this emerging space.

erikstarck · on Dec 7, 2009

First there was the internet of Documents (www).

Then there was the internet of People (social networks).

And the internet of Places (maps, LBS, still happening).

Soon we will see the internet of _Objects_ and something like Google Goggles will be the key driver.

Documents -> People -> Places -> Objects. What's next?

wallflower · on Dec 7, 2009

Perhaps not Google. Maybe Google and Facebook.

Robert Scoble has an interesting perspective on the ambitions of Facebook.

"Phase 1. Harvard only.

Phase 2. Harvard+Colleges only.

Phase 3. Harvard+Colleges+Geeks only.

Phase 4. All those above+All People (in the social graph).

Phase 5. All those above+People and businesses in the social graph.

Phase 6. All those above+People, businesses, and well-known objects in the social graph.

Phase 7. All people, businesses, objects in the social graph."

http://scobleizer.com/2009/03/21/why-facebook-has-never-list...

selven · on Dec 8, 2009

Fully integrated augmented reality. Once we get mind-machine interfaces working, we could have an application that looks at every object around you, checks which one you're focusing on and somehow the entire Wikipedia article on it is already in your mind.

ntoll · on Dec 8, 2009

Documents -> People -> Places -> Objects. What's next?

You should check out FluidDB from the FluidInfo guys... http://fluidinfo.com/

Maciek416 · on Dec 7, 2009

Can anyone who has an Android phone who has tried this out for a few hours/days report on how well this works in practice? This looks pretty fantastic, especially if it can also read stuff like QR codes, etc.

gchucky · on Dec 7, 2009

Just gave it a test and it works pretty well. The image needs to be clear, but it was able to read the American Express logo off of my bill, a website address off of a keychain, and the Time Warner logo off of the remote.

Edit: I tried an image of the Centrino and Vista logos side by side on my laptop and its top result was for a German book. It did get the Centrino logo in the Other results, though.

danfitch · on Dec 7, 2009

Tried it out searched a book cover...worked searched some text on a page worked... I wasn't going outside in 20 degree weather to check on the locations but from inside it kinda worked... not bad

jimmyjim · on Dec 7, 2009

It works exactly as advertised. Delightfully accurately.

jfno67 · on Dec 7, 2009

Worked pretty good on logos around the office. Also great for business card, recognised all the data, but the add to contact was not selecting the most relevant data.

It did great with QRCode too, but those were already easy to read. It's faster with Barcode Scanner.

harry · on Dec 7, 2009

This works really well. Picked out the KU logo and Swingline staplers perfectly. Choked on a big box o Jolly Ranchers. I'll prolly keep futzing with this for the rest of the day now.

harry · on Dec 7, 2009

Took a picture of a Halflife 2 poster I have in my office (http://store.valvesoftware.com/productpages/prints/product_H...) - not only did it return a strider battle youtube but also a wiki article on it.

nym · on Dec 7, 2009

Worked with a bottle of A1 steak sauce and a restaurant logo - although I think it used OCR actually for the 2nd one, but still impressive.

amohr · on Dec 7, 2009

google just started a campaign for "favorite places" google.com/favoriteplaces that uses QR - I assume these are not entirely unrelated.

cmelbye · on Dec 8, 2009

Damn you for making me want an Android-based phone, Google! First free turn-by-turn navigation, now this. Argh. Once AT&T gets a good Android phone, I'm switching.

jcapote · on Dec 8, 2009

Just tried it on my G1; This works really well.

mrtron · on Dec 7, 2009

The number of mashups I would like to do...

Desperately awaiting an API.

jpwagner · on Dec 7, 2009

there's some scary future uses for this.

getting someone's number at a bar will be so much easier.

pgbovine · on Dec 7, 2009

cell phone cams + face recognition + facebook ==> super creepy

EDIT: since people post so many pics of themselves on facebook under all sorts of lighting conditions and angles, that might actually make face recognition somewhat feasible from a cell phone cam. of course, efficiently searching thru a corpus of millions of faces (each taken at several different angles) is an enormous technical challenge. i envision some app like Shazam being developed to recognize faces rather than songs ...

ericd · on Dec 7, 2009

Yep, FB has an enormous corpus of accurate training data due to its millions of users performing tagging on hundreds or thousands of photos each. I would venture to say that they're the only ones that could do this/enable this to be done.

liuliu · on Dec 7, 2009

I've done this kind of research 4 years ago based on general Internet image. That was a failure, but because of the rising of Facebook, it might work again. Someone already work on this for sometime: face.com.

quilby · on Dec 7, 2009

Ive talked with one of the guys from face.com . They have put a lot of effort into making their system very fast and scalable.

He told me that they built it from scratch and did not use OpenCV. Some parts were even written in assembly.

Im pretty sure that this is the only web application that I use that has had major parts of it written in assembly.

liuliu · on Dec 8, 2009

I agree that for some specific tasks OpenCV is not particularly useful (I have to write detector and trainer from scratch even there is opencv haartraining. The OpenCV crowd are most interested in applying it to real-time video processing and put little effort on Internet scale problem).

But for assembly optimization it is really not as useful as it used to be. Scalability is more about how to get sub-linear time complexity and efficient communication pattern. Nowadays c compiler can get fairly good assembly code and for low level optimization, human cannot compete with machine (how many people knows the particular cache line alignment trick on old core i7?). Multimedia instructions (SSE/MMX/3D Now) are useful but most of them can be done by function call instead of hand-crafted assembly.

talklittle · on Dec 7, 2009

And the GPS info could narrow down the search in most cases.

wallflower · on Dec 8, 2009

> of course, efficiently searching thru a corpus of millions of faces

There was research cited here a while ago on the high accuracy obtained by constraining the search space to within one's social graph (hundreds vs millions). While it might not identify the stranger at the bar, it might identify their companions and so on. Six degrees of separation in real-time.

mbrubeck · on Dec 7, 2009

Add in security camera networks and you get all sorts of geographic applications without the need for always-on GPS!

hristov · on Dec 7, 2009

For me this won't be very useful until someone develops the more crucial ass recognition technology.

aw3c2 · on Dec 7, 2009

quick, patent it

symesc · on Dec 7, 2009

Google's relentless focus on speed within search is going to play perfectly in the realm of augmented reality.

I anticipate a video version of this soon.

Prolorn · on Dec 7, 2009

Doesn't the "Places" feature count as video?

http://www.google.com/mobile/goggles/#place

symesc · on Dec 8, 2009

Yes.

But as per the other response here, I'm not sure it's using the video so much as the location and orientation to deliver results.

I'd like to see a version, like the still-photos, whereby the search engine is translating in real time as new objects enter the frame.

While I'm at it, I'd like it to perform these tasks with a T-800-style HUD.

Droid indeed.

Devilboy · on Dec 7, 2009

If I understand it correctly it just uses your GPS and magnetometer to work out where you are, it doesn't actually use the video data like it does for the still pictures you feed it.

wallflower · on Dec 7, 2009

I'm eagerly awaiting an AR app for language learning. See the native labels for objects like a VH-1 Popup video.

migpwr · on Dec 7, 2009

It works surprisingly well! It also includes an option to rate the results at the bottom of the search results.

cyen · on Dec 7, 2009

reminds me of a much more practical version of some of the cool bits from the "sixth sense" interface that was presented at TED / all over the news awhile ago - http://www.pranavmistry.com/projects/sixthsense/ . There, their prototype was also able to "look at" an object and return information about it immediately (e.g. you hold up a book in front of the camera, a processor recognizes it and pulls some relevant information, and the mini projector next to the camera projects the information back onto the book cover)

quilby · on Dec 7, 2009

Where does the info for the landmark recognition feature come from? Is it from street view?

I dont see how the landmark recognition feature could be useful. If you have a camera + 3g on your phone, you have a GPS.

mbrubeck · on Dec 8, 2009

The latest iPod Touch has a video camera but no GPS. The same is true of various other non-phone handheld tablets. (Though landmark recognition might not be useful on those, since apparently it actually uses the GPS and compass as input...)

Retric · on Dec 7, 2009

3g adds nothing to this. Any network of sufficient bandwidth would work.

steveklabnik · on Dec 7, 2009

Reminds me of the first time I saw TinEye. Mobile is sweet, though.

natmaster · on Dec 7, 2009

Very cool. I am interested in learning where they got their data from, and how they trained their image recognition algorithms.

zooted · on Dec 8, 2009

http://images.google.com/imagelabeler/

JCThoughtscream · on Dec 7, 2009

Oh, wow. With Google's backing, an AR future suddenly doesn't seem very far off at all.

wglb · on Dec 7, 2009

Very cool. It seems to work on the first few things I tried with my droid.

eduardoflores · on Dec 7, 2009

why aren't they enabling this for PCs/Netbooks & webcams?

ephermata · on Dec 8, 2009

Speculation: Because those aren't as mobile as a phone, and because Android needs something to differentiate itself from the iPhone. First seamless turn by turn directions, now this.

fleaflicker · on Dec 7, 2009

there's an Easter egg in the yelp iphone app that does something similar.

GrandMasterBirt · on Dec 7, 2009

impressive. Yet another reason to get droid. I wonder if verizon will get google to make these tools exclusively for the droid only.

nym · on Dec 7, 2009

It's already on the market available for phones other than Verizon's droid (tested on T-Mobile / G1).