Hacker News new | past | comments | ask | show | jobs | submit login
Emoji and Deep Learning (getdango.com)
168 points by wxs on June 9, 2016 | hide | past | favorite | 50 comments



I have to say this was one of the more "fun" ML articles I have read lately. Excellent visualizations as well. Great job!


Thanks! I was tempted to go way more technical but we decided it would be fun to let more people get some intuitive grasp of these kinds of algorithms.


It was really effective! It was great to be able to visually "walk" through the state space. If you do another, more technical post digging into the details then I would definitely be interested.


I shared it internally with my team as well. :)


This is cool, even if it seems kind of frivolous. Word embeddings work for emoji just like they do for actual words, and it's neat to see an idea for how to commercialize that directly.

I wish they had explained details, such as what two-dimensional non-linear projection they're using for their map.

I also don't see it fully explained how they're getting representations of sequences of emoji. They explain how their RNN handles sequences of input words, but the result of that is a vector that they're comparing to their emoji-embedding space. Does the emoji-embedding space contain embeddings of specific sequences as well?


We use t-SNE for the projection! We do mention it somewhere in there. Could maybe be more clear

The sequences of emoji we ended up glossing over here (difficult balance making these concepts as accessible as possible). In the app we can beam search to predict combos, just as you would in sequence to sequence learning. That's not demo'd on the live website though.


It's also a little... racist. If you feed it with emojis it spits out other emojis (I was testing if it could spit out text from emoji input) But what happens if you change the skintone of the emojis?

White arm: http://i.imgur.com/KTNky0O.png Obvious connection to sports, sunglasses(like saying "cool" in this context)

Black arm: http://i.imgur.com/uXtSRfc.png Policeman searching something, a location marker(search location?)


(I am also a dev for Dango)

This is definitely a concern and something we've though about but not yet fully solved. The neural net is trained on real world data which unfortunately includes various types of questionable, racist, sexist, etc content. We already blacklisted emoji combination that are too often triggered in racist ways. However such a system is very difficult to audit completely.

Your example comparing different skin tone modifiers is a good one that we hadn't thought of. I've made a note of it so we can try and improve.


racism is in the eye of the beholder. You've injected a lot of non-standard meaning into those emoji.

http://emojipedia.org/customs/


Yeah this is a real challenge with AI systems, they reflect both our worst and our best back at us, we're still struggling with the best way to deal with this.

Although in this particular case it's actually just a bug: Dango gets confused by any skin tone modifier character, since they're not supported on Android (our target platform). Try putting in a "white" arm and you'll see the same results. They're actually just our "Dango is confused by this input" results.

We should fix the bug, of course!


I'm seriously struggling to make the 'racist' connection with your second photo here


Reading this was somewhat bothersome in that my browser (Chrome on Linux) doesn't render emoji. Is there a standard font that supports emoji that could be installed?


You could try this: http://emojione.com/chrome/


Extremely neat, but I really don't understand the point of the app (Dango) that all this engineering is for the sake of. If I'm using an emoji, it's either instead of words, or to clarify words that could be taken multiple ways (e.g. sarcasm.)

Who are these people that type a sentence (with a single meaning, clear-cut enough for Dango to detect), and then want to add a redundant pictorial representation of the same words they just typed?


Yeah so this is a legitimate concern. Of course sometimes it's fun to say "let's eat pizza :pizza_emoji:", but that's not hugely valuable.

However, Dango's training data includes people using Emoji to augment rather than repeat their sentence. So if there are two different interpretations and an emoji could disambiguate, the ideal is that Dango has seen people use that phrase both ways and, and that it suggests both possibilities and you can pick the one that you meant. In many cases this works now, in many cases we still have work to do.

It also suggests based on messages sent to you, so if there are a couple different replies it can show you them all (although this feature still needs work).


This is really cool. But half the fun for me is to pick the emojis at the end of the message. And they "add" to the mood of my message, they don't "amplify" it. Hence this wouldn't work for me most of the time o_0 ;(


Well you can search in Dango, too ;)

But yeah our main focus is suggestions. You can use Dango concurrently with the normal emoji keyboard, of course! It can just sit there showing you emoji you might not know about "ambiently"


It seems like you're so close to having a full predictive virtual keyboard (with nothing but dynamically-generated keys).

Have you given any thought on integrating this with some sort of bluetooth thimble-like button (makey makey?) on each finger for untethered typing?

I've written more about this line of reasoning here[1] if you're interested. Feel free to ping me on twitter if there's any way I can help. Congrats on this awesome project!

[1]: https://news.ycombinator.com/item?id=11223697


The possibilities for how language will work in the future are really exciting and interesting! We made the Minuum keyboard, too, which also explores how machine learning assistance can open up new ways of communicating.

One reason we're interested in visual communication with Dango, though, is that regular text input is pretty good already. Chorded keyboards exist and are way faster, but people mostly can't be bothered to use them. QWERTY is just good enough. But the field is wide open for rich communication with images, nothing out there is particularly good yet.


I might add that this is pretty much exactly what Elon Musk is asking for with his "Neural Lace" idea to merge humans and machines in "symbiosis".

Replace thimble-keys with OpenBCI and you already have it.

Urbit.org sounds like a good fit for the immutable append-only content-addressable private keylog (now would be the time for a portmanteau generator). I would love to help in any way to make this happen.


This is pretty cool. Emoji's seem trivial, but they're becoming more and more important in communications (whether that is good or bad is a separate discussion), and this is a pretty impressive bit of ML.


>IBM uses them for operationalizing business unit synergies

What does this mean?


I'm inclined towards thinking that it might be a joke.


;)


Nothing, and I presume they're making fun of IBM.


Why not train the RNN to directly predict emojis, instead of projecting everything to semantic space and picking the closest emoji? Seems like that would help with the problem of emojis with multiple meanings in different contexts. With this model, they could only be in a single point in semantic space.


The RNN does in fact directly predict emoji. It outputs a vector of length 1624 (the number of emoji) containing the score associated with each emoji given the input text. This vector of probabilities is what can be though of as the point in semantic space.

The issue of multiple meanings is that if you strongly predict an ambiguous emoji (say the prayer emoji) how do you then extrapolate what concept is contained in the sentence (e.g. was the person saying "thanks" or "high five" or "please").

[I'm also a Dango dev]


So yeah: we can focus on vectors at different levels of the net and these are in some sense different semantic spaces. In the article I talk about a level immediately before it projects onto the emoji vectors. If you look at the output after the projection (and do a softmax) you get a probability distribution across all emoji. This would be a different space in which each axis is an emoji, rather than the emoji being points distributed around the space.


Awesome, thanks for clarifying. So does the training optimize some property of the "semantic" layer immediately before the final emoji prediction layer? Or does it just optimize accuracy of emoji prediction directly?

And then the t-SNE projection shown in the article is based on this same layer (one before prediction)?


Well those are sort of equivalent. But yeah, we use cross-entropy between the projected output and the target emoji distribution as our objective to minimize.

And yes, we do the t-SNE on that pre-projection space. That's why we can visualize the targets (emoji) in it. We can also t-SNE the word embeddings themselves — the input to the RNN — which is also kind of interesting. It automatically learns all kinds of structures there. Chris Olah has a good post on word embeddings if you're interested: http://colah.github.io/posts/2014-07-NLP-RNNs-Representation...


Pretty cool. Tried about 10 sentences and the suggested emojis were spot on. Nice write up.


The real question is, where do you get this training data from?


All over the web! For instance, over 20% of all tweets have emoji in them. Emoji are very popular in instagram comments, etc.


Another example is Venmo.

In fact I made a website that tracks the live count of emojis used on Venmo:

http://venmoji.com/

The source is here for anyone interested:

https://github.com/milesokeefe/venmoji


Can it generate sequences of emoji that is has not seen before?


Yes it can, language models can generate phrases that have not been seen before. Karpathy's RNN post is a pretty good intro to how that works http://karpathy.github.io/2015/05/21/rnn-effectiveness/


This is cool. I hesitate to install it on my phone.. does it send everything I type to a webservice in order to suggest emoji?


As a lover of Emoji and deep learning, this is awesome. Are you planning to support unicode 9.0 sometime soon (I know it isn't even technically out)?


We start supporting them as soon as they're available! Obviously we have a lot less training data early on but we can lean on some heuristics early on until it builds up.

Unfortunately in the app we can't give you emoji that your phone doesn't support so we don't always show all the results.


If you have (or build) a slack integration, would you be able to include custom emoji's? Or is that not enough input?


Yeah we'd need to do some more work. But this is similar to stickers and GIFs: there are many fewer examples for any given sticker or GIF, which is why we do transfer-learning approaches as discussed in the article.

So there's a good chance we could get it to work! We've not focused on that possibility… yet.


I figured - even a Slack integration as is would be pretty cool though.


Great article, fun service. Still a part of me feels sad that we spend brain power on things like this.


How come I can't download the app from outside US?


> This app is incompatible with all of your devices.

Well that's disappointing. Is it country-restricted ?


Quick! Somebody make a slackbot!


Do you guys have API?


Not officially, although people have started reverse-engineering the website ;).

We can get an actually supported one up officially if there's interest. Email me at xavier@whirlscape.com!


Yeah, I saw that you guys take a query test and spit out json with emojis :). It would be nice to have official one so you guys don't cut IPs off :)


Nice post




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: