How I Used Deep Learning to Train a Chatbot to Talk Like Me

1024core · on Aug 10, 2017

It's always nice to read the details of people building real applications, even if it's not a homerun.

BTW: just blurring people's names using a gaussian blur operator is not enough, if your radius is too small. From the screenshots of chats, you can tell names like "Arvind Sankar", "Manu Saravanan", etc.

sarabande · on Aug 11, 2017

Also, although difficult, Gaussian blurs can be in certain cases reversed (even though this one is so obvious there was no point in blurring the names). I prefer a fat black box when censoring.

MasterScrat · on Aug 10, 2017

If you want a more robust parser I made scripts to export Messenger and Hangouts chat logs to dataframes for a previous project: https://github.com/MasterScrat/ChatShape

It will give you rows with : [timestamp, interlocutorName, messageSenderName, text]

adeshpande · on Aug 10, 2017

Thanks! This is definitely useful

sp332 · on Aug 11, 2017

This reminds me of this year's xyzzy award-winning interactive fiction game, The Mary Jane of Tomorrow. https://emshort.blog/2016/06/05/the-mary-jane-of-tomorrow/ (It won Best Single Puzzle and Best NPC.)

netsharc · on Aug 10, 2017

No mention of Black Mirror?

Also: https://www.theverge.com/a/luka-artificial-intelligence-memo...

adeshpande · on Aug 10, 2017

Just watched the episode today LOL Insane stuff

sagado · on Aug 11, 2017

Great explanation! I also worked on a similar project (https://medium.com/towards-data-science/personality-for-your...).

Also feel free to check my Conversation Analyzer (https://github.com/5agado/conversation-analyzer), which includes scraper and parser for Facebook Messanger conversations.

adeshpande · on Aug 11, 2017

Nice writeup! Definitely seems to be a lot more realistic than my bot LOL The parser is really helpful too

luckyt · on Aug 11, 2017

Is it possible to make it spew out random personal information like addresses and phone numbers if you give it the right input? That's usually what happens when you train a RNN and it overfits.

adeshpande · on Aug 11, 2017

Yeah you're right it is a possibility. Overfitting was definitely a problem in this project. I think just a larger dataset and maybe adding some regularizers would have helped.

Luckily I dont think I have too much compromising information in my dataset LOL

emodendroket · on Aug 11, 2017

Doesn't seem like it's really any better than eliza or the emacs doctor or whatever stuff didn't use the buzzword tech of the year.

gr__or · on Aug 10, 2017

Ha! Just yesterday I had the narcisstic idea of gifting someone I've been chatting regularly with for years a chatbot that answers his messages like I would. This is just what I needed, thanks!

adeshpande · on Aug 10, 2017

Awesome! Let me know if you get good results or find a way to improve my Seq2Seq model

TeMPOraL · on Aug 11, 2017

I have this idea pretty regularly. Human interactions are quite often annoying :).

theoneandonlyyy · on Aug 10, 2017

Really cool project! Curious to see what type of changes you could make to get more realistic results

Aron · on Aug 11, 2017

while(0) {cout << 'hurr. durr.\n'; sleep(1000);}

problems · on Aug 11, 2017

So exit immediately?

Aron · on Aug 11, 2017

Hi. Welcome to chatbot/n.

orthoganol · on Aug 10, 2017

[flagged]

sctb · on Aug 10, 2017

I think we don't need this off-topic flamebait here. Please post civilly and substantively or not at all.

https://news.ycombinator.com/newsguidelines.html

icholy · on Aug 10, 2017

Funny how someone without anything remotely technical in their comment history feels like they can speak for "the majority of tech people".

mLuby · on Aug 10, 2017

ad hominem super sleuth

tdb7893 · on Aug 10, 2017

I don't see anything bad that he said. Also, I've met a decent amount of people who come across as very "bro" who are actually really nice considerate people so being kinda bro isn't inherently bad (unless you define "bro" as essentially just being an ahole, which doesn't seem to apply here)

mgbmtl · on Aug 10, 2017

So.. I went reading the article, expecting something really ridiculously bro. The article was well written, good references, but yes, uses a few silly conversation samples (for which there were no sexist, racist or homophobic references).

prawn · on Aug 11, 2017

Is it possible that training a bot to speak like that (silly/casual) is more challenging than to emulate a very grammatically correct and stiff style? As such, maybe the choice of conversations was intentional?

mgbmtl · on Aug 11, 2017

My comment was directed towards the OP, who was harsh and didn't provide a proper argument. As I said, it was a nice article and I wanted to encourage others to read it, despite that (flagged) comment.

prawn · on Aug 11, 2017

Yes, I was writing in support of your comment.

msla · on Aug 10, 2017

Being bro is synonymous with being male to some people.

It is, of course, still an insult to them.

mgbmtl · on Aug 10, 2017

I see it more often used to mock a male mono-culture and the pitfalls that derive from it (ex: brogrammers). A bit like the term douchebag has become a synonym of 'jerk'.

sremani · on Aug 10, 2017

The author is probably 20 years old (give or take) Undergrad student.

What is wrong with you people? Always have to put others in their "stereotypical silos" and stupid them they are the racists and homophobes ?!

manav · on Aug 11, 2017

Author seems to be into fantasy sports and must have a lot of group threads on that topic.

adeshpande · on Aug 11, 2017

Yeah LOL i think maybe limiting the dataset to just 1:1 conversations would be better than including group chats as well

piptastic · on Aug 10, 2017

Ideally most of them won't care what other people think about them. That's not something to be spending much thinking power on anyway.

There's nothing wrong with making tech that can communicate to a larger audience than "people in tech".