DarkForest: Deep learning engine for playing Go from Facebook research

andrewljohnson · on June 17, 2016

The name alludes to a book by Cixin Liu, the second in The Three Body Problem trilogy: https://www.amazon.com/dp/B00IQO403K/

This is perhaps the best of a hundred scifi novels I've read in a few years. These novels will make you question the wisdom of sending traceable radio signals into space. Stay quiet, it's a dangerous Dark Forest out there.

lesdeuxmagots · on June 17, 2016

I am a bit baffled that people think about it so highly. I found three-body highly flawed, but at least moved at a good pace; I could barely get through the first few chapters of dark forest. The characters lacked any depth, the writing and structuring were very poor as well.

Unfortunately just having fascinating premise and good story is not good enough for me. I'd almost rather it be an essay, a thought experiment, than a book.

trungaczne · on June 18, 2016

I'm in the same boat, I didn't enjoy the writing at all and could not progress very far.

personjerry · on June 18, 2016

To be fair, the version you read is likely translated from chinese.

Hates_ · on June 17, 2016

Definitely a great book. Eagerly awaiting the final book later this year.

exhilaration · on June 17, 2016

Thank you, I'm downloading the audiobook from Audible right now. Just in time for the evening commute!

ipsum2 · on June 17, 2016

You should start with the first one in the trilogy, "Three Body Problem". Both great reads, I'm excited for the third book to be translated in September.

placeybordeaux · on June 17, 2016

I was sad when I finally finished the Dark Forest after listening to the three body problem on my walk to/from work.

wyager · on June 17, 2016

I wasn't aware they'd gotten around to translating the second one! Thanks for the reminder.

gdubya · on June 17, 2016

Coincidentally I just finished reading this. Highly recommended for anyone interested in sci-fi!

xgbi · on June 17, 2016

And as usual, Kindle edition is more expensive than paperback :(

placeybordeaux · on June 17, 2016

My kindle is literally covered in dust because of this.

Not to mention that with the paperback I own something, while with the kindle version I have a revokable license to download a DRM covered file...

dmix · on June 17, 2016

As usual? I rarely see this...

My problem is how many Kindle books I see are $9 in the US store and then $30 in the Canadian store. For the same digital content :/

boto3 · on June 17, 2016

Why don'd Facebook, Google and potentially MS, Apple, etc. set up a yearly tournament for the world title of 'Champion of Go'. This is similar to F1 races by the car manufacturers.

gjm11 · on June 17, 2016

Taking your question at face value (why don't they?): Because right now, so far as I can tell, Google is so far ahead that there would be little interest in the contest.

Perhaps Facebook will start to catch up, or Microsoft will suddenly appear with something startling, but for the moment the only real competition for AlphaGo comes from top human players. (Perhaps not even them? I don't think anyone really knows.)

duaneb · on June 17, 2016

> Google is so far ahead that there would be little interest in the contest.

Sounds like EXACTLY the motivation a contest would help to foster.

mtgx · on June 17, 2016

I think he's saying that, right now, both other companies would expect to be crushed in such a competition, so why would they not spare themselves the embarrassment?

And because we're doing silly memes, to accept the challenge, Microsoft and Facebook would probably have to go into the competition with this kind of attitude: https://www.youtube.com/watch?v=9ZYg4ZbcOPQ

kibwen · on June 17, 2016

Go has a few interesting properties that make it an appealing target for AI researchers (perfect information, no randomness, an abundance of highly-trained human players to benchmark against), but if you were trying to devise an AI-vs-AI tournament you would probably concoct a more difficult challenge (or use one of the human games that AIs still have problems with, like no-limit Texas Hold'em, or the board game Diplomacy).

osti · on June 18, 2016

There is already a computer NL hold'em tournament out there, which can be found here, http://www.computerpokercompetition.org

VeejayRampay · on June 17, 2016

I thought about that too. But then I realized that Google are better than anyone else at this for now.

ma2rten · on June 17, 2016

I think the expectation is that they move on to the next challenging problem eventually.

laxatives · on June 18, 2016

It would still be interesting if they only did it in the short term, say 2-3 years.

farresito · on June 17, 2016

Would those companies have any interest in publishing all their research if it gets to the point where it actually means something winning this tournament?

atonse · on June 17, 2016

Sure why not? There isn't any competitive advantage in the AI model for beating this game – there's probably tangential knowledge and experience gained.

Maybe someone can clarify but it doesn't seem like hoarding this model (specific to playing Go) would help Google have an advantage in their search algorithms over Facebook, for example.

1138 · on June 17, 2016

If they did, they should allow humans and limit the amount of energy used during the matches, like calories in a tuna sandwich.

devsquid · on June 17, 2016

Thats such a cool idea! OMFG

andreyk · on June 17, 2016

I've read a lot about AlphaGo, so I'll make a go at offering a quick explanation of (as far as I understand) how DarkForest compares to AlphaGo in terms of implementation.

Both do the cool thing of combining Monte Carlo Tree Search (which used to be state of the art, and is like really smart brute force tree search) with Deep Learning, and there is not much more to DarkForest than that so we can describe that first. The Deep Learning portions involves training a Deep policy Convolutional Neural Net, meaning a neural net that takes in a Go position (actually both take in Go-specific features about the position) and outputs the best moves. Datasets from human moves are used to get this, so its pretty easy. This policy neural net is used to guide the Monte Carlo Tree Search, which just means using the net's move predictions to play out a bunch of games into the future to evaluate the best move. In classic MCTS you play out to the end of the game and estimate the 'value' of a move by just ratio of wins it gets at the bottom of the tree, but the estimation of value is a more complicated for both systems. In DarkForest they 'Use PUCT and virtual loss. Remove win rate noise'. The move with the highest estimated value based on the tree search (guided by the neural net) is chosen as the next move.

AlphaGo does a bunch in addition to this which is hard to sum up, but basically: 1. They train a fast and a slow (but better) policy net, using the same dataset. Both of these are used to guide the tree search. The slow but better is used less often than the fast net, so a lot of games can be played out in the tree search. 2. The slow but better policy net is improved through reinforcement learning - making the system play itself and learn from that rather than just the human moves dataset. 3. Since there is a policy network that can choose moves, one can reasonably assume we can also get a 'value' network to evaluate how good a position is. This is also done, and the value network is used as part of an equation to compute the value of positions in the tree search (there is also a much simpler hard coded value equation, and the final value used as part of tree search is a weighted combination of these two). 4. All this is run in an absurdly, radically distributed manner with an insane amount of compute power - on the 'non distributed' version AlphaGo uses 40 search threads running on 48 CPUs, with 8 GPUs for neural net computations being done in parallel, and in the 'distributed version' it uses more than a thousand CPUs and close to 200 GPUs. I don't even know if that's up to date, those number are at least a few months old.

To sum up: DarkForest combines a policy network trained by supervised learning to guide Monte Carlo Tree Search for move selection. AlphaGo uses a small/fast policy network trained by supervised learning AND a slow but better policy network trained by both supervised and reinforcement learning to guide monte carlo tree search, and computes the value of final position with a combination of a hardcoded value equation and a 'value' network derived from the slow policy network. AlphaGo is absurdly, radically distribued (which helps a lot). Oh yeah, it should noted the amount of engineering required for AlphaGo involved dozens of people (two dozen are listed on the paper), whereas Darkforest seems to have been mainly developed by two people.

joe_the_user · on June 17, 2016

Thanks for the summary.

I suppose the question one could then ask is "will AlphaGo's approach wind-up being emulated over time or is it going to be something like a cul-de-sac?

How many single algorithmic challenges are worth expending this much effort on? Could AlphaGo's approach be applied to other such problems? Will increasing processor speed just make all this effort moot? Is AlphaGo something like Deep Blue (the custom computer that beat Kasparov and then was dismantled rather than being developed further)?

andreyk · on June 17, 2016

These are all precisely the right questions to ask about this, I think.

My take is that the approaches of AlphaGo are more applicable to other problems than DeepBlue, but not by much. Rigid rules make tree search and reinforcement learning easily applicable to Go, but not so much for many real life problems. I made a small diagram to illustrate this point (http://www.andreykurenkov.com/writing/images/2016-4-15-a-bri...) as part of a series of posts about Game AI (http://www.andreykurenkov.com/writing/a-brief-history-of-gam...).

Still, the general ideas of supervised learning followed by reinforcement learning, training multiple models of varying complexities from the same dataset, and combining tree search with learned models as they did are useful general ideas. Hybrid methods as a whole will become increasingly common, I think (no doubt self driving cars already are very complicated hybrid models).

mtgx · on June 17, 2016

> I don't even know if that's up to date, those number are at least a few months old.

Yeah, probably not given this recent announcement:

https://cloudplatform.googleblog.com/2016/05/Google-supercha...

gooseus · on June 17, 2016

Curious if the name DarkForest is based on the Three Body Problem sequel, The Dark Forest, and/or does the dark forest theory outlined in the book somehow relate to any aspect of their learning/decision algorithm?

I would outline the theory here, but it's kind of a book spoiler. So be wary if you have interest in reading it eventually and want to know what I'm referring to now.

bcoates · on June 17, 2016

This sort of strategy game AI in general works on a Dark Forest-ish theory, recursively evaluating what it would do if it were in the opponent's shoes to predict the reaction it will get to each possible move.

Since Go is zero-sum the only reasonable strategy is to be maximally malicious and assume the other guy is too.

olimashi · on June 17, 2016

An interesting code name for this project - slightly ominous given the novel of the same name?

tedmiston · on June 17, 2016

Can anyone comment on why the author used Lua?

I'm curious if it's for Torch, or perhaps there's something more fundamental about why it's a good language for AI that I don't know yet.

kmiroslav · on June 17, 2016

Mostly because that's what Torch uses. LUA itself is extremely fast for a scripting language but while that characteristic is vital in video games (where LUA is used quite a bit), it's pretty irrelevant for machine learning since the core of these engines is usually written in C/C++ for speed, and the scripting language is just used as an entry point that makes the API and data entry easy.

Still, I think ultimately, LUA is a liability for Torch, and it's not helped by the fact that not only is TensorFlow a superior competitor (in my opinion) but also because TensorFlow is based on Python (which is vastly more popular than LUA).

Despite all their investment in Torch, it wouldn't surprise me if eventually, Facebook transitions to TensorFlow because it's probably what it will take for them to effectively compete against Google on the machine learning front.

cjbprime · on June 17, 2016

Though note that AlphaGo used Torch too, so it can't be the source of Facebook's weakness at a Go engine.

https://m.facebook.com/story.php?story_fbid=1015344288418214...

adamnemecek · on June 17, 2016

It's a high level language that's super fast.

semisight · on June 17, 2016

Facebook is heavily invested in Torch, a learning framework built in Lua.

yazriel · on June 18, 2016

Can anyone comment on the CPU requirements for such a system? (either this one or AlphaGo) How many commutative CPU/GPU hrs are required to get to here?

umutisik · on June 17, 2016

I would be very interested in finding out what the key factors are that make Alphago stronger than Dark Forest.

lukeHeuer · on June 17, 2016

Purely from a hardware advantage standpoint Alphago runs on TPUs, and Google asserts these ASIC units offer "an order of magnitude better-optimized performance per watt for machine learning."[0] Facebook doesn't seem to have an equivalent afaik.

[0] https://cloudplatform.googleblog.com/2016/05/Google-supercha...

visarga · on June 18, 2016

That doesn't make TPUs better at achieving performance, just cheaper. My bet is on the size of the team. FB only put in 2 people, or so it seems, while Google invested 10x more man-time.

lukeHeuer · on June 19, 2016

It most certainly does mean they are better at achieving performance. The real headline of the optimization bit is that they are capable of more ops per second, not that they just use less energy and are therefor cheaper. To use a more open/mainstream analog with freely available metrics, look at the performance difference in bitcoin mining between GPUs and ASIC units.

amaks · on June 17, 2016

I'm curious when their engine will be good enough to start playing against leading human Go players?

spot · on June 17, 2016

it's plenty good to beat everyone up to very strong amateurs. it's not at professional level yet.

andreygrehov · on June 17, 2016

DarkForest vs. AlphaGo - is it a possible scenario in which DarkForest learns from AlphaGo?

aab0 · on June 17, 2016

If AlphaGo ever gets released, sure. DarkForest can learn from corpuses of AG self-play games.

joe_the_user · on June 17, 2016

Does anyone have an idea how strong this in terms of stones?

cjbprime · on June 17, 2016

AlphaGo could probably give it 9 stones and still win, 9p->5d. (Stones go non-linear around the pro ranks, so we can't just count the rank difference in the way we do as amateurs.)

joe_the_user · on June 17, 2016

OK, I see the reference now.

It's not even the strongest bot on KGS.

It seems like an indication that either they haven't incorporated the advances of AlphaGo or that Alpha succeeded through the investment of a huge amount of tuning time and processing power rather than through specific advances.

pmontra · on June 17, 2016

That's too much. Let's say Alpha Go is 9 dan pro or stronger. It would be 4 or 5 stones to a 6 dan amateur, one more stone to 5 dan. Let's say AG is a couple of stones stronger than the strongest humans. Still not nine stones handicap.

dfan · on June 17, 2016

There is about a 3-stone difference between 9p and 1p, 1p is probably about 8d, and there is about a 3-stone difference between 8d and 5d. So in total 6+ stones (AlphaGo may be stronger than 9p, of course).

The version of AlphaGo that beat Fan Hui won 77% of games against the strong engine Crazy Stone giving a 4-stone handicap, even when running on a single machine. Crazy Stone has improved since then, though.

owaislone · on June 17, 2016

So when are we gonna pit it against Google Brain?

jolux · on June 17, 2016

>We hope that releasing the source code and pre-trained models are beneficial to the community.

Translation: Google beat us and there's no point keeping this private anymore. ;)

semisight · on June 17, 2016

Playing them against each other would be a pretty cool way for the two companies' AI teams to compete though.

cjbprime · on June 17, 2016

5d KGS is a world away from professional play, Google would win every game.

oh_sigh · on June 17, 2016

That would be equivalent to Lebron James going head to head with a disabled 5 year old.

dfan · on June 17, 2016

No, it would be equivalent to LeBron James going head to head with a talented amateur basketball player. Of course he would still win every time, but it would at least be a game of basketball.

ns0xai · on June 17, 2016

is someone working to port it to tensorflow