I don't think it's accurate to call AlphaGo 'improved tree search' the way that ...

argonaut · on March 9, 2016

The correct characterization is that it's a hybrid of deep learning and tree search. It's also a hybrid in the feature representation for the algorithm. There are raw features (board state), but there are also handcrafted features for the tree search.

panic · on March 9, 2016

I'd say both characterizations are accurate. Neither the tree search nor the neural net could have accomplished this on their own. But the essential interface to the algorithm is the tree search: it's picking the best move from a set of legal moves determined by formal game rules. The real world doesn't follow formal game rules. I find it difficult to see the progression from this victory to some kind of real-world AI takeover.

kqr · on March 9, 2016

Sure, but when you say "tree search" people think of the traditional kind of expert system-based tree search. Traditional tree search methods prune trees by approximating subtrees according to some very specific rules decided on by human experts. This means in a tree search system, the computer cannot actually evaluate a position any better than the humans that designed it. The way it performs better is by running very many of these evaluations deep down in the tree.

When you're talking about some other kind of evaluation function, such as Monte Carlo rollouts, you usually prefix that to the tree search (in the case of Monte Carlo rollouts, "Monte Carlo tree search" or MCTS) to indicate that besides the basic fundamental task common to almost all AIs (finding the optimal branches in a decision tree) it functions completely differently from the expert systems.

So is the case with this program, which (in a first pass) approximates subtrees by a trained neural net, rather than Monte Carlo rollouts or an expert system. So using terminology that suggests classical expert system tree search is bound to cause confusion (as you noticed).

igravious · on March 9, 2016

> The real world doesn't follow formal game rules.

Really?

Why not?

argonaut · on March 9, 2016

Infinite state/belief/world space. Infinite action space. It's not so much that there aren't rules (there are - physics), it's that the complexity of the full set of rules is exponential or super-exponential.

ogrisel · on March 9, 2016

You can do math with continuous and infinite dimensional spaces.

argonaut · on March 9, 2016

And? This does not address my argument that the complexity is beyond-combinatorially explosive (infinite spaces). I'm not talking about the space of possible board states. I'm talking about merely the set of all possible actions.

EDIT: clarified my language to address below reply.

ogrisel · on March 9, 2016

...and it's possible to train learning agents to sense and interact with a world described by high dimensional continuous vector spaces, for instance using conv nets (for sensing audio / video signals) and actor-critic to learn an continuous policy:

http://arxiv.org/abs/1509.02971

The fact that the (reinforcement) learning problem is hard or not is not directly related to whether the observation and action spaces are discrete or continuous.

ogrisel · on March 9, 2016

see also by the same team:

http://arxiv.org/abs/1602.01783

argonaut · on March 9, 2016

There is a near infinite number of such spaces.

ogrisel · on March 10, 2016

I don't understand why would the "number of spaces" matters. What matters is can you design a learning algorithm that performs well in interesting spaces such as:

- discrete spaces such as atari games and go, - continuous spaces such as driving a car, controlling a robot or bid on a ad exchange.

argonaut · on March 10, 2016

A really really large number of distinct decisions that need to be made. A car only needs to control a small set of actions (wheels, engine, a couple others I'm missing). A game player only needs to choose from a small set of actions (e.g. place piece at position X, move left/right/up/down/jump).

ogrisel · on March 10, 2016

A human brain also has a limited number of muscles to control to interact with the world.

argonaut · on March 11, 2016

And a much larger set of decisions.

eru · on March 9, 2016

Go is combinatorially explosive, too.

vlowther · on March 9, 2016

Your assertion that the universe has infinite anything is a common mistake. A stupidly large number of states is not infinite.