And? This does not address my argument that the complexity is beyond-combinatorially explosive (infinite spaces). I'm not talking about the space of possible board states. I'm talking about merely the set of all possible actions.
EDIT: clarified my language to address below reply.
...and it's possible to train learning agents to sense and interact with a world described by high dimensional continuous vector spaces, for instance using conv nets (for sensing audio / video signals) and actor-critic to learn an continuous policy:
The fact that the (reinforcement) learning problem is hard or not is not directly related to whether the observation and action spaces are discrete or continuous.
I don't understand why would the "number of spaces" matters. What matters is can you design a learning algorithm that performs well in interesting spaces such as:
- discrete spaces such as atari games and go,
- continuous spaces such as driving a car, controlling a robot or bid on a ad exchange.
A really really large number of distinct decisions that need to be made. A car only needs to control a small set of actions (wheels, engine, a couple others I'm missing). A game player only needs to choose from a small set of actions (e.g. place piece at position X, move left/right/up/down/jump).