The paper should definitely be more clear on this point, but there's a sentence in section 5.2.3 that makes me think that this was playable and played: "When playing with the model manually, we observe that some areas are very easy for both, some areas are very hard for both, and in some the agent performs much better." It may be a failure of imagination, but I can't think of another reasonable way of interpreting "playing with the model manually".