they don't and talk about the difficulties in their paper. I found it refreshing to see the standard of frankness and openness in how they address this. But - it's all pretty compelling and will surely prompt and sustain a lot more research investigating these results and data and also creating more in the future.
How do they know if their AI did it correctly or not?