I am not an expert in the particular application domain of that article; but I can see why their argument of pre training might be valid. It is not especially controversial to say that pre training neural nets improves few shot learning performance. And I suspect there is an inflection point for every problem where pre trained neural nets yield better few shot learning performance than less data hungry approaches - such as hand crafted features or strong priors.
That being said, it seems that the question here is whether that inflection point has been reached in this case.
That being said, it seems that the question here is whether that inflection point has been reached in this case.