I think the example is simplified to make its point efficiently, but also: the moon is something whose size would very likely be precisely explained in texts about it. While some hunting journals might brag about the weight of a lion that was killed, or whatever, most texts that I can recall reading about lions basically assumed you already know roughly how big a lion is; which indeed I learned from pictures as a pre-literate child.
A good, precise spec is better that a few pictures, sure; the random text content of whatever training set you can scrape together, perhaps not (?)
A good, precise spec is better that a few pictures, sure; the random text content of whatever training set you can scrape together, perhaps not (?)