I think young children's drawing ability is more indicative of the type of tool we are giving them, they only have the ability to draw a fixed width line, how else would you represent a limb?
Well, it's not at all obvious that a line is the naive representation of a limb rather than a particularly intelligent encoding of it.
For example: CNNs, though pretty good at detecting limbs (and miscellaneous other things) have only a very limited ability to encode structural information in this way. An interesting open question in the field is what is the "right way" to encode this sparse, graph-like structural data (hence capsule networks).
Absolutely, but that requires advanced fine motor control, understanding of how the instrument lays down color and what multiple layers of color look when on top of each other, and so on.
The naive way to use the instrument, is to run the instrument over the area one or a few times. The simplest way to do that in terms of motor control (e.g. fewest turns) is to run it up and down the longest axis one or more times. That's exactly what a child does.