Almost. The KL tells you how much additional information/entropy you get from a ...

Almost. The KL tells you how much additional information/entropy you get from a random sample of your distribution versus the target distribution.

Here, the target distribution is defined as the unit gaussian and this is defined as the point of zero information (the prior). The KL between the output of the encoder and the prior is telling us how much information can flow from the encoder to the decoder. You don't want the KL to be zero, but usually fairly close to zero.

You can think of the KL as the number of bits you would like to compress your image into.