If one ever tried to make edits to the latents prior to decoding them with a VAE in SD1.5 and then in SDXL, it could be seen that that local changes had somewhat unpredictable and global effects on the image in SD1.5, while in SDXL the changes have more predictable impacts to the output image and some of the different latent channels end up corresponding more directly to the resulting image channels.
Definitely a fascinating write-up. I have been curious about these differences for a while, though I had never considered this a "problem" per se.
Definitely a fascinating write-up. I have been curious about these differences for a while, though I had never considered this a "problem" per se.