Hacker News new | past | comments | ask | show | jobs | submit login

If one ever tried to make edits to the latents prior to decoding them with a VAE in SD1.5 and then in SDXL, it could be seen that that local changes had somewhat unpredictable and global effects on the image in SD1.5, while in SDXL the changes have more predictable impacts to the output image and some of the different latent channels end up corresponding more directly to the resulting image channels.

Definitely a fascinating write-up. I have been curious about these differences for a while, though I had never considered this a "problem" per se.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: