Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Important note from the paper - the resolution is limited to 384x384 currently.



Seems like a massive buried lede in an “outperforms the previous SoTA” paper.


Great for generating favicons!


don't most architectures resolve this via superscaling / some up scaling pipeline after that adds the details?

iirc stable diffusion xl uses a "refiner" after initial generation


The SDXL refiner is not an upscaler, it's a separate model with the same architecture used at the same resolution as the base model that is focussed more on detail and less on large scale generation (you can actually use any SDXL-derived model as a refiner, or none; most community SDXL derivatives use a single model with no refiner and beat the Stability SDXL base/SDXL refiner combination in quality.)


Ouch, that's even smaller than the now-ancient SD 1.5 which is mostly 512x512.


The obvious point of a model that works like this is to see if you can get better prompt understanding. Increasing the resolution in a small model would decrease the capacity for prompt adherence.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: