Hacker News new | past | comments | ask | show | jobs | submit login

I actually agree with you. I was a bit sarcastic. If I understand correctly there isn't a fundamental difference when it comes to text output vs pixel data output in this context. If so then it suddenly sounds much more of a stretch (intuitively) to claim that somehow stable diffusion understands the real world (like people claim to be the case with language models).



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: