Models never « see » anything to begin with, it’s all matrices. And since we dit...

IIAOPSW · on June 4, 2023

Yeah, real humans see with a fourier transform in a highly optimized basis for projecting 3d down to the 2d retina. Not cold soulless math like those machines!

http://www.math.utah.edu/~bresslof/publications/01-3.pdf

goldenkey · on June 4, 2023

Much hardware and software for codecs use wave basis as well.

https://en.wikipedia.org/wiki/Discrete_cosine_transform

astrange · on June 4, 2023

Humans see more than 8-bit RGB. Humans can see light polarization and stereo disparity, but more importantly we can interact with things we look at.

XYZ is the "most optimal" 3-channel colorspace but it's a simple transformation from RGB, so it doesn't matter - the model can learn it if it wants to.