Isn't that 3 dimensions (amplitude, time, and frequency)? The plot of course fills 2 spatial dimensions and uses color to represent the 3rd dimension. But I don't know very much about this.
I don't have a mathematically rigorous understanding of it but the number of dimensions is basically the number of freely varying inputs to the corresponding functional representation.
In a 2d image, x position and y position are mapped to a color, e.g. I(x,y) = C.
In a spectrogram, freq and time are mapped to a color (amplitude) e.g. S(f,t) = A.
In neither case can you just pick an arbitrary color or amplitude and in general produce a singular x/y or f/t from that.