I still don't understand this. If a blob of C code passes a large structure by value, doesn't it go on the stack? Why would the compiler be required to pass such an object only via the registers?
It's generally expected that functions compiled with different compilers (for the same target architecture) can call each other. This only works if the compilers agree on where function arguments go; since they are all required to do it in the same way, it's better to require everyone to do it the fast way rather than requiring everyone to do it the slow way.