I was thinking more about a machine vision context with e.g. different parts com...

I was thinking more about a machine vision context with e.g. different parts coming in at any rotation angle.

I know that some translation invariance comes from e.g. the usual conv+maxpool layer structure, but there must still be several representations existing in the first hidden layer of the network stack, for the different translation shifts?

Especially rotation looks like something that should produce a lot of symmetry and shared parameters, but it also looks difficult enough for me that I rather would like to know about someone with mad math/group theory(?) skills who looked at that.

But thank you for the detailed reply anyways!