you don't transpose it before the matmul, you always have it transposed (i.e., w...

		kanaffa12345 on Jan 26, 2022 \| parent \| context \| favorite \| on: WebAssembly techniques to speed up matrix multipli... you don't transpose it before the matmul, you always have it transposed (i.e., when you print the weights of a linear layer in pytorch, you're actually seeing (A^t)^t and what's stored is A^t.