We know how they work in that we built the framework, we don't know how they wor...

We know how they work in that we built the framework, we don't know how they work in that we cannot decode what is "grown" on that framework during training.

If we completely knew how they worked we could go inside an explain exactly why every token generated was generated. Right now that is not possible to do, as the paths the tokens take through the layers tend to be outright nonsensical when observed.