"different concepts have inherent geometrical forms"
Absolutely, in fact you can build the foundation of mathematics on this concept. You can build proofs and reasoning (for some value of "reasoning").
That's how dependent type systems work, search for HoTT and modal homotophy theory. That's how lean4, coq, and theorem proofs work.
If you remember at the foundation of lambda calculus or boolean algebra, they proceed through a series of transformation of mathematical objects that are organized lattices or semi-lattices, partially ordered sets (e.g. in boolean algebra, where the partial order is provided by the implication).
It would be interesting to understand if the density of attention mechanisms follow a similar progression as dependent type systems, and we can find a link between the dependent types involved in a proof and the corresponding spaces in a LLM via some continuous relaxation analogous to a proximal operator + some transformation (from high-level concepts into output tokens).
We have found in embeddings that geometry has a meaning. Specific simple concepts correspond to vector directions. I wouldn't be surprised at all that we find that reasoning on dependent concepts correspond to complex subspaces in the paths that a LLM takes, and that with enough training this connections becomes closer and closer to the logical structure of corresponding proofs (for self-consistent corpus of input and, like math proofs, and given enough training data).
Absolutely, in fact you can build the foundation of mathematics on this concept. You can build proofs and reasoning (for some value of "reasoning").
That's how dependent type systems work, search for HoTT and modal homotophy theory. That's how lean4, coq, and theorem proofs work.
If you remember at the foundation of lambda calculus or boolean algebra, they proceed through a series of transformation of mathematical objects that are organized lattices or semi-lattices, partially ordered sets (e.g. in boolean algebra, where the partial order is provided by the implication).
It would be interesting to understand if the density of attention mechanisms follow a similar progression as dependent type systems, and we can find a link between the dependent types involved in a proof and the corresponding spaces in a LLM via some continuous relaxation analogous to a proximal operator + some transformation (from high-level concepts into output tokens).
We have found in embeddings that geometry has a meaning. Specific simple concepts correspond to vector directions. I wouldn't be surprised at all that we find that reasoning on dependent concepts correspond to complex subspaces in the paths that a LLM takes, and that with enough training this connections becomes closer and closer to the logical structure of corresponding proofs (for self-consistent corpus of input and, like math proofs, and given enough training data).