If you are interested in this sort of thing, you might want to take a look at a very simple neural network with two attention heads that runs right in the browser in pure Javascript, you can view source on this implementation:
Even after training for a hundred epochs it really doesn't work very well (you can test it in the Inference tab after training it), but it doesn't use any libraries, so you can see the math itself in action in the source code.
https://taonexus.com/mini-transformer-in-js.html
Even after training for a hundred epochs it really doesn't work very well (you can test it in the Inference tab after training it), but it doesn't use any libraries, so you can see the math itself in action in the source code.