I think you might be missing some appropriate context. I agree that it is ridicu...

I think you might be missing some appropriate context. I agree that it is ridiculous to expect a language model to be good at symbolic manipulation; that is best done with tool use. However, there is a significant line of work dedicated to algorithm discovery for mathematical problems using neural networks. Transformers are used here due to their popularity, but also some theoretical analysis to suggest that they are the among the most efficient architecture for learning automata. It's still unclear whether this is truly sound though, which is where this kind of research matters.