Okay, great, but what's the point? At the end of the day, we're not running anything on Turing machines; we're running them on computers, which are almost but not quite the same thing. Remember that the Turing machine is intended as a model for algorithms and not a model for the actual execution process/environment of those algorithms. (Again, a distinction that's irrelevant when discussing TMs in most contexts, so it's generally brushed over).
And at the end of the day, my original point stands: we're talking about writing purely functional code, but if the code is being compiled by a compiler that takes advantage of non-functional code optimization (and as far as I know, nearly all general-purpose compilers do to some degree), then it doesn't make sense to compare the functional code vs. non-functional code. Even if it appears that data structures are immutable, if the compiler is mutating them behind-the-scenes, it all depends on how well-optimized the compiler is, and that's a very different discussion than the one implied by the title.
Do you ever try to understand the asymptotic performance of a program in terms of the source code, or only read compiled binaries?
You can describe the performance of a functional program in something like asymptotic number of reductions. You can write compilers that will run those programs on real hardware in time related to the performance bound you get by working in the source model. Also, perfect optimization is undecidable. So, the question is whether there are problems where you necessarily lost asymptotic performance by working with a purely functional programming language.
In the general case, not necessarily for any subset, and this would apply to to non-functional algorithms as well. Just because I write a 'simple' for/while loop, that doesn't necessarily imply anything about the actual
At the end of the day, an algorithm is simply an unambiguous, executable, and terminating specification of a way to solve a problem. We may implement an algorithm by creating a physical or virtual machine that conforms to that specification, but even abstractions like Turing machines and lambda calculi are just attempts at creating a deterministic model that we can manipulate in order to understand the algorithm, which is an abstract specification.
The problem is that in general, algorithms are not specified in perfect detail, or they are implemented in ways that are logicially equivalent, but not exactly equivalent, with the specification. (Or oftentimes both). When dealing with any specific algorithm, we can usually infer that analyses of one will translate to the other, but in the general case, this assumption is murky.
The difference between functional and non-functional for an algorithm happens at the level of the specification, which is different from the level of the implementation. The translation from one to the other involves optimization. And if perfect optimization is undecidable, as you point out, so is the question of whether a perfectly optimized implementation of one class of algorithms is 'better' than a perfectly optimized implementation of another class, since we're no longer dealing with individual instances.
And at the end of the day, my original point stands: we're talking about writing purely functional code, but if the code is being compiled by a compiler that takes advantage of non-functional code optimization (and as far as I know, nearly all general-purpose compilers do to some degree), then it doesn't make sense to compare the functional code vs. non-functional code. Even if it appears that data structures are immutable, if the compiler is mutating them behind-the-scenes, it all depends on how well-optimized the compiler is, and that's a very different discussion than the one implied by the title.