*> Consider the problem of naming a piece of music based on a short sample of th...

cLeEOGPw · on Aug 21, 2015

By function I think he meant mapping input points to output points in an abstract plane.

So in that sense a piece of music, or a sentence in one language is a point of input, while name of music or sentence in another language is another point.

Everything is a function as long as there is a way to turn that thing into inputs that correspond to outputs.

GuiA · on Aug 21, 2015

Ok, so by your reasoning let's have a function as a point of input, and whether it halts or not as a point of output.

So now we have a function, I can't wait till "we have good techniques for constructing such a network" that maps those inputs and outputs in an abstract plane :)

cLeEOGPw · on Aug 21, 2015

That is a good example. But you are forgetting that neural networks are approximating the actual functions, so the function you described could be built with some kind of confidence level in the answer. Just like you can have some confidence that certain code will not halt from experience, neural network could also be built to do that. Not all possible functions though, unless you have infinitely large network with infinite computing power.

groar · on Aug 21, 2015

Yeah but it's still misleading as we are talking about approximating continuous functions here, not any function. Those examples are not clearly computable, or even just continuous..

MrManatee · on Aug 21, 2015

I would agree that the examples can be somewhat misleading, but not because of computability. Universal approximation theorem for neural networks doesn't care about computability, but it does require that the domain of the function is finite (or to be more precise, compact).

For example, suppose that the function to be approximated is simply f(x) = x. For any real numbers a < b we can produce an approximation that works well for a ≤ x ≤ b, but it cannot be a good approximation for all real numbers x. (The sigmoid in the hidden layer means that every approximation has to be a bounded function, and a bounded function cannot be a good global approximation of f(x) = x.)

Therefore, the translation example works if we assume that the number of different Chinese texts is finite, but otherwise nothing is guaranteed.