If you join a bunch of perceptrons together that limitation goes away. Another path is to make the problem effectively linear again by transforming into higher dimensions, kernels do this with one clever trick that allows them to avoid the computational cost of doing so explicitly.