The main conceptual novelty that modern machine learning brings is the addition of computational complexity in the mix: from information theory's question
what is learnable?
to
what is learnable in
poly-time?
(or similar resource constraints). This was pioneered, as far as I am aware, in Valiant's A Theory of the Learnable (please correct me if I'm wrong, I'm not an ML/AI historian). Interestingly, we see a similar evolution of Shannon's thinking about cryptography (what is secure information theoretically, i.e. against computationally unbounded adversaries?) to: what is safe against a poly-time restricted adversary?