It may work. Since it may be just a new "programming language" (somewhat literally), i.e. a new level of high level abstraction. We already know examples of such transition to higher abstraction levels: binary code -> assembly languages -> c/lisp/fortran/etc -> c++/javascript/go/python/r -> np/torch/react/whatever frameworks/libraries. For an average programmer nowadays knowledge of frameworks/libraries is as important (if not more important even) as actual knowledge of the programming language they use. The only disadvantage of this is that people will need to adapt to something generated and updated via a machine learning. So far there are not much examples of that, except maybe people adapting to Tesla Autopilot with every new release. Before we were adapting to a new c++/python/framework version, in future there will be GitHubNext v1, v2 and v3 with known features and bugs.
The only problem with this being a next abstraction level, is that it actually leads to more "coding", because of general spoken language being less informationly dense as any programming language.
Before by switching from binary to assembly to higher level languages to frameworks/libraries, you generally reduce amount of "code" being written after each step, with voice programming this seems to be the opposite.