I can see why people hold out against this; voice programming sounds slower in general, and vastly slower during the ramping up period.
But I wonder if we'd be better off enabling voice input sooner, before we're in serious trouble, and using it sporadically. Mixing voice and keyboard input at need looks like a way to substantially reduce typing while still using a keyboard whenever voice work is too inconvenient.
This is one of my huge goals with Talon. I want to convince people who have no symptoms that using voice + eye tracking + etc is cool and will help them in useful / exciting ways. I think RSI is a kind of silent epidemic, and we won't ever solve it by only treating people who already show major symptoms.
On the other hand, it's really not much slower with the state of the art, especially when you mix in stuff like eye tracking at a really core level (we're playing stuff like autocompleting based on a symbol you just looked at!). I'm not far enough along with Talon to be pushing the benchmarking side of things heavily yet, but one early test was about 2/3 the code input performance of a 90wpm typist on the same code, with a lot of obvious places to improve. I think good application of continuous recognition, resumable grammars, and really context-specific helper code can push specific workloads way past what a keyboard/mouse can do (which calls back to the goal of impressing people who aren't injured yet).
Then there's the professional app space - e.g. Photoshop, CADD, and video/audio production tools could really benefit from voice workflows (imagine using a pen tablet augmented with voice + eye tracking instead of complex UI).
But I wonder if we'd be better off enabling voice input sooner, before we're in serious trouble, and using it sporadically. Mixing voice and keyboard input at need looks like a way to substantially reduce typing while still using a keyboard whenever voice work is too inconvenient.