> When my girlfriend texts me when I'm out with my AirPods, I think we'd both like me to hear her message in her actual voice rather than Siri's
And then you can use voice to text to text her back, and she can hear it in your voice! It's just like a phone call from 30 years ago, but one that requires infinitely more processing power!
Genuinely funny reply. But (a) whether I prefer typing or speaking, and (b) whether I prefer reading or hearing, is very context-dependent and might not be the same for the person at the other end -- and so yeah I think having flexibility there is good! If I'm on AirPods and not on my phone, I'd like to hear the message. CarPlay, too. When actively on my phone, I prefer to type, even if on AirPods. CarPlay, I shouldn't probably be typing ever. So yeah, generating text and speech simultaneously and having the end result be situational is in fact a good thing.
It's worth noting this is already how iPhones work and people already love it. What I'm suggesting additionally is substituting Siri's voice for a DIFFERENT customized synthetic voice in a very specific circumstance. I'm not advocating for using synthetic voices where there currently aren't any here.
> It's just like a phone call from 30 years ago, but one that requires infinitely more processing power
There are folks who couldn’t, for a variety of reasons, do that thirty years ago. This feature is for them. The rest of us get to e.g. more naturally text a response to a call we’re listening into on a flight.
> Can't you already do that with iMessage's voice delivery feature?
Voice memos? No. I would have to, at some point, speak it. If you’re referring to text-to-speech, there is a difference between having your speech read in a different voice and your own.
On the plus side it would use less bandwidth. That phone call from 30 years ago probably used (ballpark) 64kilobits/second. This could use a lot less and have higher audio quality.
And then you can use voice to text to text her back, and she can hear it in your voice! It's just like a phone call from 30 years ago, but one that requires infinitely more processing power!