Apple previews Live Speech, Personal Voice, and more new accessibility features

samwillis · on May 16, 2023

Firstly compliments to Apple for all these incredible accessibility features.

I think there is an important little nod to the future in this announcement. "Personal Voice" is training (likely fine tuning) and then running a local AI model to generate the user's voice. This is a sneak peak of the future of Apple.

They are in the unique position to enable local AI tools, such as assistance or text generation, without the difficulties and privacy concerns with the cloud. Apple silicone is primed for local AI models with its GPU, Neural Cores and the unified memory architecture.

I suspect that Apple is about to surprise everyone with what they do next. I'm very excited to see where they go with the M3, and what they release for both users and developers looking to harness the progress made in AI but locally.

detourdog · on May 16, 2023

Just yesterday I started using a new maxed out Mac mini and everything about it is snappy. I have no doubt that it is ready for enormous amount of background processing. Heavy background work is the only way to use the processing power in that little computer.

samwillis · on May 16, 2023

Think Siri+ChatGPT trained on all your email, documents, browser history, messages, movements, everything. All local, no cloud, complete privacy.

"Hey Siri, I had a meeting last summer in New York about project X, could you bring up all relevant documents and give me a brief summary of what we discussed and decisions we made. Oh and while you're at it, we ate at an awesome restaurant that evening, can you book a table for me for our meeting next week."

lotsofpulp · on May 16, 2023

All of my current experience with Siri tells me there is a 50-50 chance of the result coming back as “Sorry, I have having trouble connecting to the network” or playing a random song from Apple Music.

Just last night, we were entertaining our toddler with animal sounds. It worked with “Hey Siri, what does a goat sound like?”, then we were able to do horse, cow, sheep, boar, and it somehow got tripped up on pig, for which it responded with the Wikipedia entry and told us to look at the phone for more info.

joshspankit · on May 16, 2023

You’ve touched on what is probably the biggest reason I don't use Siri more: Apple does not limit it to what’s important to me as user.

I have thousands of contacts, lots of photos, videos, and emails, all in Apple’s first-party apps and yet Siri is more likely to respond with a popular song or listing of news articles that’s only tangentially connected to my request.

Smoosh · on May 16, 2023

> me as user

This becomes more complicated when Siri is the interface on a homepod in a shared area. Who's data and preferences should be used? Ideally it would recognise different voices and give that person's data priority, but how much can/should be shared between users? Where are these data - they shouldn't be in the homepod, so it would have to task the phone with finding the answer. I'm sure something good could be done here, but it wouldn't be easy.

kolinko · on May 16, 2023

If I'm not mistaken, it recognises voices already? At least once I had it tell me it didn't recognize my voice and couldn't do a thing I requested.

a_subsystem · on May 17, 2023

My father in law's iPad responds to me, and my phone to him almost every time either of us say 'hey Siri'.

pertymcpert · on May 17, 2023

These are called Personal Requests and are already handled as a use case by HomePod.

coldtea · on May 16, 2023

>All of my current experience with Siri tells me there is a 50-50 chance of the result coming back as “Sorry, I have having trouble connecting to the network” or playing a random song from Apple Music.

Well, this is about adding ChatGPT-level smartness to Siri, not just the semi-dumb assistant of yore.

wlesieutre · on May 16, 2023

Federico Viticci's S-GPT is doing some pretty neat things using Shortcuts

https://www.macstories.net/ios/introducing-s-gpt-a-shortcut-...

Some examples from that blog post:

> I’m feeling nostalgic. Make me a playlist with 25 mellow indie rock songs released between 2000 and 2010 and sort them by release year, from oldest to most recent.

This doesn't just return a list of songs, it will create the playlist for you in Music.

> Check the paragraphs of text in my clipboard for grammar mistakes. Provide a list of mistakes, annotate them, and offer suggestions for fixes.

> Summarize the text in my clipboard

> Go back to the original text and translate it into Italian

I haven't tried it myself, but it has other integrations like "live text" where your phone can pull text out of an image and then could send that to GPT to be summarized.

Version 1.0.2 makes improvements for using it via Siri including on HomePod.

smileybarry · on May 16, 2023

Today I asked Siri for the weather this week. She said daytime ranges from 31C to 23C, so I then asked "on what day is the temperature 31 celsius?". And, of course, what I got back was "it's currently twenty seven degrees".

te_chris · on May 16, 2023

The weather ones are so annoying: "Is it going to rain today?". "It looks like it's going to rain today". "What time is it going to rain today?". "It looks like it's going to rain today".

hn_version_0023 · on May 16, 2023

The only requests that work consistently well for me are:

"Hey Siri, whats the weather?" and "Hey Siri, what the X-day forecast". Everything else is a huge mess.

olyjohn · on May 17, 2023

"Set a timer for 10 minutes" works pretty well. Most of the time.

hn_version_0023 · on May 17, 2023

It seems ironic then that specific thing failed spectacularly for me today. Siri put the text "set a timer for 15 minutes" into the text field of a reminder. I have no clue why, and no timer was set.

But you know what? Still better than Alexa for managing my smart home stuff. By miles and miles, IMO.

smileybarry · on May 17, 2023

And god help you if you give up halfway through a command with a prompt. “Cancel”, “stop” and “nevermind” don’t work for half of that for some reason, so you have to walk up and tap the HomePod to cancel.

GeekyBear · on May 16, 2023

> All of my current experience with Siri tells me there is a 50-50 chance of the result coming back as “Sorry, I have having trouble connecting to the network” or playing a random song from Apple Music.

Meanwhile, Google and Amazon have decided that the data center costs of their approach just aren't worth it.

>Google Assistant has never made money. The hardware is sold at cost, it doesn't have ads, and nobody pays a monthly fee to use the Assistant. There's also the significant server cost to process all those voice commands, though some newer devices have moved to on-device processing in a stealthy cost-cutting move. The Assistant's biggest competitor, Amazon Alexa, is in the same boat and loses $10 billion a year.

https://arstechnica.com/gadgets/2023/03/google-assistant-mig...

Both companies made big cuts to the teams running their voice assistant tech.

JumpCrisscross · on May 16, 2023

> there is a 50-50 chance of the result coming back as “Sorry, I have having trouble connecting to the network”

Doesn’t the locally-run LLM bit solve this particular grievance? I’m picturing a personal AI à la Kim Stanley Robinson’s Mars trilogy.

basch · on May 16, 2023

Yes. I dont understand the criticism of the current Siri in this context, the point of a language model on the device would be to derive intent and convert a colloquial command into a computer instruction.

smcleod · on May 16, 2023

Siri was so good before iOS 13, I'm not sure what they did in that release but it went from around 90-95% accuracy and 80-90% contextual understanding - down to 70% and 75% respectively.

As someone who dictates more than half of their messages and is an incredibly heavy user of Siri for performing basic tasks I really noticed this sudden decline in quality and it's never got back up there - in fact, iOS 16 really struggles with many basic words. Before iOS 13. I would have been able to dictate these two paragraphs likely without any errors however, I've just had to edit them in five places.

dmd · on May 16, 2023

me: hey siri weather

siri: The temperature is currently 54 degrees. Expect clear skies starting in the evening, going down to 52 degrees tonight.

me: hey siri what's the high today

siri: the high today will be 84 degrees

me: hey siri will it rain today?

siri: Expect heavy thunderstorms around noon

Note nothing it said in the original was actually wrong...

detourdog · on May 16, 2023

The more they can move Siri on device the less that will happen.

iknowstuff · on May 16, 2023

[flagged]

lotsofpulp · on May 16, 2023

I thought the lack of ability to execute on current “easy” queries would indicate something about ability to execute something as complicated as figuring out the restaurant you ate at and making a reservation. At least anytime in the next few years.

fwlr · on May 16, 2023

I don’t think it does. This isn’t a hypothetical Siri v2 with some upgrades; it’s a hypothetical LLM chatbot speaking with Siri’s voice. I recall one of the first demonstrations of Bing’s ability was someone asking it to book him a concert where he wouldn’t need a jacket. It searched the web for concert locations, searched the web weather information, picked a location that fit the constraint and gave the booking link for that specific ticket. If you imagine an Apple LLM that has local rather than web search, it seems obvious that this exact ability that LLMs have to follow complicated requests and “figure things out” would be perfectly suited to reading your emails and figuring out which restaurant you mean. With ApplePay integration it could also go ahead and book for you.

s3p · on May 16, 2023

HN is the only place where you can get paragraphs of text and comments that argue in circles about a nonexistent issue.

fwlr · on May 17, 2023

Certainly not the only place, but you’re very right that it does house a large population of commenters like me who enjoy the “sport” of “being correct on the internet”.

jonathankoren · on May 16, 2023

Yeah. It’s been my experience, the longer the comment, the predictably more unhinged and/or pedantic it becomes.

coldtea · on May 16, 2023

And yet the parent makes a very specific (and correct) comment, that this wont be Siri with some upgrades, but Siri in the name only, with a totally different architecture.

Whereas yours and your sibling comment are just irrelevant meta-comments.

jonathankoren · on May 16, 2023

At least the irony of this comment isn’t lost on one of us.

sebzim4500 · on May 16, 2023

Aside from anything else, your claim is absurd. There are huge parts of the internet dedicated to exactly that.

mason55 · on May 16, 2023

Siri today is built on what’s essentially completely different concepts from something like ChatGPT.

There are demos of using ChatGPT to turn normal English into Alexa commands and it’s pretty flawless. If you assume Apple can pretty easily leverage LLM tech on Siri and do it locally via silicon in the M3 or M4, it’s only a matter of chip lead time before Siri has multiple orders of magnitude improvement.

rschoultz · on May 16, 2023

That experience likely isn’t transferable to Siri, that has deeper problems. People, me included, are reporting their problems with Siri, e.g. setting it to transcribing what they and Siri says as text on the screen, and then are able to show that given input as “Please add milk to the shopping list” results in Siri responding “I do not understand what speaker you refer to.”, in writing.

Likely problems like these could be overcome, but preparing better input would probably not address the root cause of the problems with Siri.

lotsofpulp · on May 16, 2023

That would be a pleasant surprise!

coldtea · on May 16, 2023

You'd think that, but you'd be wrong.

Microsoft voice assistant was equally dumb as Siri, but ChatGPT is another thing entirely. Wont even be the same team at all, is most likely.

So nothing about their prior ability, or lack thereof, to make Siri smart means anything about their ability to execute if they add a large LLM in there.

nicbou · on May 16, 2023

I love Steve Jobs' "bicycle for the mind" metaphor, and what you describe is the best possible example of this concept. A computer that does that would enable us to do so much more.

This is the sort of AI I want; a true personal assistant, not a bullshit generator.

y04nn · on May 16, 2023

It appears that we are tantalizingly close to have the perfect voice assistant. But for some inexplicable reason, it does not exist yet. Siri was introduced over a decade ago, and it seems that its development has not progressed as anticipated. Meanwhile, language models have made significant advancements. I am uncertain as to what is preventing Apple, a company with boundless resources, from enhancing Siri. Perhaps it is the absence of competition and the duopoly maintained by Apple and Google, both of whom seem reluctant to engage in a competitive battle within this domain.

lamontcg · on May 16, 2023

It is probably a people problem. The people who really understood Siri have probably left, the managers left running it are scored primarily on not making any mistakes and staying off the headlines. Any engineers who understand what it would take to upgrade it aren't given the resources and spend their days on maintenance tasks that nobody really sees.

slowmovintarget · on May 16, 2023

It's more likely a perverse incentive problem. Voice activated "assistants" weren't viewed as assistance for end users. They were universally viewed as one of two things: A way of treating the consumer as a product, or a feature check-box.

That Siri went from useful to far less useful had more to do with the aim to push products at you rather than actually accomplishing the task you set for Siri. If Apple actually delivers an assistant that works locally, doesn't make me the product, and generally makes it easier to accomplish my tasks, then that's a product worth paying for.

When anyone asks "who benefits from 'AI'?" the answer is almost invariably "the people running the AI." Microsoft and OpenAI get more user data, and subscriptions. Google gets another vehicle for attention-injection. But if I run Vicuna or Alpaca (or some eventual equivalent) on my hardware, I can ensure I get what I need, and that there's much less hijacking of my intentions.

So Microsoft, if you're listening: I don't want Bing Chat search, I want Cortana Local.

Miraste · on May 16, 2023

When was Siri ever useful? I have yet to encounter a voice "assistant" that can do more than search Google and set timers reliably, and Siri itself can't even do those very well.

interpol_p · on May 17, 2023

I use it around 50 - 100 times per day. Mostly playing music, sending messages, controlling lights in the home, weather, timers, and turning on/off/opening apps on the TV

There are definite frustrations, mostly around playing music. Around 5% of the time, Siri will play the wrong album or artist because the artist name sounds like some other album name, or vice versa. I wish, here, that it used my Music playback history to figure out which one I meant

scarface74 · on May 16, 2023

Doing what Siri is doing is not rocket science. It’s a simple intent based system where you give it patterns to understand intents and you trigger some API based on it.

Once you have the intents parsing, it should be just a matter of throwing man power at it and giving it better intents.

Yes, I have experience with building on top of such a system.

lamontcg · on May 16, 2023

But the group managing Siri has probably been gutted in the past 10 years, and while the core is always simple the integrations and the QA testing to make sure it all keeps working is probably brittle and time consuming, and the core code is likely highly-patched spaghetti at this point.

It would be easy to write Siri again and make it a hundred times better, if you could start all over and only write the core features, and not have to validate against the whole product/feature matrix.

The problem with the rewrite of course would be that you won't be able to deliver that minimal viable product any more and you will have 10 years worth of product requirements and user expectations that you MUST hit for the 1.0 release (which must be a 1.0 and not an 0.1).

I've worked on lots of "simple" and "not rocket science" systems that were 10-years old, and it is always incredibly difficult due to the state of the code, the lack of resources, and the organizational inertia.

IOT_Apprentice · on May 16, 2023

It appears to be a leadership issue. Tim Cook wanted butts in seats at the corporate site, the Siri/AI Guys wanted to still work from home.

And the AI team seemed to NOT want to be hidden in terms of discussion of industry ideas with peers at other firms.

They went back to google. Magic 8 ball time.

Who have they hired since then?

Who knows?

Who in leadership is allowing them to succeed?

Results unclear try again.

Who has a clear vision of why to build ALL ONBOARD the users device?

Results unclear try again.

bredren · on May 16, 2023

> All local, no cloud, complete privacy.

This is already felt in use of Stable Diffusion, where M2 is fully capable offline.

Anything that can be done to reduce the need to “dial out” for processing protects the individual.

It erodes the ability of business and governmental organizations to use knowledge of otherwise private matters to target and influence.

The potential of moving a HQ LLM like GPT to the edge to answer everyday questions reminds me of my move from Google to DDG as my default search engine.

Except it’s even a bigger deal than that. It reduces private data exhaust from search to zero, making going to the net a backup plan instead of a necessity.

Apple delivering this on device is a major threat to OpenAI, which will have to provide some LLM model with training that Apple can’t or won’t.

Savvy users will begin to leer at having to produce queries over the wire, feeding valuable data (proven by ShareGPT)

Even then, Apple will likely chose to or be forced to open up on device AI to allow user contributed apps like LORAs which would ask the question why does OpenAI need to exist?

Also fascinating the potential to do this at the Server level for enterprise. If Apple produced a stack for enterprise training it could replace generalized data compute needs, shifting IT back to local or intranet.

tomcam · on May 16, 2023

Apparently, you are not an actual user of Siri, because I get jack shit out of her. speech to text is infinitely worse than the first week Siri was released.

ceejayoz · on May 16, 2023

Siri has felt like abandonware for years. My hope is something like a ChatGPT model is its replacement soon.

_xnmw · on May 16, 2023

This is already available with Rewind.ai + their GPT plugin and I've already asked such questions and got good answers.

edgyquant · on May 16, 2023

Thats not what’s being discussed though, which is Siri having these abilities built in

samwillis · on May 16, 2023

Rewind.ai looks awesome, but only like 10% of what I'm thinking Apple could do.

throw384624 · on May 16, 2023

I would love if Apple created a personal iCloud device too.

Let me sync my information to something local.

I know there's Nextcloud, but it's not as seamless as iCloud.

tough · on May 16, 2023

We should get the European Union to force Apple to offer an Icloud Local server thingy

scarface74 · on May 16, 2023

Yes and we should also have EU regulators at every design meeting for every company. They did such a good job with the GDPR making the user experience better on the web

another2another · on May 22, 2023

Yes, alas they didn't leave room for a 'cookie preferences' cookie, so that whenever I choose the option 'reject all', it's of course going to ask me again, every time I visit the website.

saying that, their intentions were good, I'm always horrifically amazed at the number of cookies used whenever I see the preferences popup. I honestly had no idea how many tracking cookies were used by the average website.

precompute · on May 16, 2023

>Think Siri+ChatGPT trained on all your email, documents, browser history, messages, movements, everything. All local, no cloud, complete privacy.

That sounds absolutely horrifying if you remove the "all local" part. And that part's a pipe dream anyway. Plus, when using a model you'd basically become subservient / limited to the type of data in the model, which would necessarily abide by Apple's TOS, so a couple of hundred million people would be the Apple TOS but in human form. I don't understand why apple fanboys don't get this. Apple is pretty shoddy when privacy is concerned. Are these apple employees making these posts?

mattl · on May 16, 2023

Sounds like Knowledge Navigator

https://youtu.be/umJsITGzXd0

detourdog · on May 17, 2023

My senior thesis was influenced by that.

sroussey · on May 16, 2023

I’m working on something very much like this!

blitzar · on May 16, 2023

"Playing Our Last Summer by ABBA on Apple Music"

addandsubtract · on May 16, 2023

Fat chance Apple will alow us to do this locally. More like, upgrade to Apple Cloud Plus to get these features. But yeah, I've also dreamt of what my Apple hardware could do.

fastball · on May 16, 2023

Apple already does all existing AI stuff locally. The main one being categorizing images in your Photos library.

danieldk · on May 16, 2023

Why not? They do on-device training of face recognition:

https://machinelearning.apple.com/research/recognizing-peopl...

insane_dreamer · on May 16, 2023

Actually Apple is much more device-first than cloud-first

Maursault · on May 17, 2023

> Just yesterday I started using a new maxed out Mac mini and everything about it is snappy.

Really?! I didn't think anyone here would fall for that.

      Mac Mini 12-core M2, 19-core GPU, 32GB, 10Gbit, 8TB storage? $4500

      Mac Studio 20-core M1, 48-core GPU, 64GB, 10Gbit, 1TB storage is $4000. 128GB of RAM is $800 more

but either Studio RAM configuration obviously spanks the M2 mini. It's sacrificing Apple's expensive storage, but with Thunderbolt 3 it's pretty academic to find 8TB or more of NVMe storage, probably 32GB of NVMe RAID[1], for less than Apple's charge of $2200 above cost of 1TB.

[1] https://eshop.macsales.com/shop/express-4m2

detourdog · on May 17, 2023

I specced the smallest SSD. I use netwomr homes. The mini is a stop gap waiting for the pro. Drive size Indont really consider a performance item anymore.

I spent just over $2,000.

Mac mini With the following configuration: Apple M2 Pro with 12‑core CPU, 19-core GPU, 16‑core Neural Engine 32GB unified memory 512GB SSD storage Four Thunderbolt 4 ports, HDMI port, two USB‑A ports, headphone jack 10 Gigabit Ethernet

Im satauisfied.

Maursault · on May 17, 2023

Not awful, but for $2K you could have had 16-core CPU, 20-core GPU, 32-core Neural Engine, 48GB unified memory, 512K SSD storage, Four Thunderbolt 4 ports, two HDMI ports, four USB-A ports, two headphone jacks, two Gigabit Ethernet.

detourdog · on May 17, 2023

Yes. I wanted the 10Gbt Ethernet. My purchasing question is when is the right time to buy a great monitor. In the CRT days the monitor lasted the longest and buying the best one could afford worked for me.

detourdog · on May 17, 2023

I just went back to compare the Mini with the Studio again. Despite your advice I would buy the Mini again for these reasons:

I'm on a newer generation chip that has a lower power draw. Meets my network speed minimum. All for the price of the entry level Studio. This box is basically an experiment to see how much processing power I need. I have a very specific project that will require the benchmarking of Apple's machine learning frameworks. I want to see how much of a machine learning load this Mini can handle. Once I have benchmarks maybe the Pro will exist and I will be in good shape to shop and understand what I'm buying.

I think a Mini of any spec is a great value. The studio has a place but I'm hoping the Pro ends up being like an old Sun E450.

This Mini experiment is to help me frame the hardware power vs. the software loads.

Maursault · on May 17, 2023

> I think a Mini of any spec is a great value.

My second suggestion for 16-core was M2, also. $100 less with 1Gb, and with 10Gb it would be $100 more than you paid. i.e. two of the 8-core M2 Minis with 24GB RAM each would do about twice as much work as the high end Mini M2 Pro alone, sometimes less than twice the work, sometimes more. The same is true of two M1 Max Studios vs one M1 Extreme Studio for the same price. 2 less powerful machines spank one more powerful machine every single time, and one M1 Extreme Studio is definitely NOT worth two M1 Max Studios, same as one 12-core M2 Pro Mini is definitely NOT worth two 8-core M2 Minis.

Everyone is drawn to "the best," and that's where Apple fleeces and makes its money. Pretty consistently forever, the best buys from Apple are never the high end configurations. We may feel secure in what our choices were, doubling down on affirming them, but we definitely pay for it.

detourdog · on May 18, 2023

I don't see a 16 core M2 or any Studio's with an M2. I was drawn to the latest chip Apple has produced. They put that chip in a small headless form factor. I shopped for a Macintosh computer and judged whether I wanted the motherboard bandwidth of the Mac Studio or the latest chip with the Mac mini. I'm sorry I disappointed you. I have retroactively looked over everything you have said and doubt I would do it differently. If this machine turns out to be such a dog I can get another one to pair it with as you have suggested I do with 8-core. Finally are you speaking from first hand experience or benchmarks?

I think the disconnect is that you are trying to get as much processing power as possible and I'm trying to understand how much processing power currently exists.

hammock · on May 16, 2023

Do you think it would be good/capable for audio or video editing?

detourdog · on May 17, 2023

I think so. It is stunningly fast. It may be the biggest boost I have seen. The last big boost I saw was the move to SSD.

huijzer · on May 16, 2023

Sounds plausible. Also due to the news yesterday that Apple uses 90% of TSMC‘s 3nm space in 2023 [1]. Whereas everyone is talking about a recession, Apple seems to see opportunities. Or maybe they just had too much cash on hand. Also possible.

[1]: https://news.ycombinator.com/item?id=35947339

thorncorona · on May 16, 2023

I thought this is because everyone else is interested in N3E while Apple is happy to book on N3B?

smoldesu · on May 16, 2023

Density doesn't always matter. I'm reminded of Apple's 5nm M1 Ultra struggling to keep up with Nvidia's 10nm RTX 3080 in standard use. Having such a minor node advantage won't necessarily save them here, especially since Nvidia's currently occupying the TSMC 4nm supply.

philistine · on May 16, 2023

You're comparing a pickup truck with a Main Battle Tank. An RTX 3080 is an electricity hog and produces heat like a monster. No wonder it performs better than an M1 Ultra with a worse node tech.

smoldesu · on May 16, 2023

The RTX 3080 consumes ~300w at load, the M1 Ultra consumes ~200w. If you extrapolate the M1 Ultra's performance to match the 3080, it would also consume roughly the same amount of power.

Is this not a battle-tank-to-battle-tank comparison?

lbourdages · on May 16, 2023

Since when is there a full fledged CPU in an RTX 3080?

smoldesu · on May 16, 2023

You can run an RTX 3080 off anything with enough PCI bandwidth to handle it. Presumably the same goes for Apple's GPU. We could adjust for CPU wattage, but at-load it amounts to +/-40w on either side and when we're only testing the GPU it's like +/-10w maximum.

The larger point is that Apple's lead doesn't extrapolate very far here, even with a generous comparison to a last-gen GPU. It will be great at inferencing, but so are most machines with AVX2 and 8 gigs of DRAM. If you're convinced Apple hardware is the apex of inferencing performance, you should Runpod a 40-series card and prove yourself wrong real quick. It's less than $1 and well worth the reality check.

lbourdages · on May 16, 2023

My point was mostly that the 200W TDP you quote is for the whole package (CPU, GPU, RAM, plus the Neural network thingy and the whole IO stuff). A 120W figure for the GPU is more realistic.

I'm not pretending the Apple chips are the be-all-end-all of performance. They certainly have limitations and are not able to compete with proper high end chips. However I can confidently say that on mobile devices and laptops, competition is largely behind. Sure a 1000+$ standalone GPU will be faster, but it doesn't fit in my jeans. It's the same as comparing a Hasselblad camera with the iPhone 14 pro...

smoldesu · on May 16, 2023

The competition is all fine, though. They have enough memory to run the models, they have hardware acceleration (ARMnn, SNPE, etc.) and both OSes can run it fine. Apple's difference is... their own set of APIs and hardware options?

How can you justify your claim that they're "largely behind"? It sounds to me like the competition is neck-and-neck in the consumer market, and blowing them out at-scale. It's simply hard to entertain that argument for a platform without CUDA, much less the performance crown or performance-per-watt crown.

valine · on May 16, 2023

Nvidia is somewhat encumbered by their need to optimize for raster performance. Ideally, all those transistors should be going toward tensor cores. Apple has never really taken the gaming market seriously. If they wanted to, they could ship their next M3 chip with identical GPU performance and use all that new 3nm die space for AI accelerators.

Someone · on May 16, 2023

> Having such a minor node advantage

Is that a minor advantage? I would think that, the smaller the nodes get, the larger the impact of a 1nm difference. Because transistors have area, I think the math, in ≈transistor count would be 3nm:4nm = ⅓²:¼², and that’s 1,777… so a 3nm node could have 75% more transistors on a given die area than a 4nm one (roughly).

Kirby64 · on May 16, 2023

4nm -> 3nm no longer means size goes down as a result directly ratiometricly. You have to look at what TSMC is claiming for their improvements. They're claiming 5nm -> 3nm is a 70% density improvement (I can't find any 4nm -> 3nm claims)... so 4->3 must be much less.

Also, most folks seem to have gone directly from 5nm to 3nm, and skipped 4nm altogether.

IOT_Apprentice · on May 16, 2023

Apple is rumored to be taking 90% of TSMCs 3nm production.

smoldesu · on May 16, 2023

It will be quite the showdown, then. The M1 struggled to compete with current-gen Nvidia cards at release, we'll have to see if the same holds true for M3.

jayd16 · on May 16, 2023

Is it really that unique a position? Doesn't Google's pixel phones have a neural core and similar architecture as well?

crazygringo · on May 16, 2023

A lot of people buy Android. But very few people buy Pixel:

> In a world dominated by iPhones and Samsung phones, Google isn't a contender. Since the first Pixel launched in 2016, the entire series has sold 27.6 million units, according to data by analyst firm IDC -- a number that's one-tenth of the 272 million phones Samsung shipped in 2021 alone. Apple's no slouch, having shipped 235 million phones in the same period. [1]

[1] https://www.cnet.com/tech/mobile/why-google-pixels-arent-as-...

pier25 · on May 16, 2023

> But very few people buy Pixel

I've wanted to buy a Pixel for years but Google doesn't distribute it here. It's not like I'm living in some remote area, I live in Mexico, right next door.

The first couple of years I assumed Google was just testing the waters, but after so many Pixel models I suspect it's really just more of a marketing thing for Android. They don't seem to have any interest in distributing the Pixel worldwide, ramping up production, etc.

JustSomeNobody · on May 16, 2023

How is this at all an answer to @jayd16's question?

The number of phones Google has sold is completely irrelevant to the fact that they too do local ai and have hardware on device for processing it.

crazygringo · on May 16, 2023

Because jayd16 was responding to samwillis's comment about Apple being in a unique position.

Part of that unique position is already being a popular product. Google adding a bunch of local ML features isn't going to move the needle for Google if people aren't buying Pixels in the first place for reasons that have nothing to do with ML.

If Google's trying to roll out local ML features but 90% of Android phones can't support them, it's not benefiting Google that much. Hence, Apple's unique position to benefit in a way that Google won't.

JumpCrisscross · on May 16, 2023

> number of phones Google has sold is completely irrelevant to the fact that they too do local ai

How will they make money? For Apple, device purchases make local processing worth it. For Google, who distribute software to varied hardware, subscription is the only way. For reasons from updating to piracy, subscription software tends to be SaaS.

barkerja · on May 16, 2023

Does Google do on-device processing? Or do they have to pander to the lowest denominator, which happens to be their biggest marketshare?

If the answer is no, then does it make sense for them to allocate those resources for such a small segment, and potentially alienate its users that choose non-Pixel devices?

Also, if the answer is no, this is where Apple would have the upper-hand, given that ALL iOS devices run on hardware created by Apple, giving some guarantees.

(I don't know this answer, it's a legitimate ask)

jayd16 · on May 16, 2023

Pixel is just an example of Google owning the stack end to end but the Qualcomm chips in the Samsung phones have Tensor accelerator hardware and all mobile hardware is shared memory. I think samwillis was referring to the uniqueness of their PC hardware and my comment was that they're simply using the very common mobile architecture in their PCs instead of being in a completely unique place.

prepend · on May 16, 2023

Google doesn’t want to run local AI. It channels everything through the google Plex on purpose.

So while pixel phones may be possible, they don’t want to.

Take image processing for example. iPhones will tag faces and create theme sets all locally. Google could too, but they don’t. They send every picture to their cloud to tag and annotate.

shaklee3 · on May 17, 2023

And their service is infinitely better. It's why all iphone owners I know use Google photos.

prepend · on May 17, 2023

I think we know different people. I moved off google photos ages ago with google’s product changes and am pretty happy with only iPhone photos.

I like this because Apple the corp doesn’t know the individuals in photos and processing happens locally.

shaklee3 · on May 17, 2023

It also doesn't know how to do complex text searches on those, but to each their own.

another2another · on May 22, 2023

Are you using photos just to do OCR? Seems a pretty niche use case, and something that could be easily met by an App.

Grustaf · on May 16, 2023

The uniqueness is not in hardware it’s in trust. And the ability to make software for normal human beings.

whimsicalism · on May 16, 2023

I don't trust Apple to make usable/working ML products, whereas I do trust Google to do those things.

I would never, for instance, rely on Apple's "Translate" when Google is available.

SoftTalker · on May 16, 2023

Apple's maps, navigation, and routing are also inferior to Google's

sbuk · on May 16, 2023

Depend on where you are. In my experience, Google's times are way too optimistic and the route finding in London is shit.

JustSomeNobody · on May 16, 2023

Trust. That's an interesting word.

Personally, I don't trust corporations. Their motive is always money.

Grustaf · on May 17, 2023

Sure, but some corporations take a long term view and sacrifice short term profits in order to ensure that they can keep making money for a long time.

And some companies, like Apple, forego short term profits from e.g. selling customer data, because it would undermine customers’ trust in them.

reaperducer · on May 16, 2023

Personally, I don't trust corporations.

How do you eat?

JustSomeNobody · on May 18, 2023

Slowly, chewing my food completely.

nawgz · on May 16, 2023

Because a food corporation that sells food that kills people is unlikely to make piles of money.

paganel · on May 16, 2023

Not for lack of trying.

If anyone were to write a chronological history of regulations imposed by different authorities throughoug history I think that it is a fair assumption to make that regulations related to making bread would already show up in the first chapters of the book.

asveikau · on May 16, 2023

Sounds more like fanboyism to me.

Grustaf · on May 17, 2023

Call it what you want but Apple has a very high level of trust

asveikau · on May 17, 2023

Depends on who you ask. I wouldn't trust them too much. I think their security reputation is mostly hype and marketing, which some on this thread seem to have bought hook, line and sinker.

birdyrooster · on May 16, 2023

Google has the absolute worst ARM silicon money can buy (Tensor G2), go look at the benchmarks it's comical they would charge $1800 for a phone with it.

janardhanj · on May 28, 2023

You get the same Tensor G2 for $450

dzikimarian · on May 17, 2023

You don't need to buy Fold to get G2. Stop making false statements.

Grazester · on May 16, 2023

Comical or not it gets the job done and I think this was the idea.

mupuff1234 · on May 16, 2023

Do users actually care whether something is local or not?

(Edit: In terms of privacy as there are benefits in terms of speed and offline work)

It's not like we're not already storing all of our media on the cloud (including voice), passwords and other sensitive data.

crazygringo · on May 16, 2023

Absolutely, in terms of speed.

Even with something as simple as dictation, when iOS did it over the cloud, it was limited to 30 seconds at a time, and could have very noticeable lag.

Now that dictation is on-device, there's no time limit (you can dictate continuously) and it's very responsive. Instead of dictating short messages, you can dictate an entire journal entry.

Obviously it will vary on a feature-by-feature basis whether on-device is even possible or beneficial, but for anything you want to do in "real time" it's very much ideal to do locally.

Edit in response to your edit: nope, on privacy specifically I don't think most users care at all. I think it's all about speed and the existence of features in the first place.

ambicapter · on May 16, 2023

Apple has positioned itself as big on privacy, turning privacy into a premium product (because no other big tech company has taken that stance or seems willing to), further entrenching Apple as the premium option. In that respect I think users will "care" about privacy.

heyjamesknight · on May 16, 2023

I do, when I’m without a quality internet connection and a basic request to Siri to turn on a timer fails.

deanc · on May 16, 2023

Yes. The amount of times I ask Siri on my homepod "What time is it?" and it replies "One moment..." [5 seconds] "Sorry, this is taking longer than expected..." [5 seconds] "Sorry, I didn't get that".

I have to assume this is due to connectivity issues, there is no other logical reason why it would take so long to figure out what I said for so long, or not have the data on what the time is locally.

slowmovintarget · on May 16, 2023

It's shipping the recording to get recognized, interpreted, and mapped to commands.

deanc · on May 17, 2023

Exactly. Interpreting should be done on device for a large number of commands. It's absurd

criddell · on May 16, 2023

A lot of end users do not and they have no interest in spending the time figuring it out. That's why it's very important that the companies behind the technology we use make ethical choices that are good for their users and when that doesn't happen, legislators need to step in.

Apple has been on both sides of that coin and what is ethical isn't always clear.

fastball · on May 16, 2023

> companies need to be ethical

> what is ethical isn't always clear

So...?

criddell · on May 16, 2023

So communicate. Have guiding principles for the company. Be open about the choices you make and why you make them. Listen and respond to criticism.

otterley · on May 16, 2023

Apple has one: "Privacy is a fundamental human right."

LocalH · on May 16, 2023

Not for their employees, otherwise they'd develop a setup to keep personal and company information separate internally. https://www.theverge.com/22648265/apple-employee-privacy-icl...

wstrange · on May 16, 2023

Except when it impacts profits (I E. China)

hosteur · on May 16, 2023

Yes. I am a user. And I care.

I think most users care if actually given the choice.

j2bax · on May 16, 2023

Agreed, which is why Facebook was so devastated by opt in tracking…

drewbeck · on May 16, 2023

Local also solves any spotty connection issues. Your super amazing know everything about you assistant that stops working when you’re on a plane or subway or driving through the mountains is very less amazing. If they can solve it, local will end up being way way smoother of a daily experience.

Someone · on May 16, 2023

> Do users actually care whether something is local or not?

I think most don’t, but they do care about latency, and that’s lower for local hardware.

Of course, it’s also higher for slower hardware, and mobile local hardware has a speed disadvantage, but even on a modern phone, local can beat in the cloud for latency.

s3p · on May 16, 2023

No, it is not. We are not ALL doing this, maybe you are though.

insane_dreamer · on May 16, 2023

xnx · on May 16, 2023

> They are in the unique position to enable local AI tools

Nothing special about Apple with regards to AI. M2 beats x86 in power efficiency, but not significantly better than other ARM processors.

gcr · on May 17, 2023

Some workloads on M1 absolutely smash other ARM processors in part because of M1's special-purpose hardware. In particular, the undocumented AMX chip is really nice for distance matrix calculations, vector search, embeddings, etc.

Non-scientific example: for inference, whisper.cpp links with Accelerate.framework to do fast matrix multiplies. On M1, one configuration gets ~6x realtime speed, but on a very beefy AWS Gravatron processor, the same configuration only achieves 0.5x realtime, even after choosing an optimal threadcount, even linking with NEON-optimized BLAS. (Maybe I'm doing something wrong though).

shepherdjerred · on May 16, 2023

I think the parent is referring to the Apple Neural Engine in Apple Silicon, which aren't widely used today (as far as I know)

yyyk · on May 16, 2023

>They are in the unique position to enable local AI tools

The only unique Apple thing here is how bad their AI products here and how behind they are in AI. This is the only thing that matters here - performance is adequate or better for the other processors out there, but you can't get anywhere without the appropriate software. Maybe they'll get smart enough to buy some AI startups/companies to get the missing talent.

andy_ppp · on May 16, 2023

Which AI products that they have actually implemented are bad? I think Siri is pretty poor to be fair and improves at a glacial pace. Pretty much everything else I'd say is state of the art from things like text selection from images, cutting out of image subject, their computational photography, even Map directions have come a long way.

yyyk · on May 16, 2023

When people talk about AI they mean the new tech like LLM or Diffusion, and the only relevant Apple offering (Siri) is way behind and there's no evidence they have anything to replace it.

(Aside, their image manipulation and Map is worse - though with Maps I dunno what's the underlying issue, and OCR was already mostly solved. I'm far from a photography expert so can't compare there).

brookst · on May 16, 2023

With Apple it is always good to draw the distinction (as you did) between them being behind and there being no evidence of them not being behind.

yyyk · on May 16, 2023

True that. But I won't give Apple credit for products we can't see or assume good performance without proof. As they say in the movies: "Show me the money".

I'll say though that no multibillion company is under existential threat. Not Apple, not Google and not even Intel. At worst they will lose a couple tens of billions and some marketshare. Even IBM still exists and took a long long time to fall to where it is still today.

musictubes · on May 17, 2023

What most people think of as AI can be better described as generative AI. Things like LLM and image making programs like Stable Diffusion. Apple has yet to implement anything like that.

They have done a ton with ML though. Some of these accessibility features, the ipad pencil, FaceID, image cataloging, live text, etc. etc. etc. showcase how Apple can not only do ML well but also make good use of them. All of it is done on device. LLMs and image generation are other examples of ML processes that Apple could include in the OS and run locally. With all of the issues surrounding LLMs and the like I am perfectly happy that Apple has been taking its time implementing them. It does feel like they could flip a switch when the time is right and that is why people say they are in a great position.

ec109685 · on May 16, 2023

The question is whether there will be models that can’t fit into an iPhone that apple will miss out on because they find cloud based personalization so abhorrent.

Agree these are tremendously good features and having them run locally will provide the best possible experience.

reaperducer · on May 16, 2023

The question is whether there will be models that can’t fit into an iPhone that apple will miss out on

Fortunately, Apple isn't an adolescent, so it doesn't suffer from FOMO.

because they find cloud based personalization so abhorrent.

I'm not even sure what "cloud based personalization" means to the user, other than "Hoover up all of your personal information."

whimsicalism · on May 16, 2023

> I'm not even sure what "cloud based personalization" means to the user, other than "Hoover up all of your personal information."

It means having actually good ML.

I see so many posts around here saying Apple is absolutely well positioned to dominate in ML. It's just not true.

Nobody who is a top AI player wants to work at Apple where they have few if any AI products, no data, don't pay particularly well, not a big research culture, etc. etc.

The only thing they have going for them in this space is a good ARM architecture for low power matrix multiplication.

samwillis · on May 16, 2023

My hope for the surprise is an almost order of magnitude increase in memory on these chips. That wold be transformative for local AI.

fastball · on May 16, 2023

Order of magnitude increase in memory without a corresponding increase in size is a pipe dream if I've ever heard one.

samwillis · on May 16, 2023

We're all aloud our pipe dreams, the future is built on them.

astrange · on May 16, 2023

Have you tried using less memory?

cubefox · on May 16, 2023

> Apple silicone is primed for local AI models with its GPU, Neural Cores and the unified memory architecture.

Is there any information on how they are doing with respect to local AI compared to other Smartphone SoCs? (Snapdragon, Tensor, Kirin, Helio etc)

rodgerd · on May 16, 2023

> I think there is an important little nod to the future in this announcement. "Personal Voice" is training (likely fine tuning) and then running a local AI model to generate the user's voice.

My Kiwi accent needs this so bad.

brookst · on May 16, 2023

UMA may turn out to be visionary. I really wonder if they saw the AI/ML trend or just lucked out. Either way, the apple silicon arch is looking very strong for local AI. It’s a lot easier to beef up the NPU than to redo memory arch.

smoldesu · on May 16, 2023

I think pretty much any multicore ARM CPU with a post ARMv8 ISA is looking pretty strong for local AI right now. Same goes for x86 chips with AVX2 support.

All of them are pretty weak for local training. But having reasonably powerful inferencing hardware isn't very hard at all, UMA doesn't seem very visionary to me in an era of MMAPed AI models.

danieldk · on May 16, 2023

I think pretty much any multicore ARM CPU with a post ARMv8 ISA is looking pretty strong for local AI right now. Same goes for x86 chips with AVX2 support.

Apple Silicon AMX units provide the matrix multiplication performance of many core CPUs or faster at a fraction of the wattage. See eg.

https://explosion.ai/blog/metal-performance-shaders https://github.com/danieldk/gemm-benchmark#1-to-16-threads

smoldesu · on May 16, 2023

Yes, and generic multicore ARM CPUs can run ARM's standard compute library regardless of their hardware: https://github.com/ARM-software/armnn

Plus, the benchmark you've linked to is comparing hardware accelerated inferencing to the notoriously crippled MKL execution. A more appropriate comparison would test Apple's AMX units against the Ryzen's AVX-optimized inferencing.

danieldk · on May 16, 2023

Nope, the benchmarks are done by disabling the MKL AMD cripple (I did these benchmarks). It’s not faster with eg. AMD BLIS.

brookst · on May 16, 2023

The visionary part is having a computer with 64GB RAM that can be used either for ML or for traditional desktop purposes. It means fewer HW SKUs, which improve scale economy. And it means the same HW can be repurposed for different users, versus PCs where you have to replace CPU and/or GPU.

For raw ML performance in a hyper-optimized system, UMA is not a big deal. For a company that needs to ship millions of units and estimate demand quarters in advance, it seems like a pretty big deal.

smoldesu · on May 16, 2023

Is it very different from their previous desktop lineup? I'm under the impression that Intel Macs can also run ML models with acceleration.

brookst · on May 16, 2023

Very different. Intel Macs had separate system RAM and video RAM, like PCs.

Apple Silicon doesn't just share address space with memory mapping, it's literally all the same RAM, and it can be allocated to CPU or GPU. If you get a 96GB M2 Mac, it can be an 8GB system with 88GB high speed GPU memory, or a 95.5GB CPU system with a tiny bit of GPU memory.

Apple's GPUs are slow today (compared to state of the art nvidia/etc), but if Apple upped the GPU horsepower, the system arch puts them far ahead of PC-based systems.

smoldesu · on May 16, 2023

Yeah, that would be great in a world where model splitting didn't exist: https://huggingface.co/blog/accelerate-large-models

brookst · on May 17, 2023

That doesn't have any relevance to the efficiency and cost improvements of having the same very fast RAM connected to both CPU and GPU cores.

I can't believe anyone is arguing that bifurcated memory systems are no big deal. Are you like an x86 arch enthusiast? I'm sure Intel is frantically working on UMA for x86/x64, if that makes it more palatable. Though they'll need on-die GPU, which might get interesting.

smoldesu · on May 17, 2023

I'm a computer enthusiast. I've got my M1 in a drawer in my kitchen, it's just not very useful for much unless I'm being paid to fix something on it. MacOS is a miserable mockery of itself nowadays and Apple Silicon is more trouble than it's worth, at least in my experience.

As I'm working on AI stuff right now, I have to be a realist. I'm not going to go dig up my Mac Mini so my AI inferencing can run slower and take longer to set up. Nothing I do feels that much faster on my M1 Mini. It feels faster than my 2018 Macbook Pro, but so did my 2014 MBP... and my 2009 x201. Being told to install Colima for Docker with reasonable system temps was the last straw. It's just not worth the hoop-jumping, at least from where I stand.

So... when a day comes where I need UMA for something, please let me know. As is, I'm not missing out on any performance uplift though.

> I'm sure Intel is frantically working on UMA for x86/x64

Everyone has been working on it. AMD was heavily considering it in the original Ryzen spec iirc. x86 does have an impetus to put more of the system on a chip - there's no good reason for UMA to be forced on it yet. Especially at scale, the idea of consolidating address space does not work out. It works for home users, but so does PCI (as it has for the past... 2 decades).

It's just marketing. It's a cool feature (they even gave it a Proper Apple Name) but I'm not hearing anybody clamor for unified memory to hit the datacenter or upend the gaming industry. It's another T2 Security Chip feature, a nicely-worded marketing blurb they can toss in a gradient bubble for their next WWDC keynote.

archon810 · on May 18, 2023

Sneak peek*

uncletammy · on May 16, 2023

> They are in the unique position to enable local AI tools, such as assistance or text generation, without the difficulties and privacy concerns with the cloud.

I don't see why client-side processing mitigates the privacy concerns. That doesn't stop Apple from "locally" spying on you then later sending that data to their servers.

marricks · on May 16, 2023

Ok, sure, but surely you see how it is that much harder to do?

Also since Apple is built around selling expensive devices and services you could also see why they’d have much less incentive to spy and collect data than, say, Google or Facebook?

The cynicism of “everything is equally bad so why care” is destructive.

SoftTalker · on May 16, 2023

Now. It was just two decades ago that Apple was on life support. That could happen again. And the temptation would be much stronger to start monitizing their user's data.

marricks · on May 16, 2023

Very good point: if things changed drastically they could change drastically.

aardvarkr · on May 16, 2023

Nope but that is something that can be monitored and investigated, pretty easily I might add.

elicash · on May 16, 2023

I'm not the target audience for these features, and don't want to speculate on behalf of others, so I'll just focus on my own needs...

Live Speech: I actually answer unknown phone numbers (usually) and would like text-to-speech on my calls because I've started to get concerned about what can be done, fraud-wise, with even small samples of my voice. So in this case, using another's voice is fine, even preferred. (Edit: I suppose I'd actually prefer a voice-changer here, which is less related to this accessibility feature. But I think Apple is unlikely to do that.)

Personal Voice: When my girlfriend texts me when I'm out with my AirPods, I think we'd both like me to hear her message in her actual voice rather than Siri's. This feature doesn't allow for that yet, but the pieces are all there.

Finally, Apple needs to detect and tell me when I'm listening to a synthetic voice, including when it's not being generated by Apple. There's fraud potential here. I'm clearly excited about this tech, but I want to know more on this front.

somebody78978 · on May 16, 2023

> When my girlfriend texts me when I'm out with my AirPods, I think we'd both like me to hear her message in her actual voice rather than Siri's

And then you can use voice to text to text her back, and she can hear it in your voice! It's just like a phone call from 30 years ago, but one that requires infinitely more processing power!

elicash · on May 16, 2023

Genuinely funny reply. But (a) whether I prefer typing or speaking, and (b) whether I prefer reading or hearing, is very context-dependent and might not be the same for the person at the other end -- and so yeah I think having flexibility there is good! If I'm on AirPods and not on my phone, I'd like to hear the message. CarPlay, too. When actively on my phone, I prefer to type, even if on AirPods. CarPlay, I shouldn't probably be typing ever. So yeah, generating text and speech simultaneously and having the end result be situational is in fact a good thing.

It's worth noting this is already how iPhones work and people already love it. What I'm suggesting additionally is substituting Siri's voice for a DIFFERENT customized synthetic voice in a very specific circumstance. I'm not advocating for using synthetic voices where there currently aren't any here.

JumpCrisscross · on May 16, 2023

> It's just like a phone call from 30 years ago, but one that requires infinitely more processing power

There are folks who couldn’t, for a variety of reasons, do that thirty years ago. This feature is for them. The rest of us get to e.g. more naturally text a response to a call we’re listening into on a flight.

reaperducer · on May 16, 2023

Can't you already do that with iMessage's voice delivery feature?

JumpCrisscross · on May 16, 2023

> Can't you already do that with iMessage's voice delivery feature?

Voice memos? No. I would have to, at some point, speak it. If you’re referring to text-to-speech, there is a difference between having your speech read in a different voice and your own.

throwaway54_56 · on May 16, 2023

> It's just like a phone call from 30 years ago

Disagree that it's the same at all. Sending discrete messages at your leisure is quite a different experience than a real-time conversation.

Someone · on May 16, 2023

On the plus side it would use less bandwidth. That phone call from 30 years ago probably used (ballpark) 64kilobits/second. This could use a lot less and have higher audio quality.

See it as data compression on the wire.

reaperducer · on May 16, 2023

It's just like a phone call from 30 years ago, but one that requires infinitely more processing power!

And latency!

masklinn · on May 16, 2023

> Personal Voice: When my girlfriend texts me when I'm out with my AirPods, I think we'd both like me to hear her message in her actual voice rather than Siri's. (This feature doesn't allow for that yet, but the pieces are all there.)

Neat idea, same with carplay, having it reliably imitate the voice of the person who sent the message would make it a lot nicer.

Though they would need to get all the TTS and intonation right first, which IME is not the case, I think having the right voice but the wrong intonation entirely would be one hell of an uncanny valley.

NoGravitas · on May 16, 2023

When my computer talks to me, it needs to be in Majel Barrett's voice. Anything else is just not acceptable.

xen2xen1 · on May 16, 2023

Surely Shatner has sold his voice already like Bruce Willis did? I'd pay to hear my morning information read in the voice of the shat, in the format of a captain's log.

jansommer · on May 16, 2023

A voice changer for unknown or blocked caller id's is actually a great idea. Just looked in F-Droid and the Play Store, where there seems to be no such app. Preventing callers from sampling my voice is a concern I did not know I have.

elicash · on May 17, 2023

A couple weeks ago ago, I looked into building it for iPhone, but there was no way in iOS to integrate it. So now I'm just hoping for Apple to do it. This article was what prompted the thought: https://www.washingtonpost.com/technology/2023/03/05/ai-voic...

jansommer · on May 20, 2023

Doesn't seem to be possible on Android either unfortunately. One might be able to do it by writing a custom dialer - scope creep x100.

quitit · on May 17, 2023

>because I've started to get concerned about what can be done, fraud-wise, with even small samples of my voice.

This vector was recently highlighted as a weakness of the Australian "Voiceprint" system(1)

It's also an excellent point to make that these tools can be useful for everyone. I find myself using a number of the accessibility tools simply to speed up some of my common interactions with the phone and watch.

I've also noticed that these technologies end up in other products. For example livetext is now a standard feature on macOS/iOS yet it's an accessibility feature originating in the screenreader to deal with text flattened in images. This technology sharing also gives us a bit of a preview of what they're working on (e.g. AR)

(1) https://www.theguardian.com/technology/2023/mar/16/voice-sys...

kwertyoowiyop · on May 16, 2023

I was thinking it would be fun just to have Siri use a Personal Voice, but your idea is better! And it could just offer to make your Personal Voice data available along with your Contact picture.

bnj · on May 16, 2023

While that sounds like it would unlock some very cool experiences it also scares me to think about the potential abuses of making personalized voice models fairly easily available. It seems like the sort of thing that would need to stay secure on your own device. It would be great to see some kind of middle ground where a text to speech mechanism would generate audio output and send that, rather than make the model itself available.

thiht · on May 16, 2023

> Personal Voice: When my girlfriend texts me when I'm out with my AirPods, I think we'd both like me to hear her message in her actual voice rather than Siri's.

That's genius!

smith7018 · on May 16, 2023

It's also hard to do privacy-wise unless every text message she sends pre-generates the audio message using her on-device voice and then attaches it. That would make every message use 10x as much bandwidth, storage, and battery power. (10x is a random number but you get the point). Seems cute but really impractical.

kevinsundar · on May 16, 2023

I would think that your phone could request audio messages from the sender only when necessary. They already sync things like your DND status to show others so this would just be another flag. Messages could also then alert the sender that their message may be read aloud in their Personal Voice. Or maybe allow turning this on per conversation.

tshaddox · on May 16, 2023

10x compared to what though? FaceTime (and similar) is already full-duplex video and audio, which I have to imagine is at least another 10x on top of what you’re describing. Are we really budgeting our computer resources so strictly that this would even show up as more than a rounding error?

smith7018 · on May 19, 2023

6 billion texts are sent every day in America alone. What you're describing as a "rounding error" amounts to a massive change at that scale.

MagicMoonlight · on May 17, 2023

It would take as much data as a phone call… except you’re only sending enough audio for the message

packetlost · on May 16, 2023

I think I'd rather just eliminate the ringing portion of a phone call (when I have airpods in) and instead just let a trusted list of contacts talk directly into my ear (1-way) until I "answer" and open up a 2-way channel.

zamadatix · on May 16, 2023

The "answer" button seems to serve to also say "I'm ready to listen" as much as it does to say "I'm ready to say". IMO this kind of goal is better covered by voice messages, or, at least, starting with a voice message. This allows the receiver to pick when they are ready to hear it (including immediately), replay it as needed, and choose when they respond (if at all). Many of these are benefits for the sender as much as the receiver.

packetlost · on May 17, 2023

Let's have both. If you have it on do-not-disturb or don't pick it, it goes to a voice message with speech-to-text.

kevinsundar · on May 16, 2023

Isn't this the walkie talkie feature built into the Apple Watch? Could be ported over to iPhone.

MagicMoonlight · on May 17, 2023

Yeah I want “Apple Intercom” or something where me and my girlfriend can have linked airpods and it’s like a spy movie. Perfect for communicating in loud places or crowds without requiring data.