Yes, fairly limited impact, but still. Data loss is fucking important. We aren't talking about an occasional stutter when moving windows or a sound driver that sometimes needs a reboot to resume working. This is a major bug. I would really really like it if Apple would get their shit together so that people who actually rely on their computers to work correctly can upgrade at some point.
As a counterpoint, APFS was deployed to almost a billion devices over the span of several months when it was release and, IIRC, this is the first major issue.
How do you know that? Filesystem corruption is frequently silent, and every-time it happens customers don't get on the phone and send the disks to apple so that they can root cause the problem. Its quite possible this bug has happened an untold number of times before it happened to someone who went through the effort to reproduce and isolate it.
Also, I'd guess the use pattern on ios is rather different, and more homogeneous, from the use pattern in macos. I don't think these millions of devices really give Apple good code coverage.
My wife's laptop suddenly decided that the boot drive was corrupt a few weeks after she updated to APFS. None of the recovery tools were of any use. We had to reinstall the OS and pull the files from a backup. This story did not make it to Hacker News.
If you have more information about the problem you encountered and how it implicates/interacts with APFS, please do link to it. Otherwise, bug reports via circumstantial evidence are, while not inherently false, certainly suspect.
You missed the point. The anecdote was to illustrate how even power users might be working around filesystem bugs so a lack of bug reports specifically mentioning APFS is certainly not proof that there aren't problems.
It's also quite hard to report fs issues. I ended up one day with a not working apfs system. Boot was ok, but I couldn't mount the user partition. Apfs repair tool just failed and made the system hang. After a number of restarts, attempts at repair, and attempts to move the partion somewhere it can be decrypted, everything started working. And I actually had enough experience to try and debug/fix it - many people would end up wiping the system, or having to go to Apple shop.
This is not reportable. I got only a generic error or hanging system. I can't reproduce it. I don't know why it started and why it finished. Yet, it was almost certainly an apfs issue.
Even if I wanted to play, my priority was to get the work laptop usable again.
That's reasonable. I assumed that it was an assertion of evidence of an issue, not an example of how issues might theoretically go unnoticed. Upon rereading it does not appear that either is implied.
I'm not sure they're implying that it -was- an APFS issue, just that in the majority of circumstances users won't go through the same level as effort to diagnose an issue as in the article. Instead of pulling drives and trying to reproduce the error, they just wiped the drive and started fresh.
I could be wrong, but I believe the point is not that it did happen, but that this -could- have happened many times in the past and users just format/re-install without thinking about it.
The hardware is likely being blamed for a lot of failures that are software related. Its almost assured there is a software problem in cases where reinstalling the machine fixes the problem. A random machine which won't boot due to disk/filesystem failures could be a hardware issue, but that is pretty much ruled out if reinstalling/reformatting doesn't immediately manifest in further failure. Bit rot, stuck bits, bad links are a thing, but they generally show up as massive soft error correction long before it reaches the point of simply being unable to read the sector and when that happens the OS will almost always tell you that the sector can't be read rather than giving you garbage data.
That is because the likelyhood of undetected hardware failures given the layers and layers of ECC on the disks, links/etc manifesting itself as filesystem meta data failures rather than garbage in the middle of video/images/document streams/etc is really unlikely. Or the more likely case of the machine performance degrading due to read retry/ecc correction/retransmission making the machine appear to have severe performance issues long before it manifests as silent data corruption sufficient to eat the filesystem structure (its a fun excise to intentionally flip a few random bits on a hard-drive image (or in RAM)) and see if/when they are detected.
So, yes the first thing I think when I hear filesystem corruption is BUG! That is what the experience of tracking down a number of incidents in a large data storage application a few years ago taught me.
to be fair, BGA solder joint issues are ridiculously common on that model. to the point that, as someone who's services phones for years, i warn people away from them even if they're dirt cheap
The rest of the post after the question mark you stopped reading at:
> Filesystem corruption is frequently silent, and every-time it happens customers don't get on the phone and send the disks to apple so that they can root cause the problem. Its quite possible this bug has happened an untold number of times before it happened to someone who went through the effort to reproduce and isolate it.
We know that because it was deployed to almost a billion devices without a problem. If there was a problem that affected even a fraction of a percent of people, it would have been all over the news given how many devices that is.
Edit: Thanks for the downvotes. If you disagree, please tell me why. Apple's deployment of APFS to iPhones was so flawless, most people probably still don't even know they did it.
Idk, most of the people in my life could probably lose data on their phone and never realize it. Especially if we’re talking about people with lots of duplicates of the same selfie, for instance.
You’re wrong. I’m explaining why deploying to a billion devices and seeing no public outcry means there was a problem, because the parent comment is trying to claim that maybe there was a problem and people just didn’t report it. My point is that with a billion devices, even rare edge cases end up reported in the media because rare edge cares still hit so many people that it makes it look like a widespread issue.
for a sufficiently narrow definition of "major issue" that excludes
- the encryption password hint leak ("that was Disk Utility, not APFS")
- APFS volume erasure issues ("also Disk Utility")
- Adobe, Unity (editor and games built with it), Steam, Source Engine, and FCPX crash, performance, and asset loss issues on APFS volumes, all of which went away when moved to HFS+ volumes ("those teams should have adapted their software during beta")
- performance and incompatibility issues with spinning-disk drives ("platters are bad, APFS is designed for SSDs")
- RAID kernel panics, even on supported RAID 0 configurations ("that's corecrypto, not APFS")
File systems fall into the category of software where a bug can have disastrous consequences. Even if the probability of a bug is small, the magnitude of the consequence means that the overall risk is still high. And the current quality of software coming from Apple is so bad that the probability is not low.
For myself, I'm not letting APFS near my systems for at least a couple more years.
I'd point out it was deployed to iPhones - devices where the underlying physical disk won't get smaller so this bug would not have appeared.
iOS devices are extremely constrained in a number of ways that MacOS isn't - who knows how many other bugs have failed to surface because Apple thought their iOS test was a 'job done' moment.
Some of the upgrades to High Sierra in my company failed and left the laptop unbootable. Unfortunately, I wasn't involved in the repair so I only have been told it was due to the APFS conversion and that the solution was to wipe and install.
For people doing enterprise work and backups it's been a nightmare - here's one backup vendor that's been tracking issues with high RAM+CPU usage for almost 2 years now [1]. Early on if data reached over 2.0 TB, it would silently corrupt on certain cluster sizes and when deduplication was enabled. [2] Per the Veeam thread, the "fix" for [1] is only preventative, meaning that currently affected volumes will need to reformat entirely.
This doesn't excuse the APFS goofs, but silent data corruption and grinding servers to a halt just writing data to the system are pretty major show stoppers, never mind that ReFS can't be used for a host of every-day operations (i.e., it's a storage level solution, not really an every-day-driver style File System).
ReFS was only ever made the default on new installs of Windows Server 2012. It never actually made it to production builds of Windows 8 or above so it's actually only installed on a fraction of the systems that are out there. That's not really a sufficient sampling size to say that this deployment went along much better. The APFS update was a much, much larger endeavor and, based only on public response, was nearly seamless. This is the first major issue I've heard about with regard to APFS.
1. i had a filevault related corruption issue, the disk was eating itself up thinking it was encrypting... don't have the apple discussion link at hand.
2. time machine hidden snapshot, "disk full" issues. it's a major pita for me that is not possible to turn off local snapshots.
I updated day one and had usual apple problems (finder broke so I had to use my terminal, sometimes it wouldn't wake up from sleep so I had to force it to reboot, my fans would spin up for seemingly no reason). Most of the issues were fixed a week later or so.
Anyway, I recently switched to Arch on a 2018 LG Gram and I'm not really missing anything. Battery life is great (8-12 hours of Firefox) and it has a quad core x64 processor for non-browser things.
Yeah, same. I couldn't stomach the port situation on the new macbooks (and the High Sierra issues) so ended up with a 4th gen Lenovo Carbon X1. Wiped Windows 10 and installed Mint. The whole process was dead-simple and I couldn't be happier with it.
Thanks for mentioning the LG Gram. I actually didn't know LG sold laptops (they're not available in Canada) before your comment. I just ordered one online now! I insta-purchased it when I noticed the 2018 model still has USB 3 Type-A and HDMI ports. Ctrl-key in the corner instead of Fn-key and equal sized arrow keys are a bonus.
Fits my needs as a Macbook Pro replacement. Will be running linux desktop on it too. Probably elementary OS.
Or just stop using macOS. I wish I could give you my Debian Thinkpad for a week. Granted I've put some time into tweaking it because the sock gnome3 rounded matte shit is shit, but I bet you'd convert. The only issue is the screen aspect ratio Apple still wins there.
The first thing I do with a new PC laptop, after making sure Bluetooth and the other hardware all works in Windows, is to wipe it and start installing Linux. Lately I've just cloned my Gentoo install that I've maintained since 2012.
I keep around one Win10 laptop for gaming, but I prefer Linux for any real development work.