That justification is bogus however. There's already separate logic for the case...

Dylan16807 · on Feb 10, 2019

Also USB bus resets are not unheard of. Or moving devices from one port to another. If the device comes back within a minute or two you probably shouldn't throw out those writes.

0x0 · on Feb 10, 2019

If someone quickly pulls an usb drive, plugs it in another system, and then plugs it back in to the original system, then flushing writes could cause massive data corruption if those writes are relative to an outdated idea of what's on the block device. Sounds like a misfeature to me

zozbot123 · on Feb 10, 2019

> If someone quickly pulls an usb drive, plugs it in another system, and then plugs it back in to the original system, then flushing writes could cause massive data corruption

That's user error, though. The kernel should react to removable media being pulled by sending a wall message to the appropriate session/console, stating something similar to "Please IMMEDIATELY place media [USB_LABEL] back into drive!!", with [Retry] and [Cancel] options. That way, the user knows what to expect -- OS's used to do this as a matter of course when removable media was in common use. In fact, you could even generalize this, by asking the user to introduce some specific media (identified by label) when some mountpoint is accessed, even if no operations were actually in progress.

adrianmsmith · on Feb 10, 2019

RISC OS works like you propose. If you access the path "ADFS::MyDisk.$.Foo" (that is the Advanced Disk FileSystem, disk called "MyDisk", $ is the root directory and within that the file "Foo"), the user will get a pop-up asking them to insert the disk "MyDisk" into any available disk drive, then press OK to continue the I/O operation successfully. (The user can also click Cancel in which case the I/O operation will return an error.)

You don't ever have had to interact with the "MyDisk" disk before. Simply access it, and (by the time the Disk I/O system call returns) the disk will be there (by virtue of asking the user to insert it.)

vidarh · on Feb 10, 2019

AmigaOS worked similarly. I don't remember how it'd behave if you yanked a floppy after having started a write, but certainly on attempting to open a file or change working directory or list directory contents of media that was not currently in a drive would get you a dialog box.

hyc_symas · on Feb 11, 2019

MSDOS used to do this too. And Atari GEMDOS.

adrianmsmith · on Feb 11, 2019

I don't think MSDOS did the following aspect of zozbot123's post:

> In fact, you could even generalize this, by asking the user to introduce some specific media (identified by label) when some mountpoint is accessed, even if no operations were actually in progress.

In MSDOS the paths were like "A:FOO.TXT" as far as I remember. That means you had no facility, as a program, to request the user insert a particular disk identified by a particular label.

For example, on RISC OS (but not on MSDOS) you could implement a program to copy files between two disks by simply reading from the disk with the source label and writing to the disk with the destination label. Even on a machine with a single disk drive. The OS would request the user insert the source and destination disks as appropriate.

hyc_symas · on Feb 13, 2019

The FAT partition header has always had an ID field, the OS could request you to reinsert a removed disk and would know if the wrong one was inserted.

Also, on single-floppy systems there was a virtual B: so you could do disk to disk copies just by saying "copy A:foo B:" etc.

trasz · on Feb 10, 2019

In FreeBSD there’s an (optional) mechanism for just that: gmountver.

0x0 · on Feb 10, 2019

Disagree, it is a kernel error if it cannot gracefully deal with removable media being, you know, removed.

Dylan16807 · on Feb 10, 2019

The drive was in a corrupt state the first time it got unplugged. And it's nothing to shrug off, it might have been in the middle of rewriting a directory and lose all the contents.

So what are the odds that A) you get it back into a non-corrupt state B) the sectors affected by finishing the write will re-corrupt it C) you do this in one minute?

vardump · on Feb 10, 2019

> Also USB bus resets are not unheard of.

They're initiated by the host, not by a USB device.

> Or moving devices from one port to another. If the device comes back within a minute or two you probably shouldn't throw out those writes.

This would be a nice feature. Although these writes would need to be buffered. Probably also throttled. There'd also be some risk with devices that have identical serial numbers. Some manufacturers give all of their USB disks / memory sticks same serial number...

Dylan16807 · on Feb 10, 2019

> They're initiated by the host, not by a USB device.

Or by a power flicker. Which can be caused by plugging in other devices too.

> Although these writes would need to be buffered. Probably also throttled.

You don't necessarily have to allow new writes, the more important part is preserving writes the application thinks already happened. But that could be useful too.

> Some manufacturers give all of their USB disks / memory sticks same serial number...

You have the partition serial number too, usually.

vardump · on Feb 10, 2019

> Or by a power flicker. Which can be caused by plugging in other devices too.

That sounds pretty unlikely. Any references about this? The pulse needs to be pretty particular and the USB device needs to be powered.

Dylan16807 · on Feb 10, 2019

Personal experience, so not really.

And it doesn't have to be very particular. Bad grounding can cause lots of ports to reset.

vardump · on Feb 10, 2019

Sounds intentional, probably something sent by the USB host controller driver. I guess some hub chips might send it independently as well in some scenarios.

USB bus resets (both D+ and D- down for 10ms) are a signal for the USB device software (well, firmware) to initialize device configuration. Basically to set configuration, data toggles and stalls to their defaults.

(I've written USB device firmware.)

Dylan16807 · on Feb 11, 2019

Well yes it's an intentional recovery from an error state, but the point is this can happen unexpectedly.

derefr · on Feb 10, 2019

> There'd also be some risk with devices that have identical serial numbers. Some manufacturers give all of their USB disks / memory sticks same serial number...

On most OSes the HW serial number of the disk is now usually supplemented in the disk management logic with the GPT “Disk GUID”, if available. Most modern disks (including removable ones like USB sticks) are GPT-formatted, since they rely on filesystems like ExFAT that assume GPT formatting. And those that aren’t are effectively already on a “legacy mode” code-path (because they’re using file systems like FAT, which also doesn’t support xattrs, or many types of filenames, or disk labels containing lower-case letters...) so users already expect an incomplete feature-set from them.

Plus: SD cards, the main MBR devices still in existence, don’t even get write-buffered by any sensible OS to begin with, precisely because you’re likely to unplug them at random. So, in practice, everything that needs write-buffering (and will ever be plugged into a computer running a modern OS) does indeed have a unique disk label at some level.

ajross · on Feb 10, 2019

> entire underlying device vanishes

And that would then fail because of the hardware layer's bugs with reporting a device disconnect correctly. I mean, if the user follows the rules and pulls the stick out of a host port or a powered hub, sure, it's likely going to work per spec. But if it's on a daisy-chained 2003-era USB2 hub connected to a cheap USB3 hub? Yeah, good luck.

anarazel · on Feb 10, 2019

Is that really justification for incurring unsignalled dataloss? If that's actually common enough, count the number of uncleared errors on per-mount basis, and shut down the filesystem if the memory pressure gets too high while significant portions of memory are taken by dirty buffers that can't be cleaned due to IO errors.

ajross · on Feb 10, 2019

Honestly it would be simpler just to make the "mark clean on write error" behavior a tunable flag rather than try to finesse this. Having the block layer not starve the system on bad hardware as a default behavior seems correct to me.

josefx · on Feb 10, 2019

Does it always work? I had issues with encrypted filesystems that would stay mounted after the device itself disconnected and required a forced unmount before I could use them again.

josefx · on Feb 10, 2019

Did I just get downvoted for giving an example where the OS doesn't seem to handle the disconnect of the physical device at all?