Hacker News new | past | comments | ask | show | jobs | submit login

> Under RAID 0, the odds are 50% that two independent reads are on the same drive.

If you have a single sequential stream, then no. You'll either have parallel reads across the two drives, or you'll have alternating reads that the aforementioned semi-smart storage system can turn into parallel reads with buffering. If you have multiple sequential streams, then it's practically going to be like random access, which you already put out of scope. So there's no relevant case where RAID-0 is worse than RAID-1 for reads.

But you know what will be worse? Dual actuator drives. Why? Because of what dragontamer (who was right) mentioned, which you overlooked: the two actuators serve disjoint sets of blocks. They even present as separate SAS LUNs[1] just like separate disks would, so you would literally still need RAID on top to make them look like one device to most of the OS and above. But here's the kicker: they still share some resources that are subject to contention - most notably the external interface. Truly separate drives duplicate those resources, enabling both better performance and better fault isolation. Doubled performance is an absolute best case which is never achieved in practice, and I say that because I've seen it. If Seagate could cite something more realistic than IOMeter they would have, but they can't because the results weren't that good.

The only way dual actuators can really compete with separate drives is to duplicate all of the resources that change behavior based on the request stream - interfaces, controllers, etc. Basically everything but the spindle motor and some environmentals, as I already suggested now two days ago. You'd give up fault isolation, but at least you'd get the same performance. That's not what Seagate is offering, though.

[1] https://www.seagate.com/files/www-content/solutions/mach-2-m...




Since you added a huge amount since I replied, I'll make a separate reply.

> But you know what will be worse? Dual actuator drives. Why? Because of what dragontamer (who was right) mentioned, which you overlooked: the two actuators serve disjoint sets of blocks.

They don't have to do that.

I was talking about what you can do with dual actuators, not product lines that already exist.

I didn't realize how mach.2 was designed, though. That's a shame.

> But here's the kicker: they still share some resources that are subject to contention - most notably the external interface.

Each head, even at peak transfer rate, uses less than half the bandwidth of the external interface.

So even if both of them are hitting peak rates at the same time, and the drive alternates transfers between them, things are fine. For example, let's say 128KB chunks, alternating back and forth. Those take .2 milliseconds to transfer. That makes basically no difference on a hard drive.

> Doubled performance is an absolute best case which is never achieved in practice, and I say that because I've seen it.

I completely believe you, about drives where each arm can only access half the data.

> The only way dual actuators can really compete with separate drives is to duplicate all of the resources that change behavior based on the request stream - interfaces, controllers, etc.

Or upgrade them to 1200Mbps, which isn't a very hard thing to do.


> I was talking about what you can do with dual actuators, not product lines that already exist.

Since you didn't know they're different until a moment ago, you were talking about both. Don't gaslight.

> Each head, even at peak transfer rate, uses less than half the bandwidth of the external interface.

So two will come damn close ... today. With an expectation that internal transfer rates will increase faster than standards-bound external rates. And the fact that no interface ever meets its nominal bps for a million reasons. Requests have overhead, interface chips have their own limits, signal-quality issues cause losses and retries (or step down down lower rates), etc. Lastly, request streams are never perfectly balanced except for trivial (mostly synthetic-benchmark) cases, and the drive can't do better than the request stream allows. There are so many potential bottlenecks here that any given use case is sure to hit one ... as actually seems to be the case empirically. Your theory remains theory, but facts remain facts.


That sentence is specifically not about sequential reads.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: