Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not so sure that's true. This is single-threaded direct I/O doing a fio randwrite workload on a WD 850X Gen4 SSD:

    write: IOPS=18.8k, BW=73.5MiB/s (77.1MB/s)(4412MiB/60001msec); 0 zone resets
    slat (usec): min=2, max=335, avg= 3.42, stdev= 1.65
    clat (nsec): min=932, max=24868k, avg=49188.32, stdev=65291.21
     lat (usec): min=29, max=24880, avg=52.67, stdev=65.73
    clat percentiles (usec):
     |  1.00th=[   33],  5.00th=[   34], 10.00th=[   34], 20.00th=[   35],
     | 30.00th=[   37], 40.00th=[   38], 50.00th=[   40], 60.00th=[   43],
     | 70.00th=[   53], 80.00th=[   60], 90.00th=[   70], 95.00th=[   84],
     | 99.00th=[  137], 99.50th=[  174], 99.90th=[  404], 99.95th=[  652],
     | 99.99th=[ 2311]


I checked again with O_DIRECT and now I stand corrected. I didn't know that O_DIRECT could make such a huge difference. Thanks!


Oops, O_DIRECT does not actually make that big of a difference. I had updated my ad-hoc test to use O_DIRECT, but didn't check that write() now returned errors because of wrong alignment ;-)

As mentioned in the sibling comment, syncs are still slow. My initial 1-2ms number came from a desktop I bought in 2018, to which I added an NVME drive connected to an M.1 slot in 2022. On my current test system I'm seeing avg latencies of around 250us, sometimes a lot more (there a fluctuations).

   # put the following in a file "fio.job" and run "fio fio.job"
   # enable either direct=1 (O_DIRECT) or fsync=1 (fsync() after each write())
   [Job1]
   #direct=1
   fsync=1
   readwrite=randwrite
   bs=64k  # size of each write()
   size=256m  # total size written


Add sync=1 to your fio O_DIRECT write tests (not fsync, but sync=1) and you’ll see a big difference on consumer SSDs without power loss protection for their controller buffers. It adds the FUA flag (force unit access) to the write requests to ensure persistence of your writes, O_DIRECT alone won’t do that


Random writes and fsync aren't the same thing. A single unflushed random write on a consumer SSD is extremely fast because it's not durable.


You're right. Sync writes are ten times as slow. 331µs.

  write: IOPS=3007, BW=11.7MiB/s (12.3MB/s)(118MiB/10001msec); 0 zone resets
    clat (usec): min=196, max=23274, avg=331.13, stdev=220.25
     lat (usec): min=196, max=23275, avg=331.25, stdev=220.27
    clat percentiles (usec):
     |  1.00th=[  210],  5.00th=[  223], 10.00th=[  235], 20.00th=[  262],
     | 30.00th=[  297], 40.00th=[  318], 50.00th=[  330], 60.00th=[  343],
     | 70.00th=[  355], 80.00th=[  371], 90.00th=[  400], 95.00th=[  429],
     | 99.00th=[  523], 99.50th=[  603], 99.90th=[ 1631], 99.95th=[ 2966],
     | 99.99th=[ 8225]




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: