Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Out of curiosity, I took a look at how the MinimumBytes (actually MinimumCount) field is used by the Windows SMB server. Interestingly, it fails with STATUS_END_OF_FILE if the actual bytes read is less than MinimumCount, which suggests to me that this is supposed to be a minimum on the (remaining) file length, not on the number of bytes that the server is able to return at the moment.

I can't find any history of MinimumCount being used in the RTM version of any Windows SMB client, so without deeper archeology the reason this field was introduced remains a mystery to me.

Regardless, I agree that the client should validate the returned byte count. But (only having thought about this briefly), I do not think a client should retry in this case--it seems to me if the client sees a short read, it can assume that the read was short because the read reached EOF (which may have changed since the file's length was queried).



Sorry to keep laboring the point :-) but the other reason I'm pretty sure this is a client bug is that the client doesn't truncate the returned file at the end of the short read, which you'd expect if it actually was treating short read as EOF.

If you copy a 100mb file and the server returns a short read somewhere in the middle of the read stream the file size on the client is still reported as 100mb, which means file corruption as the data in the client copy isn't the same as what was on the server.

That's how this ended up getting reported to us in the first place.


Yes, that's a good point. I agree that there appears to be a client bug here. From a quick glance, it appears that nothing is checking that the non-final blocks in a pipelined read are returned from the server in full.

I don't necessarily agree that retry is the right behavior though. Wouldn't that result in an extra round trip in the actual EOF case? Again, not having thought about this much, it seems a more efficient interpretation of the spec is that truncated reads indicate EOF. In that case, a truncated read as in the middle of a pipelined operation either indicate the file's EOF is moving concurrently with the operation (in which case stopping at the initial truncation would be valid) or the lease has been violated.

Regardless, I work on SMB-related things only peripherally, so I do not represent the SMB team's point of view on this. Please do follow up with them.


It's only an extra round trip in the case of an unexpected EOF. File size is returned from SMB2_CREATE and so given the default of a RHW lease then (a) the lease can't be violated - if it is, then all bets are off as the server let someone modify your leased file outside the terms of the lease. Or (b) you know the file size, so a short read if you overlap the actual EOF is expected and you can plan for it.

A short read in the middle of what you expect to be a continuous stream of bytes should be treated as some sort of server IO exception (which it is) and so an extra round trip to fetch the missing bytes returning 0, meaning EOF and something truncated or an error such as EIO meaning you got a hardware error isn't so onerous.

After all this is a very exceptional case. Both Steve's Linux cifsfs client and libsmbclient have been coded up around these semantics (re-fetching missing bytes to detect unexpected EOF or server error) and I'd argue this is correct client behaviour.

As I said, given the number of clients out there that have this bug we're going to have to fix it server-side anyway, but I'm surprised that this expected behavior wasn't specified and tested as part of a regression suite. It certainly is getting added to smbtorture.


Whenever a client gets a short read it needs to issue a request at the missing offset if the caller wanted more bytes. Only if the server returns zero on that read can it assume EOF and concurrent truncation.

We're going to have to fix the Samba server to never return short reads when using io_uring because the clients with this bug are already out there. But if what you're saying is how Microsoft expects the protocol to operate then it needs to be documented in MS-SMB2 because I don't think it's specified this way at the moment.


No the client can't assume that. Consider pipelining reads. You can asynchronously send 10 1MB reads. The server can return the data in any order. So the read sent at offset 0 could return last after the server has already returned 9MB starting at offset 1MB onwards in the file, and this first read then returns a short read of 800k instead of 1MB.

You can't then assume that the read at offset 0 returning short means the file is now truncated to 800MB and the other 9MB is no longer of use.

Also remember you might have a complete RWH lease on the file, so you are guaranteed that there was no other writer truncating the file whilst the read is ongoing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: