If bufsize is a power of 2, doing the modulo would be exactly the same as just using a smaller integer type (like 8 bits on a microcontroller). What we want is the integer wraparound to never coincide with the buffer position wrapping. For that we would need a size that has some other factor than 2, and thus a "real" modulo operation in the indexing (which is slow).
Or is there an error in my thinking?
edit:
Duh, I understand now. If the counter has at least twice the range as the buffer size, the subtraction will always give the correct number of elements used. Power of 2 or not doesn't matter.
Not having to do the subtraction could still give a performance benefit, and if the buffer elements aren't very large, one wasted slot is worth the tradeoff.
Like for any algorithm choice, the optimal choice depends on the context.
On a desktop/laptop computer, wasting a queue slot matters very little, but it may be useful to achieve the maximum speed, so this method does not seem preferable.
On the other hand, in many microcontroller applications there is no other read/write RAM, but a few kilobytes of internal MCU RAM. So the amount of used RAM may be the most important resource and there may be many FIFO queues for various peripheral interfaces.
For such MCU applications, it is valuable to be aware of this alternative FIFO queue implementation, as it may be the best choice.
It took me a bit to understand how tcp seqnums work in case of wraparound, but the trick is exactly that: everything works fine as long as you have less than 2*31 unack'd bytes.
If bufsize is a power of 2, doing the modulo would be exactly the same as just using a smaller integer type (like 8 bits on a microcontroller). What we want is the integer wraparound to never coincide with the buffer position wrapping. For that we would need a size that has some other factor than 2, and thus a "real" modulo operation in the indexing (which is slow).
Or is there an error in my thinking?
edit: Duh, I understand now. If the counter has at least twice the range as the buffer size, the subtraction will always give the correct number of elements used. Power of 2 or not doesn't matter.
Not having to do the subtraction could still give a performance benefit, and if the buffer elements aren't very large, one wasted slot is worth the tradeoff.