If I were asked this question, the first thing I said would be this is a poorly designed architecture. Client is the poor place to do throttling by itself. It has no information on the aggregated load of the system. It makes assumption that leads to complicate code in the sample code. There're more robust and better ways to do flow control and throttling.
the intro isnt throttling, its request serialization. there isnt some limit to keep your requests to, just that its one at a time. it could go as fast or as slow as the individual requests finish.
its still not a great architecture, but its different from throttling
> But that server is faulty!! If it has to handle multiple requests at once, it starts to break down. So, we decide to make our server's life easier by trying to ensure, from the client, that it doesn't ever have to handle more than one request at once