I don't see how you need to wait. It's still an eventually consistent system. An...

zzzcpan · on Oct 18, 2017

Operations require sort of versions to synchronize, these versions is where ordering comes from too. This is important because since networks are asynchronous messages can arrive to this central server at different order. So, the way to achieve order with central server and being able to synchronize operations is to ask that server and wait for its decision. Not really that simple nor even usable. Instead the simplest and the most naive way to do this is to use timestamps for versions, still without a central server though. But time is not really possible to synchronize, so instead logical clocks should be used, like lamport timestamps, vector clocks.

(And yes, I've been through this)

burntsushi · on Oct 18, 2017

If you have a single logical server, then it chooses the ordering, but a single client doesn't need to ask the server for the ordering of any operations generated locally: it can pick the order because it determines the relative ordering of those operations.

A client could be disconnected from the network for hours and re-synchronize at some later point in time and it all would work just fine.

The only time the client needs to connect to the server are when the client wants to publish updates, or when the client wants to receive the updates. If you only need to do that once every 10 seconds (or even once every second), does that make a client/server model workable?

zzzcpan · on Oct 18, 2017

Ok, let's think: if a single client only has ordering locally and the server assigns global ordering, than retransmit in case a client didn't receive an acknowledgement from the server is going to assign global ordering again, like it's a new batch of data. Which breaks consistency. If we want to keep consistency and be able to synchronize we need to have this global ordering on the client.

To make it work that way a client has to ask the server for a starting point before it can generate operations and increment it with each new operation. But then it doesn't sound like the server does the ordering and starts to look a lot like lamport timestamps.

burntsushi · on Oct 18, 2017

Maybe think about this way: a client has two sets of operations. One set reflects what it has received from the server and the other set reflects the operations that the client itself has generated, but has not yet received confirmation from the server that they have been assimilated into a global ordering (or perhaps the client simply hasn't sent them yet). We also have a guarantee that if a client sends a batch of operations, then the server will not reorder those operations relative to one another.

For the client, the only real choice they have (until they hear otherwise) is to assume that all local operations come after all server operations. At some point, the server will send the client updates with new operations from other clients, or perhaps even confirmation that some of their previously local operations have now been recorded by the server and assigned a total ordering. The client is then responsible for updating its local state: moving the local operations into the server operation list and updating the server operation list to contain any additional operations reported by the server. The client must then re-materialize any data structures dependent on these operations, but depending on the nature of your operations, this might be possible to do efficiently.

Given enough time where every client is idle, its set of local operations will become empty as they all move into the server's set of operations. All clients then have the same set of operations in the same order, which gives you consistency.

zzzcpan · on Oct 18, 2017

Well, communications are still unreliable and you really have to use proper distributed algorithms to solve that if you want to keep consistency, there is no way around it.

burntsushi · on Oct 18, 2017

It seems like a pretty simple solution works in this particular design space: the client keeps asking for updates and keeps pushing updates to the central server.

zzzcpan · on Oct 18, 2017

I think we are talking past each other. It is simple to synchronize things, very simple, but only without centralized coordination. In distributed systems any coordination complicates things a lot and I don't think it's even possible to make simpler solution with coordination if there exists one without it.

ryuuseijin · on Oct 18, 2017

Google wave is an example of what works well with a central server: you have many documents that can be sharded across many instances because each document only needs to be consistent with itself, and your throughput limit is therefore only per-document which is more likely to be bounded by the number of participants that can practically interact in a single document.

There is still the issue of a node becoming hot because there is an unusually active document on that node, which would usually not happen in a completely decentralized model.

burntsushi · on Oct 18, 2017

This is a really good point, thanks!