Practically, the network jitter is averaged out in the clock synchronization calculations, and even output latency is remarkably well-behaved. Have you tried it on different devices? It is only noticeable when there's an external device connected to the computer.
Yeah the threshold is pretty brutal, but it is enough. Experimentally, I'd say you need under 2-3ms but even at 1ms you can start to hear some phase differences.
Most of the time, I think my synchronization algorithm is actually sub-1ms, but it can be worse depending on unstable network conditions.
I was wondering that too. It’s an impressive demo when used on devices with low latency audio drivers but I’m not convinced there’s any ability to detect drift beyond this. Might be interesting to have an option to use microphones to detect and calibrate this… …but then you have the same issue of an unknown delay on the microphone input too.
First, I do clock synchronization with a central server so that all clients can agree on a time reference.
Then, instead of directly manipulating the hardware audio ring buffers (which browsers don't allow), I use the Web Audio API's scheduling system to play audio in the future at a specific start time, on all devices.
So a central server relays messages from clients, telling them when to start and which sample position in the buffer to start from.
Interesting. Feels like this might still have some noticeable tens-of-millisends latency on Windows, where the default audio drivers still have high latency. The browser may intend to play the sound at time t, but when it calls Windows's API to play the sound I'm guessing it doesn't apply a negative time offset?
So it doesn't need to use the microphone? I guess from the "works across the ocean" comment and based on this description. I would have thought you would listen to the mic and sync based on surrounding audio somehow but it's good to know that it's not needed.