Thanks for your insights! Yes, real-life behavior is indeed interesting to look at, and for this purpose, two testnets are running right now (https://www.gmonads.com).
RaptorCast uses erasure coding to break a block proposal into smaller pieces with plenty of redundancy to allow for omissions. This means that if you receive sufficiently many chunks, you can decode the block proposal (no matter which of the chunks you received). The redundancy factor can be tweaked, but it’ll likely be >2x, to allow for networking issues and faulty/malicious nodes. Furthermore, the blockchain can make progress as long as >2/3 of the validators receive the block proposal and are honest. This means that at least in theory, you should be able to tolerate a lot of packet losses.
Re throughput: Monad has 2 blocks / s, each 2MB in size. So even with a redundancy factor of 3x, each validator only has to send 12MB per second.
Re backpressure: Not really an option for blockchains. If you have 100 peers and one of them is too slow, what are you going to do? If you back pressure to slow down consensus, you slow down the entire blockchain even though most peers are fast. There’s a recent paper about this problem: https://arxiv.org/abs/2410.22080.
What’s important is that the amount of bandwidth required per validator remains constant in RaptorCast, no matter how many validators are part of the network. And you always just need one round-trip to broadcast a block proposal, as opposed to Gossip protocols that may involve more steps and have higher latency.
> If you have 100 peers and one of them is too slow, what are you going to do?
Detect, and remove them from the set, so that if the bottle-neck is upstream of them you're relaxing pressure on switches etc that may be shared with nearby nodes, reducing the chance that two of them are suddenly too slow...
> The redundancy factor can be tweaked, but it’ll likely be >2x
If your packet loss is due to your traffic overwhelming a queue at any intermediate hop, sending more redundant packets would be aggravating the problem instead of solving it.
Are you running this on top of something providing congestion control?
RaptorCast uses erasure coding to break a block proposal into smaller pieces with plenty of redundancy to allow for omissions. This means that if you receive sufficiently many chunks, you can decode the block proposal (no matter which of the chunks you received). The redundancy factor can be tweaked, but it’ll likely be >2x, to allow for networking issues and faulty/malicious nodes. Furthermore, the blockchain can make progress as long as >2/3 of the validators receive the block proposal and are honest. This means that at least in theory, you should be able to tolerate a lot of packet losses.
Re throughput: Monad has 2 blocks / s, each 2MB in size. So even with a redundancy factor of 3x, each validator only has to send 12MB per second.
Re backpressure: Not really an option for blockchains. If you have 100 peers and one of them is too slow, what are you going to do? If you back pressure to slow down consensus, you slow down the entire blockchain even though most peers are fast. There’s a recent paper about this problem: https://arxiv.org/abs/2410.22080.
What’s important is that the amount of bandwidth required per validator remains constant in RaptorCast, no matter how many validators are part of the network. And you always just need one round-trip to broadcast a block proposal, as opposed to Gossip protocols that may involve more steps and have higher latency.