In general TCP just isn't great for high performance. In the film industry we used to use a commercial product Aspera (now owned by IBM) which emulated ftp or scp but used UDP with forward error correction (instead of TCP retransmission). You could configure it to use a specific amount of bandwidth and it would just push everything else off the network to achieve it.
I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.
So the TCP does not seem to be a bottleneck if 40 Gbit/s is "high" enough, which it probably is currently for most people.
I have also seen plenty situations in which TCP is faster than UDP in datacenters.
For example, on Hetzner Cloud VMs, iperf3 gets me 7 Gbit/s over TCP but only 1.5 Gbit/s over UDP. On Hetzner dedicated servers with 10 Gbit links, I get 10 Gbit/s over TCP but only 4.5 Gbit/s over UDP. But this could also be due to my use of iperf3 or its implementation.
I also suspect that TCP being a protocol whose state is inspectable by the network equipment between endpoints allows implementing higher performance, but I have not validated if that is done.
Aspera was/is designed for high latency links. Ie sending multi terabytes from london to new Zealand, or LA
For that use case, Aspera was the best tool for the job. It's designed to be fast over links that single TCP streams couldn't
You could, if you were so bold, stack up multiple TCP links and send data down those. You got the same speed, but possible not the same efficiency. It was a fucktonne cheaper to do though.
> I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.
Do you mean literally just streaming data from one process to another on the same machine, without that data ever actually transiting a real network link? There's so many caveats to that test that it's basically worthless for evaluating what could happen on a real network.
To measure other overhead of what's claimed (TCP the protocol being slow), one should exclude other things that necessarily affect alternative protocols as well (e.g. latency) as much as possible, which is what this does.
It sounds like you're reasoning starting from an assumption that any claimed slowness of TCP would be something like a fixed per-packet overhead or delay that could be isolated and added back in to the result of your local testing to get a useful prediction. And it sounds like you think alternative protocols must be equally affected by latency.
But it's much more complicated than that; TCP interacts with latency and congestion and packet loss as both cause and effect. If you're testing TCP without sending traffic over real networks that have their own buffering and congestion control and packet reordering and loss, you're going to miss all of the most important dynamics affecting real-world performance. For example, you're not going to measure how multiplexing multiple data streams onto one TCP connection allows head of line blocking to drastically inflate the impact of a lost or reordered packet, because none of that happens when all you're testing is the speed at which your kernel can context-switch packets between local processes.
And all of that is without even beginning to touch on what happens to wireless networks.
Somebody made a claim that TCP isn't high performance without specifying what that means, I gave a counterexample of just how high performance TCP is picking some arbitrary notion of "high performance".
Almost like it makes the point that arguing about "high performance" is useless without saying what that means.
That said:
> you're not going to measure how multiplexing multiple data streams onto one TCP connection
Of course not: When I want to argue against "TCP is not a high performance protocol", why would I want to measure some other protocol that multiplexes connections over TCP? That is not measuring the performance of TCP.
I could conjure any protocol that requires acknowledgement from the other side for each emitted packet before sending the next, and then claim "UDP is not high performance" when running that over UDP - that doesn't make sense.
UDP by itself cannot be used to transfer files or any other kind of data with a size bigger than an IP packet.
So it is impossible to compare the performance of TCP and UDP.
UDP is used to implement various other protocols, whose performance can be compared with TCP. Any protocol implemented over UDP must have a performance better than TCP, at least in some specific scenarios, otherwise there would be no reason for its existence.
I do not know how UDP is used by iperf3, but perhaps it uses some protocol akin to TFTP, i.e. it sends a new UDP packet when the other side acknowledges the previous UDP packet. In that case the speed of iperf3 over UDP will always be inferior to that of TCP.
Sending UDP packets without acknowledgment will always be faster than for any usable transfer protocol, but the speed for this case does not provide any information about the network, but only about the speed of executing a loop in the sending computer and network-interface card.
You can transfer data without using any transfer protocol, by just sending UDP packets at maximum rate, if you accept that a fraction of the data will be lost. The fraction that is lost can be minimized, but not eliminated, by using an error-correcting code.
> perhaps it [..] sends a new UDP packet when the other side acknowledges the previous UDP packet. In that case the speed of iperf3 over UDP will always be inferior to that of TCP
It does not, otherwise it would be impossible by a factor ~100x to measure 4.5 Gbit/s as per the bandwidth-delay calculation (the ping is around the usual 0.2 ms).
With iperf3, as with many other UDP measurement tools, you set a sending rate and the other side reports how many bytes arrived.
It is a long time since I have last used iperf3, but now that you have mentioned it I have also remembered this.
So the previous poster has misinterpreted the iperf3 results, by believing that UDP was slower, as iperf3 cannot demonstrate a speed difference between TCP and UDP, since for the former the speed is determined by the network, while for the latter the speed is determined by the "--bandwidth" iperf3 command-line option, so the poster has probably just seen some default UDP speed.
High performance means transferring files from NZ to a director's yacht in the Mediterranean with a 40Mbps satellite link and getting 40Mbps, to the point that the link is unusable for anyone else.
Can we throw a bunch of AI agents at it? This sounds like a pretty tightly defined problem, much better than wasting tokens on re-inventing web browsers.
If you strip out the swarm logic (ie. downloading from multiple peers), you're just left with a protocol that transfers big files via chunks, so there's no reason that'd be faster than any other sort of download manager that supports multi-thread downloads.
Aspera did the chunking and encryption for you, and it looked and acted like SFTP.
The cost of leaking data was/is catastrophic (as in company ending) So paying a bit of money to guarantee that your data was being sent to the right place (point to point) and couldn't leak was a worthwhile tradeoff.
For Point to point transfer torrenting is a lot higher overhead than you want. plus most clients have an anti-leaching setting, so you'd need not only a custom client, but a custom protocol as well.
The idea is sound though, have an index file with and then a list of chunks to pull over multiple TCP connections.
I'm in a tiny part of the film industry. Bigger clients lend us licenses to Aspera and FileCatalyst when receiving files from them, but for our own trans-oceanic transfers I dug up an ancient program called Tsunami UDP and fixed it up just enough.
Aspera's FASP [0] is very neat. One drawback to it is that the TCP stuff not being done the traditional way must be done on CPU. Say if one packet is missing or if packets are sent out of order, the Aspera client fixes those instead of all that being done as TCP.
As I understand it, this is also the approach of WEKA.io [1]. Another approach is RDMA [2] used by storage systems like Vast which pushes those order and resend tasks to NICs that support RDMA so that applications can read and write directly to the network instead of to system buffers.
FASP uses forward error correction instead of retransmission. So instead of waiting for something not to show up on the other end and sending it again, it calculates parity and transmits slightly more data up front, with enough redundancy that the receiving end is capable of reconstructing any missing bits.
This is basically how all storage systems work, not just Weka. You calculate enough parity bits to be able to reconstruct the missing data when a drive fails. The more disks you have, the smaller the parity overhead is. Object storage like S3 does this on a massive scale. With a network transfer you typically only need a few percent, unless it's really lossy like Wifi, in which case standards like 802.11n are doing FEC for you to reduce retransmissions at the TCP layer.
Even if they are available on your street, each building and individual flat has to be connected. For blocks of flats that's not always straightforward.
Even if the cable is cat 5, telephone sockets are often daisy-chained from room to room. So it can still be a pain to get a point to point connection if it goes through several sockets.
I was in college when v6 was going through the RFC process. In my networking class we had to learn Netware (IPX) and v6, which have both turned out to be equally irrelevant, for different reasons. At this stage, I fully expect to retire having never deployed a single resource using v6.
Correct. I had used both at work up until around 2005. The idiot large companies I worked at did not believe in Source Code Control. That is the one thing I liked about RCS/SCCS, once I checked out an item, no one could check in their changes unless they contacted me. Forcing a coordinated manual merge between us.
I tried to get our org on to something for a while, but got massive push back until 5 or 6 years ago when they setup corporate wide paid githup repo.
Before that, I found a small group of developers around 2005 that used CVS and they allowed me to leverage that for my group. But of course I was the only one who used it.
Back then I guess people loved loosing source code, which happened a lot until git.
I convinced a software company to use a version control system (RCS on shared disk) back in 1993. To make it work we had to setup a network — Ethernet over (thin) coaxial cable at the time. This was so new to us that we didn't know we needed to use terminators on the two cable ends.
There's a really good emulator for the iPhone! Back when I bought it, it came from HP themselves, but a few years ago they sold it to another company which actually maintains it. They just released a major new version a few weeks ago.
Yes but they're using this fund to prop up their core business (and share price) by artificially creating demand for their own products. Most of the money that they invest comes back to them when these companies buy GPUs.
I wouldn't say they are artificially creating the demand. They artificatially create capacity to make a purchase by enabling their customers to pay with ownership of their business rather than with money. It's just an alternative financing scheme.
> They artificatially create capacity to make a purchase
If the company did not intend to purchase anything but Nvidia used the investment to "incentivize" a purchase then this is artificially creating demand where there used to be none. It's very different from Nvidia allowing a company to purchase Nvidia products that they already wanted to purchase but pay with stock.
I think this will continue. They can't change 3GPP's vision with just Nokia. They need to bribe other companies. Ericsson is the other big vendor. I think there is a possibility of that. However, Huawei is impossible. Who is gonna provide a GPU to them? Therefore, they simply can't just put a GPU on every base station around the world.
>Therefore, they simply can't just put a GPU on every base station around the world.
I dont think that is what's happening here. Base station has been power limited for quite some time and part of the whole 5G / Cloud RAN promise was moving a lot of the processing off base station. Ignoring GPU there are a lot of the current stack fits Nvidia portfolio especially DPU. May be Nvidia have figure out a way to use CUDA and have it perform better than Ericsson and Huawei.
Nokia is also the smallest of the three and has been in decline for quite some time. Part of me also wish Nvidia just buy Nokia and start competing against Ericsson and Huawei.
You are right on your points. I agree on DPU but it's on the network stack. I think Nvidia wants to get into the PHY and MAC layer (CuPHY, etc.). That's where I find it unlikely due to the cost of latency. If Nvidia had wanted to buy Nokia, they could've already completed the deal. It's a possibility in the future but this $1B investment kind of showed that they are more interested in creating artificial demand for their GPUs rather than diversifying their product portfolio. I agree with you that Nvidia should just buy Nokia.
More importantly than gaining a client for Nvidia's AI chips, this investment gives the company a solid foothold in a competitor to Broadcom in the wireless, datacenter and networking solutions space. I wouldn't be surprised if Nvidia eventually scoops up all of Nokia.
buy in, let them use your money to buy your product back from you, then if you think they'll actually succeed you make money and if not you can slowly dump that stock back onto the market and end up ahead. you've basically manufactured a customer at that point.
Exactly, the risk is extremely small. If they fail, there will be others too, which means that WSBro’s can bundle and derivatives their way into at least a neural exit.
reply