A64fx have on board HMB -> that means no dram. If you look at the fugaku mother board their is no Dimm slots. All the memory is on the same package as the CPU.
This delivers a huge boost in bandwith.
HMB stand for high memory bandwidth. It offers up to 900 GB/s.
Now if you add the tofu interconnect on top you have a systems finely tuned for maximising data movement.
Remember : compute is cheap, communication is expensive.
You can have load of gpu and processors but if you can't feed them data fast enough they are useless.
That is a pretty fun architecture. I hope that opens the door to higher performance for more workloads than top500.
At least with the top500 benchmark, the bandwidth is not a problem, so long as you can do a large enough problem. Since it is a linear solve that spends all its time doing matmul (n^3 operations on n^2 data), so long as the problem is big enough, you can saturate the cores.
That's fascinating. I know that AMD has been touting HBM as a faster memory subsystem for their GPUs, is that the same as HMB where it's stacked? Or are they just calling it something similar?
This delivers a huge boost in bandwith.
HMB stand for high memory bandwidth. It offers up to 900 GB/s.
Now if you add the tofu interconnect on top you have a systems finely tuned for maximising data movement.
Remember : compute is cheap, communication is expensive.
You can have load of gpu and processors but if you can't feed them data fast enough they are useless.