Wonder if something like that is available for an ARM based platform.
I guess the Zynq UltraScale+ MPSoC is pretty much that, but it would be cool if you could simply pair a cheap-ish FPGA device with some cheap, commodity ARM-based SoC, to get that sort of power without adding a few hundred $ to your BOM cost.
Wow, that was fast. I cannot fathom this. We went from zero to-tooooo fast. The pace of innovation cannot be internalized. A billion cycles in the place of period. .
> * Unified discovery and reservation of AMD and Xilinx accelerators using a converged runtime in the AMD ROCm open software platform;
> * Dispatch of work to Alveo accelerators using the same user-space queues used for low-latency work dispatch to AMD Instinct accelerators;
> * Peer-to-peer synchronization between GPU and FPGA devices; and
> * Access to memory on GPU, CPU, and FPGA devices using a common, shared virtual address space
Basically, your single-source C++ code can now go to CPU-land, FPGA-land or GPU-land.