The question we all want the answer to is what is the current most likely contender for a decent performance widely available implementation of this profile, and when is it due?
I'm not aware of any chips, even in development, that will genuinely have full RVA23 support. If you'll settle for "near enough" then my betting is we'll have SiFive P650/P670 hardware first (https://www.sifive.com/cores/performance-p650-670). There are some other development boards coming (sorry, under NDA!) that should have near-RVA23 + Xeon-like performance and we should get them in very end 2025 / early 2026. These are server-class parts so the development boards will not be cheap.
Some parts of RVA23 like a complete implementation of the vector sub-extensions, and hypervisor support, are pretty complex.[1]
If you just want RVA23 now (with poor performance) then qemu implements it. We found it's not very useful for software development, not just because it's slow, but also because it performs so differently from the real hardware, so you cannot, for example, optimize your vectorized code.
> There are some other development boards coming (sorry, under NDA!) that should have near-RVA23 + Xeon-like performance and we should get them in 2026. These are server-class parts so the development boards will not be cheap.
I can get a 3A6000 system for 400€ with decent (~Zen2) performance, are we talking more than that?
But RISC-V predates loongarch64 - why did Loongson create a new ISA instead of implementing a fast RISC-V core?
Sure you can move faster if you can make your own standards and don't have to coordinate with anyone, but still makes me wonder if there is some fundamental issue that makes it difficult to create a high performance RISC-V implementation.
How does it compare to the current RISC-V leader, ESWIN SiFive P550?
Anyway, China seem to have decided to pursue a dual strategy of pouring money into RISC-V and LoongArch at the same time. I've no idea why that is. The company I work for talks to several RISC-V vendors who don't believe there is any issue with the RISC-V ISA for high performance server-class application cores.
The P550 is pretty slow. It doesn't have any SIMD, and it's scalar engine is ~about the same as a Core 2 from ~2008. In general, it's fairly confusing to me why SciFive has been so cautious in their designs. It seems like they would be in a much better place if they'd released a few CPUs where rather than trying to balance everything while creeping complexity up, they just added all the power to theoretically match high end CPUs and then addressed the bottlenecks over iterations.
> The P550 is pretty slow. It doesn't have any SIMD, and it's scalar engine is ~about the same as a Core 2 from ~2008.
Yes! And this is great progress in just a couple of years.
The previous SiFive core that made it to available hardware, the U74, is about like a Pentium or PowerPC 603 in µarch, though with higher clock speed so putting it more like late Pentium 3 or PowerPC G4 in delivered performance (and also quad core, not single core, which helps a lot).
SiFive already have 2 1/2 generations of core released since the P550 in mid 2021, the P670 and the P870(-D). Both of those implement RVV (Vector). If not for US sanctions we'd be seeing very nice 16x P670 machines this year, likely quite a bit better than the Pi 5 (and Rock 5 and Orange Pi 5). Or somewhere between Sandy Bridge and Skylake in Intel terms, I think.
The P870 is somewhere around Snapdragon 8 gen 2.
It just takes time to get CPU cores from the drawing board to shops, and especially for a company that wants to make consumer products to license the core in the first place since SiFive (like Arm) doesn't make chips.
RISC-V is behind Arm, but the gap in cores available to license is far smaller (about 2 years at present) than the gap in things anyone can buy in a shop (about 5 years at present) and getting smaller all the time.
> it's fairly confusing to me why SciFive has been so cautious in their designs. It seems like they would be in a much better place if they'd released a few CPUs where rather than trying to balance everything while creeping complexity up, they just added all the power to theoretically match high end CPUs
Because most of the market (by volume / revenue / profit) does not need the highest end CPUs.
Even in phones the Galaxy S25, with its Qualcomm Oryon cores, is a nice headline product, but there are still new SoCs and phones being introduced today (or at least there were some in February 2024 ... I haven't checked recently) with nothing faster than Arm A53 cores announced in 2012. I expect those low end phones are still selling in pretty big numbers.
> Yes! And this is great progress in just a couple of years.
Is it? In 5 years they went from the HiFive Unmatched on 5nm (Intel ~1999 except missing vector instructions so it's ~4x slower) to the HiFive Premier on 7nm (~Intel 2008 except it's still missing vector instructions so it's ~4x slower on a lot more things). Sure that means they're catching up, but it really seems like they should be further along since they don't need anything "new", they can just take the ratios of a 5 year old CPU and build that. The P650 to me seems like the first of their CPUs that will be of a moderately reasonable performance level, but due to sanctions it's unclear when/if it will ever launch in a dev board.
> Because most of the market (by volume / revenue / profit) does not need the highest end CPUs.
The problem is that any new ISA that wants to compete in the general purpose market needs some high power CPUs if only to give developers something to play with/run CI/compile the world on. They don't need to develop on a new node, they don't need to be ultra power or cost efficient, etc. They just need a chip with some moderately OK power.
If they had by now a dev board that was $2000, consumed 300w and crashed if you looked at it funny, that would be better than what we have now, because their current products don't include anything that runs faster than QEMU.
> In 5 years they went from the HiFive Unmatched on 5nm
It's 28nm.
> the HiFive Premier on 7nm
The EIC7700X is on TSMC 12 nm FFC.
You're not doing well on the easily googleable facts here.
Also the HiFive Unmatched (may 2021) uses prototype shuttle-run chips, while the Premier (December 2024, 3 1/2 years not five) uses mass-production chips.
A more accurate comparison would be either from the Unmatched to IBM's "Horse Creek" prototype board demoed in September 2023 (2 1/4 years apart) or else from the VisionFive 2 (February 2023) to the Premier, which is 1 3/4 years.
So your five years is way out too.
> current [RISC-V] products don't include anything that runs faster than QEMU
Current RISC-V does fine compared with QEMU on a Mac/PC, at least on a per core basis.
Note that all the following use qemu-user effectively in a chroot (docker), which is much faster than full-system emulation using qemu-system.
Let's take compiling the Linux kernel, commit 7503345ac5f5 from early December, with defconfig, oldish now but I'm sticking with it for my benchmarks to keep them comparable:
19m13s: i9-13900HX, 8P + 16E cores, 32 threads
48m37s: i9-13900HX, -j4
69m16s: Mac Mini M1, 4P + 4E cores
143m20s: Ryzen 5 4500U, 6x Zen2
251m31s: i7-3720QM (4x Ivy Bridge)
The 24 core i9 (which can compile a native x86_64 version of the same kernel in 1m3s) takes three times longer using qemu than the fastest current RISC-V machine, which has a similar list price. It's twice as fast as a $199 4 core RISC-V (Megrez).
When restricted to -j4, the i9 with qemu takes 15% longer than the quad core $199 RISC-V SBC.
The i9 is good if you already have it and it's under-utilised, but if you are buying something for dedicated use then you'll get much more RISC-V build speed from the Pioneer, or the same from two $199 boards or four $50 boards (either Orange Pi RV or RV2, for example, which are the same SoCs as my older VisionFive 2 and Lichee Pi 3A respectively).
If you're using an M1 Mac then qemu on it is in the pack with the cheap RISC-V boards, and well behind the P550 boards.
I bought the Ryzen 4500U at the same time as the M1 Mac (both in late 2020). It's a pleasant machine, but really doesn't cut it for running qemu. Neither does the 2012 Ivy Bridge machine -- I remember thinking a 3770 was hot stuff, and the 3720 won't be far off it (3.6 turbo vs 3.9 turbo)
> If they had by now a dev board that was $2000, consumed 300w and crashed if you looked at it funny, that would be better than what we have now
But we've had such a machine -- the Milk-V Pioneer -- for 15 months already. It's $1500 for the board, or $2500 for a fully built machine in a case with power supply, 128 GB RAM, 1 TB SSD, AMD GPU card. It uses 90W when loaded down (not 300) and 68W when idle.
Perhaps you missed the story where Chimera Linux announced on March 12 that they were dropping RISC-V support for lack of a suitable build machine, someone gave them access to their Pioneer, and on March 20 they said they'd done a successful full build of their entire distro (there are 11,057 packages in their repo) and weren't dropping RISC-V after all. It took a few days to get access to the machine, a few days to get set up on it, and ... well we don't know how long the build took, but it's whatever is left from 8 days elapsed ... minus however long it took to write the 2nd blog post.
Loongarch is built on MIPS, so the core predates RISC-V by decades.
RISC-V didn't have the specs to build what they were targeting when they started designing a few years ago. Given the similarities in the ISAs, I suspect they may switch to RISC-V in the near future.
It is a sad day in history to see both a RISC-V-related acronym and NDA used in the same sentence. The term oxymoron barely cuts it IMO. I thought that RISC-V used to stand for open hardware and freedom, TIL.
The RISC-V specs are all free to download and implement. Some (by no means all) implementations are proprietary. Red Hat (the company I work for) talks to all sorts of server vendors, x86, Arm, POWER, RISC-V all the time under NDA so we can find out what new products those vendors are developing before they are released, and have software ready in time. The software we write is all open sourced. This is nothing particular to do with RISC-V, and has nothing to do with the RISC-V ISA, nor with open sourcing of software.
My bets are on Tenstorrent Ascalon. They are currently taping out and planning to release a 8 core@2.25GHz devboard and laptop next year. The cores have an 8 wide decode/issue/dispatch and dual issue 256-bit RVV support.
Their the scalar part of their scheduling model is already upstream in LLVM as tt-ascalon-d8.
The only I can think of that may deliver earlier is Ventana with the Veyron V2, they plan to have chiplets in the first half of this year. I'm not sure if they are planning on releasing devboards though, as they target big servers. The cores have a 16 issue backend, with 5 512-bit RVV execution units (arithmetic,mask,permut,load,store) and a decode of up to 10 instructions per cycle.
Yes, they got access to a $2500 RISC-V machine that's been out for a year but isn't currently available because the same manufacturer has much updated machines coming out soon.
$2500 is more than a Raspberry Pi but not expensive for any organisation that is paying even one salary to one employee. I also note they recently got an Ampere Altra arm64 machine, which I believe start at around $4000.
Anyway, they blogged saying they were dropping RISC-V on March 12, got access to a RISC-V machine a couple of days later, mucked around a bit with build systems (including building a new Linux kernel that didn't advertise the pre-ratification V extension, which is a checkbox in `make menuconfig`) and then on March 20 -- 8 days after the initial post -- blogged that they had successfully completed a build of their distro on the donated RISC-V machine.
I don't know how long that complete build of all their packages took (Chimera Linux repo contains 11057 packages), but it can't have been more than 3 or 4 days, maybe less, which seems very reasonable.
I'm not sure what is "grim" here.
It's a very new ISA -- first ratified only in July 2019, and with very significant additions in November 2021 and more very important things (for RVA23, which will be the baseline for Android and also for Ubuntu 26.04 LTS) right up to 2024.
Machines are only going to get faster and cheaper from this point on, but a $2500 64 core 128 GB RAM machine from January 2024 that can build a complete distro in a couple of days isn't bad at all.
This release adds the cray style variable length vector instructions. Compiling with avx512 and running without it will crash, but this just uses whatever length the chip supports with the same instruction. Also means you don't have to wait on llvm to update every time they grow the vector size.
I don't get why they decided to create all these very complicated extensions before even getting solid hardware for the base instruction set. RVV support is many, many years into the future and I bet many of its instructions will be micro coded since they aren't very useful for compilers anyway. Specifying standards for the future before you have working prototypes is very (ahaha) risky.
There is solid hardware, it just not standard hardware for consumers. RVV support in software is already good and there are already many commercial chips.
> I bet many of its instructions will be micro coded since they aren't very useful for compilers anyway
I have listened to a lot of the talks about RISC-V and many about Vector extension, and micro-coding was barley mentioned.
What is more common is that people don't implement all the instructions in academic settings.
> Specifying standards for the future before you have working prototypes is very (ahaha) risky.
Absolutely nothing is standardized before hardware implementations exist. In fact, for Vector there was a first generation of the 0.7 that saw limited commercial use and far more more for the RVV 1.0 version.
I looked at RISC-V SoCs last year and I couldn't find anything at all with support for RVV. Just a lot of "coming soon", "in progress", and "preorder now". Maybe the situation has improved since then.
The CanMV-K230 has RVV 1.0 and shipped in November 2023, just two years after RVV 1.0 was ratified.
So "nothing at all" was already false before January 1 last year.
The BPI-F3 (8x X60 cores with 256 bit VLEN RVV 1.0) was announced in February 2024 and shipped in May. The Lichee Pi 3A and Milk-V Jupiter with the same SoC were not far behind it, and the MuseBook and DC-Roma II laptops were not far behind them. We now also have the Orange Pi RV2 with the same SoC for $30 (2 GB RAM) to $50 (8 GB)
It's probably going to be a while before we see the next generation but there are half a dozen different machines available with the current best mass-production RVV 1.0 chip.
so completely the opposite complaint than https://news.ycombinator.com/item?id=43762141 which said RISC-V people should have skipped all the intermediate implementations and gone straight to x86-competitive competitive high performance cores.
i.e. they're probably doing something right.
When was the last successful and surviving today brand new ISA released that didn't offer backward compatibility with something that was already big in the market?
Not Aarch64 -- it coexisted with 32 bit Arm in all of Arm's cores from 2012 until 2023.
Not x86_64 which even to this days runs 32 bit (and 16 bit?) x86 code.
MIPS and SPARC are both gone, Alpha and Itanium were never huge and anyway are gone.
I guess IBM POWER, originally released as RS/6000 in 1990, 35 years ago. It had design lineage from 801/ROMP, but was not compatible with them.