But adding compression just costs 400 gates, how on earth is that an issue, even on a small controller?
The C extensions saves a lot more memory than these more complex instructions. So even if you don't add macro fusion, you are still getting advantages from fewer cache misses or less money spent on cache.
You seem to talk theory, when in practice we know the BOOM RISC-V CPU outperforms a ARM-32 Cortex-A9, while requiring half the silicon area. The RISC-V is 0.27 mm2 while the ARM is 0.53 mm2 using same technology.
And what you are missing from the overall picture is that a key requirement for RISC-V is that it is useable in academia and for teaching. It is supposed to be easy for students to learn as well as to implement simple RISC-V CPU cores. All of that is quickly out the window if you go down the ARM road.
That RISC-V pulls off all these things: higher performance, smaller die, simpler implementation and easier to teach validates IMHO their choices. I don't see how your argument has any legs to stand on.
> But adding compression just costs 400 gates, how on earth is that an issue, even on a small controller?
If it's so cheap and good, why is it an extension and not part of the base then?
Anyway, the problem isn't how few gates you can get away with for a low performance microcontroller, but rather how to design a wide and fast decoder for a higher end core. As the instruction stream isn't self-synchronizing, you need to decode previous instructions to know where the instruction boundary for the next instruction is. Sure, you could speculatively start to decode following instructions, but that gets hairy and consumes extra power.
> You seem to talk theory, when in practice we know the BOOM RISC-V CPU outperforms a ARM-32 Cortex-A9, while requiring half the silicon area. The RISC-V is 0.27 mm2 while the ARM is 0.53 mm2 using same technology.
Yes, BOOM is a nice design, and the (original) author used to hang around here on HN. That being said, having read the paper where those area claims were made, I think it's quite hard to do cross-ISA comparisons like this. E.g. the A9 has to carry around all the 32-bit legacy baggage (in fact, it doesn't even support aarch64, which isn't that surprising since it's an old core dating back all the way to 2010), it has a vector floating point unit, it supports the ARM 32-bit compressed ISA, and whatnot.
> And what you are missing from the overall picture is that a key requirement for RISC-V is that it is useable in academia and for teaching. It is supposed to be easy for students to learn as well as to implement simple RISC-V CPU cores. All of that is quickly out the window if you go down the ARM road.
I'm not forgetting that, and that's certainly an argument in favor of RISC-V. Doesn't mean that it's a particularly relevant argument for evaluating ISA's for production usage.
I'm not saying RISC-V is a bad idea. Certainly it seems good enough that combined with the no licensing cost aspect as well as geopolitical factors which is important for some prospective users, it has a good future ahead of it. I'm just saying that with some modest changes when the ISA was designed, it could have been even better.
> If it's so cheap and good, why is it an extension and not part of the base then?
I think what you mean is why it is not in the G extension which encompasses IMAFD but not C. I agree that is a bit odd.
I think it would have been very wrong if it was part of the I base instruction set. That should be as minimal as possible.
But I guess a question like this easily becomes very philosophical. For me it makes sense that C is not in G, because C is really an optimization and not about capability. A software developer, tool maker etc would care more about the instructions available I think than particular optimizations.
> E.g. the A9 has to carry around all the 32-bit legacy baggage
But surely that counts in RISC-V favor as ARM has no alternative modern minimal 32-bit instruction set. With RISC-V you can use 64-bit and 32-bit instructions with minimal code change.
And I don't see how ARM-64 would have made any of this any better, as it has over 1000 instructions. I am highly skeptical that you can make tiny cores out of that. But I am not a CPU guy so I am okay with being told I am wrong ;-) As long as you can give me a proper reason.
> Doesn't mean that it's a particularly relevant argument for evaluating ISA's for production usage.
True, but I think there is a value in the whole package. You see people evaluating RISC-V and finding that sure there are commercial offerings performing slightly better, or they could make custom ISA that does better. But the conclusion for many is that RISC-V is good enough and with the growing eco-system, that still makes it a better choice in sum. If you are to make a custom ISA today, it better be a lot better than RISC-V to be worth it, I would think.
I would also think there is a value for hardware makers to be on the same platform which Universities and research institutions are going to be using. As well as the same platform students are going to come out of University knowing.
Anyway thanks for the discussion. While I am pushing back on everything (seemingly) I do find this kind of discussion very valuable in learning better pros and cons. It spurs me to look up and learn more things.
The C extensions saves a lot more memory than these more complex instructions. So even if you don't add macro fusion, you are still getting advantages from fewer cache misses or less money spent on cache.
You seem to talk theory, when in practice we know the BOOM RISC-V CPU outperforms a ARM-32 Cortex-A9, while requiring half the silicon area. The RISC-V is 0.27 mm2 while the ARM is 0.53 mm2 using same technology.
And what you are missing from the overall picture is that a key requirement for RISC-V is that it is useable in academia and for teaching. It is supposed to be easy for students to learn as well as to implement simple RISC-V CPU cores. All of that is quickly out the window if you go down the ARM road.
That RISC-V pulls off all these things: higher performance, smaller die, simpler implementation and easier to teach validates IMHO their choices. I don't see how your argument has any legs to stand on.