I’ve been thinking about writing an AVR-8 emulator in AVR-8 assembly (Harvard architecture, can’t normally run code in RAM) so I can download a subroutine into what little RW memory the thing has and call it. In that case I would probably allocate some set of registers for the emulator’s own use so the ‘virtual machine’ would have, say, 24 registers instead of 32.
(An alternative to that would be something like Wozkniak’s SWEET16 which implements a ‘better’ architecture but what’s a better way to learn AVR-8 assembly than writing a self-hosting emulator?)
SWEET16 is better than 6502, at least in code density for handling 16 bit values, but its main attribute was that it was only about 300 bytes of code. SWEET16 is certainly not better than AVR in any dimension!
I've thought about making an MSP430 emulator for 6502/z80 as it's quite simple, has dense code, and is about the only 16 bit ISA to be well supported by modern compilers.
"Can't execute code from RAM" is a definite problem in certain circumstances. Is it still true with recent AVRs which have a unified address space? I know you know now use normal load instructions to read constants from ROM, but I don't know if that also extends to executing code from RAM. The ATMega328 and ATMega2650 and ATTiny85 aren't the latest thing any more :-)
(An alternative to that would be something like Wozkniak’s SWEET16 which implements a ‘better’ architecture but what’s a better way to learn AVR-8 assembly than writing a self-hosting emulator?)