Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Compiling Quake 3 Virtual Machines To JavaScript (inolen.com)
100 points by mariuz on Feb 16, 2014 | hide | past | favorite | 31 comments


It would be cool if it compiled either LCC assembly or QVM bytecode to LLVM IR. Now you have all of LLVM's optimizations! Finally, spit out the JS using Emscripten like the rest of the project.


I actually did that first: https://github.com/inolen/qvmc

It worked, but the main issue was related to object file generation.

Basically, you have to allocate one big i8 array to represent all of the QVM's data segments (including its bss segment). This doesn't translate well once compiled with Emscripten, as bss relocation is on a per-variable basis. For one of the Quake 3 Fortress QVMs which was only a few hundred kb it output a ~50mb .bc file and a ~70mb .js file of mostly 0s.

I experimented with storing each segment as its own i8 array (which could be 0 initialized for the bss segment) in a packed struct, which significantly lowered the .bc file size. However, while I was looking into what I'd need to change in Emscripten to support this, I instead decided to go the route of writing the runtime compiler.

Fun fact: Quake 3 mods didn't have access to malloc / free, so it wasn't uncommon to have large amounts of static, 0 initialized data.


I thought that too, but remember this has to all be done at runtime, so depending on LLVM would require an Emscripten-ized version of LLVM (does such a thing exist?) to be distributed with the generated js and similarly depending on Emscripten would require a copy of Emscripten to be distributed with the generated js. And from what I hear the runtime performance of Emscripten itself ain't that great.


Why does this have to be done at runtime, exactly? How many mods (or rather, QVM files) were there? And are people still making them?

If the answer is "less than 100K" and "no, of course not," then you can probably get away with:

1. compiling all the QVM files to their Javascript equivalents ahead-of-time;

2. stuffing them all on a CDN (with each JS blob named after the md5 of the relevant QVM source);

and then 3. having a "compiler" in your client that just hashes the source it's about to "compile", and requests the blob with that hash from your CDN.

Of course, if the amount of generated JS is small enough, you could even just serve it all with the client. I'm doubting that one, though; it's probably at least 50MB of code. (Though it might be heavily redundant code... it could compress very well!)


> Why does this have to be done at runtime, exactly?

Because that's how quake3 works. What you're describing would be incompatible with all other quake3 clients. And part of the point of implementing the QVM in the first place is compatibility.


This is pretty crazy impressive! The performance gain particularly. Is that much of a gain expected?


I'm impressed, but I'm not particularly surprised. The initial code was interpreted, while the new code is JIT compiled. I have no data to back this up, but I suspect that if instead of generating javascript, he generated x86 assembly, you'd see a similar speedup. Possibly a larger one.


That's exactly what Quake3 did natively, and it gave a large performance boost.


Right. Pulling from JC's August 16, 1999 .plan (http://floodyberry.com/carmack/johnc_plan_1999.html#d1999081...) when he first wrote the x86 compiler:

Q3demo1 dll: 52.9 Compiled: 50.2 Interpreted: 43.9

Q3demo2 dll: 50.1 Compiled: 46.5 Interpreted: 38.7


What is up with that comment?


It's the guy writing TempleOS. Google around, you're in for an interesting story.



Does he have a new hn account? I love to read his stuff. But the account "losethos" has been inactive for over a month now.




He has changed the name of his OS multiple times. Might have some luck with the latest name, but you'd have to dig for it.


He still posts but his comments are auto-killed. You can see the comments if you turn on showdead in your preferences.


Very nice. Glad to see CPMA in there. :-)


'Transpiling' since it kind of converts to a scripting language not a native code.


Transpiling is also known as source-to-source compilation.


To add to this, transpiling is a kind of compiling where the source code and target code have roughly the same level of abstraction. asm.js does appear to be more "low-level" than quake bytecode, so this is compiling, and not transpiling.


asm.js is portable assembly with a jit assembler. In that sense, it's even closer to "native code" than C, as C requires a separate compilation step.


I am confused, there are JIT compilers for C:

http://code.google.com/p/asmjit/

http://homepage1.nifty.com/herumi/soft/xbyak_e.html

How is asm.js different to those?


He's trying to conjure up a difference that just isn't there in practice. Compilation is compilation, regardless of the number of times it's performed, and regardless of whether or not the result is stored for future use.


Embedding jit assemblers into C binaries is different than jit-ing the C language itself.


So is asm.js not run from a binary, that compiles the js JIT? C JIT compilers compile C code JIT, I still don't see the difference.

The two projects I linked to were incorrect, they are for interpreting asm JIT.

What I meant to point to were:

http://root.cern.ch/drupal/content/cling

https://metacpan.org/pod/C::TinyCompiler


Thanks for the pointers. Looks like asm.js and C are on par as portable assembly targets. The one difference is that asm.js is ready to use by half a billion users, the other is that no sane human should code asm.js directly, whereas some people swear by C as a language for humans. Allow me to clarify the initial statement:

> For 99.9% of the end users, asm.js it's even closer to "portable native code" than C, as C is usually delivered to end users as a precompiled binary via a separate compilation step, therefore no longer portable.


The reason people call C portable assembly is because it's about as low as you can go without coding in assembly, and still have it compile to various platforms.

You are really comparing asm.js and C as portable assembly emitters rather than targets. That said, people very occasionally use C as a target like they use asm.js, e.g. the original C++ compilers, the mars rover.

asm.js is as low as you can seemingly go and have it run in a modern version of Chrome and Firefox, but those requirements prevent it from being a portable assembler in the sense that C is. In addition, it's sensible to write your program in another language first before asm.js, as I understand it.

There's a whole swathe of machines (rather than users) out there that cannot run a modern version of those browsers, i.e. asm.js is not portable to them.

When the code being written in and transposed from is asm.js and the destination is some other language - then asm.js will be portable assembler in the sense that C is.


That's not true at all.

Asm.js is just a subset of JavaScript that is far more machine-friendly than it is human-friendly. Having a worse syntax than C or normal JavaScript doesn't mean that it's closer to native code or anything like that.

And the compilation step is still there in both cases. It doesn't matter if it's just-in-time compilation or ahead-of-time compilation; it's still compilation, and it's still there.


Precisely. Until there are CPUs with direct asm.js support (hopefully not though).


'Translation' would be the word of choice from PL literature. But there's such a significant transformation going on here that compiling seems quite apt.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: