Why not simply use MAP_JIT?
There is no need to do these weird tricks, plus you’re putting your process and the rest of the system at risk by elevating privilege and using task-for-pid.
Hey, author of the article here. Thanks for the suggestion! I actually didn't know about the MAP_JIT flag to mmap before and will defo consider it. As to elevating your privs - this is just a temp solution until I work out how to add this entitlement https://developer.apple.com/documentation/bundleresources/en... to the Zig compiler. I wrote the default Zig's MachO linker from scratch and it can embed the adhoc code signatures no probs, but haven't worked out baking the entitlements in yet.
The only entitlement that is relevant here is get-task-allow. But that would allow anyone to get a control task port for your application and do with it as they may. This functionality was not designed to be used in production — except for debugging.
The debugger entitlement is even more powerful, but once again since you’re modifying your own memory you don’t need it.
Wait, but what about debuggers then? Plus hot-code reloading should only ever be used for quick development cycles when prototyping your app in debug mode, so very much what a debugger is used for, right? Additionally, I actually based the implementation of this PoC on lldb's debugserver for macOS.
Exactly, and the way I see it, this is the only valid use case for hot-code reloading in the first place: app in-development. I'll try augmenting my linker to be able to bake in the entitlements into the Zig compiler and this hopefully will remove the requirement for elevating privs via "sudo". Thanks for your comments though - it's been very enlightening :-)
Depending on the use case, you might want to have the application opt-in to the reload anyway (e.g. with before/after lifecycle callbacks), since any threads running in that address space would need to be paused, and this might lead to nasty situations if the developer isn't in control of this.
Yeah it's weird - basically all JIT compilers have this feature built-n - they usually create stubs for not yet called functions which invoke the JIT, and replace the stub calls with actual executable code.
I have implemented live reload (in a much simpler and naive way) for my little game engine/framework written in C a few years ago, as a "scene" is essentially a reloadable shared library there. When it worked it was a huge productivity booster and time saver indeed, but I've eventually stopped using this feature that much once I started relying more and more on passing function pointers around as callbacks - so things break pretty bad when the function address changes. I had some ideas on how to tackle that, but never got around to implementing it so far. Language level support looks very neat.
This seems like it would just crash or make your program get into an inconsistent state. What happens if you are executing a transaction but mid transaction your binary gets updated and your transaction now has done half of what's needed by the old version and half of what's needed by the new version. I don't think it would be that hard to get into an inconsistent state.
> If that matters, you probably shouldn't be using hot code reloading regardless of the language or implementation.
Why not? Perhaps I'm missing the point, but can't this be mutex'd away? Or perhaps you're refering to the fact that the hot code reloading mechanism should provide the mutex implicitly?
Things like restoring connections can be mutex'd away in the implementation, yeah. Or whatever the actually-working version would be, maybe there needs to be a socket-owning parent process that never dies, I dunno.
But if you write
result = 0
while true:
result += 2 * do_something()
and change that to this and reload:
result = 1
while true:
result -= 2 * do_something()
then even "ideal, correct behavior" at a reload level can give you end result values that are not possible to get while running either version of the code end-to-end. It's not possible to "fix" that because it's doing exactly what you told it to do - either it preserves values and can produce impossible values, or it does not and it's not really hot-swapping any more. Or it does some mix of that and it has even more surprising / wrong edge cases.
There are of course uses for systems with hot-swap boundaries/transactions to prevent swapping during critical regions, or to do something like stop and replace actors rather than touching their internal state, and they exist (e.g. erlang). But that requires your code to already be adopting those semantics, and it inherently defines boundaries on what is swapped and what is not. In that case the root "inconsistent transaction state" concern is not really a hot swapping issue, it's a failure to define your boundaries and semantics correctly - a normal bug / relying on undefined behavior.
Yep, you are absolutely right, and hence why this is a proof-of-concept after all and a start really. With the proper tweaks to the linker though I am sure it would be possible orchestrate updates in such a way as to not clobber any existing global state or cause weird UB (by overwriting the code a thread is currently executing, etc.). Either way, thanks so much for the feedback!
Yeah, I've never been a huge fan of hot-code reloading for that reason -- when I'm writing code I'm often trying to get something stable, and if it's having funky issues I don't want to be second-guessing whether it's the HCR or my own error. I've never found a HCR that's stable enough for me to not have that worry, and I don't think it's possible for the reason you mentioned.
Well, games have a leg up on the competition in this regard - games tend to have some update logic which is called 60 times per second.
If you can restrict the code that runs between frames, you can just replace the logic everywhere between the frames, and not have to worry about what code the threads are executing at this time - since game logic is not running.
Great question. With the presented approach in Zig, since you work on the binary directly, after you finish your hot-code reloading session, you can still run the generated binary from disk (and debug it or whatnot) as all writes to memory were also committed to the underlying file on disk. There is therefore no need to recompile your program to an executable from a dynamic library as I guess would be the case for approach taken by Nim/V.
The presented approach might also be more resource efficient as it is writing directly to program's memory rather than unloading and reloading a dynamic library, but this is very much a guess and I would need to do some benchmarking to get a better feel for it.
In general though, this approach is possible in Zig since first of all, we have our own linker for each target file format (ELF, MachO, COFF-coming-soon-tm), secondly, the compiler generates the executable file directly, and thirdly, incremental updates are super granular in order to minimise writes to the file as much as possible.
When I see hot code reloading, I think there are probably some server frameworks that are waiting to get this feature to reduce feedback loop to a few tenths of second of build… Nice overview!
I love the feature but I am a little worried about how much more complex it would make the toolchain. How many extra LOC does this add to the compiler and linker (and how big are they now)?
Compiler and linker are 186,865 lines of code. My `hcs` branch which adds hot code swapping support for Linux (a companion PoC to the author of the OP) is 328 additions and 18 deletions [1]. And about 200 lines of those are adding ptrace constants to the std lib and a simple socket server to the compiler just as a method of issuing commands while the stdio is locked up in the running application.
Crazy right?
This is because incremental compilation & linking is actually the same problem as hot code swapping! This was just a natural fallout of the design of the compiler.
Same deal with Jakub's hot code swapping branch. It's 257 additions and 9 deletions and has the same ~150 lines of adding the socket server [2].
So in answer to your question, the PoC adds about 150 lines to an 186,865 line codebase.
I'm using Zig, but this all sounds like being able to hot swap in Lisp dev, especially for games. I fermenter someone using Gambit and iPhone for game dev in 2010 or so pulling this off. This seems more complicated in comparison. I'm playing with this: https://github.com/michal-z/zig-gamedev
Mach for Zig looks interesting too, but it's very new.
> Instead of having your program managed by a running side-by-side loader program, what if the compiler would “simply” update the memory of the running process? You will most inevitably think I have gone completely crazy
This isn't crazy - this is how just-in-time compilers work.
A typical JIT compiler runs in the same process as the executable as part of its runtime -- it's a program modifying itself. This is describing a compiler modifying the memory of a separate running process out from underneath it.
Which does seem a little crazy, and the post doesn't go into details about how the compiler knows when it's reasonable to rewrite instructions -- it seems like you would need some runtime mechanism in the child to coordinate this, or you'd run the risk of... well, literally anything could happen, with the wrong patch at the wrong time. I feel like I might be missing something. The more I think about this the crazier it seems. :)
(And if you had this inter-process channel for coordinating changes, then the child process could also just patch itself based on messages from the compiler, removing the need for special cross-process memory-writing privileges. LiveReload for native code...)
> it seems like you would need some runtime mechanism in the child to coordinate this
I had been thinking about what mechanism you could put in place to coordinate, but you made me realize that if your language provides some kind of event loop, then that would be the right place to have the "please stop while I push this old function under the rug" functionality. And languages that have async/await have to have such a runtime, which can make the whole thing completely transparent to the programmer. But you still need the inter-process comms to let the runtime know it should wait for a bit, so your last sentence still applies: at that point have the process rewrite itself.
>how the compiler knows when it's reasonable to rewrite instructions
JIT languages tend to be also GC languages, which tend to have a very good idea of when it's safe to stop a thread, and what the values in the registers and stack mean at a particular point in time.
Yep, a JIT will typically use on-stack replacement to keep execution coherent. Hot code reloading has to deal with the semantics of the code changing out from under it, which is substantially more complicated to deal with (and many implementations do not really handle this very well).
JITs also support code reloading. Eg normal Clojure(Script) workflow is based on reloading both on JVM and JS platforms. But functions are not altered mid-execution, rather your top level invocation or main loop picks up the new versions of functions from the updated namespaces after reload. This seems more reliable in face of arg function argument changes.
JITs don’t need to reload code, they run the same code and tier it up to an optimized implementation when it gets hot. No execution context is lost, at least semantically.
I don’t know much about the internals of a JIT compiler, but wouldn’t it just (recursively) deoptimize the methods which include the inlined method and do a normal method call instead which will now go to the updated implementation?