Hot-code reloading on macOS/arm64 with Zig

pmalynin · on March 17, 2022

Why not simply use MAP_JIT? There is no need to do these weird tricks, plus you’re putting your process and the rest of the system at risk by elevating privilege and using task-for-pid.

kubkon · on March 17, 2022

Hey, author of the article here. Thanks for the suggestion! I actually didn't know about the MAP_JIT flag to mmap before and will defo consider it. As to elevating your privs - this is just a temp solution until I work out how to add this entitlement https://developer.apple.com/documentation/bundleresources/en... to the Zig compiler. I wrote the default Zig's MachO linker from scratch and it can embed the adhoc code signatures no probs, but haven't worked out baking the entitlements in yet.

andrewmcwatters · on March 18, 2022

I was just thinking to myself, “The author probably didn’t know about MAP_JIT…” after reading the parent comment. I sure didn’t know about it.

There’s just so much software out there and so many little bits of information one can gather.

pmalynin · on March 17, 2022

The only entitlement that is relevant here is get-task-allow. But that would allow anyone to get a control task port for your application and do with it as they may. This functionality was not designed to be used in production — except for debugging.

The debugger entitlement is even more powerful, but once again since you’re modifying your own memory you don’t need it.

kubkon · on March 17, 2022

Wait, but what about debuggers then? Plus hot-code reloading should only ever be used for quick development cycles when prototyping your app in debug mode, so very much what a debugger is used for, right? Additionally, I actually based the implementation of this PoC on lldb's debugserver for macOS.

pmalynin · on March 17, 2022

Yeah this entitlement says your application is still “in-development” and allows a debugger to modify its executable memory.

kubkon · on March 18, 2022

Exactly, and the way I see it, this is the only valid use case for hot-code reloading in the first place: app in-development. I'll try augmenting my linker to be able to bake in the entitlements into the Zig compiler and this hopefully will remove the requirement for elevating privs via "sudo". Thanks for your comments though - it's been very enlightening :-)

jhgb · on March 19, 2022

> Plus hot-code reloading should only ever be used for quick development cycles when prototyping your app in debug mode

What about evolutionary/genetic programming? That can definitely take advantage of this as well.

saagarjha · on March 17, 2022

The entitlement exists to enable development workflows like these.

geocar · on March 18, 2022

Depending on the use case, you might want to have the application opt-in to the reload anyway (e.g. with before/after lifecycle callbacks), since any threads running in that address space would need to be paused, and this might lead to nasty situations if the developer isn't in control of this.

You also wouldn't need the entitlement anymore.

saagarjha · on March 17, 2022

You’ll need to get the task port of the child process if you want to patch code in it.

hu3 · on March 18, 2022

This looks awesome for game development! Short feedback loops are a superpower.

emidoots · on March 18, 2022

This is exactly what I'm aiming to use Zig for: I'm working on a game engine in Zig[0] that I hope to leverage hot code reloading with

[0] https://hexops.com/mach/

deaddodo · on March 18, 2022

If you watch some of Notch’s old game jam live streams; this is exactly how he used his Java environment and is probably why he was so invested in it.

You can see some examples here when he just tests logic changes:

https://youtu.be/MhQ70O1MiXc

(Though, he restarts often since he’s editing preloaded assets also)

pjmlp · on March 18, 2022

The irony is that C++ got it before Java, but those environments were too resource hungry for them to suceed in the market.

Now almost 30 years later is when we see them being adopted.

"Lucid Energize Demo VHS 1993"

https://www.youtube.com/watch?v=pQQTScuApWk

Visual Age for C++ v4 had a similar capability, it was image based and allowed for Smalltalk like workflows.

torginus · on March 18, 2022

Yeah it's weird - basically all JIT compilers have this feature built-n - they usually create stubs for not yet called functions which invoke the JIT, and replace the stub calls with actual executable code.

pjmlp · on March 18, 2022

Visual Studio also has had it for several years in debug mode for C++, Edit-and-Continue.

Since last year they have been working on it to make it more dynamic, improving the use cases.

It is these little things that make it so much better than the UNIX alternatives.

However there is also ROOT and now CINT from CERN, or Live++.

seba_dos1 · on March 18, 2022

I have implemented live reload (in a much simpler and naive way) for my little game engine/framework written in C a few years ago, as a "scene" is essentially a reloadable shared library there. When it worked it was a huge productivity booster and time saver indeed, but I've eventually stopped using this feature that much once I started relying more and more on passing function pointers around as callbacks - so things break pretty bad when the function address changes. I had some ideas on how to tackle that, but never got around to implementing it so far. Language level support looks very neat.

barrkel · on March 18, 2022

Optional late binding - function pointers that go through a lookup step (string or slot) before finding their destination.

seba_dos1 · on March 18, 2022

Yeah, that was the idea, but doing it in a way that doesn't require excessive boilerplate needs some creativity :)

charcircuit · on March 17, 2022

This seems like it would just crash or make your program get into an inconsistent state. What happens if you are executing a transaction but mid transaction your binary gets updated and your transaction now has done half of what's needed by the old version and half of what's needed by the new version. I don't think it would be that hard to get into an inconsistent state.

Groxx · on March 17, 2022

If that matters, you probably shouldn't be using hot code reloading regardless of the language or implementation.

mihaigalos · on March 18, 2022

> If that matters, you probably shouldn't be using hot code reloading regardless of the language or implementation.

Why not? Perhaps I'm missing the point, but can't this be mutex'd away? Or perhaps you're refering to the fact that the hot code reloading mechanism should provide the mutex implicitly?

Groxx · on March 19, 2022

Things like restoring connections can be mutex'd away in the implementation, yeah. Or whatever the actually-working version would be, maybe there needs to be a socket-owning parent process that never dies, I dunno.

But if you write

    result = 0
    while true:
      result += 2 * do_something()

and change that to this and reload:

    result = 1
    while true:
      result -= 2 * do_something()

then even "ideal, correct behavior" at a reload level can give you end result values that are not possible to get while running either version of the code end-to-end. It's not possible to "fix" that because it's doing exactly what you told it to do - either it preserves values and can produce impossible values, or it does not and it's not really hot-swapping any more. Or it does some mix of that and it has even more surprising / wrong edge cases.

There are of course uses for systems with hot-swap boundaries/transactions to prevent swapping during critical regions, or to do something like stop and replace actors rather than touching their internal state, and they exist (e.g. erlang). But that requires your code to already be adopting those semantics, and it inherently defines boundaries on what is swapped and what is not. In that case the root "inconsistent transaction state" concern is not really a hot swapping issue, it's a failure to define your boundaries and semantics correctly - a normal bug / relying on undefined behavior.

kubkon · on March 17, 2022

Yep, you are absolutely right, and hence why this is a proof-of-concept after all and a start really. With the proper tweaks to the linker though I am sure it would be possible orchestrate updates in such a way as to not clobber any existing global state or cause weird UB (by overwriting the code a thread is currently executing, etc.). Either way, thanks so much for the feedback!

vgel · on March 18, 2022

Yeah, I've never been a huge fan of hot-code reloading for that reason -- when I'm writing code I'm often trying to get something stable, and if it's having funky issues I don't want to be second-guessing whether it's the HCR or my own error. I've never found a HCR that's stable enough for me to not have that worry, and I don't think it's possible for the reason you mentioned.

torginus · on March 18, 2022

Well, games have a leg up on the competition in this regard - games tend to have some update logic which is called 60 times per second.

If you can restrict the code that runs between frames, you can just replace the logic everywhere between the frames, and not have to worry about what code the threads are executing at this time - since game logic is not running.

mihaigalos · on March 18, 2022

Awesome! vlang has it too: https://github.com/vlang/v/blob/master/doc/docs.md#hot-code-...

kubkon · on March 18, 2022

Nice! V does it the same way as Nim though, right? From quickly browsing over the sources, it looks like it is based on dynamic library hot swapping https://github.com/vlang/v/tree/master/vlib/v/live

mihaigalos · on March 18, 2022

Not very familiar with the mechanism, sorry. The vlang community is very reactive, both in GH Issues and the Discord channel.

They can support you with good advice in the further developing of Zig, I'm sure! Good luck!

amedvednikov · on March 18, 2022

V author here. Yes, dynamic library hot swapping, but we're also planning to implement a more sophisticated way to do this.

mihaigalos · on March 18, 2022

What is the more sophisticated way? Do you have an Issue / Milestone in GitHub?

WithinReason · on March 18, 2022

What are the pros/cons of one vs. the other?

kubkon · on March 18, 2022

Great question. With the presented approach in Zig, since you work on the binary directly, after you finish your hot-code reloading session, you can still run the generated binary from disk (and debug it or whatnot) as all writes to memory were also committed to the underlying file on disk. There is therefore no need to recompile your program to an executable from a dynamic library as I guess would be the case for approach taken by Nim/V.

The presented approach might also be more resource efficient as it is writing directly to program's memory rather than unloading and reloading a dynamic library, but this is very much a guess and I would need to do some benchmarking to get a better feel for it.

In general though, this approach is possible in Zig since first of all, we have our own linker for each target file format (ELF, MachO, COFF-coming-soon-tm), secondly, the compiler generates the executable file directly, and thirdly, incremental updates are super granular in order to minimise writes to the file as much as possible.

Octplane · on March 17, 2022

When I see hot code reloading, I think there are probably some server frameworks that are waiting to get this feature to reduce feedback loop to a few tenths of second of build… Nice overview!

muth02446 · on March 18, 2022

I love the feature but I am a little worried about how much more complex it would make the toolchain. How many extra LOC does this add to the compiler and linker (and how big are they now)?

AndyKelley · on March 18, 2022

Compiler and linker are 186,865 lines of code. My `hcs` branch which adds hot code swapping support for Linux (a companion PoC to the author of the OP) is 328 additions and 18 deletions [1]. And about 200 lines of those are adding ptrace constants to the std lib and a simple socket server to the compiler just as a method of issuing commands while the stdio is locked up in the running application.

Crazy right?

This is because incremental compilation & linking is actually the same problem as hot code swapping! This was just a natural fallout of the design of the compiler.

Same deal with Jakub's hot code swapping branch. It's 257 additions and 9 deletions and has the same ~150 lines of adding the socket server [2].

So in answer to your question, the PoC adds about 150 lines to an 186,865 line codebase.

[1]: https://github.com/ziglang/zig/compare/hcs

[2]: https://github.com/ziglang/zig/compare/hcs-macos

hiccuphippo · on March 18, 2022

It's a great feature that might become one of the killer features of zig, among others. Very much worth the extra complexity.

eggy · on March 18, 2022

I'm using Zig, but this all sounds like being able to hot swap in Lisp dev, especially for games. I fermenter someone using Gambit and iPhone for game dev in 2010 or so pulling this off. This seems more complicated in comparison. I'm playing with this: https://github.com/michal-z/zig-gamedev

Mach for Zig looks interesting too, but it's very new.

rdc12 · on March 18, 2022

Can hot-code reloading coexist well with inlined functions?

chrisseaton · on March 17, 2022

> Instead of having your program managed by a running side-by-side loader program, what if the compiler would “simply” update the memory of the running process? You will most inevitably think I have gone completely crazy

This isn't crazy - this is how just-in-time compilers work.

ianthehenry · on March 18, 2022

A typical JIT compiler runs in the same process as the executable as part of its runtime -- it's a program modifying itself. This is describing a compiler modifying the memory of a separate running process out from underneath it.

Which does seem a little crazy, and the post doesn't go into details about how the compiler knows when it's reasonable to rewrite instructions -- it seems like you would need some runtime mechanism in the child to coordinate this, or you'd run the risk of... well, literally anything could happen, with the wrong patch at the wrong time. I feel like I might be missing something. The more I think about this the crazier it seems. :)

(And if you had this inter-process channel for coordinating changes, then the child process could also just patch itself based on messages from the compiler, removing the need for special cross-process memory-writing privileges. LiveReload for native code...)

estebank · on March 18, 2022

> it seems like you would need some runtime mechanism in the child to coordinate this

I had been thinking about what mechanism you could put in place to coordinate, but you made me realize that if your language provides some kind of event loop, then that would be the right place to have the "please stop while I push this old function under the rug" functionality. And languages that have async/await have to have such a runtime, which can make the whole thing completely transparent to the programmer. But you still need the inter-process comms to let the runtime know it should wait for a bit, so your last sentence still applies: at that point have the process rewrite itself.

torginus · on March 18, 2022

>how the compiler knows when it's reasonable to rewrite instructions

JIT languages tend to be also GC languages, which tend to have a very good idea of when it's safe to stop a thread, and what the values in the registers and stack mean at a particular point in time.

stingraycharles · on March 17, 2022

It is a bit crazy if you do it for hot code reloading, though, as the logic has now changed.

In the case of JIT, it may swap the bytecode but the flow of execution remains the same.

Hot code reloading usually requires a different approach.

saagarjha · on March 17, 2022

Yep, a JIT will typically use on-stack replacement to keep execution coherent. Hot code reloading has to deal with the semantics of the code changing out from under it, which is substantially more complicated to deal with (and many implementations do not really handle this very well).

fulafel · on March 18, 2022

JITs also support code reloading. Eg normal Clojure(Script) workflow is based on reloading both on JVM and JS platforms. But functions are not altered mid-execution, rather your top level invocation or main loop picks up the new versions of functions from the updated namespaces after reload. This seems more reliable in face of arg function argument changes.

metadat · on March 18, 2022

In the case of JIT, it may swap the Bytecode but the flow of execution remains the same.

Are you able to elaborate on this further?

1a. Does this mean the process gets restarted during reload?

1b. If no restart, how are active threads handled? What if a thread should no longer exist post-reload?

saagarjha · on March 18, 2022

JITs don’t need to reload code, they run the same code and tier it up to an optimized implementation when it gets hot. No execution context is lost, at least semantically.

chrisseaton · on March 18, 2022

> JITs don’t need to reload code

If I've inlined a method, and someone redefines that method in a language that allows that, then I'm going to need to reload that code.

kaba0 · on March 18, 2022

I don’t know much about the internals of a JIT compiler, but wouldn’t it just (recursively) deoptimize the methods which include the inlined method and do a normal method call instead which will now go to the updated implementation?

tedunangst · on March 18, 2022

Yes, but if a called function redefines its own parent, it still returns to the old code, not the new code.

moonchild · on March 18, 2022

That is true, but JITs nevertheless perform on-stack replacement for optimization purposes.

sslAnon436 · on March 18, 2022

dang, could you add https:// at the start of the link?