Hacker News new | past | comments | ask | show | jobs | submit login
Hot-code reloading on macOS/arm64 with Zig (jakubkonka.com)
237 points by syrusakbary on March 17, 2022 | hide | past | favorite | 54 comments



Why not simply use MAP_JIT? There is no need to do these weird tricks, plus you’re putting your process and the rest of the system at risk by elevating privilege and using task-for-pid.


Hey, author of the article here. Thanks for the suggestion! I actually didn't know about the MAP_JIT flag to mmap before and will defo consider it. As to elevating your privs - this is just a temp solution until I work out how to add this entitlement https://developer.apple.com/documentation/bundleresources/en... to the Zig compiler. I wrote the default Zig's MachO linker from scratch and it can embed the adhoc code signatures no probs, but haven't worked out baking the entitlements in yet.


I was just thinking to myself, “The author probably didn’t know about MAP_JIT…” after reading the parent comment. I sure didn’t know about it.

There’s just so much software out there and so many little bits of information one can gather.


The only entitlement that is relevant here is get-task-allow. But that would allow anyone to get a control task port for your application and do with it as they may. This functionality was not designed to be used in production — except for debugging.

The debugger entitlement is even more powerful, but once again since you’re modifying your own memory you don’t need it.


Wait, but what about debuggers then? Plus hot-code reloading should only ever be used for quick development cycles when prototyping your app in debug mode, so very much what a debugger is used for, right? Additionally, I actually based the implementation of this PoC on lldb's debugserver for macOS.


Yeah this entitlement says your application is still “in-development” and allows a debugger to modify its executable memory.


Exactly, and the way I see it, this is the only valid use case for hot-code reloading in the first place: app in-development. I'll try augmenting my linker to be able to bake in the entitlements into the Zig compiler and this hopefully will remove the requirement for elevating privs via "sudo". Thanks for your comments though - it's been very enlightening :-)


> Plus hot-code reloading should only ever be used for quick development cycles when prototyping your app in debug mode

What about evolutionary/genetic programming? That can definitely take advantage of this as well.


The entitlement exists to enable development workflows like these.


Depending on the use case, you might want to have the application opt-in to the reload anyway (e.g. with before/after lifecycle callbacks), since any threads running in that address space would need to be paused, and this might lead to nasty situations if the developer isn't in control of this.

You also wouldn't need the entitlement anymore.


You’ll need to get the task port of the child process if you want to patch code in it.


This looks awesome for game development! Short feedback loops are a superpower.


This is exactly what I'm aiming to use Zig for: I'm working on a game engine in Zig[0] that I hope to leverage hot code reloading with

[0] https://hexops.com/mach/


If you watch some of Notch’s old game jam live streams; this is exactly how he used his Java environment and is probably why he was so invested in it.

You can see some examples here when he just tests logic changes:

https://youtu.be/MhQ70O1MiXc

(Though, he restarts often since he’s editing preloaded assets also)


The irony is that C++ got it before Java, but those environments were too resource hungry for them to suceed in the market.

Now almost 30 years later is when we see them being adopted.

"Lucid Energize Demo VHS 1993"

https://www.youtube.com/watch?v=pQQTScuApWk

Visual Age for C++ v4 had a similar capability, it was image based and allowed for Smalltalk like workflows.


Yeah it's weird - basically all JIT compilers have this feature built-n - they usually create stubs for not yet called functions which invoke the JIT, and replace the stub calls with actual executable code.


Visual Studio also has had it for several years in debug mode for C++, Edit-and-Continue.

Since last year they have been working on it to make it more dynamic, improving the use cases.

It is these little things that make it so much better than the UNIX alternatives.

However there is also ROOT and now CINT from CERN, or Live++.


I have implemented live reload (in a much simpler and naive way) for my little game engine/framework written in C a few years ago, as a "scene" is essentially a reloadable shared library there. When it worked it was a huge productivity booster and time saver indeed, but I've eventually stopped using this feature that much once I started relying more and more on passing function pointers around as callbacks - so things break pretty bad when the function address changes. I had some ideas on how to tackle that, but never got around to implementing it so far. Language level support looks very neat.


Optional late binding - function pointers that go through a lookup step (string or slot) before finding their destination.


Yeah, that was the idea, but doing it in a way that doesn't require excessive boilerplate needs some creativity :)


This seems like it would just crash or make your program get into an inconsistent state. What happens if you are executing a transaction but mid transaction your binary gets updated and your transaction now has done half of what's needed by the old version and half of what's needed by the new version. I don't think it would be that hard to get into an inconsistent state.


If that matters, you probably shouldn't be using hot code reloading regardless of the language or implementation.


> If that matters, you probably shouldn't be using hot code reloading regardless of the language or implementation.

Why not? Perhaps I'm missing the point, but can't this be mutex'd away? Or perhaps you're refering to the fact that the hot code reloading mechanism should provide the mutex implicitly?


Things like restoring connections can be mutex'd away in the implementation, yeah. Or whatever the actually-working version would be, maybe there needs to be a socket-owning parent process that never dies, I dunno.

But if you write

    result = 0
    while true:
      result += 2 * do_something()
and change that to this and reload:

    result = 1
    while true:
      result -= 2 * do_something()
then even "ideal, correct behavior" at a reload level can give you end result values that are not possible to get while running either version of the code end-to-end. It's not possible to "fix" that because it's doing exactly what you told it to do - either it preserves values and can produce impossible values, or it does not and it's not really hot-swapping any more. Or it does some mix of that and it has even more surprising / wrong edge cases.

There are of course uses for systems with hot-swap boundaries/transactions to prevent swapping during critical regions, or to do something like stop and replace actors rather than touching their internal state, and they exist (e.g. erlang). But that requires your code to already be adopting those semantics, and it inherently defines boundaries on what is swapped and what is not. In that case the root "inconsistent transaction state" concern is not really a hot swapping issue, it's a failure to define your boundaries and semantics correctly - a normal bug / relying on undefined behavior.


Yep, you are absolutely right, and hence why this is a proof-of-concept after all and a start really. With the proper tweaks to the linker though I am sure it would be possible orchestrate updates in such a way as to not clobber any existing global state or cause weird UB (by overwriting the code a thread is currently executing, etc.). Either way, thanks so much for the feedback!


Yeah, I've never been a huge fan of hot-code reloading for that reason -- when I'm writing code I'm often trying to get something stable, and if it's having funky issues I don't want to be second-guessing whether it's the HCR or my own error. I've never found a HCR that's stable enough for me to not have that worry, and I don't think it's possible for the reason you mentioned.


Well, games have a leg up on the competition in this regard - games tend to have some update logic which is called 60 times per second.

If you can restrict the code that runs between frames, you can just replace the logic everywhere between the frames, and not have to worry about what code the threads are executing at this time - since game logic is not running.



Nice! V does it the same way as Nim though, right? From quickly browsing over the sources, it looks like it is based on dynamic library hot swapping https://github.com/vlang/v/tree/master/vlib/v/live


Not very familiar with the mechanism, sorry. The vlang community is very reactive, both in GH Issues and the Discord channel.

They can support you with good advice in the further developing of Zig, I'm sure! Good luck!


V author here. Yes, dynamic library hot swapping, but we're also planning to implement a more sophisticated way to do this.


What is the more sophisticated way? Do you have an Issue / Milestone in GitHub?


What are the pros/cons of one vs. the other?


Great question. With the presented approach in Zig, since you work on the binary directly, after you finish your hot-code reloading session, you can still run the generated binary from disk (and debug it or whatnot) as all writes to memory were also committed to the underlying file on disk. There is therefore no need to recompile your program to an executable from a dynamic library as I guess would be the case for approach taken by Nim/V.

The presented approach might also be more resource efficient as it is writing directly to program's memory rather than unloading and reloading a dynamic library, but this is very much a guess and I would need to do some benchmarking to get a better feel for it.

In general though, this approach is possible in Zig since first of all, we have our own linker for each target file format (ELF, MachO, COFF-coming-soon-tm), secondly, the compiler generates the executable file directly, and thirdly, incremental updates are super granular in order to minimise writes to the file as much as possible.


When I see hot code reloading, I think there are probably some server frameworks that are waiting to get this feature to reduce feedback loop to a few tenths of second of build… Nice overview!


I love the feature but I am a little worried about how much more complex it would make the toolchain. How many extra LOC does this add to the compiler and linker (and how big are they now)?


Compiler and linker are 186,865 lines of code. My `hcs` branch which adds hot code swapping support for Linux (a companion PoC to the author of the OP) is 328 additions and 18 deletions [1]. And about 200 lines of those are adding ptrace constants to the std lib and a simple socket server to the compiler just as a method of issuing commands while the stdio is locked up in the running application.

Crazy right?

This is because incremental compilation & linking is actually the same problem as hot code swapping! This was just a natural fallout of the design of the compiler.

Same deal with Jakub's hot code swapping branch. It's 257 additions and 9 deletions and has the same ~150 lines of adding the socket server [2].

So in answer to your question, the PoC adds about 150 lines to an 186,865 line codebase.

[1]: https://github.com/ziglang/zig/compare/hcs

[2]: https://github.com/ziglang/zig/compare/hcs-macos


It's a great feature that might become one of the killer features of zig, among others. Very much worth the extra complexity.


I'm using Zig, but this all sounds like being able to hot swap in Lisp dev, especially for games. I fermenter someone using Gambit and iPhone for game dev in 2010 or so pulling this off. This seems more complicated in comparison. I'm playing with this: https://github.com/michal-z/zig-gamedev

Mach for Zig looks interesting too, but it's very new.


Can hot-code reloading coexist well with inlined functions?


> Instead of having your program managed by a running side-by-side loader program, what if the compiler would “simply” update the memory of the running process? You will most inevitably think I have gone completely crazy

This isn't crazy - this is how just-in-time compilers work.


A typical JIT compiler runs in the same process as the executable as part of its runtime -- it's a program modifying itself. This is describing a compiler modifying the memory of a separate running process out from underneath it.

Which does seem a little crazy, and the post doesn't go into details about how the compiler knows when it's reasonable to rewrite instructions -- it seems like you would need some runtime mechanism in the child to coordinate this, or you'd run the risk of... well, literally anything could happen, with the wrong patch at the wrong time. I feel like I might be missing something. The more I think about this the crazier it seems. :)

(And if you had this inter-process channel for coordinating changes, then the child process could also just patch itself based on messages from the compiler, removing the need for special cross-process memory-writing privileges. LiveReload for native code...)


> it seems like you would need some runtime mechanism in the child to coordinate this

I had been thinking about what mechanism you could put in place to coordinate, but you made me realize that if your language provides some kind of event loop, then that would be the right place to have the "please stop while I push this old function under the rug" functionality. And languages that have async/await have to have such a runtime, which can make the whole thing completely transparent to the programmer. But you still need the inter-process comms to let the runtime know it should wait for a bit, so your last sentence still applies: at that point have the process rewrite itself.


>how the compiler knows when it's reasonable to rewrite instructions

JIT languages tend to be also GC languages, which tend to have a very good idea of when it's safe to stop a thread, and what the values in the registers and stack mean at a particular point in time.


It is a bit crazy if you do it for hot code reloading, though, as the logic has now changed.

In the case of JIT, it may swap the bytecode but the flow of execution remains the same.

Hot code reloading usually requires a different approach.


Yep, a JIT will typically use on-stack replacement to keep execution coherent. Hot code reloading has to deal with the semantics of the code changing out from under it, which is substantially more complicated to deal with (and many implementations do not really handle this very well).


JITs also support code reloading. Eg normal Clojure(Script) workflow is based on reloading both on JVM and JS platforms. But functions are not altered mid-execution, rather your top level invocation or main loop picks up the new versions of functions from the updated namespaces after reload. This seems more reliable in face of arg function argument changes.


In the case of JIT, it may swap the Bytecode but the flow of execution remains the same.

Are you able to elaborate on this further?

1a. Does this mean the process gets restarted during reload?

1b. If no restart, how are active threads handled? What if a thread should no longer exist post-reload?


JITs don’t need to reload code, they run the same code and tier it up to an optimized implementation when it gets hot. No execution context is lost, at least semantically.


> JITs don’t need to reload code

If I've inlined a method, and someone redefines that method in a language that allows that, then I'm going to need to reload that code.


I don’t know much about the internals of a JIT compiler, but wouldn’t it just (recursively) deoptimize the methods which include the inlined method and do a normal method call instead which will now go to the updated implementation?


Yes, but if a called function redefines its own parent, it still returns to the old code, not the new code.


That is true, but JITs nevertheless perform on-stack replacement for optimization purposes.


dang, could you add https:// at the start of the link?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: