What's the most portable way to include binary blobs in an executable?

amethyst · on July 26, 2022

Make a ZIP file containing the blob, and catenate it to the end of the executable binary. The ZIP format specifically puts all of the key metadata at the back of the file, so pretty much any ZIP tool can correctly read/list/extract data from the ZIP portion of the file. Anything that needs to be linked at runtime can just be extracted to a temp dir, and then cleaned up on exit. Bonus points for getting "free" compression on text data blobs.

We do this for Python applications, by combining a ZIP containing the "link tree" of sources/packages/modules, with a shell bootstrap script that automatically sets up the environment, import path, etc, and Python itself has built in support for importing pure-python modules from a ZIP file. All that's needed for native modules is a simple import hook that extracts the native objects into temp space and then loads them appropriately.

titzer · on July 26, 2022

Indeed, JARs are just ZIP files (and can even be uncompressed), so this trick works to make self-executable JAR files that consist of a shell script concatenated with the actual JAR.

https://blog.frankel.ch/creating-self-contained-executable-j...

https://coderwall.com/p/ssuaxa/how-to-make-a-jar-file-linux-...

deivid · on July 26, 2022

Just in case you are unaware, take a look at shiv: https://github.com/linkedin/shiv which does this quite neatly

EdSchouten · on July 27, 2022

I think that approach only works on general purpose operating systems, right? It can’t be used to add assets to executables that are used on, say, embedded systems. Or WebAssembly apps.

gumby · on July 27, 2022

> Trying to solve objcopy's ugliness

Hey, I wrote objcopy and objdump as debugging tools for developing bfd targets when I was designing bfd. Neither was (originally) intended as a production tool. OK I still use them myself, and I haven’t looked at the bfd source code in almost 30 years.

I forget who came up with the tool chain target names, which I think was your real complaint, though I remember when. Perhaps Ian Taylor (later author of gold, which was notable for, among other things, not using bfd).

Objdump should have a feature to generate assembly or c code for an arbitrary blob (with correct byte swap, if needed, of course).

ltratt · on July 27, 2022

I always try to be respectful towards other software in my writing and I fell short this time. I hope you'll accept my sincere apologies!

> the tool chain target names, which I think was your real complaint

Yes, this was solely what I was referring to (slightly thoughtlessly) as "ugly", and even then only in the sense of "where did those magic names come from?" I certainly wasn't referring to objcopy itself!

gumby · on July 27, 2022

don't worry: I'm not actually insulted. I just think it's funny that something intended for development debugging turn out to be actually useful (as I said I still use them too, and not for debugging bfd).

mmastrac · on July 28, 2022

I've been using objcopy and objdump for a _long_ time. You saved my bacon on some really obscure archs (QCOM hexagon, for instance). Thanks!

jviotti · on July 26, 2022

My team is working on this problem in the context of creating Node.js single-executable applications. While the naive approach of just appending data at the end of the binary works, it is not friendly with code-signature in macOS and Windows given that signing operates on PE and Mach-O sections.

We have recently open-sourced a small tool called Postject (https://github.com/postmanlabs/postject), which is able to inject arbitrary data as proper ELF/Mach-O/PE sections for all major operating systems (with AIX support coming). The tool also provides C/C++ cross-platform headers for easily traversing the final binary and introspect whether the segment is present or not.

The tool is based on the LIEF (https://github.com/lief-project/LIEF) project.

At Postman, we are making use of this on our custom Node.js single-executable applications and soon on our custom Electron.js builds too.

CMCDragonkai · on July 27, 2022

Do you have a full example somewhere or using postject. We are currently using vercel/pkg and it is quite opaque.

jviotti · on July 28, 2022

The full example is sadly closed-source for now. That said, we are closely collaborating with the maintainer of PKG as part as the Node.js SEA initiative, and we hope to be kickstarting an experiment on refactoring PKG to use Postject pretty soon.

avrionov · on July 26, 2022

This was discussed a few days ago "Embed is in C23":

https://news.ycombinator.com/item?id=32201951

C++ added "std::embed" https://open-std.org/JTC1/SC22/WG21/docs/papers/2020/p1040r6...

gavinhoward · on July 26, 2022

The answer to this is easy. At least it was for me; I didn't know it was such a problem.

My solution is [1]. It generates a C file with a specific array name passed in through the command-line. It also has a few other niceties that I need.

It works on Windows, Mac OSX, Linux, and the BSD's, no matter the compiler or linker.

I use it to generate the arrays for the help texts ([2] and [3]), as well as two math libraries ([4] and [5]).

People are welcome to adopt and adapt it. Just follow the license, as per usual. I've even adapted to my other software. [6]

[1]: https://git.yzena.com/gavin/bc/src/branch/master/gen/strgen....

[2]: https://git.yzena.com/gavin/bc/src/branch/master/gen/bc_help...

[3]: https://git.yzena.com/gavin/bc/src/branch/master/gen/dc_help...

[4]: https://git.yzena.com/gavin/bc/src/branch/master/gen/lib.bc

[5]: https://git.yzena.com/gavin/bc/src/branch/master/gen/lib2.bc

[6]: https://git.yzena.com/Yzena/Yc/src/branch/master/tests/strge...

electroly · on July 27, 2022

I use xxd for this and the problem is far from solved. Around 5MB or so, the compile times explode. Somewhere around 7MB I could no longer self-host on 32-bit platforms and had to cross-compile from a 64-bit machine with more memory. Around 10MB the compilers start crashing. These are not very big files. The time added for the couple of resource files far, far exceeds the time it takes to compile the whole rest of the project. I had to add special compile-time options to exclude resources for development builds so I wouldn't drive myself crazy with the long compile times. I previously used all the platform-specific methods mentioned in the article and they don't have this problem, but it's an absurd amount of complexity to handle every platform's little way of doing it in my Makefile.

samatman · on July 27, 2022

I wrote a Lua script to do the same thing once.

Mentioned it to an old Unix hand, she replied "hmm, what about `xxd -i < file`"?

Live and learn.

gavinhoward · on July 27, 2022

Considering I knew I had to make it work everywhere, and that included not depending on tools that are not standard in POSIX, I still think I made the right choice.

But yes, xxd is an easy solution.

samatman · on July 27, 2022

I still use the Lua script, why not?

Granted I can replace it with xxd and two lines of awk, but that would have taken longer, and what's done's done, it's not a task with shifting requirements.

electroly · on July 27, 2022

I don't know who needs to know this, but if you pull xxd.c out of the vim codebase, you can compile it by itself without building vim. It doesn't depend on any other part of vim. I just vendored that one file and ditched the rest of vim.

ufo · on July 26, 2022

How do you feel about that problem the parent blog post mentioned, of this being slow for large blobs particularly when compiling with Clang?

gavinhoward · on July 27, 2022

I've just run a benchmark on my code.

It did 13875457.34 bytes per second, which means my program would have done the 82 MiB file he had in 6.20 seconds, faster than hexdump, GCC, and Clang.

It used a max of 3223848 Kbytes, which is about the size of the file it was processing. (The file was 3300000020 bytes exactly.)

I also tried with a file as close to 82 MiB as I could get. It used 85168 Kbytes max, and it took 6.77 seconds.

My code could probably be optimized too. It tries to skip a header comment. It also reads all of the input file in at once, when it could probably stream it on demand. It is also checking for stuff to exclude, which takes time.

If I take out the if statement that begins with:

    if (!strncmp(in + i, bc_gen_ex_start, strlen(bc_gen_ex_start)))

That should remove most of everything else that does not matter.

If I run that on the 82 MiB file I made, it uses 85124 Kbytes and 5.40 seconds.

This is still far slower than objcopy and incbin, but maybe it's good enough, right?

I think I could optimize it more, but I still think that's a pretty good showing against the competition, especially for portability.

Edit: I forgot to mention that I did these tests while running a fuzzer (AFL++) on 15 of my 16 cores and while watching YouTube. I didn't want to stop the fuzzer just for this (it's been running for more than 24 hours).

comex · on July 27, 2022

But that’s just to generate the C file, right? You still need to compile that C file, which is the biggest bottleneck.

gavinhoward · on July 27, 2022

Argh! I feel dumb. You're absolutely correct. My bad.

CyberDildonics · on July 27, 2022

This can actually be solved by writing out to assembly and using an assembler to make the .o file to link in. That can deal with huge files with constant memory.

hikarudo · on July 26, 2022

You can split it up into several files, then concatenate the arrays at runtime.

kelseyfrog · on July 26, 2022

https://thephd.dev/finally-embed-in-c23

mmastrac · on July 26, 2022

One missing approach is just appending the binary data to the end of the file, and then reading the resource from /proc/self/exe on Linux (or the equivalents on Mac and Windows).

It's not "portable" per-se, but all modern platforms [1] have a way to interrogate the binary contents of the currently-running executable.

[1] _NSGetExecutablePath, GetModuleFileName(), getexecname() etc

EDIT: Apparently https://github.com/gpakosz/whereami will manage a lot of this complexity for you

anyfoo · on July 26, 2022

Don't do this, it's ugly and relies on assumptions that aren't true. I haven't checked each spec, but it is very unlikely that your ELF/mach-O/PE/... is still valid with added junk at the end. You may try it out and it may work, but that is true for many things that may come back to bite you (or others) in spectacular ways.

fabian2k · on July 26, 2022

I'd be interested in any example where this approach would produce an invalid executable. I have used this without issues, but of course I have certainly not tried this in every possible environment.

anyfoo · on July 26, 2022

Computing history is chock full of examples where something "seems to work" but is actually invalid (and a mach-O treated that way would be invalid [EDIT: or just "not accepted" by some parts of the system, see below], whether it runs or not), and then Raymond Chen has to write a blog post about it decades later. Here's just one out of many as a random example: https://devblogs.microsoft.com/oldnewthing/20041026-00/?p=37...

Back to this particular case, the binary will fail strict code signing validation on macOS. It may still run because the kernel does not access the binary past the coverage of the code signature (and all the bits there are still intact), similar to how multiarch binaries work, but you will at least severely be hampered to distribute your binary, since Gatekeeper won't be happy either.

dmitrygr · on July 26, 2022

> it is very unlikely that your ELF/mach-O/PE/... is still valid with added junk at the end.

I've written loaders for all of the executable formats you mentioned, and maybe a dozen more. I know of none where this would violate the strict interpretation of the word of the spec.

That being said, valid file != happy OS

anyfoo · on July 26, 2022

Agreed. As above: It may for example run, but not be accepted by other parts of the OS (as evidenced).

naasking · on July 26, 2022

And on microcontrollers where embedded binaries are essential?

duskwuff · on July 26, 2022

Most microcontrollers run code directly from flash memory -- there's no "executable file" (or, indeed, any files) involved at all.

naasking · on July 27, 2022

The lack of files is exactly the point. Any kind of blobs need to be bundled with the compiled code in a -defined and preferably portable way, and this is the domain where it's used the most. The way the OP described just doesn't work in that context. We had a recent thread about the new #embed which does this properly and portably.

anitil · on July 27, 2022

Sqlite has an append-vfs that allows you to append the database to your executable (or really any file I suppose) [0] . I believe (but haven't looked for a while and my memory is hazy) that it moves all its metadata to the end of the file rather than at the beginning.

[0] https://www.sqlite.org/src/file/ext/misc/appendvfs.c

Edit - the very first comment explains that is uses a trailing string

tomn · on July 26, 2022

My colleague wrote this solution for C++ and cmake:

https://github.com/ebu/libear/commit/40a4000296190c3f91eba79...

This is a cmake function which generates C++ files using no external tools. It's probably not very fast, but if you don't need to handle big files and are already using cmake this is easy to integrate, adds no dependencies and works on all platforms.

svnpenn · on July 27, 2022

Easy with Go

https://godocs.io/embed

jll29 · on July 26, 2022

Here's a standalone (and Rust-implemented) version similar to xxd (if you don't like the vim dependency): https://github.com/jochenleidner/ltools/blob/main/src/bin/bi...

What I found is that many compilers don't like to compile very large source files; so if the binaries you'd like to integrate are big, it might be better to integrate their constituent objects one by one (if applicable).

kazinator · on July 26, 2022

In TXR Lisp, I did this:

https://www.nongnu.org/txr/txr-manpage.html#N-0389D15E

There is a 128-byte area prefixed by the character sequence @(txr):. It normally contains all zeros (empty null-terminated string). If you put a non-empty UTF-8 string there, it gets executed.

Of course, the problem of including a binary blob is trivial if it can just be declared as an array; the interesting problem is doing it to the executable, without doing any compiling or linking.

warmist · on July 27, 2022

My approach to this problem was to write file_to_obj converter. It creates a x64 (msvc only currently) file that you can link to. The file contains "unsigned char[]" and size of it or if using c++ you can generate obj for std::array of bytes. Imho the binary format is not that difficult and every tool should target that (i.e. if using gcc target gcc obj format)

branon · on July 26, 2022

Something like https://justine.lol/ape.html perhaps?

DethNinja · on July 26, 2022

Assuming binary blob is relatively small:

Just template generate and store the data as a bit array on the language of your choice.

For example, if you are using C/C++ you can zip everything then use a small python script to generate a C/C++ header where this data is available as a uint8_t array.

Keep in mind that all this data will be loaded to memory, so I don’t recommend this approach for anything north of 10mb.

kazinator · on July 26, 2022

On a modern VM system, the static initialized data will be mapped to memory, not loaded. So you have to worry about its virtual footprint, not physical memory use.

kevin_thibedeau · on July 26, 2022

This is broken:

  .incbin "string_blob.txt"
  ...

  printf("%s\n", string_blob);

Text files don't have to have a NUL termination. The proper way to embed data with the .incbin directive is to add a label after the file and use that directly for pointer arithmetic or compute the size with another assembly directive.

jandrese · on July 26, 2022

Couldn't you follow up the .incbin statement with a .const 0 or something similar?

kelnos · on July 26, 2022

In this case it works because the author explicitly put a NUL at the end of the string in the text file. I don't think the author was trying to suggest that you can do this with arbitrary data.

raggi · on July 26, 2022

I had to do this in Fuchsia a while back, and the toolchain team kindly fixed up llvm-objcopy for this purpose. The section replacement flags now work as you would want for this use case (they didn't before).

raggi · on July 26, 2022

Also, fwiw, embed isn't always the answer you want.

The thing I was patching up was a large rust program (30+mb release stripped), and so it was undesirable to always have blob changes require a relink of the program.

Embed as it is available in a few common languages now/soon is very convenient, but it is quite painful once the program link stage is expensive.

JonChesterfield · on July 27, 2022

Article is missing .incbin from file scope inline asm. Much more convenient than a separate .s file.

Probably doesn't work on windows, no measurable compile time cost (unlike massive arrays of comma separated chars).

vgel · on July 26, 2022

If you're already using Clang (and thus LLVM & its platform constraints), I wonder if the best way would be to link in a tiny Rust / Zig `.o` using `include_bytes!` / `@embedFile`...

Rizz · on July 26, 2022

If the converting to hex step is too slow, could you leverage your makefile or other build script to bootstrap a simple, faster converter to convert your data file?

anon291 · on July 27, 2022

the only thing you need for this is

xxd --include name < binaryblob

X-Cubed · on July 27, 2022

> However, perhaps surprisingly, xxd is part of Vim (but not Neovim?), which is a rather heavyweight (and odd) dependency to require just to include a binary blob

mrlonglong · on July 26, 2022

C23 will soon have the #embed attribute to include such blobs. This will ease portability concerns.

ncmncm · on July 26, 2022

ELF provides for any number of different kinds of "section" that you can have automatically mapped into your address space at startup. You just need a way for your program to know where it is. There are lots of different ways to get that.

titzer · on July 26, 2022

Yes, but the article was mostly about what tools do you use to get that section into the ELF.

a-dub · on July 26, 2022

i'd probably use a constant string by writing a script and build tooling that takes my binary and spits out a source file (c?) which is then compiled in...

UncleEntity · on July 27, 2022

Blender has been using this exact technique forever for embedded fonts, icons, the splash screen, and who knows what else.