Make a ZIP file containing the blob, and catenate it to the end of the executable binary. The ZIP format specifically puts all of the key metadata at the back of the file, so pretty much any ZIP tool can correctly read/list/extract data from the ZIP portion of the file. Anything that needs to be linked at runtime can just be extracted to a temp dir, and then cleaned up on exit. Bonus points for getting "free" compression on text data blobs.
We do this for Python applications, by combining a ZIP containing the "link tree" of sources/packages/modules, with a shell bootstrap script that automatically sets up the environment, import path, etc, and Python itself has built in support for importing pure-python modules from a ZIP file. All that's needed for native modules is a simple import hook that extracts the native objects into temp space and then loads them appropriately.
Indeed, JARs are just ZIP files (and can even be uncompressed), so this trick works to make self-executable JAR files that consist of a shell script concatenated with the actual JAR.
I think that approach only works on general purpose operating systems, right? It can’t be used to add assets to executables that are used on, say, embedded systems. Or WebAssembly apps.
Hey, I wrote objcopy and objdump as debugging tools for developing bfd targets when I was designing bfd. Neither was (originally) intended as a production tool. OK I still use them myself, and I haven’t looked at the bfd source code in almost 30 years.
I forget who came up with the tool chain target names, which I think was your real complaint, though I remember when. Perhaps Ian Taylor (later author of gold, which was notable for, among other things, not using bfd).
Objdump should have a feature to generate assembly or c code for an arbitrary blob (with correct byte swap, if needed, of course).
I always try to be respectful towards other software in my writing and I fell short this time. I hope you'll accept my sincere apologies!
> the tool chain target names, which I think was your real complaint
Yes, this was solely what I was referring to (slightly thoughtlessly) as "ugly", and even then only in the sense of "where did those magic names come from?" I certainly wasn't referring to objcopy itself!
don't worry: I'm not actually insulted. I just think it's funny that something intended for development debugging turn out to be actually useful (as I said I still use them too, and not for debugging bfd).
My team is working on this problem in the context of creating Node.js single-executable applications. While the naive approach of just appending data at the end of the binary works, it is not friendly with code-signature in macOS and Windows given that signing operates on PE and Mach-O sections.
We have recently open-sourced a small tool called Postject (https://github.com/postmanlabs/postject), which is able to inject arbitrary data as proper ELF/Mach-O/PE sections for all major operating systems (with AIX support coming). The tool also provides C/C++ cross-platform headers for easily traversing the final binary and introspect whether the segment is present or not.
The full example is sadly closed-source for now. That said, we are closely collaborating with the maintainer of PKG as part as the Node.js SEA initiative, and we hope to be kickstarting an experiment on refactoring PKG to use Postject pretty soon.
I use xxd for this and the problem is far from solved. Around 5MB or so, the compile times explode. Somewhere around 7MB I could no longer self-host on 32-bit platforms and had to cross-compile from a 64-bit machine with more memory. Around 10MB the compilers start crashing. These are not very big files. The time added for the couple of resource files far, far exceeds the time it takes to compile the whole rest of the project. I had to add special compile-time options to exclude resources for development builds so I wouldn't drive myself crazy with the long compile times. I previously used all the platform-specific methods mentioned in the article and they don't have this problem, but it's an absurd amount of complexity to handle every platform's little way of doing it in my Makefile.
Considering I knew I had to make it work everywhere, and that included not depending on tools that are not standard in POSIX, I still think I made the right choice.
Granted I can replace it with xxd and two lines of awk, but that would have taken longer, and what's done's done, it's not a task with shifting requirements.
I don't know who needs to know this, but if you pull xxd.c out of the vim codebase, you can compile it by itself without building vim. It doesn't depend on any other part of vim. I just vendored that one file and ditched the rest of vim.
It did 13875457.34 bytes per second, which means my program would have done the 82 MiB file he had in 6.20 seconds, faster than hexdump, GCC, and Clang.
It used a max of 3223848 Kbytes, which is about the size of the file it was processing. (The file was 3300000020 bytes exactly.)
I also tried with a file as close to 82 MiB as I could get. It used 85168 Kbytes max, and it took 6.77 seconds.
My code could probably be optimized too. It tries to skip a header comment. It also reads all of the input file in at once, when it could probably stream it on demand. It is also checking for stuff to exclude, which takes time.
If I take out the if statement that begins with:
if (!strncmp(in + i, bc_gen_ex_start, strlen(bc_gen_ex_start)))
That should remove most of everything else that does not matter.
If I run that on the 82 MiB file I made, it uses 85124 Kbytes and 5.40 seconds.
This is still far slower than objcopy and incbin, but maybe it's good enough, right?
I think I could optimize it more, but I still think that's a pretty good showing against the competition, especially for portability.
Edit: I forgot to mention that I did these tests while running a fuzzer (AFL++) on 15 of my 16 cores and while watching YouTube. I didn't want to stop the fuzzer just for this (it's been running for more than 24 hours).
This can actually be solved by writing out to assembly and using an assembler to make the .o file to link in. That can deal with huge files with constant memory.
One missing approach is just appending the binary data to the end of the file, and then reading the resource from /proc/self/exe on Linux (or the equivalents on Mac and Windows).
It's not "portable" per-se, but all modern platforms [1] have a way to interrogate the binary contents of the currently-running executable.
Don't do this, it's ugly and relies on assumptions that aren't true. I haven't checked each spec, but it is very unlikely that your ELF/mach-O/PE/... is still valid with added junk at the end. You may try it out and it may work, but that is true for many things that may come back to bite you (or others) in spectacular ways.
I'd be interested in any example where this approach would produce an invalid executable. I have used this without issues, but of course I have certainly not tried this in every possible environment.
Computing history is chock full of examples where something "seems to work" but is actually invalid (and a mach-O treated that way would be invalid [EDIT: or just "not accepted" by some parts of the system, see below], whether it runs or not), and then Raymond Chen has to write a blog post about it decades later. Here's just one out of many as a random example: https://devblogs.microsoft.com/oldnewthing/20041026-00/?p=37...
Back to this particular case, the binary will fail strict code signing validation on macOS. It may still run because the kernel does not access the binary past the coverage of the code signature (and all the bits there are still intact), similar to how multiarch binaries work, but you will at least severely be hampered to distribute your binary, since Gatekeeper won't be happy either.
> it is very unlikely that your ELF/mach-O/PE/... is still valid with added junk at the end.
I've written loaders for all of the executable formats you mentioned, and maybe a dozen more. I know of none where this would violate the strict interpretation of the word of the spec.
The lack of files is exactly the point. Any kind of blobs need to be bundled with the compiled code in a -defined and preferably portable way, and this is the domain where it's used the most. The way the OP described just doesn't work in that context. We had a recent thread about the new #embed which does this properly and portably.
Sqlite has an append-vfs that allows you to append the database to your executable (or really any file I suppose) [0] . I believe (but haven't looked for a while and my memory is hazy) that it moves all its metadata to the end of the file rather than at the beginning.
This is a cmake function which generates C++ files using no external tools. It's probably not very fast, but if you don't need to handle big files and are already using cmake this is easy to integrate, adds no dependencies and works on all platforms.
What I found is that many compilers don't like to compile very large source files; so if the binaries you'd like to integrate are big, it might be better to integrate their constituent objects one by one (if applicable).
There is a 128-byte area prefixed by the character sequence @(txr):. It normally contains all zeros (empty null-terminated string). If you put a non-empty UTF-8 string there, it gets executed.
Of course, the problem of including a binary blob is trivial if it can just be declared as an array; the interesting problem is doing it to the executable, without doing any compiling or linking.
My approach to this problem was to write file_to_obj converter. It creates a x64 (msvc only currently) file that you can link to. The file contains "unsigned char[]" and size of it or if using c++ you can generate obj for std::array of bytes. Imho the binary format is not that difficult and every tool should target that (i.e. if using gcc target gcc obj format)
Just template generate and store the data as a bit array on the language of your choice.
For example, if you are using C/C++ you can zip everything then use a small python script to generate a C/C++ header where this data is available as a uint8_t array.
Keep in mind that all this data will be loaded to memory, so I don’t recommend this approach for anything north of 10mb.
On a modern VM system, the static initialized data will be mapped to memory, not loaded. So you have to worry about its virtual footprint, not physical memory use.
Text files don't have to have a NUL termination. The proper way to embed data with the .incbin directive is to add a label after the file and use that directly for pointer arithmetic or compute the size with another assembly directive.
In this case it works because the author explicitly put a NUL at the end of the string in the text file. I don't think the author was trying to suggest that you can do this with arbitrary data.
I had to do this in Fuchsia a while back, and the toolchain team kindly fixed up llvm-objcopy for this purpose. The section replacement flags now work as you would want for this use case (they didn't before).
Also, fwiw, embed isn't always the answer you want.
The thing I was patching up was a large rust program (30+mb release stripped), and so it was undesirable to always have blob changes require a relink of the program.
Embed as it is available in a few common languages now/soon is very convenient, but it is quite painful once the program link stage is expensive.
If you're already using Clang (and thus LLVM & its platform constraints), I wonder if the best way would be to link in a tiny Rust / Zig `.o` using `include_bytes!` / `@embedFile`...
If the converting to hex step is too slow, could you leverage your makefile or other build script to bootstrap a simple, faster converter to convert your data file?
> However, perhaps surprisingly, xxd is part of Vim (but not Neovim?), which is a rather heavyweight (and odd) dependency to require just to include a binary blob
ELF provides for any number of different kinds of "section" that you can have automatically mapped into your address space at startup. You just need a way for your program to know where it is. There are lots of different ways to get that.
i'd probably use a constant string by writing a script and build tooling that takes my binary and spits out a source file (c?) which is then compiled in...
We do this for Python applications, by combining a ZIP containing the "link tree" of sources/packages/modules, with a shell bootstrap script that automatically sets up the environment, import path, etc, and Python itself has built in support for importing pure-python modules from a ZIP file. All that's needed for native modules is a simple import hook that extracts the native objects into temp space and then loads them appropriately.