Hacker News new | past | comments | ask | show | jobs | submit login

From what I know about it, gobject-introspection has some nice properties, but one killer drawback: it's incompatible with cross compilation. This is apparently because it requires running binaries compiled for the target system as part of the build process. You actually can cross compile if you have an emulator for that system handy [1], but that's horrible.

Apparently the reason it requires running the compiled binaries is that GObject types are only registered at runtime, within so-called "_get_type" functions. For more typical systems, everything needed can be determined at compile time. Too bad there's no portable way to ask a C compiler to dump what it knows about a source file, but if you just want things like struct sizes and field offsets, you can compile a C file that embeds them as global variables, and then extract the variable values. For more advanced introspection there are many less-portable options including Clang's API, GCC-XML, parsing debug info, or writing your own compiler (easier for C; it seems that parts of gobject-introspection work like this).

Anyway, another interesting comparison is DTrace's CTF (Compact Type Format) [2], a simple binary format that describes the kernel's C struct layouts, function signatures, etc. This information is simply converted from compiler-generated debug info [3], but it's stripped down enough that it can be embedded into every kernel without too much size overhead. When the DTrace compiler is invoked to compile a user hook, it parses the CTF data and exposes the types and functions to the user's code (which is written in a custom C-like language).

Ironically, BPF has BTF, which is a very similar-looking format that encodes very similar kinds of data – but is used for a completely different purpose. Specifically, it's only used to encode types and functions defined by BPF programs, to allow the kernel to pretty-print things. But in theory BTF could be repurposed to work like CTF: you would need to generate BTF information for the kernel itself, and then Clang could be extended to support "including" BTF files in place of C headers. However, this option was apparently discussed and rejected [4]. I haven't read the original threads to find out why, but I suspect it might involve:

- Lack of existing tooling to do the above;

- Lower expressivity compared to C headers, e.g. the inability to encode macros (although this could be fixed);

- Desire to use the information for building not just BPF hooks but also full-fledged kernel modules.

[1] https://maxice8.github.io/8-cross-the-gir/

[2] https://github.com/oracle/libdtrace-ctf/blob/master/include/...

[3] https://www.freebsd.org/cgi/man.cgi?query=ctfconvert&sektion...

[4] https://lwn.net/Articles/783832/




Yeah, I'd love to have something like BTF or CTF used widely for machine-readable type information. (https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf... gives some further information there.)

The limitations regarding macros sound like the biggest issue to me (both code-like macros and just simple defined names for values via #define). I'd love to see solutions for that. What do you think that would look like?


Interesting writeup! I didn't realize there was an active attempt to generate BTF for the kernel.

Regarding macros...

Well, to start with, there's the brute-force approach of simply embedding textual macro definitions. That might be good enough for most use cases in practice: as far as I know, most BPF hooks are written in either C or the C-like bpftrace language, so expanding macros as text would probably give a sensible result for the majority of macros that aren't particularly complex. And macro definitions are already included in the DWARF info, so the DWARF-to-BTF approach from your link could be easily extended to embed them.

But it would be nice to describe macros in a more structured format, which could allow use from non-C-like languages and would probably save on file size. Some prior art I'm familiar with is rust-bindgen, which generates Rust bindings for C headers using libclang, and supports translating C macros that expand to constants. Basically it checks each macro that's defined without arguments and uses libclang to try to evaluate it as a C constant expression; this will fail for macros that expand to things other than constant expressions, but it just ignores those. If evaluation succeeds, it translates the macro to a typed Rust constant declaration.

It might be possible to do something similar for BTF. As output format, either add a new 'constant integer' node, or translate such macros as if they were enum definitions. For Linux it would probably be best to avoid a dependency on libclang, but a custom parser might work, or maybe a hackier approach based on feeding things to the C compiler like:

    enum { value_of_SOME_MACRO = ((((((((( SOME_MACRO ))))))))) };
and sorting through the resulting morass of compiler errors :)

Edit: Forgot to mention – functional macros would be nice too, but of course they're much harder to translate. And heck, what about inline functions? Convert them to BPF?


> there's the brute-force approach of simply embedding textual macro definitions. That might be good enough for most use cases in practice

I very much want this for usage from Rust, so that doesn't suffice.

> It might be possible to do something similar for BTF. As output format, either add a new 'constant integer' node

That sounds promising to me, for the common case.

> Edit: Forgot to mention – functional macros would be nice too, but of course they're much harder to translate. And heck, what about inline functions? Convert them to BPF?

In an ideal world, 1) emit a symbol for them so they can be used from any language, albeit not "inline", and 2) compile them to bytecode that LTO can incorporate and optimize, for languages using the same linker.

Neither of those would work for macros designed for especially unusual usages that can't possibly work as functions. (The two most common cases I can think of: macros that accept names and use them as lvalues, and macros that emit partial syntax such as paired macros emitting unmatched braces.) But honestly, flagging those and handling all the common cases via BTF information would still be a huge improvement.

Perhaps we should continue this on an IRLO thread?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: