Neither main/WinMain/wmain/wWinMain are actually the methods called on program l...

codeflo · on July 10, 2023

> The pre-main runtime doesn't do much

This only applies to C++, but the CRT runs all constructors of global variables before main/WinMain. (Or rather, the CRT calls a special function that the compiler generates for this purpose and links into the executable.) In some codebases, that's quite a lot of stuff.

account42 · on July 10, 2023

There is also initialization relevant for C code, e.g. strlen() will crash if you call it from the startup function directly without properly initializing msvcrt.

amluto · on July 10, 2023

> On most platforms you can get argc and argv[] through the right system calls

Not (reliably) on Linux or, as far as I know, on similar systems. argv, environ, and the aux vector come from a horrible data structure the kernel creates on the stack.

matheusmoreira · on July 10, 2023

The kernel just copies the data to the program's stack in a contiguous manner. Obtaining pointers to them can seem somewhat magical if you're writing a nolibc program but I wouldn't call it horrible.

I implemented it for my programming language with some rather simple assembly code:

https://github.com/lone-lang/lone/blob/master/arch/x86_64.c#...

https://github.com/lone-lang/lone/blob/master/arch/aarch64.c...

ithkuil · on July 10, 2023

The structure may be "horrible" but why so you say it's unreliable? It's part of the kernel/user space contract.

The same information is also exposed via the /proc FS FWIW

amluto · on July 10, 2023

I’m saying you can’t reliably get the information from syscalls. The runtime (i.e. whatever implements the actual entry point declared in the ELF headers) can get it reliably, as can any other code to which the runtime gives an appropriate pointer.

You can’t assume that /proc is procfs if you’re writing a low level runtime library.

ithkuil · on July 10, 2023

is your concern that some code in the process has clobbered the data that lives before the top of the stack?

amluto · on July 10, 2023

If you can find the top of the stack, you can read the contents with reasonable reliability. But the top of the stack is not at a fixed address, and if you are writing low enough level code (container manager, init, etc), poking around in /proc at startup is not a great idea.

If you’re wiring a real runtime library, none of this matters: the kernel passes a pointer in a register at startup.

ithkuil · on July 10, 2023

Do whatever libc's getauxval does to find the aux vector. Then env array is just below it, and the argv is below that iirc.

EDIT: I agree it's ugly, I'm just not sure if is fair to call it brittle

amluto · on July 10, 2023

The “whatever” is that glibc is the implementation of the ELF entry point. It remembers where the data structure is.