I don't think adding this to io_uring is at all bad. But I don't think it enough to solve the problem. If for no other reason, than because it requires using the machinery of io_uring, which adds quite a bit of complexity.
However, maybe I'm missing something, but it seems like linux already has functionality that could make spawning a process a lot more efficient and threadsafe. My idea is basically to use clone or clone3 to create a new process in a new thread group that shares the original processes memory (that is with CLONE_VM but not CLONE_THREAD). And pass a function point to call (instead of returning on the child process) and a heap-allocated stack for the child process to use.
Then there is no need to copy the address space, and you can do more things to prep before calling exec, since other threads can still release locks, you can write to memory, etc.
The downsides I see are that you wouldn't be able to safely modify the current environment variables since that would impact the parent process, and there might be some weirdness with the child process having copies of file descriptors instead of the originals. The first is easy to work around though, and the latter probably wouldn't be an issue in most cases.
Another thought I've had is that if there was a more efficient single syscall for spawning a process that combined fork and exec, even if it is a lot less flexible than fork/exec or the io_uring equivalent, something simple could probably meet the needs of most applications and benefit performance and safety in the common case where you don't need complex setup before calling execve.
> it seems like linux already has functionality that could make spawning a process a lot more efficient and threadsafe. My idea is basically to use clone or clone3 to create a new process in a new thread group that shares the original processes memory (that is with CLONE_VM but not CLONE_THREAD). And pass a function point to call (instead of returning on the child process) and a heap-allocated stack for the child process to use.
In Linux you can do that with standard system calls, by spawning a thread with pthread_create() then calling vfork() from the thread. vfork() pauses only the parent thread, not the entire parent process, until the vfork child calls execve().The effect is to create a child task which has CLONE_VM but not CLONE_THREAD, which runs concurrently with all the other threads.
However, maybe I'm missing something, but it seems like linux already has functionality that could make spawning a process a lot more efficient and threadsafe. My idea is basically to use clone or clone3 to create a new process in a new thread group that shares the original processes memory (that is with CLONE_VM but not CLONE_THREAD). And pass a function point to call (instead of returning on the child process) and a heap-allocated stack for the child process to use.
Then there is no need to copy the address space, and you can do more things to prep before calling exec, since other threads can still release locks, you can write to memory, etc.
The downsides I see are that you wouldn't be able to safely modify the current environment variables since that would impact the parent process, and there might be some weirdness with the child process having copies of file descriptors instead of the originals. The first is easy to work around though, and the latter probably wouldn't be an issue in most cases.
Another thought I've had is that if there was a more efficient single syscall for spawning a process that combined fork and exec, even if it is a lot less flexible than fork/exec or the io_uring equivalent, something simple could probably meet the needs of most applications and benefit performance and safety in the common case where you don't need complex setup before calling execve.