Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I've to say that you've clearly thought a lot about this, and though I've not tried it (and probably won't) you seem to have done an excellent job.

Thank you. I needed that. I started the project in 2016 and have been obsessed with it since then. The weird thing is that I never had to use workarounds or hacks that are mentioned by the Cygwin team [0]. For instance, the select() call does map cleanly on top of the Win32 API. I just needed to use WSAWaitForMultipleEvents() instead of WaitForMultipleEvents() (the other "easteregg"). Why the Cygwin people didn't figure this out baffles me. I guess their current code base doesn't allow the rewrite. My big breakthrough was when I realized that the "inconsistent interfaces" [1] in Win32 file handles can be implemented as virtual file systems. One for each handle type (char, disk, pipe, etc). That was my "throwing away 1000 lines of code" [2] moment.

As to the weird file names, I use the file names OpenBSD uses. My rule is to always use the file name of the header (.h) file where the system call is declared in OpenBSD. I also use their struct and constant names, prefixed with "WIN_".

The "fork is evil" thing is discussed a lot in the programmers community. I myself find it quite clever. Threads are highly volatile and are very hard to program without running into race conditions. The solution is to make a copy of everything the child will be using: duplicate file descriptors, the stack, globals (rss). The kernel does all this for you in one system call. I often wonder how the people who complain about the absence of real concurrency in their programming languages [3] actually would use this feature. In my opinion the best way to use concurrency is to string individual programs into a pipeline. This will never go "evil" on you.

[0] https://cygwin.com/cygwin-ug-net/highlights.html

[1] https://www.usenix.org/legacy/publications/library/proceedin...

[2] https://skeptics.stackexchange.com/questions/43800/did-the-c...

[3] https://news.ycombinator.com/item?id=32408577



> Thank you. I needed that. I started the project in 2016 and have been obsessed with it since then.

Well, like I said, I'm impressed.

> I myself find [fork()] quite clever.

Oh for sure it is clever. Though vfork() would have been more clever. The thing that fork() did that was very nice is make it real easy to spawn processes in a shell, which meant not having to design a spawn() system call (which are invariably large APIs), which greatly simplified Unix development in the 70s, both kerne-land and user-land. vfork() would have been more clever, but that didn't occur to Ritchie, Thompson, et. al. I wonder how things would have gone if they had thought of vfork().


Wait, I sense some genuine concern here. In fact, you tricked me into learning, which is what I also do with my students. I've read the article you provided in your earlier comment, plus the one from Microsoft [0].

They expose a dark secret behind the fork() call and I felt it too when implementing it myself. Almost gave me a heart failure. So here's my simplified take: what if the parent did a malloc() and put the resulting pointer on the stack (which is what git does, BTW). A simple copy of the stack to the child wouldn't be sufficient. The kernel then would have to follow the pointer, malloc() new memory and copy the data. It's not hard to see where this is going. What if there is malloc()ed memory in this copy? It's madness. I suspect this could bring a whole system to its knees.

This is a problem a kernel should not try to solve. Only user space knows about its application of memory. While reading the source code of the software I include in MinC, I always thought this is bad programming, perpetuated by Torvalds' #1 rule "don't mess with user-space", leading to things like copy-on-write.

This all leads me to believe I can get away with doing flat copies of stack and globals during fork(), implement spawn(), never implement copy-on-write and patch the userland code if needed. Am I right?

[0] https://www.microsoft.com/en-us/research/wp-content/uploads/...


> Wait, I sense some genuine concern here. In fact, you tricked me into learning, which is what I also do with my students.

Haha, well, tricking you was not my intent. I'm glad I did though!

> The kernel then would have to follow [...]

The kernel doesn't have to do anything like that because the kernel doesn't know about user-space allocators. The kernel only knows about the pages used in constructing the user-land address space for the process. What you're really getting at is "fork-safety" (like thread safety, but fork fork()).

Whether using fork() or vfork(), in principle the child[0] process is only permitted to use async-signal-safe functions on the child side. It can only use async-signal safe functions, which are typically the system calls needed to do everything up to an execve() (which is also safe).

In practice however, many of us know how to write multi-process daemons that do very much use async-signals-UNsafe functions on both sides of the fork(), and it's OK if you know what you're doing, and if it's a _real_ fork(). If it's more like like a combination of threads and vfork() then it's not safe at all to use async-signal-UNsafe functions on the child side!

And malloc() (and free()!) is absolutely NOT async-signal-safe! Which is what you noticed in thinking about this.

So a fork() that creates a new thread but not a new address space, and which swaps the stack back and forth as each of the parent or child execute, is NOT safe to use with async-signal-UNsafe functions on the child side of fork().

So your fork() implementation, if I understood what it does, is probably only safe for a certain class of programs that happen to be using fork() exactly as the fork(2) man page says.

So you might need to patch some fork()-using OpenBSD programs to function correctly in MinC. And any other fork()-using programs one might want to use under MinC may also need to be patched.

Programs using posix_spawn() will be OK _if_ OpenBSD's implementation uses vfork() and the MinC kernel implements vfork().

With vfork() the danger of using anything other than async-signal-safe functions on the child-side is so much clearer that it is paradoxically and in my opinion safer than fork().

Although I called fork() "evil", I use it lots in my code. I've written many versions of daemon(3) that have the parent exit only when the child signals that it is ready (this is to avoid race conditions in multi-service systems and testing). I've written multi-processed daemons that do use async-signal-UNsafe functions on both sides of fork(). But I don't really condone that :cry-laugh:. One has to be quite aware of the dangers, and understand them, in order to use fork() like that.

BTW, I think it would be interesting to have a new try at implementing fork() in WIN32. I wonder if one can create a copy of the parent's address space in the child w/o having to use any of the LoadLibrary*() functions to load DLLs, thus avoiding the ASLR issues for example. I imagine that it must be possible, but also that it must be very tricky. You can see that abandoning fork() for vfork() and spawn-type APIs would be best for running Unix software on Windows...

[1] is an implementation of daemonization that spawns a child instead of fork()-and-continue. That has an option to exec on the child-side to make it possible to test on Unix logic that otherwise would only be tested on Windows. One could use the same approach to build multi-processed servers, where you'd spawn each child rather than fork() each child -- i.e., vfork() then execve() with a special command-line option or env var to indicate "you are a worker process". OpenSSH's sshd nowadays always execs on the child-side of fork().

[0] daemon(3) inherently violates the requirement that the child-side of fork() not use async-signal-UNsafe functions, but this is OK because the real [but unstated] requirement is that only one of the parent or child may use async-signal-UNsafe functions.

[1] https://github.com/heimdal/heimdal/blob/master/lib/roken/det...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: