I was fascinated with Plan 9. It just has really cool ideas in its design.
It has been said before and I agree. The problem with Plan 9 adoption isn't Plan 9. Plan 9 is beautiful. The problem is that Linux is quite alright as well.
Also, Linux has drivers for more hardware when Plan 9 was ready to be played with. Plan 9 doesn't. It is a chicken and egg problem. I remember trying Plan 9 and it just didn't have drivers for some of my hardware. It was like Linux in back in the days.
The problem is that Linux is quite alright as well.
The advantage of plan9 was "at scale," being able to commingle multiple system's resources in interesting novel ways.
This has never been and still is in no ways Linux's strength. Linux is inward focused: a monolithic kernel designed to serve as a pleasant base for anything running under it.
There are probably near a thousand different Node.js projects each implementing some kind of coordination layer between Node.js processes (ex: https://github.com/amino/amino) and similar efforts abound on for all other applications to be coded against (Hazelcast, Hadoop, &c). Why?
It's all done as application framework level because the Linux operating system's design space is sitting on a box and running. The fanciest interconnect you're going to get is when you first modprobe in USB-IP (allows network control of a usb peripheral). It's downhill, and back to writing applications if you want to push on.
Linux is definitely quite alright. We're fine with Hadoop, running on the JVM, running on Linux. We were delighted to have a desktop Unix, to have X running- the never ending parade of "year of the linux desktop" stories, year after year. Linux arose when we wanted better ways to run things on a box, and most problems fit on a box. Linux was what we wanted, took care of the pressing matter of running a single box, and it was good at that, whereas when it and Plan9 were in their early days, it was a harsh and hostile landscape and we weren't good at working within the confines of the box.
Plan9 has always been for those cases where you abutted the constraints of the box, and wanted to think more openly more broadly about the problems about us- and plan9 was for those who wanted to reap solutions that could scale and reach out in a common interoperable serviceable way. It was ahead of it's time, and we couldn't picture the use of a fabric that would let the boxes work together, in part because boxes were still rapidly individually advancing from the very crude, hard to use machines of the mainframe days into things we could begin to use. Thanks Linux for getting us so far, but I tend towards suspecting that thinking of a running environment that thinks chiefly of itself, that remains firmly about it's own box, is too old world to be of much real service these days.
> There are probably near a thousand different Node.js projects each implementing some kind of coordination layer between Node.js processes (ex: https://github.com/amino/amino) and similar efforts abound on for all other applications to be coded against (Hazelcast, Hadoop, &c). Why?
There is only one Erlang though and it has been around for 20+ years. It has built in distribution and automatic failover. Yes it is built on top of Linux/Mac/Windows OS.
But wait, that's not all. There is this project that basically attempts to thin out the OS and replace it with Erlang -- http://erlangonxen.org/ -- Erlang VM implemented on top of Xen hypervisor. To make it even more interesting and bring it back home -- the underlying clustering support is based on 9P (the plan 9 distribution protocol). I think that is pretty cool.
I'm amazed that "drivers" are still a consideration in OS design anywhere other than embedded systems. Why aren't we writing our operating systems to target hardware-abstracting hypervisors yet? Isn't that an obvious separation of concerns?
Isn't the BIOS supposed to be the hardware abstracting layer we run our programs directly on? That doesn't work any more (because BIOS is a legacy mess), so now we have kernels that act as the hardware abstraction, and write operating systems on top of those. Of course, now you need to target you operating system to a kernel.
Yes, the BIOS is exactly what's supposed to be doing the hardware abstraction, such that jobs like "read a block from a block device" are done via hardware interrupts, not system calls. I think hypercalls are a good compromise, these days; what is a hypervisor but a BIOS running on the CPU?
There are already specialty OSes that take this tack--things like Openmirage (http://www.openmirage.org/) and Erlang on Xen (http://erlangonxen.org/). It could be done with a general-purpose OS just as easily.
The BIOS provides the programmer with basic I/O routines, but let's say performance wise it's far from being very efficient... And access from protected mode is quite tricky (but possible if you don't care about performance, security and reliability, but FreeBSD does it routinely to guess supported video modes from the graphic card).
When I read you, you seem to believe that writing to a disk (for example), is just a matter of having a "driver". That might be possible in a world where caching doesn't exist and where you don't have to support shitload of devices that claim to be a disk but really aren't or other drivers trying to pretend that they are a disk or other drivers trying to catch your writes to the disk, but really, you don't need an os I guess, just something minimal with hypercalls?
Tell me then, just as a thought experiment, if you get a kernel panic, what do you tell to the disk controller? Not today?
A snarky comment, I know, but I happen to be a kernel programmer and reading your comment was really annoying.
I'm assuming a full hypervisor, like Xen, which is basically a full-blown kernel that does its own resource pooling and abstraction. A "disk" for Xen is just like a "disk" for Linux--something that presents a disk-like interface in its driver and nothing more.
Let me rephrase my previous idea. You have two kernels. One is an exokernel/library kernel, responsible only for managing the hardware[1]. It only does things when it's called into. The other is a "userspace" kernel, that does the virtual memory, process/thread management, etc. It's the one that registers interrupt vectors.
You could do this to Linux, today. Rip all the stuff that starts the machine and initializes the devices and abstracts them out, away from all the stuff that touches userspace, and put an known ABI between them. Then, take advantage of modern processors' specific support for hypervisors to make this ABI more performant.
The important idea is not that "somebody else" deals with the "drivers and stuff", leaving you free to write "just" an OS. Both the driver part and the process part can be considered part of one OS, and you can't have just one or the other, and likely the same vendor will be worrying about both. No, the important part is that each is a layer, which should be able to treat the other layer as a black box with a known interface. In exactly the same way that TCP doesn't care whether it's being carried by IP or NetBEUI, while IP doesn't care whether it's carrying TCP or UDP, moving the driver part of a kernel out into a hypervisor means that the process-management part doesn't have to care what hardware-management part it's talking to, and vice versa. An "OS stack", to match your "network stack."
Really, we already have "OS stacks" in a sense -- that's when we run a full-blown Linux kernel, running an (optionally paravirtualized) IA32/64 emulator, running another full-blown Linux kernel, running your software. This just eliminates everything redundant about that configuration, while keeping all the advantages.
---
[1] Notably, this also means that the outer kernel is responsible for permissions to the hardware. This means that each "inner kernel" running on this hardware only gets a single effective permission-set; if the one process on the inner kernel has access to a resource (a disk, say), then every process on the inner kernel has access to that resource, because the inner kernel doesn't deal with the hypercall--it goes directly to the hypervisor.
This would have been a terribly big deal 10 years ago. Now? Who runs things as more than one mutually-untrustworthy user within the same virtual machine any more? Today, if you want separate security contexts, you use containerization. Which is to say, you get the hypervisor, not the OS, to enforce resource permissions.
Until Windows 4, even the graphic part wasn't in the kernel, eventually they had to compromise for performance reasons.
I don't know Linux well enough to answer your question about Linux, but I'm pretty sure that concerns are well separated, however they are so fundamental to what the OS does that you can't just rely on some "external library".
An Hypervisor doesn't touch any hardware stuff. Basically if you consider kernel mode to run in ring 0 and user mode to run in ring 3, the Hypervisor runs in ring -1 and just deals with privileges to sum up. At to no instance does the hypervisor deal with the hardware such as disks or network cards. That code lies in the ring 0 kernel. All the Hypervisor says is ... "yeah ok, ring -1 kernel, you can play in that sandbox".
For example when you run a VM, the VM emulates hardware but the OS running in the VM stills need drivers...
Now it's not that easy to separate the OS from the drivers, because each OS has got some ideas about how the cache should work and how disks should be managed and what is actually a NIC.
When you write an OS you make your best so that writing a driver can be done to someone else and the code can be reused (e.g. for writing to multiple OS) but you CAN'T expect to use a common framework because well, at some point you will want to do DMA and you don't want to tell your user that their Intel SSD will run at 10 MiB/s.
On the latter point, you can actually have the two in one kernel paradigm - use coreboot, run a kernel payload, and use containers. The container kernel hooks are essentially the userspace abstractions without the kernel redundancy.
> what is a hypervisor but a BIOS running on the CPU?
Generally, a hypervisor implies a VM which multiplexes the hardware and allows multiple OSes to run at once, which is more than any BIOS has ever done. Also, a hypervisor generally doesn't provide any hardware abstraction, to be compatible with existing OSes which expect to be alone on the hardware.
If the hypervisor is a paravirtualized one then it will provide hardware abstractions.
The Openmirage project described above is essentially an OCaml Machine, it only needs drivers to talk to the generic Xen devices for disk, network, console etc...
> Isn't the BIOS supposed to be the hardware abstracting layer we run our programs directly on?
Yes. The problem with this idea is that the BIOS fell prey to a classic negative feedback loop: It wasn't extremely efficient to begin with, and in 16-bit x86 programs can bypass it entirely and go directly to the hardware, so they did.
This meant hardware interfaces became more standardized (to a point), to make specific pieces of hardware more attractive to buyers who wanted to run the widest possible variety of application software, and the BIOS was relatively neglected. This cycle continued until the BIOS had essentially code-rotted into unusability, being able to boot Windows and not much else. The final nail in the coffin was when 32-bit x86 chips came out and the BIOS remained 16-bit real mode: Even though protected mode could have saved the BIOS, by making it impossible for application software to go directly to the hardware, the fact the BIOS never jumped to 32-bit code meant not even OSes can really use it.
So now the BIOS can run a POST and load a bootloader that actually knows how to load a modern OS. That's pretty much it.
Actually, it was quite ridiculous. Printing text to the screen on a 4.77 IBM PC felt like using a serial terminal at about 19200 bps. My 1 MHz Apple II would appear much faster just because it could output text to the screen more efficiently (to be fair, putting a symbol on a CGA screen is a two-byte affair - one for the symbol, one for attributes, but, still, for a supposedly much more advanced machine, it was horrible).
> to be fair, putting a symbol on a CGA screen is a two-byte affair - one for the symbol, one for attributes
Text-heavy stuff was supposed to be done using MDA graphics cards, which were praised at their introduction for allowing large amounts of readable text, especially on IBM's own monitors. I'm sure a lot of business users running 3270 terminal emulators and word processors got a lot of good use out of MDA, but the fact it was, as its name stated, monochrome pretty much killed it in the home market.
Anyway, the IBM PC was very definitely built to a price point and a release date; it was also built to avoid cannibalizing any of IBM's pre-existing product line, which explains why Compaq, not IBM, would come out with the first PC built around the 80386. It was fairly impressive compared to S-100 bus computers running CP/M but the Apple II crowd would have been forgiven for not being bowled over.
Someone has to write the drivers for the hypervisor. A hypervisor is nothing but a minimal OS that provides a different set of resource management guarantees. You can't make many of those guarantees without access to the hardware.
Yes, agreed fully. The point of the distinction, here, is not that everyone can avoid writing drivers--somebody, somewhere has got to write them into something. The hypervisor is in fact an Operating System in every sense of the term (and I assumed people already knew this and would take it for granted--Xen is an OS; Linux doing KVM or LXC is (obviously) an OS; and so forth.)
But the point, here, is that the type of "OS" people talk about when they say they're "designing an OS"; and the "OS" Microsoft, or Apple, or Google, or Canonical are selling you; has everything to do with how processes are managed, memory is allocated, IPC is done, etc.; and nothing to do with drivers.
OS devs don't want to think about drivers; they just want them there. Driver support is, in other words, a commodity.
And what do we (hopefully) do with commodities, in the software world? Why, we break them out into (usually BSD/zlib-licensed) libraries, with a consortium of companies co-developing it for everyone's benefit! See, for example, Webkit.
Here's another way to approach the idea. Imagine if (for example) every OS was actually based on Linux; if Linux was a kernel "kit" that provided all the drivers, but then left a big pluggable hole where Microsoft and Apple and everyone else had inserted their own implementation of the stuff that matters to people in an OS.
That'd be neat, wouldn't it? Everyone would be using the same drivers; any update to the driver exokernel would help out everyone equally; all the proprietary hardware manufacturers would be forced to contribute to it because of its monopsony status. It'd have quite a few advantages[1], most importantly that as long as [major OS] supported your device, all the little niche/hobbyist OSes would get support too, for free. Just like, today, if Chrome supports some weird new CSS trick, then you can expect it'll also work in Safari, on iOS, in node-webkit, and on random Android browsers (even most of the crap proprietary ones.) Suddenly, writing your own OS isn't such a daunting task.
The reason this hasn't happened, obviously, is that "Linux" isn't just drivers; it's all the other stuff that makes Linux Linux, instead of Windows or OSX. So you can't build on top of Linux without accepting the Linux memory-manager, and the Linux process model, and the Linux scheduler, and the Linux VFS, ad nauseam.
The reason that this hasn't happened, in other words, is that we aren't making this distinction.
---
[1] It'd also have the obvious disadvantage of giving malware authors only one system to target all their exploits toward--but that also means that all the security researchers would be working together to harden the exokernel, out in the open, with no possibility of "responsible disclosure." And obviously, groups like OpenBSD would be still going their own way with their own exokernel, just like they always have.
One of the main advantages of using the hypercalls as the point-of-contact is that it standardizes and contractualizes the interface between the driver-exokernel and the userland-managing-kernel. If we pick a standard set of hypercalls--a modern version of a BIOS API--then you can swap out the Linux driver-exokernel for OpenBSD's version, or any other version, and it'd interoperate with whatever userland-managing-kernel(s) you wanted to use.
I agree with your sentiment but when someone comes along with the smart idea of running a knife between the business end of the OS and the driver stack we end up with abominations like UEFI that just push the OS's duties in to the firmware.
The API of a driver is just as important as the bits they twiddle because it insidiously chains you to a particular abstraction model, i.e. a specific state machine you're building on top of hardware state. This is why Linux doesn't maintain a stable driver API. If you enforce driver API stability then you're boned down the line when it comes to moving from, say, a synchronous model to an asynchronous one.
Tanenbaum considers it an extended machine and a resource manager [MOS 1.1]. So I think that calling something an OS even though it's not on the bare metal is still cromulent.
I can't reply to your comment below, but in reply to it as well as this one: I'd love to read a good article about how this would work. I may or may not be able to piece it together from your erlangonxen and openmirage links, but it would also be great if someone knowledgeable (you?) would write about it!
Well, the problem with Plan 9 is that it requires everything support the Plan 9 protocol. The entire stack has to speak Plan 9, otherwise nothing works. So anything that interacts with it would have to support the full protocol, whereas Unix is just "pipe this output stream into that input stream," which is quite a lot easier to adopt.
It's not unusual for an operating system to require its programs to use its system calls. A Unix program that doesn't speak Unix protocols is just as useless as a Plan 9 program that doesn't speak Plan 9 protocols. It's a moot point though, since you'd have a difficult time unwittingly making a compiler generate a program that doesn't work on its own operating system.
Pipes on Plan 9 use the same pipe, read, write, and close system calls as Unix does; they are an equivalent amount of work to use (or adopt, as you say).
its important to note that "speaking the plan9 protocol" amounts to normal file operations like read(), write(), stat() etc. This also works fine with pipes.
You can run Plan 9 under Xen, Amazon AWS being one of those places. I've been running it under VirtualBox as well. If you are already using virtualization, the Plan 9 device drivers may be less of an issue.
Which is fantastic endorsement. GUIs and the Web may be nice for mucking about - but when I work - I work on the command line (or in emacs). All I truly care about is daemons and command line interfaces.
Of course - my daemons and command line interfaces have to produce pretty output, for people who care about other things. But those people are not me.
So I installed Plan 9 from User Space[1] last night - largely on this recommendation :)
I feel like GUIs these days cater too much to novice users. Obviously the old-school terminal/no-graphics style is pretty dated, but I have yet to see a nicely integrated solution with the advantages of GUIs and the power of the cli.
So after your first comment I was wondering what GUI gives you as far as tools in an IDE that you couldn't have with curses style terminal-based interface.
Emacs (and vi) do a wide assortment of refactorings, auto-completions, multipaned outputs, large scale cut and pastes etc. I couldn't think of anything that actually needed graphics. Given that i couldn't think of any reason why an IDE needed to be a GUI.
That is to say I can see people preferring the look, or the use of a pointing device, but I couldn't figure out any coding features that where actually GUI-only. You have any I couldn't think of?
Code navigation, able to represent visually references between modules, packages, binary files. Quite handy to navigate across large code bases, typical in enterprise environments.
Another nice feature is the ability to see graphical representations of data structures. Specially handy for list and tree based structures.
Have you also used Acme (non-trivially)? The UI makes heavy use of text (the whole thing being a text processing environment after all) but the graphical aspect is highly significant, and most of the (fairly complex) interaction is via mouse.
The Plan 9 userspace excels at tying text-using utils, but the traditional Unix CLI is not the means by which you tie them. Even the terminal is quite far removed from a typical command line.
Acme is based on how the whole Native Oberon OS works, which provides a much better UI experience using the same principles, specially the last version (System 3).
That is nothing new, it is how Native Oberon works (1987), which is based on Smaltalk (1972) works.
In Native Oberon, every module can export functions as commands that are mouse callable in the UI and take input from mouse selections, which can even be UI objects (gadgets).
Most of the mouse/keyboard interactions in Rio were already available in Native Oberon.
There are Amazon AMI's for plan 9, called plan9-fossil. You don't use ssh to get in though. You need to use the plan 9 "cpu" command, using the "ec2" user and a random password printed on the instance console/system log. I have a plan 9 host running under VirtualBox that I used to access my Amazon Plan 9 host. It worked just great. Another little item is you have to allow TCP access to ports 17010 and 17013 on the instance.
Note that I put it in the background. You will want to adjust your server address as appropriate.
I should add that drawterm just opens a blank Plan 9 desktop after you authenticate. You have to open a new command session on your own. A right click brings up a menu, you want to choose "new". This will turn the pointer to a "+" shape. You then right click and hold, stretching out the command window until it's the right size. When you release the right click, you will have a rc shell to type commands into.
Neat! Thanks for the instructions - that's definitely the easiest way to play around with Plan9 that I've come across (but I guess I haven't been looking too hard...)
One of my favorite features of Plan 9 is the extremely powerful filesystem abstraction through the 9p file protocol.
It is incredible how clean becomes the design of the applications and the communication between the modules with a simple file interface. Currently I'm working on potential extension of this idea - an ontology-based virtual filesystem, where the files are replaced by objects and the organization is defined by relations, instead of the hierarchical structures.
Plan 9 was basically the successor of Unix by the same research group, more modern, taking the ideas of Unix further, a distributed operating system with full network transparency, better shell (actually, rc already replaced the Bourne shell in version 10 Unix) and lots of other goodies such as the plumber. And it served as the "reference implementation" for UTF-8.
I understand all that. My point was that if you know enough to appreciate Plan 9 then you know enough to take advantage of the *nix ecosystem as it exists today, which is far too mature for Plan 9 to catch up to. We can always dream though...
Considering plan9 is so great for distributed computing, those with Big Data experience, do you think it is only a matter of time before plan9 replaces other operating systems being used in map-reduce, clustered environments?
That's partly due to historical reasons I think. Some of the platforms they intended to run Plan 9 on had very buggy GCC ports. It's also worth noting that that the "native" Plan 9 compilers had some quirks of their own (see http://doc.cat-v.org/plan_9/4th_edition/papers/compiler ). It was never really intended to be a very familiar Unix environment, and they tried to straighten up a couple of things even in the way software is written. There is a POSIX environment available (the APE) but I don't know much about it, I never really tried it.
To be fair, between the "good enough" phenomenon and the licensing problems, it never really took off, which is rather sad.
Plan9 been free and open source for over a decade (2003).
At first it was released with some very unusual constraints(Lucent Public License v1.0). But those were quickly lifted(Lucent Public License v1.02).
It has been said before and I agree. The problem with Plan 9 adoption isn't Plan 9. Plan 9 is beautiful. The problem is that Linux is quite alright as well.
Also, Linux has drivers for more hardware when Plan 9 was ready to be played with. Plan 9 doesn't. It is a chicken and egg problem. I remember trying Plan 9 and it just didn't have drivers for some of my hardware. It was like Linux in back in the days.