For those who are not familiar with Lua, or who have heard the prevalent "conventional wisdom" about it, here's a quick clarification:
A. Lua's source code is really really small and extremely well-commented[1].
B. Lua is super easy to integrate... you can literally just throw Lua's entire source code into your app[2] and it'll compile just fine.
C. Lua's C API is super easy to use, and way more thought-out than Python's or Ruby's (no offense to Matz or Guido) -- yes, it is stack-based, but this actually makes its API much simpler and smaller than it would otherwise be. You can probably learn all of it in an afternoon (with the right guide).
D. Lua is uncannily similar to JavaScript in semantics, and the syntax[3] is ridiculously simple to learn. Semantically, it has way less magic to learn than either Ruby or Python.
E. Lua isn't only for scripting video games. I wish this "conventional wisdom" would just die. Yes, it is good for scripting video games, for the same reason it's excellent for scripting anything.
F. Table indices starting at 1 is not hard to reason about, not hard to use, and it's not hard to switch back-and-forth between Lua and languages where indices start at 0. This is a complete myth.
All of the above are why antirez picked it for Redis, despite being a fan of Tcl. I've never used Lua, but it strikes me as 'the' thing in highly-embeddable niche these days.
Oh thanks for helping me remember the last bullet point:
Lua is fine for small embedded environments, but it's also an excellent way to add scripting support to plain old bulky GUI apps like Firefox or Sublime Text or whathaveyou.
A contrary view, just to be fair and balanced. I really didn't enjoy using Lua, though it had its moments.
A. Lua's source code is pretty inscrutable, and written in a style that was pretty alien to me. It was also a bit horrid to single-step through because of all those damn reference copying macros. But it's actually fairly easy to add stuff to it and modify it, and this was what actually convinced me to go with Lua in the end; look-mummy-my-first-Lua-hack was surprisingly easy to put in.
B. Lack of variable declarations and global-by-default was a terrible idea. Languages should require declarations, or at least be local-by-default, and have a special interactive mode for when you want the no-declarations-and-global-by-default behaviour. Because making things "convenient" is fine, until it isn't, at which point it just becomes a bug magnet. For a language supposedly designed for use by people who don't know what they're doing, this borders on criminal; when you've got years of experience behind you, it's merely a huge pain in the arse. MaxScript gets this wrong too.
C. 1-based indexing is annoying. If it makes no difference, switch to zero, because it's then the same as other languages; if it makes a big difference, switch to zero so that it isn't so difficult. Just because you can switch back from one way of doing things to the other doesn't make 1-based indexing right! MaxScript gets this wrong as well. But 1-based indexing would be wrong even if MaxScript and Lua didn't do it.
D. Lack of a proper array type is silly; you have this ugly thing that's like an array, but not, and it's easy to stop it being like an array. Because it's not an array... it's a mapping. A mapping with some very odd ideas. Come on people, mappings and sequences aren't the same! Shape up!
(At this point in my argument I used to be in the habit of adding, "after all, you wouldn't skip having separate integer and floating-point types, now, would you?" - because Lua famously had just the float variety. But these days they have integers too. So I'm still in the habit of adding it.)
(The Lua guys' insistence on minimizing number of data types can also be seen in the lack of a separate symbol type. Even MaxScript doesn't get that wrong.)
E. The C-side API is actually kind of weird, but on the plus side it's actually somewhat difficult to use improperly from C. So... I guess that's even. Maybe there shouldn't even be a point E, but... well. The C-side API really does stick in my mind.
Aside from that, no complaints!
(This post is at least partly just so that other people who don't like Lua know that they aren't alone.)
As someone who chose the Lua runtime as a basis for his own toy language, these are my observations:
A. True. Paradoxically Lua's source is well-documented in places that arguably didn't need documentation in the first place, but it tends to contain no comments whatsoever in tricky places like the VM or the implementation of internal data objects.
B. Global-by-default has bugged me as well. Thankfully it's not difficult to patch it so declarations are mandatory and local though.
C. It's a matter of taste in the end. I struggled for a while on whether to convert it to 0-based indexing or not, but so far have left things as they are. Depending on your programming style, you might end up not using indexes directly all that much.
D. The distinction between having separate array and hash parts in a table kind-of works, but only if you make the conscious decision never to work with numerical indexes. Generally, I think this type of array/map hybrid is something PHP got right with their ordered hash maps, whereas Lua's way of doing things have a few gotchas.
[] For a lot of applications, having only one number type works surprisingly well. I just wish they had opted for morphing the internal number representation according to usage in the script, instead they are now doing a separate int type in Lua 5.3 which feels a little out of place for my taste.
[] I really don't think Lua needs a separate symbol type.
[] Things I didn't like include how often you have to check for certain conditions in Lua to avoid raising fatal errors. Not being able to use keywords inside the table dot notation. The idea that the self operator works on a function in the table instead of its metatable. The lacking core library, especially when it comes to table usage. The fact that a require()'d file doesn't know its own file name. No default to-string serialization on tables. The pattern that control flow statements always require code block components instead of single expressions. Goto statements.
Here's a link to my project which addresses my personal Lua pain points: http://np-lang.org/
You should have a look at Squirrel (http://www.squirrel-lang.org/) - it addresses most of the points above but is still very lightweight and easy to embed. The main disadvantage is that the documentation and community is not as well developed as Lua.
Variables being global by default is definitely a wart. You can mitigate it a lot with a linter or with runtime tests (extending the global metatable). Local by default would be a bad idea and worse than global by default, IMO, as it would mess up lexical scoping. The ideal would be having a compilation error by default.
It would be fine, I think. Just create the variable in the scope you want it, then assign to it somewhere else. If you get it wrong, you get it wrong, and there's scope for weird bugs from that - but the effect is at least localized, lexically.
(Of course, the right thing to do is have explicit variable declarations...)
Localized bugs are still bugs. Better always require a variable declaration and have no defaults.
Usually languages use local-by-default to avoid the need for variable declarations in the first place (in misguided manner, IMO) instead of doing so to avoid globals.
A. C is a beautiful language. I believe a more productive attitude towards any usage of C - as a binding/engine language layer, or whatever - is to assume: there is no one familiar style. (What Lua/src has, is good coding rules compliance..)
B. You control the VM, completely. The Lua language is designed with the VM integrator in mind. Lua frameworks are not something you inherit. So .. Why does global name-spacing versus local definition bother you? Show me your offensive global/local case; in every single case, it can be solved with better design. At the very least, walk _G yourself.
C/D. Its the 21-st Century, 1-based indexing makes sense to most of the non-hacker world, since you can't have a "0-th" box in which to store things in the real world. Except of course you can, you just make one and label it, arbitrarily, the '0' box .. in fact you do have the ability to index everything 0-based yourself in Lua, so have at it. Lua provides the 'one-type to rule all things' type, the Lua table{}, to give you the options you need. Want an array? Use your new_array{} as an array. Want a string-dictionary? Have at it. Need a hash: ditto. Bonus points: understand what you're doing, and get features from the table type; like scatter detection, packing/indexing/ordering operations, and so on. Lua metatables are your friend, not an enemy. Learn to use the table type and you will become a convert.
E. C-side API. You do have to understand the design constraints of the VM. That's about all you have to do. Is that really so weird?
Arrays start at zero because what you're counting is how far you're moving a pointer from the initial address in memory, not the items themselves. Array[0] is literally 'read what's at pointer sizeof(n)+0 from the start.' Zero-based indices make perfect sense when you know what they actually indicate.
Although to be fair, starting with one and actually counting the items makes sense too, and seems more intuitive, and this may not even apply to Lua at all. But one-based indexing is still wrong regardless.
a) it doesn't matter b) lua doesn't have these pointer things you speak of. If you are going to hate on 1 based indexing at least have the fine style of EWD.
Off by one errors are enough of a problem without having to remember that lua decided [1] means something other than what [1] means in almost every other language, so plan accordingly.
Symbol types[1] are useful for storing identifiers, method names or enums. They are common in LISPs and in Lua you use (immutable) strings for a similar purpose[2].
The main advantage of symbol types are that they have a more restricted API than strings and that they might be more efficient (integer vs string representation)
Since you pointed out that Lua strings are immutable you're aware of what I'm about to say, but just to make it explicit:
The efficiency argument is a common one for symbols in general, but in Lua it doesn't really hold because string comparison has constant cost (ie, it's a simple pointer test, since strings are immutable).
(This may or may not be an issue, but it still offended me somewhat, since you're getting all of the compile-time cost at runtime, combined with none of the expressiveness. Not a tradeoff that impresses me, personally, even if the speed is fine - but it takes all sorts.)
> Lua is uncannily similar to JavaScript in semantics
I see what you mean, but before people run away screaming: the similarity lies more in the way it is amenable to Self-style prototype-based object orientation, and not in the various "wat"-style semantic quirks of JavaScript.
A very interesting article. While the idea of using scripting to extend OS functionality has been around for a long time, surprisingly few attempts have been made to implement it.
Of the potential uses, an OS built-in webserver would be especially appealing. Many scripting languages provide tcp server/client APIs so it doesn't seem too much of a stretch to embed this functionality in a kernel. For example, Tcl's "socket" core command makes it very easy to set up a tcp server or client.
If performance or security is an issue, some interpreted languages can also be compiled to native code. There are Scheme implementations that are embeddable in C and can load interpreted or compiled modules.
One thing I didn't completely grasp is the distinction of "embedding" vs. "extending". Are these really disjoint ideas? I found this puzzling:
"... Extension scripts are loaded into instances of a kernel-embedded Tcl interpreter that execute in independent system processes [22]. A set of extension bindings exposes to the scripts the necessary resources for extending the kernel. What distinguishes our approach from [micro]Choices is our support of scripting by embedding the language interpreter, in addition to scripting by extending it." [emphasis is mine]
On the whole, the idea seems familiar and sensible where sane security policies and safety constraints are applied to actions of the embedded scripting language.
Yes, there is a difference between embedding and extending. Extending means your app is just a module that lives in the language's interpreter. Embedding means the language interpreter lives in your app.
One advocate[-1] of extending prefers it simply because he got bitten by his own poor design.
Another advocate[undefined] prefers it simply because he prefers JavaScript and the easiest way to embed it is to write your app as C plugins that hook into V8[NaN].
The Python community[2] will unfailingly tell you to choose extending over embedding, delegating their reasoning to this highly questionable rant[2].
The 'Python community' says those things because the CPython codebase is a PITA to embed, that is the only reason. With a clean design, like the Lua interpreter, embedding vs extending is about equal.
Re: extending vs. embedding, it depends on whether your main app is written in Lua, with C bindings (extending), or in C, with the C code calling the Lua interpreter (embedding). See for instance this Python guide: https://docs.python.org/2/extending/
I'm working on something that goes in this direction but it will still take a lot of work: https://github.com/mntmn/home-lisp
I think the "peripheral agnostic" idea might not work out, though.
A while ago, I hacked some forth interpreter (I don't recall which one) into the Linux kernel. It was just fooling around though, and didn't do anything useful or hook into anything interesting. This looks like a much more practical approach to that kind of thing.
A lot of the discussion here has been around Lua, but not around what they do with it.
I think that putting yet another language and runtime in the OS kernel is a bad idea. It makes the amount of code to validate much bigger. I much prefer the Mill Computing approach to security, where "everybody works the same" including the kernel (see http://millcomputing.com/docs/security/ if you have not seen it already). That's also the idea behind many microkernels.
Although the Mill CPU embeds basic mechanisms in the hardware to make it easier, I believe this could be achieved on modern x86 or ARM as well with reasonable performance. You don't need privileged instructions to deal with packet routing or CPU throttling, you only need controlled memory access to specific regions of memory (e.g. device registers, buffers, etc). I see no reason why the scripts could not do their work from user-space, with kernel-controlled access to the regions of memory they need to operate. And the overhead in doing that is practically zero (basically, it's the cost of the TLB translations, which is paid for every single memory access anyway).
Obviously, you also need inter-process and inter-CPU synchro, interrupt handling, etc. But all these are already presented in a virtualised form to user-space.
Also, if you want the flexibility of a scripting language, you are willing to forego quite a bit of performance in the process. Is the cost of a user-kernel transition really that relevant in this scenario? And if it is, there are mechanisms to mitigate that cost, e.g. pooling requests, deferring, sending to another CPU, etc.
I'm a bit appalled by the way they measure the overhead (Section 5 of the article). They say that their CPU rethrottling code runs in "only" 8 microseconds. Well, if compared to the rescheduling interval, it may seem small, 8 microseconds in kernel code is actually a lot. It means 1/125 of a regular millisecond tick. All this just to change the frequency of the CPU?
In short, I agree with the objectives (making the OS more extensible and more flexible), but the way they did it seems dangerous and under-optimal.
>>I think that putting yet another language and runtime in the OS kernel is a bad idea.
I'm of the opposite opinion. My experience with distributions and host-OS/integrated libs+bins, from a product perspective, lends me more towards the new-school idea of putting the Linux kernel in a bare machine with nothing but Lua.. because actually this is already a configuration and approach to embedded computing. Lua is used industriously as a configuration and embedded scripting language; it is one of the small/light/good-enough variants of the gems of the open source world.
A full-stack Linux OS with strictly Lua front-end, from embedded to desktop, is an achievable target.. and indeed a worthy goal.
There will always be new languages. Always. But what really matters is the usage factor. Putting an extreme case forward as a simplification of the field will always, thus, be an interesting exercise. As we can see with the router-Lua guys, often profitable.
Imagine that space encroaching on Android, et al.? There are some who say that a bridging maneuver against the disastrous developer-mindset-killing, "iOS vs. Android" battle, has been waged, and its flag is Lua.
I really like NetBSD. It has a clean design and it was the first Unix-like OS that I really picked up on, right around the v1.5 - v1.6 releases (2001 to 2002).
I haven't been able to run it much since then due to owning incompatible hardware, but after seeing how well it runs in emulation, I am considering recycling some PC hardware and running it on top of Microsoft's free Hyper-V Server 2012. The only problem is that Hyper-V Server doesn't support wireless networking and where I want to put it isn't anywhere near a router, so I am unsure how to handle networking.
The article is well-written and makes a good point that scripting languages open new avenues for OS extension.
However the question remains whether Lua is the right candidate for the job, being not-so-open yet. Obviously the hacker cultures in NetBSD world and Lua world are still quite different, so I see no official integration in any foreseeable future.
A lightweight LISP or Scheme, on the other hand, might be a good candidate for a long-term implementation of the article's claims.
Yes, the MIT License is one of Lua's very, very key features. Those who use Lua and ship it in the millions to paying customers are only able to do so, the way they're doing it, because of the freedoms of the MIT License.
Apropos Lua not being 'open', or a 'bad language', as a user of Lua on a full-spectrum approach, I can definitely say "meh!" to all nay-sayers. For every detraction, there is a significant advantage. LuaJIT, cache-coherency, access to high-performance components, well-compartmentalized, due to a simple discipline .. these are the things to love about Lua. But please, anyone not yet familiar with the subject, approach Lua not critically; from the perspective of the progressive, if you haven't grok'ed the way you can drop the VM into anything and pass it byte-code, then please do this first. Grok the VM, guys. Fact is: Lua can be glommed onto anything.
Schemix was "a Scheme system implemented as a patch to the Linux kernel ... for exploration of the Linux kernel and for rapid, interactive prototyping of Linux drivers and other new kernel features"
Lua is free open-source software, distributed under a very liberal license (the well-known MIT license). It may be used for any purpose, including commercial purposes, at absolutely no cost. Just download it and use it.
I wouldn't say Lua is similar to C as far as syntax and semantics goes (for example, arrays indexes start from 1). That said, Lua is great at working with C - its very easy to call C code from Lua and vice versa.
The Lua module is already in NetBSD, that's pretty official integration.
There is also a large overlap between people who use NetBSD and people who use Lua, probably due to the fact that they are both relatively small C based projects that try to do one thing well. Indeed the linked paper is written by two NetBSD commiters, the main author of Lua and another person on the Lua team.
A. Lua's source code is really really small and extremely well-commented[1].
B. Lua is super easy to integrate... you can literally just throw Lua's entire source code into your app[2] and it'll compile just fine.
C. Lua's C API is super easy to use, and way more thought-out than Python's or Ruby's (no offense to Matz or Guido) -- yes, it is stack-based, but this actually makes its API much simpler and smaller than it would otherwise be. You can probably learn all of it in an afternoon (with the right guide).
D. Lua is uncannily similar to JavaScript in semantics, and the syntax[3] is ridiculously simple to learn. Semantically, it has way less magic to learn than either Ruby or Python.
E. Lua isn't only for scripting video games. I wish this "conventional wisdom" would just die. Yes, it is good for scripting video games, for the same reason it's excellent for scripting anything.
F. Table indices starting at 1 is not hard to reason about, not hard to use, and it's not hard to switch back-and-forth between Lua and languages where indices start at 0. This is a complete myth.
[1]: http://www.lua.org/source/5.2/
[2]: like this: https://github.com/sdegutis/mjolnir/tree/master/Mjolnir/lua
[3]: here's Lua's EBNF in its entirely, fitting on a single screen for me: http://www.lua.org/manual/5.2/manual.html#9