And it pairs well with another article on the front page. [0]
Which I bring up because they disagree on a particular point. And that is how a script without a shebang gets run as a script.
> This is done by registering a set of callback functions, and these callbacks get invoked when the kernel is asked to execute a binary file. The kernel invokes the callbacks on this list, and the first one that claims to recognize the file takes responsibility for getting it properly loaded into memory. If nobody on the list accepts it, then as a last resort the kernel will attempt to treat it as a shell script without a shebang line. And if that doesn't fly, then you'll get that "Exec format error" message described above.
But the article I linked to says the shell actually handles it. And based off of its research (terribly reproduced below), I'm inclined to believe it.
echo echo Hello world > test.sh
chmod +x test.sh
strace ./test.sh
strace sh -c ./test.sh
You'll see the first one errors with `ENOEXEC`, but the second one does not. Also, in my head, I don't know how the kernel would know what shell to choose, or that it should even expect to have access to a shell.
So I read the other article, and I saw that bit that disagreed with my essay. My first thought was, "Oh, of course that's how it works. How did I get that so wrong?" My only excuse was that this essay was originally a tech talk and I was under a deadline. (But I really should have caught it when I wrote it up as an essay.)
So I was going to go edit my essay, when I learned that my essay was also posted on Hacker News. And now I discover that someone has already called out my error before I could fix it. Sigh.
Anyway, I just thought I should acknowledge this before I go to fix it.
And kudos to you for following up with your own exploration. It's so often the case that the most interesting stuff is hidden in what other people are wrong about.
Both articles are correct, from me reading them. When you invoke a shell script directly, it gets passed to the kernel to try and execve. The kernel returns ENOEXEC when it detects it doesn't have a shebang. The shell catches the error, and then as a last resort, tries opening the file and interpreting its instructions.
I'll quote the line more explicitly from this article:
> If nobody on the list accepts it, then as a last resort the kernel will attempt to treat it as a shell script without a shebang line.
They said that the kernel is responsible for invoking the shell. I honestly think this was just a brain fart and the author meant to put shell and not kernel. With both words flying around in your head, it's an easy mistake to make.
But, the again, the article goes on to talk about how it decides to even try that last step:
> Interesting side note: The kernel decides whether or not to try to parse a file as a shell script by whether or not it contains a line break in the first few hundred bytes — specifically if it contains a line break before the first zero byte. Thus a data file that just happens to have a "\n" near the top can produce some odd-looking error messages if you try to execute it.
I decided to do a bit more testing to make sure that the newline in the script wasn't causing the kernel to do anything different. What I noticed is the output of strace is identical between the different variations of the strace invocation, with one difference, with a new line, there's an extra read call, but that's just for the shell to see what's left to run.
I guess my next step is to look at the kernel source itself. I'll probably end up doing that in a bit.
So, I've dug into the kernel code. I can't find anywhere that has a fallback mechanism. When it fails, the errors bubble up. I might not be looking in all the correct places, but I believe the shell is responsible for attempting to execute the process.
I also put together two version of the same call to a shebangless script in Python, one with `shell=True` and the other without. It's only the one that calls into the shell that successfully runs the script. The strace outputs corroborate my theory.
Without shell=True (truncated)
[pid 961626] execve("./sh.sh", ["./sh.sh"], 0x7fff7bae94a0 /* 66 vars */) = -1 ENOEXEC (Exec format error)
With shell=True (truncated)
[pid 961623] execve("/bin/sh", ["/bin/sh", "-c", "./sh.sh"], 0x7ffd75009e50 /* 66 vars */) = 0
[pid 961624] execve("./sh.sh", ["./sh.sh"], 0x5980a07c70a8 /* 66 vars */) = -1 ENOEXEC (Exec format error)
[pid 961624] execve("/bin/sh", ["/bin/sh", "./sh.sh"], 0x5980a07c70a8 /* 66 vars */) = 0
And it pairs well with another article on the front page. [0]
Which I bring up because they disagree on a particular point. And that is how a script without a shebang gets run as a script.
> This is done by registering a set of callback functions, and these callbacks get invoked when the kernel is asked to execute a binary file. The kernel invokes the callbacks on this list, and the first one that claims to recognize the file takes responsibility for getting it properly loaded into memory. If nobody on the list accepts it, then as a last resort the kernel will attempt to treat it as a shell script without a shebang line. And if that doesn't fly, then you'll get that "Exec format error" message described above.
But the article I linked to says the shell actually handles it. And based off of its research (terribly reproduced below), I'm inclined to believe it.
You'll see the first one errors with `ENOEXEC`, but the second one does not. Also, in my head, I don't know how the kernel would know what shell to choose, or that it should even expect to have access to a shell.[0]: https://news.ycombinator.com/item?id=43646698