(in reality it's barely even that e.g. it only handles GET and POST methods, discards every header, …, so it's an HTTP server in the sense that it kinda sorta will respond to HTTP requests).
So there is no keep alive in these tests including lb to application? Makes sense why the qps is so low on a single core for all versions tested if nginx is having to reopening a new socket each time to zerohttpd. Not sure how useful this is as keeping your connection alive to your lb is important for throughput.
Not in all use-cases. If your backend is serving long-lived HTTP streams (big downloads; chunked SSE streams; websocket sessions), it may make more sense to close and re-open those sockets between sessions, since they live long enough to establish TCP window characteristics that may not apply to the session succeeding them (e.g. an interactive-RPC websocket session, reusing a TCP connection previously used to stream a GB of data using huge packets, will start off quite a bit slower for its use-case than a “fresh” TCP session would.)
Keep-alive is a win in most situations, but especially so in any kind of benchmarking, since it's very easy to hit bottlenecks in opening/accepting new connections, e.g. running out of ports. If you are opening and closing thousands of connections, port space / TCP tuning can become the limiting factor, regardless of your server architecture.
As mentioned in the first part of the series, the main idea is to compare and contrast Linux server architectures. ZeroHTTPd doesn't implement the HTTP protocol in full and you can easily crash it because of the way it uses memory buffers. It is not safe to run it on the internet.
Its purpose is not to show how to implement an HTTP server, but to show how different architectures of Linux network servers are written and perform.