Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

On a VM, you often don't. I get login: on a bhyve virtual debian or FreeBSD in seconds.

Probably, you're running in a mode which has no "hints" to avoid probe delay costs. Or, depend on subsystems (SCSI?) which depend on device spinup delay to avoid surges in current draw on the PSU, so e.g. raid disk arrays delay boot time to cohere and stabilise before they do cam control.

But usually, its because people run un-optimised, generic, factory-shipped because 30s is lost in noise. If you dive down the rabbit hole, you can get a LOT faster.



Never mind seconds, I have the FreeBSD kernel boot (aka time-to-init(8)) down to 40 ms in Firecracker.


> so e.g. raid disk arrays delay boot time to cohere and stabilise before they do cam control.

Huh. So is that why it takes server hardware two to four minutes to get past the BIOS screen? Gradual PSU-protecting spin-up of hardware RAID?


Depends on the specific server hardware of course. Generally, yes the hardware RAID controller does consume a great deal of time (often displaying a splash screen with progress on the console, if one is connected at boot).

Server hardware also tends to do other hardware checks/validate before booting as well. Some of these checks are able to be disabled, and some others are not.

Then, there are systems like IBM iSeries running iOS/OS400 which then hands over to the OS to perform a great deal of system integrity checks before enabling services and I/O. Our Power7 series "400" system used to take upwards of 20 minutes to come online before we decommissioned it.

Generally speaking, for a server, boot time is not a factor. Nobody needs their physical server to boot and be ready in 5 seconds... so it's never been a priority. Stability and integrity are much more important in this arena.


> Generally speaking, for a server, boot time is not a factor. Nobody needs their physical server to boot and be ready in 5 seconds

Nobody has had the option of this. I very much have use cases for this, and while the firmware remains a persistent problem, I've been able to bring up the kernel and userspace on server hardware in a fraction of a second.

I'd love to have the firmware stop adding so much time here.


I'm curious what that use-case might be, if you are able to elaborate.

Physical servers tend to have extreme uptimes, compared to desktops and even VM's. It's a fairly rare occurrence to need to reboot a physical server (by rare, I mean only when doing updates that require a reboot, such as a kernel update, and these days there are mechanisms to avoid rebooting even under that scenario).

Needing to reboot often and be available in seconds just isn't a thing needed by most folks operating physical servers. Typically folks that require that operate VM's instead, which can freely be rebooted without powering down the hardware.


> Physical servers tend to have extreme uptimes,

And that might be a bug, not a feature.

If you are running on physical servers you’re likely to be overprovisioning, at least a bit, to be able to absorb peak load.

But most of the time you’re not running at peak load.

If you can scale your workloads up and down, via containers or anything else, you have excess capacity being wasted. Basically you have servers turned on and doing nothing but waste electricity.

In that context you might want to just turn off servers and turn them back on when needed.

And in that context you want servers to boot quickly.


In that context, does it not make more sense to rent the infrastructure instead of host in-house? Or, use commodity hardware that does not have all the redundancy and resiliency baked in at the hardware level - after all, it's completely unnecessary for short-lived services.

We have solutions for these scenarios already - VM's, Serverless, Containers, etc. I am having a really difficult time understanding the scenario where someone must use a "proper" server with all the long-runtime, redundant, resilient hardware configuration but needs it to turn on and off quickly.

Round hole, square peg. There are much better solutions available, starting with commodity hardware and ending with "the cloud".


> In that context, does it not make more sense to rent the infrastructure instead of host in-house?

Yes, but it also makes sense to be able to rent it for short periods of time, billing by the second. And sometimes you do need bare-metal, and a VM won't do.


Many people want to use physical servers via the same kind of infrastructure that gets used for VMs. Consider something comparable to AWS's `.metal` instance types or the Kubernetes "bare metal" provider, for running things that don't run in a VM (such as VMs, since nested virtualization is variously unavailable or slow).


Perhaps I'm just "old-school", but I still am not seeing the reason why this physical server needs to be capable of coming online in seconds. When renting a physical server, typically one understands the trade offs they are making versus operating a pure VM infrastructure, even for development purposes. VM's alleviate concerns about hardware for the most part.

Physical servers just are not typically spun up and down in short succession - making the boot time completely irrelevant for all but the most extreme use cases. Many/most of the things a physical server is doing during POST and boot are necessary to guarantee reliability and integrity.

Perhaps desktop "off the shelf" hardware is more what should be prescribed under your scenario.


> I still am not seeing the reason why this physical server needs to be capable of coming online in seconds.

Spinning up in response to requests, without having to be constantly running (and being billed for).


Is this not the perfect (and intended) use case for VM's, Containers, "Serverless" Functions and other mechanisms that all ride on top of already booted physical hardware?

It seems to me this particular use case is a lot like wanting a forklift capable of driving highway speeds. Can it be done? Sure. But then you sacrifice a lot of what made a forklift good at it's intended job.

There is nothing to be gained from running short-lived code directly on physical servers. Today's Hypervisors are very good at not getting in the way - ie. there is little to zero performance penalty from running a VM, particularly a VM with dedicated resources.

In any event - what you desire can already be achieved, just not with your typical IBM/Dell/HP/Whatever server platforms. The things that make these computers desirable (and earn the denotation of a "server") for long-running systems are unnecessary if the goal is to boot, execute some code, shutdown, repeat. Off the shelf commodity "desktop" hardware is fully capable of booting in seconds and executing code - you just lose all the redundancy/reliability - but that's not needed under this scenario anyway.


My system boots slow because I have a legacy SCSI card installed and it's boot ROM takes ages to get running and scan the bus. Servers are more likely to have these sort of facilities onboard.


Absolutely. Although I wouldn't peg this to just legacy support - in my experience even modern SAS arrays take ages to come online.

As the GP noted, modern systems coordinate the array through an integral backplane and control spin-up of individual disks plus a lot more. The main difference between a "server" and a desktop is the reliability/integrity/redundancy built into the system, which comes at the cost of boot time. A worthy trade-off in most cases, and once the system is "alive", the expected performance is achieved.


>Generally speaking, for a server, boot time is not a factor. Nobody needs their physical server to boot and be ready in 5 seconds... so it's never been a priority. Stability and integrity are much more important in this arena.

Standard compute isn't the use case. Intel offers Slim Bootloader for systems that require very fast booting.. as close to "instant on" as you can get. An example would be a collision avoidance system in vehicles. Safety critical systems that require the ability to recover quickly.


RAM tests can be quite time-consuming. Not sure why they aren’t parallelized.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: