> Primarily because dedicated servers are a lot less efficient Assuming you're o...

menage · on May 18, 2015

Yes, I think it's safe to say that Google operates at scale.

A lot of user-facing services at Google have to be over-provisioned in order to handle the cyclical usage patterns (the daily query peak is far higher than the average for most services) and to be able to survive the loss of a datacenter or two. This results in a lot of under-utilized servers for a big fraction of the time. So by packing lots of medium and low priority jobs on those same servers (and over-committing the resources on the server), you can soak up the slack resources; in the event that the resources are needed by the user-facing service the kernel containers ensure that the all the less latency-sensitive jobs on the machine don't compete for resources with the user-facing services.

It's true that the performance isolation when there are tens of jobs running on the same machine isn't going to completely match the performance isolation of running a service on a dedicated server, even with kernel resource isolation via containers, but you have to make cost trade-offs somewhere. The number of Borg services that could justify requesting dedicated machines was very small.

And to address your other concern about the OS not having so much insight into what's going on - Borg containers consisted generally of a single process, running on the machine's normal kernel. The containerization was just for in-kernel resource accounting/isolation. (Using Linux control groups, rather than anything fancier like LXC or Xen)

jacquesm · on May 19, 2015

I think it's a fairly safe assumption to say that Google (and FB and a bunch of other extremely large web properties) run into different problems than those that are faced on a day-to-day basis by most run-of-the-mill web companies.

Thank you for the insight into the number of processes inside a typical Borg container, so that was basically a kind of 'heavy process' rather than a complete application with all dependencies (including other processes the main one depended on) packaged in, this is something I wasn't expecting at all.

btilly · on May 19, 2015

Borg containers consisted generally of a single process...

Really?

My impression was that typically you'd have a process for the service, a borgmon process for monitoring, and maybe another process to ship logs off in the background.

Developers would only think about the service process (which itself typically was a fairly thin shim in front of other services), but a borg container would have more than that going on in it.

menage · on May 19, 2015

The borgmon process would be a completely separate job on separate machines (generally with a lot fewer instances).

The logsaver would also be a separate job, although typically running co-located 1:1 with instances of the actual service job. The service and the logsaver would have access to the same chunk of disk (where the logs were generated) but otherwise they were separate as far as the kernel was concerned. (As far as Borg was concerned they were very much related, but that was at a much higher level than the kernel).

btilly · on May 19, 2015

You are right. I was remembering that I'd see those three together in the borg file, and was thinking about them being co-located because of that.