Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

After working in a few pretty large environments I have some pretty strong opinions about hostnames. I've seen everything from cute theme based names to function/role based names to meaningless uuid based names.

Usually you see the theme based names in much smaller environments, though I've seen it up to hundreds of machines. One problem I have with this approach is that people begin to personify machines, excusing their behavior. "Oh that's just akira being akira." This is counter to actually understanding and diagnosing problems. You're also relying on potentially faulty/emotional memories rather than having an actual stored account of issues. Not to mention, once you have a sufficient number of machines you're going to need a decent service database to know what runs what anyways.

Function/Role based names are what I have the most historical experience with. CBS Interactive, CNET, YouTube, and Dropbox all went this route. This often seems like a great idea and can get you really far. Most configuration management allows you to define classes/nodes/etc based on regular expressions. Sudoers has built in support for hostname globbing. It's also easy to tell what a machine's role is when you get an alert. So why wouldn't you want to go this route?

This method has some pitfalls that become increasingly burdensome as you grow into a larger amount of machines. You might have many types of services that run on a single hardware class. This introduces overhead whenever you need to update a machines role. Imagine it's Saturday and your mobile-web pool is suddenly under provisioned. Well, we have plenty of capacity in our web pool. Just need to rename the host in the machine database, update DNS, update DHCP, etc. While this might not seem like a lot of effort it definitely adds up. When I worked on a fantasy sports product we were constantly renaming hosts as various seasons came around. Same thing with just keeping spare/provisioned hosts around. How do you know how to name the host so that someone can just grab it when they need it? You also eventually get the boxes that serve more roles than the box describes itself as. You'll end up with a box call misc or admin and no one remembers that one day someone set it up as your static origin.

Another fun problem is the inability to describe more in depth what a box does. mysql255 doesn't tell you if it's a master/slave, which data is on it, should it be backed up? I've actually seen places that encoded all of this in a hostname. This is the extreme but it does happen.

The interesting thing about both theme and function/role based names is that to use them effectively in a large environment you already need good tools for managing the roles because the hostname is not effective enough.

At Dropbox, they started with functional/role based names but we decided to move to positional hostnames. These encode the dc, rack, position, and chassis. One immediate benefit we saw with this was quickly identifying when a rack or quad goes down. It's very apparent in monitoring just at a glance. We also have a service database that maps hostnames to tags/roles and those tags appear along-side alerts so you can tell right away which service is affected. We get benefits in easy preprovisioning and reprovisioning. Our configuration has become more generic and easier to comprehend. We can run multiple roles easily on one machine and it's easily discoverable what machine serve what role. In puppet we use an External Node Classifier rather than regex. We only have to worry about base config when a rack is initially installed.

That's not to say this method is flawless. One of the biggest drawbacks to this approach is that machines become harder to talk about and typing them is more difficult. I'd argue that you shouldn't care about specific machines unless they're a problem and in that instance copy/paste should always be used to avoid typos. There's some cool things you can do with PowerDNS' Pipe Backend to get dynamic resolution to service names. Another problem with this approach is it just requires a lot more tooling to get started. Obviously not everyone has the time to build all of the infrastructure around this ideal.

Anyways, sorry for the rant, but everywhere I've worked we've started with role based names and regretted it in the long term. Now that I've finally been able to live what I've long dreamed of, I couldn't imagine choosing another solution.

Obviously hostname conventions are a contentious issue so I won't say anyone is wrong, but these are my experiences.



One of my biggest pet peeves is when people name a host after a product, say 'mysql-prod-101', then later they decide to switch from MySQL to Oracle. I worked in an environment where a number of hosts contained 'tomcat' in the name but they were all running weblogic. Better to name based on the role, as you say, so the mysql-prod-101 would be better named db-prod-101. However, I've come to think that nowadays with configuration managment it's not really necessary to give hostnames at all. People usually don't like that suggestion when I bring it up, but if you are using something like puppet's 'facter' to run actions only on hosts that match certain 'facts', what do you really need hostnames for?

Edit: I just reread your paragraph about using positional hostnames. This seems like a good use, because the main problem I see with role based hostnames is that hosts can take on multiple roles. But in general the hostname seems unimportant if you are tagging your hosts with roles.


Ya, all our dev tools servers are named after the product they're running. (Well, most of them; one is named after a mythological being; one is named after its function.) As a user, this is frustrating because rather than remember what function I'm trying to access (e.g. mail, wiki, bugs, etc.) I have to remember what Atlassian decided to name their product that implements the function I want.

(Fortunately some entires in /etc/hosts + a local Apache server set up with RedirectRules fixes this annoyance, at least for myself…)


I've been part of some of large scale setups where usually location (even numbers = DC A, odd nubers = DC B), operating system (linux/aix/win etc.) and machine type (physical vs. virtual) are encoded in the hostname.

It can only get error prone if you have two hosts with similar names in the same project (made up example: lnx4739p34 vs. lnx4793p34) in the same project, i've seen hostnames getting reassigned because of regular mix ups.

If you happen to do ssh logins and manual troubleshooting, to make sure you're not messing with the wrong machine it's great to have puppet maintain CMDB data like service level, machine purpose, owner and open tickets etc. in a nicely formatted /etc/motd file, along with a autmatic "who" output in .bashrc to see who else is logged in.

Another question regarding hostnames is the use of (sub)domains. I find it highly annoying to deal with hosts like "mysql04.customer.vlan33.berlin2.intranet.my-company.com" instead of "lnx4739p34.myco.net", regardless of the global location.


The company I work for has about 20 sites with over 200 servers per site. We have cobbler boxes in each site (and more recently, puppetmasters). While some of our utility servers run as VMs under either VMWare or KVM, for the bulk of them we want every last scrap of performance out of the server that we can, so we primarily provision on physical machines and treat those physical machines as just containers.

Our naming scheme for the servers is similar to how you describe dropbox - there is a generic prefix and then a rack+uposition number (ie: gen0112). This has worked out pretty well. When we build a site or replace a server, we record the mac address. Then we can assign role-based child hostnames to the physical server name, update the cobbler config, and everything gets built accordingly.

Of course, there is a nice host database to track all these mappings as well as scripts to generate and push those cobbler configurations.


> One problem I have with this approach is that people begin to personify machines, excusing their behavior.

Does that hold with all themes? My first employer named computers after aircraft parts; easy to remember, made it clear that people were talking about a computer rather than anything else, but it never seemed to lead us to personify them.

> I'd argue that you shouldn't care about specific machines unless they're a problem

True once you've scaled up to the point that all your services are distributed, but before then it's worth having an awareness of what runs on what box. Memorable names help with that.

One other downside you've missed with location-based is that it makes it harder to move a machine.


I saw this post and came to write a comment about "save yourself some trouble, make it random or positional and just have a functional inventory system which you'll need anyway" but hey, look, it's gmjosack beating me to it! ;)


Party on HN! Sorry I'm late. :)


I think it depends mainly on the stakeholders. If we're talking about physical machines, I suggest location/hardware based naming. For virtual machines, I'd choose function/role based naming. Though, this depends on type and size of engineering staff that needs to work with it. For end users, I'd add service based names, possibly as cnames. And I'd plan to change the naming scheme as the requirements change. Has anyone a naming scheme that lasted more than ten years?


The hostnames I develop with are like "imdopsjpl741", difficult to remember. I prefer names like "rover" and "elephant" but can understand how that could be worse for the server farm ops people.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: