Whenever you are unsure whether to use a clever solution or follow the globally accepted standard in your work as a DevOps or Software engineer, always choose the standard.
Among other things, in an incident, people’s brains aren’t working - the more they have to remember about your particular system, the more likely they are to forget something.
While I agree on this particular instance, there are two types of things future engineers have to clean up after: Their predecessors thinking too small (and picking the easy route) or too big (and adding needless complexity).
One is not necessarily and in all instances less of a mess to clean up behind than the other.
it was already a 40 year old standard at the time you're talking about.
awareness of UTC being the correct choice has definitely increased over time, but UTC being the correct choice has not changed.
you say reddit servers use UTC now, which implies there was a cutover at some point. were you still at reddit when that happened? were you still hands-on with server maintenance? any anecdotes or war stories from that switchover you want to share?
because I can easily imagine parts of the system taking a subtle dependency on Arizona being Reddit Standard Time, and the transition to UTC causing headaches when that assumption was broken. your memory of this "clever" trick might be different if you had to clean up the eventual mess as well.
Hold on, I'm not a sysadmin guy. Are you folks saying the server should not know what part of the world its in, that basically it should think it's in Greenwitch?
I would have thought you configure the server to know where it is have it clock set correctly for the local time zone, and the software running on the server should operate on UTC.
From a logging perspective, there is a time when an event happens. The timestamp for that should be absolute. Then there's the interaction with the viewer of the event, the person looking at the log, and where he is. If the timestamp is absolute, the event can be translated to the viewer at his local time. If the event happens in a a different TZ, for example a sysadmin sitting in PST looking at a box at EST, it's easier to translate the sysadmin TZ env, and any other sysadmin's TZ anywhere in the world, than to fiddle with the timestamp of the original event. It's a minor irritation if you run your server in UTC, and you had to add or subtract the offset, eg. if you want your cron to run at 6PM EDT, you have to write the cron for 0 22 * * *. You also had to do this mental arithmetic when you look at your local system logs, activities at 22:00:00 seem suspicious, but are they really? Avoid the headaches and set all your systems to UTC, and throw the logs into a tool that does the time translation for you.
The server does not "know" anything about the time, that is, it's really about the sysadmin knowing what happened and when.
1) Most software gets its timestamps from the system clock
2) If you have a mismatch between the system time and the application time, then you just have log timestamps that don't match up; it's a nightmare - even more so around DST/ST transitions
you've got it backwards - the server clock should be in UTC, and if an individual piece of software needs to know the location, that should be provided to it separately.
for example, I've got a server in my garage that runs Home Assistant. the overall server timezone is set to UTC, but I've configured Home Assistant with my "real" timezone so that I can define automation rules based on my local time.
Home Assistant also knows my GPS coordinates so that it can fetch weather, fire automation rules based on sunrise/sunset, etc. that wouldn't be possible with only the timezone.
Windows assumes computer clocks are local time. It can be configured to assume UTC. Other operating systems assume computer clocks are UTC. Many log tools are not time zone aware.
that's the difference between "aware" and "naive" timestamps. Python has a section explaining it in their docs (though the concept applies to any language):
Yes, that's exactly what I'm saying :). In fact, I've run servers where I didn't even physically know where it was located. It wouldn't have been hard to find out given some digging with traceroute, but it didn't matter. It was something I could SSH into and do everything I needed to without caring where it was.
Everyone else down-thread has clarified the why of it. Keep all of your globally distributed assets all running on a common clock (UTC) so that you can readily correlate things that have happened between them (and the rest of the world) without having to do a bunch of timezone math all the time.
Would he by any chance refer to it as Zulu or Zebra time? The Z-suffix shorthand for UTC/GMT standardisation has nautical roots IIRC and the nomenclature was adopted in civil aviation also. I sometimes say Zulu time and my own dad, whose naval aspirations were crushed by poor eyesight, is amongst the few that don’t double-take.
I can’t quantify how much time my team wasted in diagnosing production glitches on checking the wrong time offsets but it was substantial. One of our systems wasn’t using UTC, and given enough time the fact that Slack wasn’t using it either does become an issue. When an outage transitions to All Hands on Deck everyone needs to get caught up to what’s going on preferably under their own power so you don’t suffer the Adding Resources to a Late Project problem.
So that first alert that came in ?? minutes ago you need to align with the telemetry and logs in order to see what the servers were doing right before everything went to shit.
What if it's your personal machine? I'm thinking about jobs I've set up... thing is, I actually do want those to align to DST in most cases. For example, ZFS scrub should start after I leave for work so that it has the greatest chance of being done by the time I get home. (It's too loud to run overnight.)
This shouldn’t be hard to deal with if the timestamp is always serialized with the offset: I’m much more picky about always persisting the offset than about always persisting UTC
Please stick with utc across the board people, someone someday may have to clean up your mess.