I wonder why the blog post claims setting clock source to 'tsc' is considered da...

bandrami · on March 7, 2017

Because if the clock rate changes, tsc can become out of sync.

https://lwn.net/Articles/209101/

pgaddict · on March 7, 2017

Not really. Recent CPUs (at least those from Intel, which is what EC2 runs on) implement constant_tsc, so the frequency does not affect the tsc.

A worse issue is that the counters may not be synchronized between cpus, which may be an issue when the process moves between sockets.

But I wouldn't call that "dangerous", it's simply a feature of the clock source. If that's an issue for your program, you should use CLOCK_MONOTONIC anyway and not rely on gettimeofday() doing the right thing.

blibble · on March 7, 2017

how does constant_tsc interact with VMs being silently migrated from one physical machine to another?

kondro · on March 7, 2017

EC2 doesn't do any kind of live machine migration. The only times a machine may start on a different host is if it is stopped and then started. Even reboots don't allow them to move.

You see this a lot when AWS lets you know about maintenance on a physical host and gives you the option to avoid the automated move by doing these steps manually at a time of your choosing before the maintenance window.

pgaddict · on March 7, 2017

Not sure, but it can't be better than moving processes between CPUs I guess. Also, does EC2 silently move VMs like this?

poofyleek · on March 7, 2017

Even without migration, the synchronization can be an issue. In older multi-core machines, tsc synchronization was an issue among cores. Modern systems take care of this. And core CPU clock frequency change is also taken care of, so that constant rate is available via tsc. However, when hypervisors such as VMWare or paravirtualization like Xen come into play, there are further issues, because RDTSC instruction either has to be passed through to physical hardware or emulated via a trap. When emulated a number of considerations come into play. Xen actually has PVRDTSC features that are normally not used but can be effective in paravirtual environments. The gettimeofday() syscalls (and clock_gettime) are liberally used in too many lines of existing software. Their use is very prevalent due to historical reasons as well as many others. One reason is that the calls are deceptively "atomic" or "isolated" or "self-contained" in their appearance and usage. So liberal use is common. A lot of issues come about due to their use, especially in time sensitive applications (e.g. WAN optimization). This is especially true in virtual environments. There are complex issues described elsewhere that are kind of fun to read. https://www.vmware.com/pdf/vmware_timekeeping.pdf and https://xenbits.xen.org/docs/4.3-testing/misc/tscmode.txt. The issue becomes even more complex in distributed systems. Beyond NTP. Some systems like erlang has some provisions, like http://erlang.org/doc/apps/erts/time_correction.html#OS_Syst.... Other systems use virtual vector clocks. And some systems, like google TrueTime as used in Spanner, synchronize using GPS atomic clocks. The satellite GPS pulses are commonly used in trading floors and HFT software. This is a very interesting area of study.

pgaddict · on March 7, 2017

It's complex stuff, no doubt about that.

For me, it's much simpler - I come from the PostgreSQL world, so gettimeofday() is pretty much what EXPLAIN ANALYZE does to instrument queries. Good time source means small overhead, bad time source means instrumented queries may take multiples of actual run time (and be skewed in various ways). No fun.

poofyleek · on March 7, 2017

It is complex and interesting. I am a novice database user. But I do know many databases use 'gettimeofday' quite a lot. Just strace any SELECT query. Most databases I have used, including Postgresql, also have to implement MVCC which mostly depend on timestamps. Imagine the hypervisor CPU and memory pressure induced time drift, or even drift in distributed cluster of database nodes. It hurts my head to think of the cases that will give me the wrong values or wrong estimate for getting the values. It is an interesting area.

pgaddict · on March 13, 2017

MVCC has nothing to do with timestamps, particularly not with timestamps generated from gettimeofday(), but with XIDs which you might imagine as a monotonous sequence of integers, assigned at the start of a transaction. You might call that a timestamp, but the trouble is that what matters is commit order, and the XID has nothing to do with that. Which is why the MVCC requires 'snapshots' - a list of transactions that are in progress.

jdamato · on March 7, 2017

We don't really know what EC2 does or precisely the type of hardware your VM will be spun up on. I've erred on the side of being cautious due to the vast amount of work being invested in timekeeping in various hypervisors. If EC2 knows that the TSC clocksource is safe on all of its hardware, perhaps modifying the Amazon Linux AMI to set TSC as the default clocksource would reassure many folks, myself included.

Advanced users that can run their own analysis or who have applications which would withstand potential time warps, are of course, free to ignore my warning at their own risk ;)