Yeah, nah, aye

Main categories:

Building a Stratum 1 Time Server

NTP, at a glance, (yet more in-depth than I had intended)

For the uninitiated, ntp is a protocol for agreeing of exactly what the time of day is on computer networks. All personal computers these days (and for the last few decades) keep track of the time of day. There is a saying I will butcher and paraphrase, giving no attribution to the original author (forgive me):

Someone with one clock knows what the time is. Someone with two clocks… they are less certain.

This becomes the case especially when multiple computers are kept on the same premises. It becomes more important when information is shared between these computers.

Just imagine, you're sending a letter to a penpal in the next city whose clock is slow by a week (extreme example, clearly). The dates on his replies may well confuse you if you try and correlate copies of your letters with his replies:

  1. You send a letter on the 13th of October (as you observe the date)
  2. The letter spends 2 days in transit
  3. It arrives at Greg's house on what you observe as the 15th
  4. He writes and sends his reply that same day, and dated it as the 8th (7 days behind, remember)
  5. The letter arrives 2 days later for you

You read this letter in awe of his anticipation of your questions 5 days prior to you ever writing them. Or perhaps you later come back to the letter and want to interleave the conversation; just put it in chronological order, surely? You need some way of agreeing between yourselves just which day is called what.

Clearly no sane person (save for a hermit, or someone lost in a remote place) would allow themselves to have their life organised around a time setting that is one week slow. I have used a large timeframe to scale it out with factors such as the speed (or lack thereof) with which letters are delivered.

Imagine this scaled down to the speeds with which computers send packets of data (analagous to letters) over a network (analagous to the postal service), and it should become more clear that the time difference between two computers' clocks does not have to appear very significant to a human's own use of time; much less than 7 days. Accurate timing becomes ever more important the more sophisticated and performant networks become.

Ok, NTP is some time thing, whatever

Most consumer PCs these days have NTP clients packaged with them in most cases. For example, Windows computers are generally set up to synchronise their time with one of Microsoft's time servers over the internet. Linux has a large number of NTP clients to synchronise your computer's clock over the internet as well. Examples include ntpd, chrony and systemd-timesyncd.

Some of these clients are more sophisticated than others. Some clients will ask a time server on the internet for the time, set the local computer's clock to that time, and then exit, not synchronising until you run the client again. These sorts of clients are sufficient for computers with relatively short uptimes (or very accurate clocks). Clocks drift, unfortunately. Not only do we have an issue of communicating the current time of day between multiple clocks, but we also have the problem of making sure everyone counts intervals of time as being the correct length. Without some way of checking this, after some time passes, we may be back where we started, and I am now running 10 seconds behind you.

The main cause for clock drift is manufacturing tolerance. Today, we use crystal oscillators in clocks because they are cleap and plentiful. The physical size of a crystal determines the frequency at which it will oscillate when excited with electricity. The accuracy with which these crystals are cut can be high or low, depending on how much money you want to pay for it. Naturally, cheaper is better for end consumers, so the crystals put in RTCs on consumer stuff is "okay". But there is still a variance from one crystal to the next.

The naive way to correct for this drift is to periodically—say once an hour—ask a time server on the internet what the time is and set our clock to that. This works for the most part, but it means that our clock isn't very accurate. What happens during the 60 minutes that come between each demand for the time? We grow more and more uncertain about the validity of the answer our clock is giving. The solution? There are two main ones:

  1. Ask for the time (poll) more often - say every minute.
  2. Mesure the drift and predict the clock's actions over the next period

Choosing (1) is a foolish approach. Asking someone for the time very often may make them dislike you, especially when you can choose to do (2) and not need to ask them very frequently at all.

With that explained, what the more sophisticated NTP clients generally do is to ask a server (more commonly 2–3 servers) what the time is, compare it with the local time, and slightly speed up or slow down a clock defined in software (i.e. not the RTC crystal) to slowly correct for the difference. If your local clock is 5 seconds ahead of what the servers on the internet agree the time is, then make the length of your seconds just 105% what they should be. The humans won't notice, they won't notice that 1 second is actually appearing as 1.05 seconds.

Let's ride that trend for 1 minute and see where that gets us. After 1 minute, we have fallen back closer to the actual time; we've shaved 3 seconds (5% of 60 seconds) off our fast clock. We only have to lose two seconds now, let's approach a little slower to avoid overshooting. We might make our seconds only 103% as fast as they should be. We'd have to ride this for just over a minute to find ourselves agreeing with the time servers on the internet. We ask them what the time is after this, and we find that we are 1 second slow! How did this happen?

Our clock isn't a perfect standard, turns out it counts a little slower than it should by its very nature. Let's decrease the length of our seconds to something just below what they "should be" and see where that gets us. Let's choose 99%. We might check back one minute later and find that our clock is still keeping correct time with this new second that's 99% what the hardware thinks a second should be. Great, let's leave it for two minutes and see. If all goes well, this 99% adjustment would stay in place, and our clock would not gain or lose against the time servers. We can decrease the frequency with which we ask them for the correct time since we are now constantly compensating for our hardware clock's drift.

This is generally very accurate for most things people and computers do. For example, some of my machines are reportedly accurate to within less than 500 µs.

That's pretty accurate, good to know. I'm out of here

Sweet, see ya. Building anything more accurate than this without reason is just that. I decided to build a Stratum 1 time server for my LAN. Why? Because it's neat. In the end, I was able to achieve a clock roughly 10 to 50 times as stable as when my server was synching over the network; preliminary figures are showing the jitter to be about ±50 ns, which is 1 part in 20 million.

Stratum? What?

Stratum can be thought of as a sort of ranking system. The time network starts with sources that have some sort of atomic clock on them. These such machines are labelled as Stratum 0; they are 0 steps away from an atomic clock. Then, servers which ask for the time from Stratum 0 servers are called Stratum 1; they are 1 step away from an atomic clock. Servers that feed from Stratum 1 are classed Stratum 2. This pattern continues.

Essentially, the lower the Stratum, the better. Stratum 1 is completely feasible and relatively cheap to achieve with consumer parts, so I opted for this. I will build a Stratum 0 time source, maybe sometime in the future!

Enough talking, get on with it

Okeydokey. There are quite a few atomic clocks that you can feed from:

  1. For free,
  2. Wirelessly, and
  3. Without them knowing

The coolest thing about these machines is that they are zooming around in space.

GPS satellites keep atomic clocks on them, because crazy-accurate timing is pretty much the basis of GPS. Have a poor clock on a GPS satellite and you can't tell if you're in Waikikamukau or standing in your back yard. In essence, GPS satellites use the measured time difference between different satellites in the sky in order to measure your distance from each one. Radio waves travel well near the speed of light, so you'd better have some accurate clocks on there.

So all we need is some form of GPS receiver in order to connect to a computer in order to get a very accurate time of day. Luckily, crowds like Uputronics have one that matches right onto the ever-cheap and ever-affordable Raspberry Pi 3—perfect.

I purchased the board from Uputronics (and I'm sure other products work) as well as a Raspberry Pi 3. Nothing like an excuse to get some updated kit after using the B+ models since they came out in 2012.

The board essentially just plugs on, and you plug your antenna into it. The GPS chipset communicates NMEA data to the UART port on the Raspberry Pi, and you can ask for time information through this.

Strictly, all that is needed in software is to disable the Raspberry Pi's bluetooth chipset to stop it from capturing the UART port at /dev/ttyAMA0, then set up GPSd and add its NMEA feed as a source in your favourite NTP server.

GPSd

Any good distro will package GPSd. This software does the heavy lifting by dealing with many different GPS chipsets and a couple of different protocols required to drive the chipsets, providing a set of uniform interfaces to other pieces of software. I want to make GPSd read from the GPS chipset even if it thinks no program is requesting information from it. I also want it to read from /dev/ttyAMA0. In /etc/gpsd, I ensure the following are set (leaving defaults as they were):

GPSD_OPTIONS="-n"
DEVICES="/dev/ttyAMA0"

Then, I enable gpsd to start on boot using systemctl.

chrony

Now all that strictly needs doing is to tell chrony to peek at the SHM (shared memory) that GPSd sets up and pushes the GPS data into. This way, chrony can see the time. Set up a minimal chrony config as necessary, with the exception that you do not need to set up any internet servers to poll (though this is still useful). Then, add the SHM as a source of time:

refclock SHM 0 offset 0.0 delay 0.0 refid NMEA

The offset and delay options can be tweaked to account for the time it takes for the data to be received by the GPS, and transmitted across the wire to the Pi, then parsed by GPSd and shared with chrony. I won't bother with this; I'll jump onto the next step.

Improving accuracy with PPS

The NMEA strings are a good source of time, but what happens if the amount of data the GPS is sending us goes up and down; the time it takes for these differing amounts of to be transmitted across the wire will vary, and our time accuracy will suffer. The GPS board I used exposes a feature available in most (all?) GPS chipsets; PPS.

PPS is a feature wherein a squarewave with (ideally) a very sharp rising or falling edge is sent from the GPS chip to something else—the Pi in my case—once every second. This edge is supposed to be very precise and align correctly with the start of a second. Sounds perfect!

To use the PPS signal with the Raspberry Pi, kernel support should be desired. If the kernel can receive the interrupt from the PPS and schedule our NTP server to receive the interupt very quickly, then we can greatly lower the remaining uncertaintly about just exactly what the time is. There is a device tree overlay available with Arch Linux ARM (and presumably Raspbian). Enabling the pps-gpio dtoverlay and rebooting presented me with a node at /dev/pps0. At this point, you might like to run ppstest from the pps-tools package (in the AUR) and check that your GPS receiver, PPS line, and kernel support are all functioning correctly.

Then, a bit of tweaking is necessary, at least on Arch Linux ARM. I found that I had to recompile a custom version of chrony with support for PPS. The problem wasn't that the packagers hadn't explicitly disabled support, but that they lacked a package required for chrony to detect support for PPS at build time. Make sure pps-tools is installed and re-build chrony from asp.

After installing your shiny new chrony, edit the config and add something like these options:

refclock PPS /dev/pps0 lock NMEA refid PPS
refclock SHM 0 offset 0.0 delay 0.0 refid NMEA noselect

This tells chrony to look for a refclock of type PPS at /dev/pps0, locking it to the refclock called "NMEA", and to call this reflock "PPS". It also tells chrony to add a refclock based on SHM0 called "NMEA" and not to select this source itself.

The reason not to select the NMEA source to base our timing off is because we determined this to be inaccurate. It's better to tell chrony to avoid this and to instead base its timing off the much more accurate PPS.

Note that PPS is "locked" to the NMEA refclock because the PPS does not provide any time information on its own; it just tells us when seconds are, not what the time is at each second. NMEA will tell us what the time is at each second though, so we pair the two up. NMEA tells us "It's 10:42:23" and the PPS tells us "wait for it… NOW!".

Start chronyd and hopefully you'll see it soon lock onto the PPS time source.

I run Arch Linux ARM on the Raspberry Pi, and the kernel provided already has PPS support out of the box. Just enable the device tree overlay and you're set:

dtoverlay=pps-gpio

Clock drift can change depending on a number of factors, the primary being temperature. Crystal oscilltors, the technology we use in clocks to keep the time are fairly variable with temperature. Temperature can be quite variable across the day or under different system loads as components heat and cool, so as you might imagine, your computer may count the length of a second as a fairly varied spectrum of lengths throughout even a short period of a few hours. We're not talking large differences, but any small difference will add up; after all, there are roughly 86400 seconds in a day.