|The NTP FAQ and HOWTO: Understanding and using the Network Time Protocol (A first try on a non-technical Mini-HOWTO and FAQ on NTP)|
A reference clock is some device or machinery that spits out the current time. The special thing about these things is accuracy: Reference clocks must be accurately following some time standard.
Typical candidates for reference clocks are (very expensive) cesium clocks. Cheaper (and thus more popular) ones are receivers for some time signals broadcasted by national standard agencies. A typical example would be a GPS (Global Positioning System) receiver that gets the time from satellites. These satellites in turn have a cesium clock that is periodically corrected to provide maximum accuracy.
Less expensive (and accurate) reference clocks use one of the terrestrial broadcasts known as DCF77, MSF, and WWV.
In NTP these time references are also named stratum 0, the highest possible quality. (Each system that has its time synchronized to some reference clock can also be a time reference for other systems, but the stratum will increase for each synchronization.)
A reference clock will provide the current time, that's for sure. NTP will compute some additional statistical values that describe the quality of time it sees. Among these values are: offset (or phase), jitter (or dispersion), frequency error, and stability (See also Section 3.3). Thus each NTP server will maintain an estimate of the quality of its reference clocks and of itself.
There are serveral ways how a NTP client will know about NTP servers to use:
|Servers to be polled can be configured manually|
|Servers can send the time directly to a peer|
|Servers may send out the time using multicast or broadcast addresses|
Ideally the reference time is the same everywhere in the world. Once synchronized, there should not be any unexpected changes between the clock of the operating system and the reference clock. Therefore, NTP has no special methods to handle the situation.
Instead, ntpd's reaction will depend on the offset between the local clock and the reference time. For a tiny offset ntpd will adjust the local clock as usual; for small and larger offsets, ntpd will reject the reference time for a while. In the latter case the operation system's clock will continue with the last corrections effective while the new reference time is being rejected. After some time, small offsets (significantly less than a second) will be slewed (adjusted slowly), while larger offsets will cause the clock to be stepped (set anew). Huge offsets are rejected, and ntpd will terminate itself, believing something very strange must have happened.
Naturally, the algorithm is also applied when ntpd is started for the first time or after reboot.
A server operating at stratum 1 belongs to the class of best NTP servers available, because it has a reference clock (See What is a reference clock?) attached to it. As accurate reference clocks are expensive, only rather few of these servers are publically available.
A stratum 1 server should not only have a precise and well-maintained and calibrated reference clock, but also should be highly available as other systems may rely on its time service. Maybe that's the reason why not every NTP server with a reference clock is publically available.
Time can be passed from one time source to another, typically starting from a reference clock connected to a stratum 1 server. Servers synchronized to a stratum 1 server will be stratum 2. Generally the stratum of a server will be one more than the stratum of its reference (See also Q: 220.127.116.11.).
Synchronizing a client to a network server consists of several packet exchanges where each exchange is a pair of request and reply. When sending out a request, the client stores its own time (originate timestamp) into the packet being sent. When a server receives such a packet, it will in turn store its own time (receive timestamp) into the packet, and the packet will be returned after putting a transmit timestamp into the packet. When receiving the reply, the receiver will once more log its own receipt time to estimate the travelling time of the packet. The travelling time (delay) is estimated to be half of "the total delay minus remote processing time", assuming symmetrical delays.
Those time differences can be used to estimate the time offset between both machines, as well as the dispersion (maximum offset error). The shorter and more symmetric the round-trip time, the more accurate the estimate of the current time.
Time is not believed until several packet exchanges have taken place, each passing a set of sanity checks. Only if the replies from a server satisfy the conditions defined in the protocol specification, the server is considered valid. Time cannot be synchronized from a server that is considered invalid by the protocol. Some essential values are put into multi-stage filters for statistical purposes to improve and estimate the quality of the samples from each server. All used servers are evaluated for a consistent time. In case of disagreements, the largest set of agreeing servers (truechimers) is used to produce a combined reference time, thereby declaring other servers as invalid (falsetickers).
Usually it takes about five minutes (five good samples) until a NTP server is accepted as synchronization source. Interestingly, this is also true for local reference clocks that have no delay at all by definition.
After initial synchronization, the quality estimate of the client usually improves over time. As a client becomes more accurate, one or more potential servers may be considered invalid after some time.
NTP uses UDP/IP packets for data transfer because of
the fast connection setup and response times. The official port number for the
NTP (that ntpd and ntpdate listen and talk to) is
There was a nice answer from Don Payette in news://comp.protocols.time.ntp, slightly adapted:
The NTP timestamp is a 64 bit binary value with an implied fraction point between the two 32 bit halves. If you take all the bits as a 64 bit unsigned integer, stick it in a floating point variable with at least 64 bits of mantissa (usually double) and do a floating point divide by
2^32, you'll get the right answer.
As an example the 64 bit binary value:00000000000000000000000000000001 10000000000000000000000000000000equals a decimal 1.5. The multipliers to the right of the point are 1/2, 1/4, 1/8, 1/16, etc.
To get the 200 picoseconds, take a one and divide it by
4294967296), you get
233E-12seconds. A picosecond is
In addition one should know that the epoch for NTP starts
1900 while the epoch in UNIX starts in
1970. Therefore the following values both correspond to
UNIX: 39aea96e.000b3a75 00111001 10101110 10101001 01101110. 00000000 00001011 00111010 01110101 NTP: bd5927ee.bc616000 10111101 01011001 00100111 11101110. 10111100 01100001 01100000 00000000
When polling servers, a similar algorithm as described in Q: 18.104.22.168. is used. Basically the jitter (white phase noise) should not exceed the wander (random walk frequency noise). The polling interval tries to be close to the point where the total noise is minimal, known as Allan intercept, and the interval is always a power of two. The minimum and maximum allowable exponents can be specified using minpoll and maxpoll respectively (See Q: 22.214.171.124.). If a local reference clock with low jitter is selected to synchronize the system clock, remote servers may be polled more frequently than without a local reference clock in recent version of ntpd. The intended purpose is to detect a faulty reference clock in time.
Of course the final achievable accuracy depends on the time source being used. Basically, no client can be more accurate than its server. In addition the quality of network connection also influences the final accuracy. Slow and non predictable networks with varying delays are very bad for good time synchronization.
A time difference of less than 128ms between server and client is required to maintain NTP synchronization. The typical accuracy on the Internet ranges from about 5ms to 100ms, possibly varying with network delays. A recent survey suggests that 90% of the NTP servers have network delays below 100ms, and about 99% are synchronized within one second to the synchronization peer.
With PPS synchronization an accuracy of 50Ás and a stability below 0.1 PPM is achievable on a Pentium PC (running Linux for example). However, there are some hardware facts to consider. Judah Levine wrote:
In addition, the FreeBSD system I have been playing with has a clock oscillator with a temperature coefficient of about 2 PPM per degree C. This results in time dispersions on the order of lots of microseconds per hour (or lots of nanoseconds per second) due solely to the cycling of the room heating/cooling system. This is pretty good by PC standards. I have seen a lot worse.
Terje Mathisen wrote in reply to a question about the actual offsets achievable: "I found that 400 of the servers had offsets below 2ms, (...)"
David Dalton wrote about the same subject:
The true answer is: It All Depends.....
Mostly, it depends on your networking. Sure, you can get your machines within a few milliseconds of each other if they are connected to each other with normal 10-Base-T Ethernet connections and not too many routers hops in between. If all the machines are on the same quiet subnet, NTP can easily keep them within one millisecond all the time. But what happens if your network get congested? What happens if you have a broadcast storm (say 1,000 broadcast packets per second) that causes your CPU to go over 100% load average just examining and discarding the broadcast packets? What happens if one of your routers loses its mind? Your local system time could drift well outside the "few milliseconds" window in situations like these.
As time should be a continuous and steady stream, ntpd
updates the clock in small quantities. However, to keep up with clock errors,
such corrections have to be applied frequently. If
adjtime() is used, ntpd will update the system clock
every second. If
ntp_adjtime() is available, the
operating system can compensate clock errors automatically, requiring only
infrequent updates. See also Section 5.2 and Q: 126.96.36.199..
NTP maintains an internal clock quality indicator. If the clock seems stable, updates to the correction parameters happen less frequent. If the clock seems instable, more frequent updates are scheduled. Sometimes the update interval is also termed stiffness of the PLL, because only small changes are possible for long update intervals.
There's a decision value named poll
adjust that can be queried with ntpdc's loopinfo
command. A value of
-30 means to decrease the polling
interval, while a value of
30 means to increase it (within
the bounds of minpoll and maxpoll). The
value of watchdog timer is the time since the last
ntpdc> loopinfo offset: -0.000102 s frequency: 16.795 ppm poll adjust: 6 watchdog timer: 63 s
Recent versions of ntpd (like 4.1.0) seem to update the correction values less frequently, possibly causing problems. Even if the reference time sources are polled more frequently, the local system clock is adjusted less often.
While in theory estimates of the clock error are maintained, there were practically some software bugs that made these numbers questionable. For example the new kernel clock model dealing with nanosecond resolution (nanokernel) produced overly optimistic estimates regarding the clock offset. The bug has been fixed in August 2000, but also different versions of the NTP daemon may produce different estimates for the same hardware.
The limit actually depends on several factors, like speed of the main processor and network bandwidth, but the limit is quite high. Terje Mathisen once presented a calculation:
2 packets/256 seconds * 500 K machines -> 4 K packets/second (half in each direction).
Packet size is close to minimum, definitely less than 128 bytes even with cryptographic authentication:
4 K * 128 -> 512 KB/s.
So, as long as you had a dedicated 100 Mbit/s full duplex leg from the central switch for each server, it should see average networks load of maximim 2-3%.
The stratum is a measure for synchronization distance. Opposed to jitter or delay the stratum is a more static measure. Basically (and from the perspective from a client) it is the number of servers to a reference clock. So a reference clock itself appears at stratum 0, while the closest servers are at stratum 1. On the network there is no valid NTP message with stratum 0.
A server synchronized to a stratum
n server will be running at stratum
n + 1. The upper limit for
15. The purpose of
stratum is to avoid synchronization loops by preferring
servers with a lower stratum.
In a synchonization loop, the time derived from one source along a specific path of servers is used as reference time again within such a path. This may cause an excessive accumulation of errors that is to be avoided. Therefore NTP uses different means to accomplish that:
The Internet address of a time source is used as reference identifier to avoid duplicates. The reference identifier is limited to 32 bits however.
The stratum as described in Q: 188.8.131.52. is used to form an acyclic synchronization network.
More precisely, the algorithm finds a shortest path spanning tree with metric based on synchronization distance dominated by hop count. The reference identifier provides additional information to avoid neighbor loops under conditions where the topology is changing rapidly. This is a very well known problem with algorithms such as this. See any textbook on computer network routing algorithms. Computer Networks by Bertsekas and Gallagher is a good one.
In IPv6 the reference ID field is a timestamp that can be used for the same purpose.
The default polling value after restart of NTP is the
value specified by minpoll. The default values for
minpoll and maxpoll are
6 (64 seconds) and
10 (1024 seconds)
For xntp3-5.93e the smallest and
largest allowable polling values are
4 (16 seconds) and
14 (4.5 hours) respectively. For actual polling intervals
larger than 1024 seconds the kernel discipline is switched to FLL
For ntp-4.0.99f the smallest and
largest allowable polling values are
4 (16 seconds) and
17 (1.5 days) respectively. These values come from the
include file ntp.h. The revised kernel discipline
automatically switches to FLL mode if the update interval is longer than 2048
seconds. Below 256 seconds PLL mode is used, and in between these limits the
mode can be selected using the
Actually there is none: Short polling intervals update the parameters frequently and are sensitive to jitter and random errors. Long intervals may require larger corrections with significant errors between the updates. However there seems to be an optimum between those two. For common operating system clocks this value happens to be close to the default maximum polling time, 1024s. See also Q: 184.108.40.206..
In order to keep the right time, xntpd must make adjustments to the system clock. Different operating systems provide different means, but the most popular ones are listed below.
Basically there are four mechanisms (system calls) an NTP implementation can use to discipline the system clock (For details see the different RFCs found in Table 4):
step (set) the time. This method is used if the time if off by
more than 128ms.
slew (gradually change) the time. Slewing the
time means to change the virtual frequency of the software clock to make the
clock go faster or slower until the requested correction is achieved. Slewing
the clock for a larger amount of time may require some time, too. For example
standard Linux adjusts the time with a rate of 0.5ms per second.
ntp_adjtime(2) to control
several parameters of the software clock (also known as kernel
discipline). Among these parameters are:
Adjust the offset of the software clock, possibly correcting the virtual frequency as well
Adjust the virtual frequency of the software clock directly
Enable or disable PPS event processing
Control processing of leap seconds
Read and set some related characteristic values of the clock
hardpps() is a function that is
only called from an interrupt service routine inside the operating system. If
hardpps() will update the frequency and offset
correction of the kernel clock in response to an external signal (See also
This statement was derived from a mail message by Professor David L. Mills in response to a suspected bug in version 4.1.0.
...says Professor David L. Mills...