Graham's Blog Archive

Jitter

Posted by Graham
January 29th, 2014

The audible effect of jitter on the music is to (usually) distort high frequency sounds making them sound like some artificial sound effect similar to phasing (or flanging) – cymbals and natural sibilance tend to be over-emphasized.

Jitter cycles the timing of these high frequencies above and below their rightful place in the sound spectrum by fractions of a second producing phase noises not in the original.

The usual analogy is to show a photograph taken with shaky hands as most people can understand that. However, the photograph shows distortion of the entire picture – light to dark. It is best to imagine the brightest parts of the picture having ‘camera shake’ whilst the darker parts of the picture remain sharp, but that is difficult to show.

Jitter however, doesn’t make it difficult to show the musical picture as shaky in the ‘bright’ parts and ‘focussed’ in the lows – because that is what jitter does.

Therefore jitter is timing distortion, but how much can we tolerate before we cannot enjoy the music? A number of studies have taken place to find out how much jitter there has to be before it gets noticed. Listener groups are subjected to increasing amounts of jitter on the music until they notice signal degradation. From the results it would seem that jitter of 10 – 15 ns is acceptable for the best in high fidelity listening, and for casual listening that figure rises to around 100 ns (ns = nano-seconds).

We can translate nano-seconds into frequency using the simple formula f=1/T, so 10 ns is 100 MHz, which is way beyond audibility. What about 100 ns? That translates to 10 MHz – again way beyond audibility. So by what mechanism can people discern that there is a difference? Perhaps it folds down into the audible range (two frequencies mix to produce a new lower frequency), but then again, this is not modulation.

No, the only ‘argument’ left is that of angular displacement which requires a different formula f = 1/2pi x t. Using that we find 10 ns represents 16 MHz and 100 ns represents 1.6MHz – closer to the audible spectrum.

I am unaware of any studies into how humans can detect phase displacement, and remember an Audio Engineering Society demonstration that ‘proved’ we cannot tell the difference between an audio signal of one phase or another, but can we tell if part of an audio signal is being phase manipulated whilst the rest of it is left alone? Jitter effects show that is possible.

Therefore jitter being noticeable in tests suggests that we can hear differences in phase when a highish frequency is stepped ‘out of line’ by the equivalent of 1.6 MHz wavelength (100 ns).

So do you think you can hear 10 ns of jitter? The wavelength is 16 MHz? Remember, the listening panels reported they couldn’t detect anything at jitter of around 10 – 15 ns.

Therefore it would tend to suggest that a D-A converter (DAC) jitter specification of 10 – 15 ns is perfectly adequate for high fidelity listening.

From the above we understand that jitter affects the highs much more than the lows – as jitter is increased it is noticed on the high frequencies first. And if jitter is sufficiently low it isn’t noticed at all.

So how do we measure jitter? Here I’m not going to get bogged down in the technicalities because it could take a slim volume to explain properly – I don’t have the time and possibly neither have you – but there are books about it. However, without having the test equipment the measurement requires you’d find the books a frustrating read.

In a nutshell I use the j-test utility (software) on an Audio Precision audio analyser. It produces the required test signals, a FFT (fast fourier transform) graph, and from that a jitter spectrum and integration readout.

Jitter can exist down to 10Hz – below that it isn’t jitter, but is more accurately described as wander due to subsonic irregularities, which isn’t jitter, and so the lower limit for testing is set at 10Hz.

However, that can lead to misleading results as jitter is most pronounced at high frequencies. The subsonic irregularities don’t just stop once they reach 10 Hz – they die down or settle from 10 Hz, suggesting that we should concentrate measurement from a decade higher at 100Hz. An 117th Audio Engineering Society convention white paper suggests we do exactly that, but state in our results that we used a 100 Hz corner such that another tester could parallel the test and should therefore report the same or similar result.

The j-test jitter spectra graph could be read off to give the jitter, but that must be qualified by the frequency it was measured at. However, that would not give an honest appraisal (but the result would look out of this world!).

The jitter should be stated over the bandwidth of the test which is the spectrum of the audio being reproduced at a particular sampling rate: 44.1 kHz yielding a 20 kHz audio bandwidth, for example.

As there is no upper limit for the test, save for the actual audio bandwidth, it is classed as being wideband. Therefore, for the bandwidth of the test, the jitter reading should be integrated.

However, there is still one factor where it is difficult or impossible to find an answer, and that is because ‘j-test’ occupies only half of the audio bandwidth – it stops at 10 KHz for a 44.1 kHz sampling frequency (12 kHz for 48 kHz, …, 48k Hz for 192 kHz).

The question is “do we integrate for the whole bandwidth or half of it”? What bearing does that have on results? The square root of the sum for a 10 kHz bandwidth is 100 whereas for 20 kHz it is 141 – the difference is going to be 1: root 2.

As the test ends at half the bandwidth and there is nothing to measure above that point, it is argued that there being nothing to integrate above that point, the results are correct for the full bandwidth integrating with only half of it. If you are able to follow that, well done! I mention it simply that you are aware that published results could be indicating only around 70% of the truth.

Jitter plot

In tests on my new Majestic DAC design, the integrated results equated to approximately 116 ps (pico seconds) measured with 100 Hz corner (as detailed above).

Another question that could be asked is if that is RMS or peak? Here again there is no indication in the test utility, and only the knowledge that the FFT is measuring power tells us that it is RMS.

Time for a sanity check: Considering that jitter levels of 10 ns – 86 times higher than my test result – cannot be distinguished, does it really matter to nit-pick whether it’s 116 ps or in-fact root 2 higher (164 ps)? It is still much lower than you’ll ever hear.

Even if I included the sub 100 Hz artefacts, which comprise the remnants of wander and the ever present line voltage noise (50/60 Hz) all around us, the integrated or total jitter (TJ) is still only around 1 ns.

However, there are those who I’m sure will be swayed toward claims of only a few ps jitter in the belief they’re getting a better deal. Those really need to ask the question “what qualifies such results – what parameters were used”?

Jitter plot

And finally noise. My FFT trace showed virtually none of the spikes which should surround the fundamental test frequency, which indicate jitter. So what was j-test actually reading? At around -140dB I think most who understand such things will realise that the noise floor is contributing here – the best Wolfson 8741 DAC specification being -128dB.

So what am I measuring? Jitter or noise? And if a bit of both, perhaps we need the same convention as used for distortion – THD+N? However, that would be difficult as jitter is an integration of time with time, and noise is amplitude – mixed metaphors?

In conclusion jitter is a valid measurement which tells us if a DAC is good or not, but we should realise that the pursuit of disappearingly low jitter is futile, because 10 ns and below cannot be discerned. And when we get in the lower hundreds of ps we may not be looking at jitter at all – but noise.