Clock and data recovery¶

Markus G. Kuhn's Optical TEMPEST paper gives a simple clock and data recovery algorithm. I always thought of CDR as black fucking magic, but this looked simple enough that I could maybe try it.

In [2]:

%pylab inline
import seaborn
style.use('seaborn-poster')
rcParams['lines.linewidth'] = .5

/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/usr/lib/python3/dist-packages/matplotlib/__init__.py:874: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.
  warnings.warn(self.msg_depr % (key, alt_key))

Populating the interactive namespace from numpy and matplotlib

Here's the bits we're going to transmit.

In [3]:

#bits = where(random.random(32) > .5, 0, 1)
bits = array([0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0,
              0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0])
bits

Out[3]:

array([0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1,
       0, 1, 0, 1, 0, 0, 1, 0, 1, 0])

In [4]:

fs = int(1e6)  # sampling frequency
baud = 9600
bit_time = fs / baud   # samples per bit
jitter = -31           # first bit starts 31 samples before we start sampling
measurement_time = int(ceil(bit_time * (len(bits) - 1)))
t = arange(measurement_time) / float(fs)
bit_index = ((t - jitter / fs) * baud).round().astype(int)
a = where(bits[bit_index], 5, 0)
subplot(411); ylim(-1, 6); plot(t, a); ylabel('V'); xlabel('s'); title('a: original pristine NRZ binary signal')
bit_time

Out[4]:

104.16666666666667

We have 104 samples per bit, so we can tolerate a lot of noise. Like, if we knew the clock, we ought to be able to tolerate about 20 dB of noise, which would be noise of amplitude 10 times higher than the signal: 25 volts RMS instead of our signal's 2.5 volts RMS. But we don't know the clock, so let's keep this conservative, only about 10 dB more noise than signal.

In [6]:

b = a + random.normal(0, 10, size=a.shape)
subplot(411); plot(t, b); ylabel('V'); xlabel('s'); title('b: sadly corrupted with noise :\'(')
(a**2).mean()**0.5, ((b-a)**2).mean()**0.5

Out[6]:

(3.477036566062219, 10.040687902230339)

In [7]:

pulse = ones(int(floor(bit_time)))                 # the pulse shape of a single bit
c = convolve(b, pulse, mode='full') / sum(pulse)   # output is N+M-1 samples. this is not the fastest way to box filter
ct = (arange(len(c)) - len(pulse)//2) / float(fs)  # c time
subplot(411); plot(ct, c); xlabel('s'); title('c: box-filtered to remove noise while minimally slowing transitions')
len(c)

Out[7]:

Here I've set the time axis to do a zero-phase convolution, which is of course non-causal: the average value of the samples around a point in time depend on the samples after it as well as before it. Kuhn's paper presents this in a causal way, where this convolution induces a half-bit linear phase shift, but I think this way is less confusing.

So, that signal mostly has the right levels, at least in the middle of the bits. Let's see if we can figure where the middles of the bits are; first we take its distance from its mean:

In [8]:

d = abs(c - c.mean())
subplot(411); plot(ct, d); xlabel('s'); title('d: distance of filtered signal from its mean')
d

Out[8]:

array([2.02690831, 1.87569661, 1.96668749, ..., 2.13933567, 2.17683606,
       2.23453115])

Now, that tells us where the middles of some of the bits are, the ones that have transitions on both sides; let's average that over a longer period of time by convolving with an impulse train at the right baud rate:

In [9]:

n_impulses = 16
impulses = zeros(int(ceil(n_impulses * bit_time)))
impulses[(arange(n_impulses) * bit_time).round().astype(int)] = 1
it = arange(len(impulses)) / float(fs)
subplot(211); plot(it, impulses, label="impulse train at bit transition frequency"); legend(); ylim(-1, 2); tick_params(labelbottom=0)
e = convolve(d, impulses, mode='full') / impulses.sum()
et = (arange(len(e)) - len(pulse)//2 - len(impulses)//2) / float(fs)
ewr = int(round(bit_time / 3))   # e window radius
e_is_max = array([(e[i] >= e[i-ewr:i+ewr]).all() for i in range(len(e))])

subplot(212, sharex=gca()); plot(et, e*5 + 2, label="e: d convolved with impulse train (scaled and filtered)")
axhline(c.mean(), color="#ff6699"); xlim(t.min(), t.max()); plot(ct, c, label="c: filtered noisy signal")

for ti in et[e_is_max]:
    if t.min() <= ti <= t.max():
        axvline(ti, color="#ff99cc")
        
legend(loc=0); xlabel('s')
e_is_max.sum()

Out[9]:

So e is our recovered clock signal that tells us where to sample c, and there we have vertical lines superimposed on the plot of c to show where we've detected a local maximum of e and will sample c. You can see that generally they are the points in c that are furthest from the mean, which is indicated by a horizontal line.

(The above way of convolving with a convolution is also not the fastest way to convolve with an impulse train, but also, there isn't a straightforward way to do windowed max in Numpy.)

Note that outside of the time interval where there's actually a signal at all, we got another hundred-plus clock cycles.

In [10]:

subplot(411); plot(et, e_is_max); plot(et, e)

Out[10]:

[<matplotlib.lines.Line2D at 0x7f0411706ef0>]

So let's extract our samples:

In [11]:

f = []
ft = []
sample_times = set(ti for ti in et[e_is_max] if t.min() <= ti <= t.max())
for ci, cti in zip(c, ct):
    if cti in sample_times:
        f.append(ci)
        ft.append(cti)
        
f = array(f)
        
subplot(411); stem(ft, f); xlim(t.min(), t.max()); plot(ct, c); axhline(c.mean(), color="#ff6699"); ylabel('V')
subplot(412, sharex=gca()); plot(t, a); ylim(-1, 6)
f.round(2)

Out[11]:

array([ 5.88, -1.04,  3.75, -0.01, -0.24,  5.99,  4.79,  5.1 ,  5.27,
        2.97, -0.21,  0.11,  5.6 ,  1.19, -0.04, -0.9 ,  0.93,  0.02,
        3.42,  4.85,  5.6 , -0.77,  5.91, -2.06,  5.44,  0.31, -0.41,
        4.08, -0.32,  6.54, -2.07])

So now let's threshold it and recover our bits:

In [16]:

subplot(411); stem(ft, f > c.mean()); ylim(-1, 2)
subplot(412, sharex=gca()); plot(t, a); ylim(-1, 6)

Out[16]:

(-1, 6)

If we compare this to the original bits that were sent, we find they match up except that we're missing the first bit.

In [17]:

where(f > c.mean(), 1, 0), bits

Out[17]:

(array([1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0,
        1, 0, 1, 0, 0, 1, 0, 1, 0]),
 array([0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1,
        0, 1, 0, 1, 0, 0, 1, 0, 1, 0]))

In [ ]: