# Generalised learning of time-series: Ornstein-Uhlenbeck processes¶

(c) 2019
Mehmet Süzen



## Reconstructive cross-validation¶

$\mathbf{y} \in \mathbb{R}^{n}$, vector of length $n$

$t_{i} \in \mathbb{R}^{n}$ where $y_{i} = y(t_{i})$ and $i \in \mathbb{Z}_{+}$

where $t_{n} > t_{n-1} > ... > t_{1}$

the ordered training dataset reads, $(y_{i} , t_{i})$

### Ordered data: Out-of-sample set for testing¶

As a continuation of $\mathbf{y}$ is defined as out-of-sample set, $\mathbf{w} \in \mathbb{R}^{p}$, vector of length $p$, $p > n$,

where $w_{j} = w(t_{j})$ and $j \in \mathbb{Z}_{+}$, where $j > p$

where $t_{m+p} > t_{p+n-1} > ... > t_{n+1}$

the ordered training dataset reads, $(w_{j} , t_{j})$

### Partitions and conventional CV-folds as missing data¶

Disjoint $k$ sets of partitions of $\mathbf{y}$ are $\mathbf{y}^{1}, \mathbf{y}^{2}, ...,\mathbf{y}^{k}$ each set having randomly assigned $y_{i}$, and partitions are approximately equal in size $$|\mathbf{y}^{1}| \approx |\mathbf{y}^{2}| \approx ... \approx |\mathbf{y}^{k}|.$$

A training-fold is defined as a union of all partitions except the removed partion, $$Y^{m} = \bigcup_{l=1, l \ne m}^{k} \mathbf{y}^{l},$$ Missing data would be on the corresponding removed partition $\mathbf{y}^{m}$.

### Reconstruction folds¶

Due to ordered nature of the series, the standard CV approach would not be used in different folds which yields to an absurd situation of predicting past using the future values. To overcome this, a reconstruction of full training series $y$ is denoted by $R^{j}$. Using each training-fold $Y^{j}$ via a secondary model $\mathbb{M}_{2}$. A technique could be interpoloation or more advanced filtering approaches, resulting $$R^{m} = Y^{m} + \hat{\mathbf{y}^{m}}.$$

The secondary model could retain the given points on the training-fold $Y^{j}$ in this approach. $\hat{\mathbf{y}^{m}}$ is the reconstructed portion.

### Reconstruction error¶

The total error due to reconstructed model $\mathbb{M}_{2}$ expresssed as $g_{r}(\mathbf{y}, \hat{\mathbf{y}_{}})$, here for example we write down as a mean absolute percent error, different metrics can be used,

$$g_{r} = \frac{1}{k} \sum_{m=1}^{k} |(\mathbf{y}^{m} - \hat{\mathbf{y}^{m}})|/\mathbf{y}^{m}.$$

### CV error on prediction task¶

The primary model, $\mathbb{M}_{1}$, is build on each $R^{m}$ and predictions on out-of-sample set $\mathbf{w}$ is applied. This results in set of predictions $\hat{\mathbf{w}^{m}}$, the error is expressed as $g_{p}(\mathbf{w}, \hat{\mathbf{w}_{}})$

$$g_{p} = \frac{1}{k} \sum_{q=1}^{k} (\mathbf{w} - \hat{\mathbf{w}^{m}}).$$

The total error in rCV is computed as follows. $g_{rCV} = g_{r}*g_{p}$. The lower the number better the generalisation, however both $g_{r}$ and $g_{p}$ should be judge seperately to detect any anomalies.

### Reconstruction of time-series via Gaussian-Process¶

Given $(y_{i}, t_{i})$ ordered pairs as time-series, we aim at inferring, i.e., reconstructing, missing values at time-points $t_{j}$. The missing values $y_{j}$ can be identified via Bayesian interpretation of kernel regularisation,

$y_{j} = K_{ji} L^{-1} y_{i}$

where $L=K_{ii} + \mathbb{I}$. The kernel matrices $K_{ji}, K_{ii}$ is build via kernel $\exp(-\mathbf{D}_m/2.0)$ where $\mathbf{D}_m$ is the distance matrix over time-points.

### Generating Ornstein-Uhlenbeck process¶

On can generate Ornstein-Uhlenbeck process drawing numbers from multivariate Gaussian with specific covariance structure. $$\mathbf{y}_{ou}(t_{i}) \approx \mathscr{N}(\mathbf{\mu}, \Sigma).$$

Taking $\mathbf{\mu}$ as all at $2.0$, and $\Sigma$ build via kernel $\exp(-\mathbf{D}_m/2.0)$ where $\mathbf{D}_m$ is the distance matrix over time-points.

We generated Ornstein-Uhlenbeck (OU) time-series for a regular $1000$ time points with a spacing of $\Delta t = 0.1$ with different length scales, mean values $\mu$, and additional $250$ time points for testing.

### Estimating a time-series learning curve via reconstructive k-folds¶

Time-series learning curves are not that common due to fact that data sample sizes are limited. However with reconstruction approach we provided above, one can build a learning curve $L$, based on number of folds

$$L(k) = g_{rCV}^{k}$$

The reason behind this, see paper.

## Python Methods¶

In :
%load_ext lab_black
%matplotlib inline
import numpy as np
import sys
import matplotlib
import matplotlib.pyplot as plt

"numpy version:", np.__version__, "matplotlib :", matplotlib.__version__, "Python version:", sys.version

The lab_black extension is already loaded. To reload it, use:

Out:
('numpy version:',
'1.17.2',
'matplotlib :',
'3.1.1',
'Python version:',
'3.7.3 (default, Mar 27 2019, 16:54:48) \n[Clang 4.0.1 (tags/RELEASE_401/final)]')

### Generate 1D Ornstein-Uhlenbeck Process¶

In :
import numpy as np

def ornstein_uhlenbeck_series_1d(t_vec, seed=424242, mu=np.array([])):
"""

Generate Ornstein-Uhlenbeck process in 1-dimension.
Sigma assumed to be a distance matrix over all t_vec.

t_vec : Ordered values time points
seed  : random seed, defaults to  424242
mu    : mean vector, defaults to all zeros via []

Returns
y(t) : 1-d Ornstein-Uhlenbeck process on the given time-points

"""
Dm = np.array([np.abs(t - t_vec) for t in t_vec])
sigma = np.exp(-Dm / 2.0)
n = mu.shape
m = t_vec.shape
if n == 0:
mu = np.zeros(m)
np.random.seed(seed)
return np.random.multivariate_normal(mu, sigma, 1)


### Generate 1D Ornstein-Uhlenbeck Process with out-of-sample data¶

In :
def ornstein_uhlenbeck_series_1d_oos(ty, nsample, mu_shift=5.0, oosp=0.25, seed=424242):
"""

Generate Ornstein-Uhlenbeck process in 1-dimension and its out-of-sample set.
Sigma assumed to be a distance matrix over time points.

ty : End time point for the training series that fold will be generated. [0,ty].
Out-of-sample portion will be 25% of nsample by default, See oosp.
where t_w > t_y (t_y, tw]
nsample : Number of samples to use between [0, t_y]. See oosp too.
mu_shift : A real number to shift mean value of OU from zeros, defaults to 5.0.
oosp : Defaults to 0.25. Ratio of nsample to be number of
out-of-sample time points.

Returns
OU data with time points (t_y, y, t_w, w)

"""
t_y = np.linspace(0.0, ty, nsample + 1)
mu = np.ones(t_y.shape) + mu_shift
y = ornstein_uhlenbeck_series_1d(t_y, mu=mu, seed=seed)
t_w = np.linspace(ty + ty / nsample, ty + ty * oosp, nsample * oosp + 1)
mu = np.ones(t_w.shape) + 5.0
w = ornstein_uhlenbeck_series_1d(t_w, mu=mu, seed=seed)
return (t_y, y, t_w, w)


### Generate k-fold indices given 1-D time-series and k-value¶

In :
def generate_reconstruction_folds(y, t_y, npartition, seed=424242):
"""

Generate folds and missing points

y   : y value
t_y : corresponding t values
npartition: Number of folds to generate, k-fold's k

Return
rfolds : List of tuple of (y_m, t_m, t_r)
y_m : fold values
t_m : corresponding time
t_r : missing time points

"""
np.random.seed(seed)
nsample = y.shape
# Each partition upper index: k-fold
inx_part = np.int64(np.round(np.linspace(0.0, 1.0, npartition + 1) * nsample))
# Partition ranges
inx_pairs_part = [(x, x) for x in zip(inx_part[:-1], inx_part[1:])]
# Shuffled indices
sample_inx = np.arange(0, nsample)
np.random.shuffle(sample_inx)
rfolds = []
for ps in inx_pairs_part:
missing_inx = sample_inx[ps : ps]
t_r = np.sort(t_y[missing_inx])
y_m = y[m_inx]
t_m = t_y[m_inx]
ix = np.argsort(t_m)
t_m = t_m[ix]
y_m = y_m[ix]
rfolds.append((y_m, t_m, t_r))
return rfolds


### Generate k-fold indices given 1-D time-series and k-value: index only version¶

In :
def generate_reconstruction_folds_inx(nsample, npartition, seed=424242):
"""

Generate folds and missing points indices

nsample   : Number of points in the time-series
npartitions : Number of folds to consider
seed : random number seed, defaults to 424242

Return
rfolds : List of tuple of (m_inx, r_inx)
m_inx : fold indices
r_inx : missing indices

"""
np.random.seed(seed)
# Each partition upper index: k-fold : Cut [0,1] to npartitions and map to nsample
inx_part = np.int64(np.round(np.linspace(0.0, 1.0, npartition + 1) * nsample))
# Partition ranges
inx_pairs_part = [(x, x) for x in zip(inx_part[:-1], inx_part[1:])]
# Shuffled indices
sample_inx = np.arange(0, nsample)
np.random.shuffle(sample_inx)
rfolds = []
for ps in inx_pairs_part:
r_inx = sample_inx[ps : ps]
m_inx = np.setdiff1d(sample_inx, r_inx)
rfolds.append((m_inx, r_inx))
return rfolds


### Solve 1-D Ornstein-Uhlenbeck process with Gaussian Process¶

In :
def gp_ou_1d(y_i, t_i, t_j):
"""

Solve for Gaussian process for unknown y_j
Kernel matrices from  Ornstein-Uhlenbeck process
with unit penalty term, on identy matrix.

y_i : Ordered values time points
t_i : Time values
t_j : Time values to estimate

Returns
y_j_bar : 1-d Ornstein-Uhlenbeck process estimation on t_j time-points

"""
D_ii = np.array([np.abs(t - t_i) for t in t_i])
K_ii = np.exp(-D_ii / 2.0)
D_ij = np.array([np.abs(t - t_i) for t in t_j])
K_ij = np.exp(-D_ij / 2.0)
L = K_ii + np.identity(K_ii.shape)
y_j_bar = np.dot(np.matmul(K_ij, np.linalg.pinv(L)), y_i)
return y_j_bar


### Solve 1-D Ornstein-Uhlenbeck process with Gaussian Process: Index version¶

In :
def gp_ou_1d_inx(y, t, m_inx, r_inx):
"""

Solve for Gaussian process for unknown y_j
Kernel matrices from  Ornstein-Uhlenbeck process
with unit penalty term, on identy matrix.

y : Ordered values time points
t : Time values
m_inx : fold indices
r_inx : missing indices (points to reconstruct)

Returns
y_j_bar : 1-d Ornstein-Uhlenbeck process estimation on t_j time-points

"""
t_i = t[m_inx]
t_j = t[r_inx]
y_i = y[m_inx]
y_j = y[r_inx]
y_j_bar = gp_ou_1d(y_i, t_i, t_j)
return y_j_bar


### Reconstruct 1-D Ornstein-Uhlenbeck process with Gaussian Process for given k-folds¶

In :
def gp_ou_1d_reconstruct_folds(y, t, rfolds):
"""

Solve for Gaussian process for given fold indices,
i.e, reconstruction.
Kernel matrices from  Ornstein-Uhlenbeck process
with unit penalty term, on identy matrix.

y : values
t : corresponding time points
rfolds : List of tuple of (m_inx, r_inx)
m_inx : fold indices
r_inx : missing indices

Returns
rcon  : List of (R_m, g_r)
R_m reconstructed y and correponding MAPE g_r on each
fold

"""
rcon = []
for r in rfolds:
m_inx, r_inx = r
y_i = y[m_inx]
t_i = t[m_inx]
t_j = t[r_inx]
y_j = y[r_inx]
y_j_hat = gp_ou_1d(y_i, t_i, t_j)
g_r = np.sum(np.abs(y_j - y_j_hat) / y_j) / y_j.shape
t_org = np.append(t_i, t_j)
y_hat = np.append(y_i, y_j_hat)
ixx = np.argsort(t_org)
rcon.append((y_hat[ixx], g_r))
return rcon


### Solve 1-D Ornstein-Uhlenbeck process with Gaussian Process for predicting future series¶

In :
def gp_ou_1d_folds_predict(rcon, t_y, w, t_w):
"""

Given folds compute predictions.

rcon  : List of (R_m, g_r)
R_m reconstructed y and correponding MAPE g_r on each fold
t_y  : time points in recunstructed folds
w    : future values
t_w  : future time-points

Returns
wcon : List of (R_m, g_r)
w_hat predicted w and correponding MAPE g_p on each fold

"""
wcon = []
for rc in rcon:
R_m, _ = rc
w_hat = gp_ou_1d(R_m, t_y, t_w)
g_p = np.sum((w - w_hat) / w) / w.shape  # MAPE
wcon.append((w_hat, g_p))
return wcon


### rCV on Ornstein-Uhlenbeck data¶

In :
import logging

def rCV_ou(ty, nsample, npartition, mu_shift=5.0, oosp=0.25, seed=424242):
"""

Given time values, number of samples, and number of partitions:
Generate Ornstein-Uhlenbeck data, covariance is a distance matrix,
and perform rCV on it.

ty : End time point for the training series that fold will be generated. [0,t_y]/
End time point for the out-of-sample test portion, where t_w > t_y (t_y, tw]
Out-of-sample portion will be 25% of nsample by default, See oosp.
nsample : Number of samples to use between [0, t_y]. See oosp too.
npartitions : k for k-fold.
mu_shift : A real number to shift mean value of OU from zeros, defaults to 5.0.
oosp : Defaults to 0.25. Ratio of nsample to be number of
out-of-sample time points.
seed : Seed to use for randomization, defaults to 424242

Returns:
rCV_data
A dictionary {
'ou_data':ou_data,
'fold_reconstructions':rcon,
'predictions':wcon,
"g_rCV": g_rCV,
}
ou_data : A tuple (t_y, y, t_w, w) OU data for training and
outof sample.
rcon  : List of (R_m, g_r)
R_m reconstructed y and correponding MAPE g_r on each fold.
wcon : List of (R_m, g_r)
w_hat predicted w and correponding MAPE g_p on each fold
g_rCV : Reconstructive-CV error
gr : Mean Reconstruction error
gw : Mean Prediction error

"""
t_y, y, t_w, w = ornstein_uhlenbeck_series_1d_oos(ty, nsample, seed=seed)
ou_data = (t_y, y, t_w, w)
rfolds = generate_reconstruction_folds_inx(nsample + 1, npartition, seed=seed)
rcon = gp_ou_1d_reconstruct_folds(y, t_y, rfolds)
wcon = gp_ou_1d_folds_predict(rcon, t_y, w, t_w)
gr = np.mean([r for r in rcon])
gw = np.mean([w for w in wcon])
g_rCV = gr * gw
rCV_data = {
"ou_data": ou_data,
"fold_reconstructions": rcon,
"predictions": wcon,
"g_rCV": g_rCV,
"gr": gr,
"gw": gw,
}
return rCV_data


## Functional tests and data generation¶

### Generate OU process¶

In :
t_y = np.linspace(0.0, 10, 1001)
mu = np.ones(t_y.shape) + 5.0
y = ornstein_uhlenbeck_series_1d(t_y, mu=mu)
t_w = np.linspace(10.1, 12.5, 251)
mu = np.ones(t_w.shape) + 5.0
w = ornstein_uhlenbeck_series_1d(t_w, mu=mu)

In :
# plt.plot(t_y, y, t_w, w, "-", "--")


### Generate OU process with OOS¶

In :
ty = 10.0
nsample = 1000
(t_y, y, t_w, w) = ornstein_uhlenbeck_series_1d_oos(ty, nsample)

# plt.plot(t_y, y, t_w, w, "-", "--")


### Generate series and k-fold indices and reconstruct¶

In :
nsample = 1001  # samples
npartition = 10  # fold
t_y = np.linspace(0.0, 10, nsample)
mu = np.ones(t_y.shape) + 2.0
y = ornstein_uhlenbeck_series_1d(t_y, mu=mu)
t_w = np.linspace(10.1, 10.5, 41)
mu = np.ones(t_w.shape) + 2.0
w = ornstein_uhlenbeck_series_1d(t_w, mu=mu)
rfolds = generate_reconstruction_folds_inx(nsample, npartition)

In :
rcon = gp_ou_1d_reconstruct_folds(y, t_y, rfolds)

In :
rcon

Out:
(array([3.70006446, 3.838436  , 3.80567831, ..., 1.41902056, 1.6361343 ,
1.65034412]), 0.08964514140227492)

### Predict future points using reconstructed folds¶

In :
wcon = gp_ou_1d_folds_predict(rcon, t_y, w, t_w)


### Test single fold reconstruction¶

In :
m_inx, r_inx = rfolds
y_i = y[m_inx]
t_i = t_y[m_inx]
t_j = t_y[r_inx]

In :
y_j_hat = gp_ou_1d(y_i, t_i, t_j)

In :
y_j_hat_inx = gp_ou_1d_inx(y, t_y, m_inx, r_inx)

In :
y_org = np.append(y[m_inx], y[r_inx])
y_hat = np.append(y[m_inx], y_j_hat)

In :
t_org = np.append(t_i, t_j)
ixx = np.argsort(t_org)

In :
# plt.plot(t_org[ixx], y_org[ixx], t_org[ixx], y_hat[ixx], "-", "--")


### Test single future prediction with GP¶

In :
w_hat = gp_ou_1d(y, t_y, t_w)

In :
yw_org = np.append(y, w)
yw_hat = np.append(y, w_hat)

In :
t_yw = np.append(t_y, t_w)

In :
np.sum((w - w_hat) / w) / w.shape  # MAPE

Out:
0.07653746287282844

## rCV with 1D Ornstein-Uhlenbeck¶

In :
ty = 10.0
nsample = 1000
npartition = 10
rCV_data = rCV_ou(ty, nsample, npartition)

In :
rCV_data.keys()

Out:
dict_keys(['ou_data', 'fold_reconstructions', 'predictions', 'g_rCV', 'gr', 'gw'])
In :
# g_r erros
[rr for rr in rCV_data["fold_reconstructions"]]

Out:
[0.028486701249059415,
0.03203318558792611,
0.03066018781182881,
0.028455409391295778,
0.02670547724709341,
0.030581226711995195,
0.026295584174107694,
0.028457964097179756,
0.028106758965915028,
0.026584472342276965]

## Results and Figures¶

### 1D OU generated data¶

In :
t_y, y, t_w, w = rCV_data["ou_data"]

In :
%matplotlib inline
font = {"family": "normal", "weight": "bold", "size": 14}

plt.rc("font", **font)
plt.plot(t_y, y, "-", label="y : Historic for rCV ")
plt.plot(t_w, w, "--", label="w : Out-of-Sample (OOS)")
plt.legend(loc="lower left")
plt.ylim([0, 7.0])
plt.xlabel("Time (arbitrary units)", **font)
plt.ylabel("y, w (arbitrary units)", **font)
plt.title("Ornstein-Uhlenbeck Process  ", **font)
plt.savefig("plots/ou_data.eps", format="eps", dpi=1000, bbox_inches="tight")
plt.cla()
plt.clf()
plt.gca()
plt.gcf()
plt.close()

findfont: Font family ['normal'] not found. Falling back to DejaVu Sans.
The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.
The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.
The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.
The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.


### Reconstructed folds absolute difference¶

In :
y_folds = [yy for yy in rCV_data["fold_reconstructions"]]

In :
k = 1
dy_mean = []
for y_f in y_folds:
dy = np.abs(y - y_f)
dy_mean.append(dy.mean())
plt.subplot(10, 1, k)
plt.plot(t_y, dy, "-")
plt.ylabel("1")
plt.axis("off")
k = k + 1
plt.suptitle("Reconstructed series absolute errors", **font)

plt.savefig(
"plots/ou_reconstructed_diff.eps", format="eps", dpi=1000, bbox_inches="tight"
)
plt.cla()
plt.clf()
plt.gca()
plt.gcf()
plt.close()

In :
np.mean(dy_mean)

Out:
0.014222398558916067

### Error summary¶

In :
# Fold construction errors MAPE
[r for r in rCV_data["fold_reconstructions"]]

Out:
[0.028486701249059415,
0.03203318558792611,
0.03066018781182881,
0.028455409391295778,
0.02670547724709341,
0.030581226711995195,
0.026295584174107694,
0.028457964097179756,
0.028106758965915028,
0.026584472342276965]
In :
# Prediction errors MAPE
[w for w in rCV_data["predictions"]]

Out:
[0.4661134459119515,
0.4701922271506014,
0.4669851635984956,
0.4671724893188641,
0.4693673160492926,
0.47414297989464843,
0.46736160215005484,
0.4678060255919754,
0.4674375313291001,
0.4672283786360845]
In :
print("Mean construction error")
mr = np.mean([r for r in rCV_data["fold_reconstructions"]])
print(mr)
print("Mean prediction error")
mp = np.mean([w for w in rCV_data["predictions"]])
print(mp)
print("g_rCV")
rCV_data["g_rCV"]

Mean construction error
0.02863669675786782
Mean prediction error
0.4683807159631068
g_rCV

Out:
0.01341287653026851
In :
np.mean([r for r in rcon]) * np.mean([w for w in wcon])

Out:
0.0075447474228444375

### Learning curve data and curve plot¶

We generate the learning curve defined above and report based on different perfromances.

In :
# Run different npartitions to get performance over
ty = 10.0
nsample = 1000
npartitions = np.linspace(2, 30, 29, dtype=np.int64)  # k-folds
gr_ = []
gw_ = []
grw_ = []
for k in npartitions:
rCV_data = rCV_ou(ty, nsample, k)
grw_.append(rCV_data["g_rCV"])
gr_.append(rCV_data["gr"])
gw_.append(rCV_data["gw"])

In :
plt.rc("font", **font)
plt.plot(npartitions, np.array(gr_) * 100, linestyle="--", marker="o")
plt.title("Learning curve with reconstruction error")
plt.ylabel("Reconstruction MAPE (%)")
plt.xlabel("Number of partitions in reconstruction")
plt.savefig("plots/learning_curve_gr.eps", format="eps", dpi=1000, bbox_inches="tight")
plt.cla()
plt.clf()
plt.gca()
plt.gcf()
plt.close()

findfont: Font family ['normal'] not found. Falling back to DejaVu Sans.

In :
plt.rc("font", **font)
plt.plot(npartitions, np.array(gw_) * 100, linestyle="--", marker="o")
plt.title("Learning curve with prediction error")
plt.ylabel("Prediction MAPE (%)")
plt.xlabel("Number of partitions in reconstruction")
plt.savefig("plots/learning_curve_gw.eps", format="eps", dpi=1000, bbox_inches="tight")
plt.cla()
plt.clf()
plt.gca()
plt.gcf()
plt.close()

In :
plt.rc("font", **font)
plt.plot(npartitions, np.array(grw_), linestyle="--", marker="o")
plt.title("Learning curve with rCV error")
plt.ylabel("rCV error")
plt.xlabel("Number of partitions in reconstruction")
plt.savefig("plots/learning_curve_grw.eps", format="eps", dpi=1000, bbox_inches="tight")
plt.cla()
plt.clf()
plt.gca()
plt.gcf()
plt.close()