Testing Seasonality with Randomeness¶

We investigate the seasonality of the EWS ETF, following on from the work of Mitchell Rosenthal.

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import yfinance as yf

In [2]:

plt.style.use("seaborn-deep")
plt.rcParams['figure.figsize'] = (10,6)
plt.rcParams['font.size'] = 14

Reproducing original results¶

Mitch observed that 89% of the time, April had a positive return. We first attempt to replicate this.

In [3]:

df = yf.download("EWS")["Adj Close"]
df.head()

[*********************100%***********************]  1 of 1 completed

Out[3]:

Date
1996-03-18    11.169745
1996-03-19    11.281445
1996-03-20    11.169745
1996-03-21    11.169745
1996-03-22    11.393141
Name: Adj Close, dtype: float64

In [4]:

df.plot();

In [5]:

# Reproduce original results
def compute_pct_up_months(df):
    # df is a dataframe of daily close prices
    df_mth = df.resample("M").last()
    month_up = df_mth.pct_change().dropna() > 0
    pct_up = month_up.groupby(month_up.index.month).mean()
    return pct_up

In [6]:

pct_up = compute_pct_up_months(df)
pct_up

Out[6]:

Date
1     0.520000
2     0.440000
3     0.640000
4     0.807692
5     0.384615
6     0.538462
7     0.560000
8     0.320000
9     0.560000
10    0.560000
11    0.560000
12    0.680000
Name: Adj Close, dtype: float64

In [7]:

fig,ax = plt.subplots()
ax.axhline(pct_up.mean(), color="r", linestyle="--", label="mean")
ax.legend()
pct_up.plot.bar(ax=ax);
ax.set_xlabel("month")
ax.set_ylabel("pct_up")
plt.tight_layout()
# plt.savefig("replicate.png", dpi=400)

Although our numbers are slightly different (expected given a slightly different methodology and more data), the anomaly is replicated.

Random data¶

In this section we will compute different paths of history by reshuffling the dataframe. This is a simple way of generating random data with the same mean/volatility over the entire sample period (though obviously different for rolling windows).

This helps us understand if the anomaly is "statistically significant".

In [8]:

rets = df.pct_change().dropna()
rets

Out[8]:

Date
1996-03-19    0.010000
1996-03-20   -0.009901
1996-03-21    0.000000
1996-03-22    0.020000
1996-03-25    0.009804
                ...   
2021-06-15    0.002954
2021-06-16   -0.015987
2021-06-17   -0.005130
2021-06-18   -0.014611
2021-06-21    0.014392
Name: Adj Close, Length: 6358, dtype: float64

In [9]:

def generate_path(rets):
    # rets is the daily EWS returns
    ret_shuffled = rets.sample(frac=1)
    ret_shuffled.index = rets.index
    px_new = (1 + ret_shuffled).cumprod()
    return px_new

In [10]:

px_new = generate_path(rets)
px_new.plot();

In [11]:

px_new = generate_path(rets)
pct_up = compute_pct_up_months(px_new)
pct_up

Out[11]:

Date
1     0.600000
2     0.680000
3     0.640000
4     0.461538
5     0.423077
6     0.500000
7     0.400000
8     0.400000
9     0.440000
10    0.440000
11    0.600000
12    0.400000
Name: Adj Close, dtype: float64

In [12]:

NUM_PATHS = 10000
paths = []

for i in range(NUM_PATHS):
    px = generate_path(rets).rename(i)
    paths.append(px)
paths_df = pd.concat(paths, axis=1)

max_up = compute_pct_up_months(paths_df).max()

In [13]:

fig,ax = plt.subplots()
paths_df.iloc[:, :100].plot(ax=ax, legend=None, alpha=0.3, color="purple", linewidth=1)
plt.tight_layout()
# plt.savefig("mc.png", dpi=400)

In [14]:

# The value whose significance we are trying to evaluate
threshold_pct = 0.8

fig, ax = plt.subplots()
max_up.hist(ax=ax, grid=False, bins=10);
ax.axvline(threshold_pct, color="r", linestyle="--");
ax.set_xlabel("Highest return seasonality")
ax.set_ylabel("Number of paths");

plt.tight_layout()
# plt.savefig("seasonality_hist.png", dpi=400)

In [15]:

p_val = (max_up > threshold_pct).sum() / NUM_PATHS
p_val

Out[15]:

0.0076

In [ ]: