We investigate the seasonality of the EWS ETF, following on from the work of Mitchell Rosenthal.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
plt.style.use("seaborn-deep")
plt.rcParams['figure.figsize'] = (10,6)
plt.rcParams['font.size'] = 14
Mitch observed that 89% of the time, April had a positive return. We first attempt to replicate this.
df = yf.download("EWS")["Adj Close"]
df.head()
[*********************100%***********************] 1 of 1 completed
Date 1996-03-18 11.169745 1996-03-19 11.281445 1996-03-20 11.169745 1996-03-21 11.169745 1996-03-22 11.393141 Name: Adj Close, dtype: float64
df.plot();
# Reproduce original results
def compute_pct_up_months(df):
# df is a dataframe of daily close prices
df_mth = df.resample("M").last()
month_up = df_mth.pct_change().dropna() > 0
pct_up = month_up.groupby(month_up.index.month).mean()
return pct_up
pct_up = compute_pct_up_months(df)
pct_up
Date 1 0.520000 2 0.440000 3 0.640000 4 0.807692 5 0.384615 6 0.538462 7 0.560000 8 0.320000 9 0.560000 10 0.560000 11 0.560000 12 0.680000 Name: Adj Close, dtype: float64
fig,ax = plt.subplots()
ax.axhline(pct_up.mean(), color="r", linestyle="--", label="mean")
ax.legend()
pct_up.plot.bar(ax=ax);
ax.set_xlabel("month")
ax.set_ylabel("pct_up")
plt.tight_layout()
# plt.savefig("replicate.png", dpi=400)
Although our numbers are slightly different (expected given a slightly different methodology and more data), the anomaly is replicated.
In this section we will compute different paths of history by reshuffling the dataframe. This is a simple way of generating random data with the same mean/volatility over the entire sample period (though obviously different for rolling windows).
This helps us understand if the anomaly is "statistically significant".
rets = df.pct_change().dropna()
rets
Date 1996-03-19 0.010000 1996-03-20 -0.009901 1996-03-21 0.000000 1996-03-22 0.020000 1996-03-25 0.009804 ... 2021-06-15 0.002954 2021-06-16 -0.015987 2021-06-17 -0.005130 2021-06-18 -0.014611 2021-06-21 0.014392 Name: Adj Close, Length: 6358, dtype: float64
def generate_path(rets):
# rets is the daily EWS returns
ret_shuffled = rets.sample(frac=1)
ret_shuffled.index = rets.index
px_new = (1 + ret_shuffled).cumprod()
return px_new
px_new = generate_path(rets)
px_new.plot();
px_new = generate_path(rets)
pct_up = compute_pct_up_months(px_new)
pct_up
Date 1 0.600000 2 0.680000 3 0.640000 4 0.461538 5 0.423077 6 0.500000 7 0.400000 8 0.400000 9 0.440000 10 0.440000 11 0.600000 12 0.400000 Name: Adj Close, dtype: float64
NUM_PATHS = 10000
paths = []
for i in range(NUM_PATHS):
px = generate_path(rets).rename(i)
paths.append(px)
paths_df = pd.concat(paths, axis=1)
max_up = compute_pct_up_months(paths_df).max()
fig,ax = plt.subplots()
paths_df.iloc[:, :100].plot(ax=ax, legend=None, alpha=0.3, color="purple", linewidth=1)
plt.tight_layout()
# plt.savefig("mc.png", dpi=400)
# The value whose significance we are trying to evaluate
threshold_pct = 0.8
fig, ax = plt.subplots()
max_up.hist(ax=ax, grid=False, bins=10);
ax.axvline(threshold_pct, color="r", linestyle="--");
ax.set_xlabel("Highest return seasonality")
ax.set_ylabel("Number of paths");
plt.tight_layout()
# plt.savefig("seasonality_hist.png", dpi=400)
p_val = (max_up > threshold_pct).sum() / NUM_PATHS
p_val
0.0076