In this case study, we will implement hierarchical risk parity based on clustering methods and compare it against other portfolio allocation methods. The case study is based on the paper “Building Diversified Portfolios that Outperform Out-ofSample” (2016) by Marcos Lopez de Prado.
Our goal in this case study is to maximize risk-adjusted returns using a clustering reduction-based algorithm on a dataset of stocks to allocate capital into different asset classes.
Checking is the additional packages needed are present, if not install them. Let us chek is the cvxopt package is present, if not install it. This package is checked separately as it is not included in requirement.txt of this book repository as the package is not used across any other case study of thie book.
import pkg_resources
import pip
installedPackages = {pkg.key for pkg in pkg_resources.working_set}
if 'cvxopt' not in installedPackages :
!pip install cvxopt==1.2.5
# Load libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pandas import read_csv, set_option
from pandas.plotting import scatter_matrix
import seaborn as sns
from sklearn.preprocessing import StandardScaler
from datetime import date
#Import Model Packages
import scipy.cluster.hierarchy as sch
from sklearn.cluster import AgglomerativeClustering
from scipy.cluster.hierarchy import fcluster
from scipy.cluster.hierarchy import dendrogram, linkage, cophenet
from sklearn.metrics import adjusted_mutual_info_score
from sklearn import cluster, covariance, manifold
#Package for optimization of mean variance optimization
import cvxopt as opt
from cvxopt import blas, solvers
# load dataset
dataset = read_csv('SP500Data.csv',index_col=0)
#Diable the warnings
import warnings
warnings.filterwarnings('ignore')
type(dataset)
pandas.core.frame.DataFrame
# shape
dataset.shape
(448, 502)
# peek at data
set_option('display.width', 100)
dataset.head(5)
ABT | ABBV | ABMD | ACN | ATVI | ADBE | AMD | AAP | AES | AMG | ... | WLTW | WYNN | XEL | XRX | XLNX | XYL | YUM | ZBH | ZION | ZTS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Date | |||||||||||||||||||||
2018-01-02 | 58.790001 | 98.410004 | 192.490005 | 153.839996 | 64.309998 | 177.699997 | 10.98 | 106.089996 | 10.88 | 203.039993 | ... | 146.990005 | 164.300003 | 47.810001 | 29.370001 | 67.879997 | 68.070000 | 81.599998 | 124.059998 | 50.700001 | 71.769997 |
2018-01-03 | 58.919998 | 99.949997 | 195.820007 | 154.550003 | 65.309998 | 181.039993 | 11.55 | 107.050003 | 10.87 | 202.119995 | ... | 149.740005 | 162.520004 | 47.490002 | 29.330000 | 69.239998 | 68.900002 | 81.529999 | 124.919998 | 50.639999 | 72.099998 |
2018-01-04 | 58.820000 | 99.379997 | 199.250000 | 156.380005 | 64.660004 | 183.220001 | 12.12 | 111.000000 | 10.83 | 198.539993 | ... | 151.259995 | 163.399994 | 47.119999 | 29.690001 | 70.489998 | 69.360001 | 82.360001 | 124.739998 | 50.849998 | 72.529999 |
2018-01-05 | 58.990002 | 101.110001 | 202.320007 | 157.669998 | 66.370003 | 185.339996 | 11.88 | 112.180000 | 10.87 | 199.470001 | ... | 152.229996 | 164.490005 | 46.790001 | 29.910000 | 74.150002 | 69.230003 | 82.839996 | 125.980003 | 50.869999 | 73.360001 |
2018-01-08 | 58.820000 | 99.489998 | 207.800003 | 158.929993 | 66.629997 | 185.039993 | 12.28 | 111.389999 | 10.87 | 200.529999 | ... | 151.410004 | 162.300003 | 47.139999 | 30.260000 | 74.639999 | 69.480003 | 82.980003 | 126.220001 | 50.619999 | 74.239998 |
5 rows × 502 columns
We will look at the data visualisation in the later sections.
We check for the NAs in the rows, either drop them or fill them with the mean of the column.
#Checking for any null values and removing the null values'''
print('Null Values =',dataset.isnull().values.any())
dataset.shape
Null Values = True
(448, 502)
Getting rid of the columns with more than 30% missing values.
missing_fractions = dataset.isnull().mean().sort_values(ascending=False)
missing_fractions.head(10)
drop_list = sorted(list(missing_fractions[missing_fractions > 0.3].index))
dataset.drop(labels=drop_list, axis=1, inplace=True)
dataset.shape
(448, 498)
Given that there are null values drop the rown contianing the null values.
# Fill the missing values with the last value available in the dataset.
dataset=dataset.fillna(method='ffill')
dataset.head(2)
ABT | ABBV | ABMD | ACN | ATVI | ADBE | AMD | AAP | AES | AMG | ... | WLTW | WYNN | XEL | XRX | XLNX | XYL | YUM | ZBH | ZION | ZTS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Date | |||||||||||||||||||||
2018-01-02 | 58.790001 | 98.410004 | 192.490005 | 153.839996 | 64.309998 | 177.699997 | 10.98 | 106.089996 | 10.88 | 203.039993 | ... | 146.990005 | 164.300003 | 47.810001 | 29.370001 | 67.879997 | 68.070000 | 81.599998 | 124.059998 | 50.700001 | 71.769997 |
2018-01-03 | 58.919998 | 99.949997 | 195.820007 | 154.550003 | 65.309998 | 181.039993 | 11.55 | 107.050003 | 10.87 | 202.119995 | ... | 149.740005 | 162.520004 | 47.490002 | 29.330000 | 69.239998 | 68.900002 | 81.529999 | 124.919998 | 50.639999 | 72.099998 |
2 rows × 498 columns
For the purpose of clustering, we will be using annual returns. Additionally, we will train the data followed by testing. Let us prepare the dataset for training and testing, by separating 20% of the dataset for testing followed by generating the return series.
X= dataset.copy('deep')
row= len(X)
train_len = int(row*.8)
X_train = X.head(train_len)
X_test = X.tail(row-train_len)
#Calculate percentage return
returns = X_train.pct_change().dropna()
returns_test=X_test.pct_change().dropna()
The parameters to clusters are the indices and the variables used in the clustering are the columns. Hence the data is in the right format to be fed to the clustering algorithms
The first step is to look for clusters of corre‐ lations using the agglomerate hierarchical clustering technique. The hierarchy class has a dendrogram method which takes the value returned by the linkage method of the same class.
Linkage does the actual clustering in one line of code, and returns a list of the clusters joined in the format: Z=[stock_1, stock_2, distance, sample_count]
def correlDist(corr):
# A distance matrix based on correlation, where 0<=d[i,j]<=1
# This is a proper distance metric
dist = ((1 - corr) / 2.)**.5 # distance matrix
return dist
#Calulate linkage
dist = correlDist(returns.corr())
link = linkage(dist, 'ward')
link[0]
array([24. , 25. , 0.09110116, 2. ])
Computation of linkages is followed by the visualization of the clusters through a dendrogram, which displays a cluster tree. The leaves are the individual stocks and the root is the final single cluster. The “distance” between each cluster is shown on the y-axis, and thus the longer the branches are, the less correlated two clusters are.
#Plot Dendogram
plt.figure(figsize=(20, 7))
plt.title("Dendrograms")
dendrogram(link,labels = X.columns)
plt.show()
A 'quasi-diagonalization' is a process usually known as matrix seriation and which can be performed using hierarchical clustering. This process reorganize the covariance matrix so similar investments will be placed together. This matrix diagonalization allow us to distribute weights optimally following an inverse-variance allocation.
def getQuasiDiag(link):
# Sort clustered items by distance
link = link.astype(int)
sortIx = pd.Series([link[-1, 0], link[-1, 1]])
numItems = link[-1, 3] # number of original items
while sortIx.max() >= numItems:
sortIx.index = range(0, sortIx.shape[0] * 2, 2) # make space
df0 = sortIx[sortIx >= numItems] # find clusters
i = df0.index
j = df0.values - numItems
sortIx[i] = link[j, 0] # item 1
df0 = pd.Series(link[j, 1], index=i + 1)
sortIx = sortIx.append(df0) # item 2
sortIx = sortIx.sort_index() # re-sort
sortIx.index = range(sortIx.shape[0]) # re-index
return sortIx.tolist()
This step distributes the allocation through recursive bisection based on cluster covariance.
def getClusterVar(cov,cItems):
# Compute variance per cluster
cov_=cov.loc[cItems,cItems] # matrix slice
w_=getIVP(cov_).reshape(-1,1)
cVar=np.dot(np.dot(w_.T,cov_),w_)[0,0]
return cVar
def getRecBipart(cov, sortIx):
# Compute HRP alloc
w = pd.Series(1, index=sortIx)
cItems = [sortIx] # initialize all items in one cluster
while len(cItems) > 0:
cItems = [i[j:k] for i in cItems for j, k in ((0, len(i) // 2), (len(i) // 2, len(i))) if len(i) > 1] # bi-section
for i in range(0, len(cItems), 2): # parse in pairs
cItems0 = cItems[i] # cluster 1
cItems1 = cItems[i + 1] # cluster 2
cVar0 = getClusterVar(cov, cItems0)
cVar1 = getClusterVar(cov, cItems1)
alpha = 1 - cVar0 / (cVar0 + cVar1)
w[cItems0] *= alpha # weight 1
w[cItems1] *= 1 - alpha # weight 2
return w
The main premise of this case study was to develop an alternative to Markowitz’s Minimum-Variance Portfolio based asset allocation. So, in this step, we define a functions to compare the performance of the following asset allocation methods.
1. MVP - Markowitz’s Minimum-Variance Portfolio
2. HRP - Hierarchial Risk Parity
def getMVP(cov):
cov = cov.T.values
n = len(cov)
N = 100
mus = [10 ** (5.0 * t / N - 1.0) for t in range(N)]
# Convert to cvxopt matrices
S = opt.matrix(cov)
#pbar = opt.matrix(np.mean(returns, axis=1))
pbar = opt.matrix(np.ones(cov.shape[0]))
# Create constraint matrices
G = -opt.matrix(np.eye(n)) # negative n x n identity matrix
h = opt.matrix(0.0, (n, 1))
A = opt.matrix(1.0, (1, n))
b = opt.matrix(1.0)
# Calculate efficient frontier weights using quadratic programming
solvers.options['show_progress'] = False
portfolios = [solvers.qp(mu * S, -pbar, G, h, A, b)['x']
for mu in mus]
## CALCULATE RISKS AND RETURNS FOR FRONTIER
returns = [blas.dot(pbar, x) for x in portfolios]
risks = [np.sqrt(blas.dot(x, S * x)) for x in portfolios]
## CALCULATE THE 2ND DEGREE POLYNOMIAL OF THE FRONTIER CURVE
m1 = np.polyfit(returns, risks, 2)
x1 = np.sqrt(m1[2] / m1[0])
# CALCULATE THE OPTIMAL PORTFOLIO
wt = solvers.qp(opt.matrix(x1 * S), -pbar, G, h, A, b)['x']
return list(wt)
def getIVP(cov, **kargs):
# Compute the inverse-variance portfolio
ivp = 1. / np.diag(cov)
ivp /= ivp.sum()
return ivp
def getHRP(cov, corr):
# Construct a hierarchical portfolio
dist = correlDist(corr)
link = sch.linkage(dist, 'single')
#plt.figure(figsize=(20, 10))
#dn = sch.dendrogram(link, labels=cov.index.values)
#plt.show()
sortIx = getQuasiDiag(link)
sortIx = corr.index[sortIx].tolist()
hrp = getRecBipart(cov, sortIx)
return hrp.sort_index()
def get_all_portfolios(returns):
cov, corr = returns.cov(), returns.corr()
hrp = getHRP(cov, corr)
mvp = getMVP(cov)
mvp = pd.Series(mvp, index=cov.index)
portfolios = pd.DataFrame([mvp, hrp], index=['MVP', 'HRP']).T
#portfolios = pd.DataFrame([ivp, hrp], index=['IVP', 'HRP']).T
return portfolios
portfolios = get_all_portfolios(returns)
fig, (ax1, ax2) = plt.subplots(1, 2,figsize=(30,20))
ax1.pie(portfolios.iloc[:,0], );
ax1.set_title('MVP',fontsize = 30)
ax2.pie(portfolios.iloc[:,1]);
ax2.set_title('HRP',fontsize = 30)
#portfolios.plot.pie(subplots=True, figsize=(20, 10),legend = False);
Text(0.5, 1.0, 'HRP')
The first pie chart shown the asset allocation of MVP, followed by IVP and HRP. We clearly see more diversification in HRP as compared to MVP. On the other hand, IVP evenly spreads weights through all assets, and it looks similar to HRP. However, this method ignores the correlation structure between the instruments unlike HRP is included just for comparison purpose. Let us look at the backtesting results.
Insample_Result=pd.DataFrame(np.dot(returns,np.array(portfolios)), \
columns=['MVP', 'HRP'], index = returns.index)
OutOfSample_Result=pd.DataFrame(np.dot(returns_test,np.array(portfolios)), \
columns=['MVP', 'HRP'], index = returns_test.index)
Insample_Result.cumsum().plot(figsize=(10, 5), title ="In-Sample Results")
OutOfSample_Result.cumsum().plot(figsize=(10, 5), title ="Out Of Sample Results")
<matplotlib.axes._subplots.AxesSubplot at 0x142248842b0>
#In_sample Results
stddev = Insample_Result.std() * np.sqrt(252)
sharp_ratio = (Insample_Result.mean()*np.sqrt(252))/(Insample_Result).std()
Results = pd.DataFrame(dict(stdev=stddev, sharp_ratio = sharp_ratio))
Results
stdev | sharp_ratio | |
---|---|---|
MVP | 0.085516 | 0.785019 |
HRP | 0.126944 | 0.523599 |
#Outof_sample Results
stddev_oos = OutOfSample_Result.std() * np.sqrt(252)
sharp_ratio_oos = (OutOfSample_Result.mean()*np.sqrt(252))/(OutOfSample_Result).std()
Results_oos = pd.DataFrame(dict(stdev_oos=stddev_oos, sharp_ratio_oos = sharp_ratio_oos))
Results_oos
stdev_oos | sharp_ratio_oos | |
---|---|---|
MVP | 0.102761 | 0.786621 |
HRP | 0.125610 | 0.836159 |
Although the in-sample result of MVP look promising, the out of sample sharp ratio and overall return of portfolio constructed using hierarchical clustering approach are better. The diversification that HRP achieves across uncorrelated assets makes the methodology more robust against shocks
Conclusion
Markowitz’s minimum-variance based portfolio allocation is less diverse and is concentrated in a few stocks. In the hierarchical clustering-based allocation, the allocation is more diverse and distributed across many assets. These portfolios then offer better tail risk management.
Finally, we looked at the backtesting framework, which enables us to compute and analyze the in-sample and out of sample return and sharpe ratio of each hypothetical portfolio. This helps in determining the best performer out of all the portfolio allocation methods. The hierarchical clustering approaches yielded the best out-of-sample results.