Suppose this is your first time to write the code. Perhaps, you want to run a simple regression using two series of asset prices to fin the equity beta. Let's use a step-by-step approach to complete the task.
Step 1: Download two assets' prices from the web
Step 2: Put them onto a matrix form
Step 3: Run the OLS
Step 4: Plot data
We will use yahoo finance package (https://pypi.org/project/yfinance/) to download Yahoo Finance data from the web. We need to (1) install and (2) import this package.
!pip install yfinance # to install, remove # and run the cell
import yfinance as yf # to import
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting yfinance Downloading yfinance-0.1.85-py2.py3-none-any.whl (29 kB) Collecting requests>=2.26 Downloading requests-2.28.1-py3-none-any.whl (62 kB) |████████████████████████████████| 62 kB 1.7 MB/s Requirement already satisfied: appdirs>=1.4.4 in /usr/local/lib/python3.7/dist-packages (from yfinance) (1.4.4) Requirement already satisfied: numpy>=1.15 in /usr/local/lib/python3.7/dist-packages (from yfinance) (1.21.6) Requirement already satisfied: multitasking>=0.0.7 in /usr/local/lib/python3.7/dist-packages (from yfinance) (0.0.11) Requirement already satisfied: pandas>=0.24.0 in /usr/local/lib/python3.7/dist-packages (from yfinance) (1.3.5) Requirement already satisfied: lxml>=4.5.1 in /usr/local/lib/python3.7/dist-packages (from yfinance) (4.9.1) Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.24.0->yfinance) (2.8.2) Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.24.0->yfinance) (2022.6) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas>=0.24.0->yfinance) (1.15.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26->yfinance) (2.10) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26->yfinance) (2022.9.24) Requirement already satisfied: charset-normalizer<3,>=2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26->yfinance) (2.1.1) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26->yfinance) (1.24.3) Installing collected packages: requests, yfinance Attempting uninstall: requests Found existing installation: requests 2.23.0 Uninstalling requests-2.23.0: Successfully uninstalled requests-2.23.0 Successfully installed requests-2.28.1 yfinance-0.1.85
# download
mystock = yf.download("TSLA", start="2011-01-01", end="2022-05-31", interval='1mo')['Adj Close'].rename('TSLA')
index = yf.download("SPY", start="2011-01-01", end="2022-05-31", interval='1mo')['Adj Close'].rename('SPY')
[*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed
We need pandas module, so let's install and import it. https://pandas.pydata.org/
#!pip install pandas # Actually, you have this alread when you isntalled Anaconda.
import pandas as pd
# combine two asset prices onto one matrix called pandas dataframe
data = pd.concat([mystock, index], axis=1)
# drop missing observations
data2 = data.dropna()
# compute monthly returns and drop the first observation
data3 = data2.pct_change().dropna()
data3
TSLA | SPY | |
---|---|---|
Date | ||
2011-02-01 | -0.008714 | 0.034738 |
2011-03-01 | 0.161574 | -0.004206 |
2011-04-01 | -0.005405 | 0.033432 |
2011-05-01 | 0.092029 | -0.011215 |
2011-06-01 | -0.033510 | -0.021720 |
... | ... | ... |
2022-01-01 | -0.113609 | -0.049413 |
2022-02-01 | -0.070768 | -0.029517 |
2022-03-01 | 0.238009 | 0.034377 |
2022-04-01 | -0.191945 | -0.084935 |
2022-05-01 | -0.129197 | 0.002257 |
136 rows × 2 columns
# need to import matplotlib. You already have this in your Jupyter environment, so no need to install.
import matplotlib.pyplot as plt
data3.plot(subplots=False, figsize=(10, 6)) # plot returns to see volatility levels
(data2 / data2.iloc[0] * 100).plot(figsize = (10, 6), subplots=False) # plot the wealth change of $100 investment over time
<matplotlib.axes._subplots.AxesSubplot at 0x7ff805248450>
data3.plot.hist(bins=50, alpha=0.7, edgecolor='black', subplots=False, figsize=(10,6))
data3.plot.scatter(x='SPY', y='TSLA', c='blue',figsize=(10,6))
<matplotlib.axes._subplots.AxesSubplot at 0x7ff803b7b910>
We need to install and import statsmodels module. https://www.statsmodels.org/stable/index.html
#!pip install statsmodels
import statsmodels.formula.api as smf
import statsmodels.api as sm
# run OLS
formula = 'TSLA ~ SPY' # set dep var and indep var
results = smf.ols(formula, data3).fit() # run OLS
print(results.summary()) # print
OLS Regression Results ============================================================================== Dep. Variable: TSLA R-squared: 0.159 Model: OLS Adj. R-squared: 0.153 Method: Least Squares F-statistic: 25.35 Date: Sun, 13 Nov 2022 Prob (F-statistic): 1.51e-06 Time: 19:48:56 Log-Likelihood: 54.624 No. Observations: 136 AIC: -105.2 Df Residuals: 134 BIC: -99.42 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 0.0319 0.015 2.198 0.030 0.003 0.061 SPY 1.7553 0.349 5.035 0.000 1.066 2.445 ============================================================================== Omnibus: 43.834 Durbin-Watson: 1.592 Prob(Omnibus): 0.000 Jarque-Bera (JB): 109.883 Skew: 1.285 Prob(JB): 1.38e-24 Kurtosis: 6.576 Cond. No. 24.9 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
We need to install and import matplotlib module. https://matplotlib.org/
#!pip install matplotlib #again, if you installed Anaconda, you have this already.
import matplotlib.pyplot as plt
fig, ax=plt.subplots(figsize=(10,6))
fig = sm.graphics.plot_partregress_grid(results, fig=fig)
#!pip install scipy # Again, you probabaly have this installed in your Jupyter environment already
from scipy import stats
beta,alpha,r_value,p_value,std_err = stats.linregress(data3['SPY'],data3["TSLA"])
print(beta.round(4))
print(alpha.round(4))
print(r_value.round(2))
print(p_value.round(4))
1.7553 0.0319 0.4 0.0
# find covariance matrix
cov = data3.cov() * 12
print(cov)
print('\n') # to give a space
print(round(cov.iloc[0,1]/cov.iloc[1,1], 4))
TSLA SPY TSLA 0.376964 0.034167 SPY 0.034167 0.019465 1.7553
Need to install numpy and import it. You probably have this alreay. So skip installation. Just import it. https://numpy.org/
$$ b=\begin{bmatrix} b_0 \\ b_1 \\ \vdots \\ b_{k} \end{bmatrix}= (X^{'}X)^{-1}X^{'}Y $$So, a beta estimate form OLS is equal to X matrix transpose times X matrix and take an inverse times X transpose times times Y vector.
# warnings are annoying, so I include below to supress them. You do not need to do this.
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import numpy as np
X = data3['SPY']
y = data3['TSLA']
X_ols = sm.add_constant(X) # add a constant vector
#print(X_ols)
# compute beta using matrix operation
beta = np.linalg.inv(X_ols.T.dot(X_ols)).dot(X_ols.T.dot(y))
print(round(beta[1], 4))
1.7553
type(y)
pandas.core.series.Series
x = np.array(X).reshape((-1, 1))
x.shape
(136, 1)
y=np.array(y).reshape((-1,1))
y.shape
(136, 1)
from sklearn.linear_model import LinearRegression
model = LinearRegression().fit(x,y)
r_sq = model.score(x, y)
print(f"coefficient of determination: {r_sq}")
print(f"intercept: {model.intercept_}")
print(f"slope: {model.coef_}")
coefficient of determination: 0.1590925668519182 intercept: [0.03187907] slope: [[1.75529024]]