Suppose this is your first time to write the code. Perhaps, you want to run a simple regression using two series of asset prices to fin the equity beta. Let's use a step-by-step approach to complete the task.
Step 1: Download two assets' prices from the web
Step 2: Put them onto a matrix form
Step 3: Run the OLS
Step 4: Plot data
We will use yahoo finance package (https://pypi.org/project/yfinance/) to download Yahoo Finance data from the web. We need to (1) install and (2) import this package.
!pip install yfinance # to install, remove # and run the cell
import yfinance as yf # to import
# download
mystock = yf.download("TSLA", start="2011-01-01", end="2022-05-31", interval='1mo')['Adj Close'].rename('TSLA')
index = yf.download("SPY", start="2011-01-01", end="2022-05-31", interval='1mo')['Adj Close'].rename('SPY')
[*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed
We need pandas module, so let's install and import it. https://pandas.pydata.org/
#!pip install pandas # Actually, you have this alread when you isntalled Anaconda.
import pandas as pd
# combine two asset prices onto one matrix called pandas dataframe
data = pd.concat([mystock, index], axis=1)
# drop missing observations
data2 = data.dropna()
# compute monthly returns and drop the first observation
data3 = data2.pct_change().dropna()
data3
TSLA | SPY | |
---|---|---|
Date | ||
2011-02-01 | -0.008714 | 0.034737 |
2011-03-01 | 0.161574 | -0.004206 |
2011-04-01 | -0.005405 | 0.033431 |
2011-05-01 | 0.092029 | -0.011214 |
2011-06-01 | -0.033510 | -0.021720 |
... | ... | ... |
2022-01-01 | -0.113609 | -0.049413 |
2022-02-01 | -0.070768 | -0.029517 |
2022-03-01 | 0.238009 | 0.034377 |
2022-04-01 | -0.191945 | -0.084935 |
2022-05-01 | -0.129197 | 0.002257 |
136 rows × 2 columns
We need to install and import statsmodels module. https://www.statsmodels.org/stable/index.html
#!pip install statsmodels
import statsmodels.formula.api as smf
import statsmodels.api as sm
# run OLS
formula = 'TSLA ~ SPY' # set dep var and indep var
results = smf.ols(formula, data3).fit() # run OLS
print(results.summary()) # print
OLS Regression Results ============================================================================== Dep. Variable: TSLA R-squared: 0.159 Model: OLS Adj. R-squared: 0.153 Method: Least Squares F-statistic: 25.35 Date: Fri, 26 Aug 2022 Prob (F-statistic): 1.51e-06 Time: 06:44:56 Log-Likelihood: 54.624 No. Observations: 136 AIC: -105.2 Df Residuals: 134 BIC: -99.42 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 0.0319 0.015 2.198 0.030 0.003 0.061 SPY 1.7553 0.349 5.035 0.000 1.066 2.445 ============================================================================== Omnibus: 43.835 Durbin-Watson: 1.592 Prob(Omnibus): 0.000 Jarque-Bera (JB): 109.887 Skew: 1.285 Prob(JB): 1.37e-24 Kurtosis: 6.576 Cond. No. 24.9 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
We need to install and import matplotlib module. https://matplotlib.org/
#!pip install matplotlib #again, if you installed Anaconda, you have this already.
import matplotlib.pyplot as plt
fig, ax=plt.subplots(figsize=(10,6))
fig = sm.graphics.plot_partregress_grid(results, fig=fig)
#!pip install scipy
from scipy import stats
beta,alpha,r_value,p_value,std_err = stats.linregress(data3['SPY'],data3["TSLA"])
print(beta.round(4))
print(alpha.round(4))
print(r_value.round(2))
print(p_value.round(4))
1.7553 0.0319 0.4 0.0
# find covariance matrix
cov = data3.cov() * 12
print(cov)
print('\n') # to give a space
print(round(cov.iloc[0,1]/cov.iloc[1,1], 4))
TSLA SPY TSLA 0.376962 0.034166 SPY 0.034166 0.019465 1.7553
Need to install numpy and import it. You probably have this alreay. So skip installation. Just import it. https://numpy.org/
# warnings are annoying, so I include below to supress them. You do not need to do this.
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import numpy as np
X = data3['SPY']
y = data3['TSLA']
X_ols = sm.add_constant(X) # add a constant vector
#print(X_ols)
# compute beta using matrix operation
beta = np.linalg.inv(X_ols.T.dot(X_ols)).dot(X_ols.T.dot(y))
print(round(beta[1], 4))
1.7553