We begin by constructing a model for equity price returns by decomposition into enterprise and speculative returns. That model is then tested using S&P 500 data going back to the year 1871 (well-known database assembled by Robert Shiller, Nobel laureate at Yale).
We find that the popular percentage approximation model is not appropriate as the time horizon increases. In contrast, the exactitude of the logarithmic model is demonstrated by multivariate regression.
The enterprise and speculative returns are then plotted to understand their respective relationship to price dynamics. The Great Recession produced spikes in the data unlike any other in history.
Total return is thus the sum of enterprise and speculative returns, plus dividend yield. We examine the dividend payout ratio relative to earnings, and find a local estimate for forecasting.
Finally we demonstrate forecasting using the above decomposition of returns and the most recent data. Those results are then compared to an univariate Holt-Winters time-series model in Appendix 1.
Appendices 2 and 3 covers the historical characteristics of the three separable components of total return.
Reference section briefly relates our findings to Bogle (2015).
VIEWING: if you encounter this message, [Math Processing Error], it means GitHub is timing out on rendering LaTeX equations. Please view this notebook at Jupyter instead. Issue #3 has been raised in this regard.
Dependencies:
CHANGE LOG
2016-03-30 Raise LaTeX math issue #3, and provide alternative Jupyter view.
Total return summary revised.
2016-03-28 Refactor data retrieval to yi_quandl module.
2016-03-18 Rough draft, test data source directly.
from fecon235.fecon235 import *
# PREAMBLE-p6.15.1223 :: Settings and system details
from __future__ import absolute_import, print_function
system.specs()
pwd = system.getpwd() # present working directory as variable.
print(" :: $pwd:", pwd)
# If a module is modified, automatically reload it:
%load_ext autoreload
%autoreload 2
# Use 0 to disable this feature.
# Notebook DISPLAY options:
# Represent pandas DataFrames as text; not HTML representation:
import pandas as pd
pd.set_option( 'display.notebook_repr_html', False )
# Beware, for MATH display, use %%latex, NOT the following:
# from IPython.display import Math
# from IPython.display import Latex
from IPython.display import HTML # useful for snippets
# e.g. HTML('<iframe src=http://en.mobile.wikipedia.org/?useformat=mobile width=700 height=350></iframe>')
from IPython.display import Image
# e.g. Image(filename='holt-winters-equations.png', embed=True) # url= also works
from IPython.display import YouTubeVideo
# e.g. YouTubeVideo('1j_HxD4iLn8', start='43', width=600, height=400)
from IPython.core import page
get_ipython().set_hook('show_in_pager', page.as_hook(page.display_page), 0)
# Or equivalently in config file: "InteractiveShell.display_page = True",
# which will display results in secondary notebook pager frame in a cell.
# Generate PLOTS inside notebook, "inline" generates static png:
%matplotlib inline
# "notebook" argument allows interactive zoom and resize.
We will test a model for equity price returns which shall exclude the usual considerations for interest rates, the discounting of dividends, and the so-called "equity risk premium."
Let us begin with the obvious:
$$ p_t = x_t s_t \qquad\hbox{where}\quad s_t = p_t / x_t $$Price per share at time t will be represented by $p_t$, and the earnings per share by $x_t$. Thus $s_t$ is the usual price/earnings ratio.
Taking the log of both sides, we can get two equations for different epochs:
$$ \eqalign{ \log p_t &= \log x_t + \log s_t \cr \log p_{t-n} &= \log x_{t-n} + \log s_{t-n} \cr}$$Subtracting the second equation from the first, given $n > 0$, we obtain:
$$ \eqalign{ \log {p_t \over p_{t-n}} &= \log {x_t \over x_{t-n}} + \log {s_t \over s_{t-n}} \cr P_t &= X_t + S_t \qquad\hbox{for some implicit } n.\cr }$$Note: $\log(1+\psi) \approx \psi$, for small $\psi$ where $\psi$ can be expressed in percentage terms. The capitalized variables thus can be approximately interpreted as follows: percentage price change $P_t$ is equal to percentage earnings growth $X_t$ plus the percentage change in price/earnings ratio $S_t$ -- over $n$ periods.
However, more exactly, a capitalized variable is the difference in logarithmic values of the lower case variables over $n$ periods. As we shall see, $X_t$ can be interpreted as enterprise return, whereas $S_t$ can be characterized as speculative return.
We shall rely upon Quandl MULTPL database which actively updates the famous series made public by Robert Shiller in his Nobel prize-winning work. Shiller keeps his data in an Excel spreadsheet, but our downloaded data will be in pandas Dataframe format. We have also doubled-checked the data with CSV formatted okfn.org and its GitHub datasets repository.
The economics department at Yale also maintains the Shiller database at Quandl YALE, but unfortunately, there seems to be a nine month latency. Here is Shiller's own description of the data:
"Stock market data used in my book, Irrational Exuberance [Princeton University Press 2000, Broadway Books 2001, 2nd ed., 2005] are available for download: U.S. Stock Markets 1871-Present and CAPE Ratio. This data set consists of monthly stock price, dividends, and earnings data and the consumer price index (to allow conversion to real values), all starting January 1871. The price, dividend, and earnings series are from the same sources as described in Chapter 26 of my earlier book (Market Volatility [Cambridge, MA: MIT Press, 1989]), although now I use monthly data, rather than annual data. Monthly dividend and earnings data are computed from the S&P four-quarter totals for the quarter since 1926, with linear interpolation to monthly figures. Dividend and earnings data before 1926 are from Cowles and associates, Common Stock Indexes [2nd ed. Principia Press 1939], interpolated from annual data. Stock price data are monthly averages of daily closing prices."
For our earnings data, we shall ignore Shiller's CAPE ratio, and instead use the raw nominal data (not inflation-adjusted), although real data has also been reconstructed by Shiller.
The explicit munging details of the download functions are located in our yi_quandl module.
# m4 implies monthly frequency:
# price:
p = get( m4spx_1871_p )
# earnings over 12-months:
x = get( m4spx_1871_e )
# We want non-inflation adjusted data for this notebook.
# Compute price/earnings ratio:
s = todf( p / x )
# Apply numpy natural logarithm:
plog = np.log( p )
xlog = np.log( x )
slog = np.log( s )
START specific tests from the next cell:
# RE-RUN the REGRESSION starting from this cell, varying specs:
# Raw data starts 1871-01-01, but shall allow lags up to 24 months:
start = '1873-01-01'
end = '2100-01-01'
# Compute changes over freq of n periods:
n = 12
# Percentage transform is approximate, but widely used,
# whereas logarithmic transform gives exact results.
transform = "%"
transform = "log"
## This is the approximate PERCENTAGE transforms:
if transform == "%":
P = div( pcent(p, n), n ) * 12
X = div( pcent(x, n), n ) * 12
S = div( pcent(s, n), n ) * 12
# div() is our numpy function for Python 2 and 3 division.
# The div() and the 12 annualizes the percentage.
## This is the exact LOG differencing transforms:
if transform == "log":
P = div( dif(plog, n), n ) * 12
X = div( dif(xlog, n), n ) * 12
S = div( dif(slog, n), n ) * 12
# div() is our numpy function for Python 2 and 3 division.
# The div() and the 12 annualizes the result.
# Create a mega Dataframe:
mega = paste([ p, x, s, P, X, S ])
# Then name the columns:
mega.columns = ["p", "x", "s", "P", "X", "S"]
stats( mega )
The correlation matrix above gives us some partial insight into pairwise relationships. We would expect P and X to be strongly correlated.
Next we conduct a multivariate regression by specifying the form of the theoretical model. We suppress estimation of the intercept by using "0" within regressformula(). Obviously, the coefficients for X and S should to be close to 1 -- but the results are quite unexpected for the approximative model.
# Get more info on regression by: regressformula??
result = regressformula( mega[start:end], "P ~ 0 + X + S" )
# ^possible truncation of Dataframe
print( result.summary() )
print("Confirming: transform=", transform, ", n=", n, ", start=", start, ", end=", end)
END specific tests from multivariate regression.
In many studies, the percentage approximation is used to examine returns, perhaps because conceptually it is readily understood. By executing the above regression while commenting out the transform="log" line, we can replicate the following results for various n values. As n increases, the R-squared statistic worsens considerably and the coefficients deviate farther away from the theoretical value of 1. Let $\varepsilon_t$ be the estimation error, then:
$$\eqalign{ \varepsilon_t &= \hat{P_t} - P_t \cr \hat{P_t} &= 0.78 X_t + 0.91 S_t + \varepsilon_t \qquad \hbox{for } n=1 \quad R^2=0.90 \cr \hat{P_t} &= 0.41 X_t + 0.61 S_t + \varepsilon_t \qquad \hbox{for } n=3 \quad R^2=0.57 \cr \hat{P_t} &= 0.19 X_t + 0.30 S_t + \varepsilon_t \qquad \hbox{for } n=6 \quad R^2=0.27 \cr \hat{P_t} &= 0.15 X_t + 0.23 S_t + \varepsilon_t \qquad \hbox{for } n=12 \quad R^2=0.21 \cr }$$The errors can be traced to the data reality that $\psi$ cannot be assumed to be small, for there were indeed some extreme changes in the course of 143 years. Lengthening $n$ also contributes to violating that numerical approximation.
Thankfully the data (and algorithms) confirm the mathematical derivation of our model when the transform="log" line is in effect:
$$\eqalign{ \hat{P_t} &= 1.00 X_t + 1.00 S_t + \varepsilon_t \qquad \hbox{for all } n \quad R^2=1.00 \cr }$$Increasing $n$ has no effect on our results, which was entirely expected. This can be verified by re-executing the appropriate sections of this notebook with different parameters.
In view of our findings, we make the this revision: the logarithmic price change $P_t$ is exactly equal to the logarithmic earnings change $X_t$ plus the change in logarithmic price/earnings ratio $S_t$ -- over $n$ periods.
The pairwise correlations, however, were somewhat surprisingly low. For example, for $n=12$, the correlation between $P_t$ and $X_t$ was 0.37, and the correlation between $P_t$ and $S_t$ was 0.28. But of course, as an ensemble, they mesh together perfectly.
The two variables $X_t$ and $S_t$ have the same functional form and the same coefficient weights, yet the mean of the former is 0.03781 and the mean of the latter 0.00432, which differs by a factor of 8.75 (when $n=12$). This indicates earnings growth contributes far more to price appreciation than variation in the price/earnings ratio. However, in Appendix 2 we see how the volatility of the price/earnings ratio could obscure this finding over a short observation period.
Since our data has monthly frequency, we shall focus on cases where $n$ is a multiple of 12. This is to mitigate the possibility of observing seasonal effects.
Since the logarithms of the raw data are clearly crucial, let us see their stats:
rawlog = paste([plog, xlog, slog])
rawlog.columns = ["pl", "xl", "sl"]
rawlog = rawlog[start:end]
# We intentionally use "start" and "end"
# to remain consistent with above analysis.
stats( rawlog )
Next we visualise their history. Over the long-term, the logarithmic scale (on the y-axis) is more suitable than the ordinary scale to discern changes in value. For example, the drop in logarithmic earnings during the Great Recession was surprisingly far greater than during the Great Depression!
# log of SPX price:
plot( rawlog.pl )
# log of SPX earnings:
plot( rawlog.xl )
# log of SPX price/earnings ratio:
plot( rawlog.sl )
Looking for possible outliers, we are struck by the violent downward spike occuring during the Great Recession in the earnings data.
Looking for possible trends, we can see how the price dynamics is fundamentally related to earnings growth. In fact, the correlation between pl and xl is 0.98 -- which justifies calling earnings growth $X_t$: the enterprise return portion of total return (to be discussed in the final sections).
Prior to 1989, sl looked almost trendless, bouncing roughly between 2 and 3. Now raw s is the price/earnings ratio, which is basically the multiplier to earnings at which an "investor" has implicitly paid at market. An investor's optimism for a higher re-evaluation, from other investors, of some given earnings level can be encapsulated by positive change in the price/earnings rato. This is the reason behind calling $S_t$ the market's speculative return.
# This is raw price/earnings:
stat( s )
The max on raw price/earnings is 123.73 which is not an outlier but due to the dramatic drop in earnings during the Great Recession (quite evident from the plots). By restricting the datetime index to end at 2000-01-01, we estimate the all-time mean price/earnings to be around 15 (note $log 15 = 2.71$ and our remark on trendless sl). The standard deviation drops from 8.4 to 4.6 due to that restriction.
# Price/earnings prior to the year 2000:
stat( s[:'2000-01-01'] )
Roughly speaking, dividends are taken out of earnings and paid out to investors. We are interested in the payout ratio which will later help give better forecasts of the dividend yield.
# Dividends paid over 12-months, monthly frequency:
d = get( m4spx_1871_d )
## Uncomment to see raw 12-month dividend:
# d
# We want to compute the PAYOUT ratio = dividends/earnings
divearn = todf( d / x )
## Uncomment to see payout ratio data:
# divearn
# Plot payout ratio = dividends/earnings:
plot( divearn )
stat( divearn )
On average, we would expect divearn to be less than 1, since earnings are the primary source for paying out the dividends.
Sometimes, however, entreprises may hold steady their dividends while earnings plunge -- for example, during the Great Recession, the payout ratio went as high as 4. But as the plot shows this is highly unusual.
Historically we can observe a slight downward trend in the payout ratio over the very long-term. The mean payout ratio is 0.61.
The total return $T_t$ for equities is composed of the logarithmic price change plus the dividend yield, over $n$ periods. We shall let $\lambda$ be the dividend payout ratio relative to earnings $x_i$, such that the index $i$ is understood to cover periods $t-n$ through $t$. Thus, we can write:
$$ T_t = P_t + {{\lambda \sum x_i } \over p_{t-n}} $$Inserting our model in place of $P_t$, we obtain total return as follows:
$$ T_t = X_t + S_t + {{\lambda \sum x_i } \over p_{t-n}} $$The term lambda $\lambda$ should be a stochastic variable, but we shall treat it locally as a constant (see plot of divearn). We utilize the median, rather than the mean, since that is a far more robust statistic.
To estimate the other variables we localize our data to the last three years, which gives 36 monthly points of data.
# For localizing forecasts:
nlocal = 36
# We can also make "backcasts" by backdating this date variable,
# or specify a date far in the future to use the most current data:
last = '2100-01-01'
# Estimated local lambda:
Elambda = np.median( tail(divearn[:last], nlocal) )
Elambda
# Estimated annualized dividend yield "dy" as percentage:
Edy = Elambda * (tailvalue(x[:last]) / tailvalue(p[:last])) * 100
Edy
Here we used a robust local estimate of lambda, the dividend payout ratio relative to earnings, to forecast dividend yield using the "last" known earnings and price levels.
The history of dividend yields is examined in Appendix 3.
We tested our model using the change in logarithmic earnings, so here we will use the geometric mean return on the localized raw data.
The geometric return is akin to our logarithmic model, and is to be distinguished from the terrible arithmetic return, though it too can be expressed in percentage form.
# Get more info by executing: georet??
Xgeoret = georet( tail(x[:last], nlocal), yearly=12 )
Xgeoret
# Geometric return is expressed as percentage,
# so here is our estimator for big X:
EX = Xgeoret[0]
EX
We tested our model using the change in logarithmic price/earnings ratio, so here we will use the geometric mean return on the localized raw data.
# yearly argument in georet annualizes:
Sgeoret = georet( tail(s[:last], nlocal), yearly=12 )
Sgeoret
# Geometric return is expressed as percentage,
# so here our estimator for big S:
ES = Sgeoret[0]
ES
In terms of our Python variables, here is our model:
$$ \hat{T} = \hat{X} + \hat{S} + \hat{dy} $$where an "E" prefix is used to represent estimated hat variables.
# FORECAST of TOTAL RETURN, finally...
ET = EX + ES + Edy
ET
print("FORECAST SUMMARY, given 'last' set at", last)
print("TOTAL RETURN (per annum) forecast: ", ET, "%")
print(" - Enterprise return: ", EX, "%")
print(" - Speculative return: ", ES, "%")
print(" - Dividend yield: ", Edy, "%")
print("Dividend payout ratio:", Elambda, "(relative to earnings)")
print("Price/earnings ratio, last:", tailvalue( s[:last] ))
Let us apply the Holt-Winters method to logarithmic price alone. This is a pure time-series technique for the purpose of cross-checking the multivariate regression model which we used for forecasting.
# Forecast 12-months ahead:
fc = forecast( plog[:last], h=12 )
## Uncomment to see forecasts:
# fc
# Change in logarithmic price:
fc.loc[12] - fc.loc[0]
last
2016-03-18: The forecasted price return is about -5.31% per annum, so adding the estimated dividend yield of +1.74%, the forecasted total return is -3.57%. Thus the univariate method on price alone differs considerably from our multivariate model which forecasted +8.76% for total return.
Averaging forecasts from different techniques usually increases accuracy. Combining the point forecasts for total return from two models: +2.6% average.
We examine their respective geometric mean returns over the entire Shiller database.
# Enterprise return since 1871:
georet( x, yearly=12 )
# Speculative return since 1871:
georet( s, yearly=12 )
By examining their geometric mean returns, we see that enterprise returns (3.62%) contributes far more to the total return than speculative returns (0.42%).
Our georet() also computes arithmetic mean return and its standard deviation (second and third elements respectively). Speculative returns contributes more volatility in observed total returns than enterprise returns.
Among the three components of total return, the dividend yield has on average has made the largest contribution historically. Also, it has never contributed negatively to total return.
# Express dividend yield as percentage per annum:
dy = todf( (d/p) * 100 )
stat( dy )
The average dividend yield since 1871 has been 4.4% per annum. But the plot of the exponentially smoothed values seems to indicate a steady decline over time.
Continuance of such a decline in dividend yield would then make earnings growth the main driver of total return in the 21st century.
# History of dividend yields:
plot( ema(dy, alpha=0.10) )
# ema means Exponential Moving Average.
Keynes (1936) enumerated two sources of stock returns: enterprise and speculative. Enterprise refers to the actual business results of corporations. Speculation, on the other hand, involved “forecasting the psychology of the market,” the ephemeral, emotional response of investors seeking to anticipate how other investors will react to particular events.
Keynes made no known attempt to quantify the distinction between enterprise and speculation, so Bogle tackled the task by developing his BSRM/S model, "Bogle Sources of Return Model for Stocks."
Bogle (2015) states:
"The sheer simplicity of the BSRM/S commends it over highly complex models, none of which (as far as we can determine) has offered similar accuracy."
This is quite a claim since Bogle has been involved in finance since 1951 (Princeton honors thesis), and is the founder of The Vanguard Group.
Bogle (2015) does not show how he arrived at his model mathematically:
"Although we find this validation of the model gratifying, we suspect a fit this close to the actual total return has a lot to do with luck."
Perhaps he was lucky in terms of small errors between realized and out-of-samples results, but the analytics as shown in this notebook is sound.
John Bogle sought a deeper reasoning behind his expectations of future total return from a portfolio of equities. A sample explanation could be phrased as follows: "I expect annual earning growth of 8%, and in ten years a terminal price/earning ratio of 14 -- given that ratio is 15.5 currently, such a 1% reduction per annum will decrease the total return by like amount. Therefore, with a dividend yield of 3%, my expected total return is 8% - 1% + 3% = 10% over the coming decade."
This notebook shows the (arithmetic) percentage reasoning is prone is large errors, whereas a logarithmic (geometric) version is exact. Shiller's reconstructed data since 1871 allows us to examine the separable components of total return: enterprise and speculative returns, plus dividend yield. An understanding of their respective contributions helps us to form better informed expectations of future total return on equities.