In answering this question, we derive a model for GDP growth based on observations from wage growth.
Dependencies: - Linux, bash - Python: matplotlib, pandas - Modules: yi_1tools.py, yi_fred.py, yi_plot.py, yi_timeseries.py
CHANGE LOG
2014-12-07 Update code and commentary.
2014-08-15 First version.
# NOTEBOOK settings and system details:
# Assume that the backend is LINUX (our particular distro is Ubuntu, running bash shell):
print '\n :: TIMESTAMP of last notebook execution:'
!date
print '\n :: IPython version:'
!ipython --version
# Automatically reload modified modules:
%load_ext autoreload
%autoreload 2 # 0 will disable autoreload.
# Generate plots inside notebook:
%matplotlib inline
# DISPLAY options
from IPython.display import Image
# e.g. Image(filename='holt-winters-equations.png', embed=True)
from IPython.display import YouTubeVideo
# e.g. YouTubeVideo('1j_HxD4iLn8')
from IPython.display import HTML # useful for snippets
# e.g. HTML('<iframe src=http://en.mobile.wikipedia.org/?useformat=mobile width=700 height=350></iframe>')
import pandas as pd
print '\n :: pandas version:'
print pd.__version__
# pandas DataFrames are represented as text by default; enable HTML representation:
# [Deprecated: pd.core.format.set_printoptions( notebook_repr_html=True ) ]
pd.set_option( 'display.notebook_repr_html', False )
# MATH display, use %%latex, rather than the following:
# from IPython.display import Math
# from IPython.display import Latex
print '\n :: Working directory (set as $workd):'
workd, = !pwd
print workd + '\n'
:: TIMESTAMP of last notebook execution: Tue Dec 9 19:52:44 PST 2014 :: IPython version: 2.3.0 :: pandas version: 0.15.0 :: Working directory (set as $workd): /home/yaya/Dropbox/ipy/fecon235/nb
# Some useful modules:
from yi_1tools import *
from yi_fred import *
from yi_plot import *
from yi_timeseries import *
# Total US population in millions, released monthly:
pop = getfred( m4pop )/1000.0
plotfred( pop )
georet( pop, 12 )
[1.14, 1.14, 0.09, 12]
This gives the annualized geometric growth rate, but one might also look at fertility rates which supports the population, e.g. 2.1 children per female will ensure growth. (cf. fertility rates in Japan which has been declining over the decades.)
# Fraction of population which works:
emppop = getfred( m4emppop )/100.0
Workers would be employed adults, which presumably exclude children (20% of pop) and most elderly persons (14% of pop). There is a dramatic drop in working% from about 64% in 2001 to about 59% recently.
plotfred( emppop )
# Total US workers in millions:
workers = todf( pop * emppop )
plotfred( workers )
georet( workers, 12 )
[1.18, 1.19, 1.17, 12]
# Deflator:
defl = getfred( m4defl )
# REAL GDP in billions:
gdp = getfred( m4gdpus )
# The release cycle is quarterly, but we resample to monthly,
# in order to sync with the deflator.
gdpr = todf( defl * gdp )
# We do NOT use m4gdpusr directly because that is in 2009 dollars!
# Our deflator always uses current dollars.
plotfred( gdpr )
georet( gdpr, 12 )
[2.86, 2.86, 1.08, 12]
Real GDP geometric rate of growth is 2.9% per annum (collectively due to the working population, presumably).
# Real GDP per worker -- NOT per capita:
gdprworker = todf( gdpr / workers )
plotfred( gdprworker )
# plotted in thousands of dollars
georet( gdprworker, 12 )
[1.69, 1.7, 1.37, 12]
Each worker has been more productive over the years, contributing more to GDP, at an annual pace of 1.7%.
# Annual income, not real, assuming 40 hours per week, 50 weeks per year:
inc = getfred( m4wage )*2000
# But maybe working hours have decreased recently?
# REAL income in thousands per worker:
rinc = todf((defl * inc)/1000.0)
plotfred( rinc )
# Ratio of real GDP to real income per worker:
gdpinc = todf( gdprworker / rinc )
Implicitly we are assuming all workers earn wages at the nonfarm non-supervisory private-sector rate. Not a bad assumption for our purposes here, if changes in labor rates are uniformly applied across various categories since we are interested in the multiplier effect.
plotfred( gdpinc )
This means currently each wage dollar paid yields $2.27 worth of product or services.
As a cross-check, we know each worker has become more productive in producing national wealth.
Hypothesis: over the years, technology has exerted upward pressure on productivity, and downward pressure on wages. In other words, the slope of gdpinc is a function of technological advances. (Look for counterexamples in other countries.)
# Fit and plot the simplified time trend:
plotfred( trend( gdpinc ))
:: regresstime slope = 0.00175176905443
Long-term: each year, on average, adds 0.021 to gdpinc multiplier.
holtfred( gdpinc, 24 )
Forecast 0 2.265296 1 2.259082 2 2.259629 3 2.260176 4 2.260724 5 2.261271 6 2.261818 7 2.262366 8 2.262913 9 2.263460 10 2.264008 11 2.264555 12 2.265102 13 2.265650 14 2.266197 15 2.266744 16 2.267292 17 2.267839 18 2.268387 19 2.268934 20 2.269481 21 2.270029 22 2.270576 23 2.271123 24 2.271671
We found evidence of a time-variant multiplier $m_t$ such that $G_t = m_t w_t$. Let's focus on GDP growth, expressed as the usual percentage change.
%%latex
\begin{aligned}
\frac{G_{t+1} - G_t}{G_t} = \frac{m_{t+1} w_{t+1}}{m_t w_t} - 1
\end{aligned}
Notice that LHS is just the growth rate of $m_t w_t$. So abusing notation, we can write $\%(G) = \%(m w)$
Empirically the multiplier varies in a very linear fashion as a function of time. So let's evaluate the GDP growth numerically, using the most recent multiplier and its expected historical incrementation, assuming wage has increased by 5% year-over-year:
%%latex
\begin{aligned}
(\frac{2.270 + 0.021}{2.270}) {1.05} - 1 = 0.0597
\end{aligned}
GDP has grown almost exactly 6%. So let's note that: 6/5 = 1.2
# Latest wage annual income data, in thousands of real dollars:
tail( rinc, 13 )
Y T 2013-10-01 41.154417 2013-11-01 41.205776 2013-12-01 41.246674 2014-01-01 41.279944 2014-02-01 41.441691 2014-03-01 41.349480 2014-04-01 41.296676 2014-05-01 41.270318 2014-06-01 41.271898 2014-07-01 41.293348 2014-08-01 41.429776 2014-09-01 41.385427 2014-10-01 41.400000
As of the October 2014, real wage growth was +0.60% YoY, thus we can predict that 2014Q4 real GDP growth will be +0.73%.
stat2( gdprworker[y], rinc[y] )
:: FIRST variable: count 667.000000 mean 63.761559 std 15.429600 min 36.432570 25% 53.950416 50% 62.867692 75% 75.177827 max 94.040646 Name: Y, dtype: float64 :: SECOND variable: count 610.000000 mean 36.882474 std 2.224209 min 32.852482 25% 34.894259 50% 36.603215 75% 38.345661 max 41.441691 Name: Y, dtype: float64 :: CORRELATION 0.652122415059 -------------------------Summary of Regression Analysis------------------------- Formula: Y ~ <x> + <intercept> Number of Observations: 607 Number of Degrees of Freedom: 2 R-squared: 0.4253 Adj R-squared: 0.4243 Rmse: 10.5027 F-stat (1, 605): 447.6566, p-value: 0.0000 Degrees of Freedom: model 1, resid 605 -----------------------Summary of Estimated Coefficients------------------------ Variable Coef Std Err t-stat p-value CI 2.5% CI 97.5% -------------------------------------------------------------------------------- x 4.0905 0.1933 21.16 0.0000 3.7116 4.4695 intercept -84.5124 7.1390 -11.84 0.0000 -98.5049 -70.5199 ---------------------------------End of Summary---------------------------------
# Examine year-over-year percentage growth:
stat2( pcent(gdpr, 12)[y], pcent(rinc, 12)[y] )
:: FIRST variable: count 655.000000 mean 2.915700 std 2.469703 min -3.828793 25% 1.770101 50% 3.072340 75% 4.321166 max 9.208310 Name: Y, dtype: float64 :: SECOND variable: count 598.000000 mean 0.451297 std 1.379578 min -3.708588 25% -0.536752 50% 0.526949 75% 1.369087 max 4.591182 Name: Y, dtype: float64 :: CORRELATION 0.443828748187 -------------------------Summary of Regression Analysis------------------------- Formula: Y ~ <x> + <intercept> Number of Observations: 595 Number of Degrees of Freedom: 2 R-squared: 0.1970 Adj R-squared: 0.1956 Rmse: 2.2085 F-stat (1, 593): 145.4659, p-value: 0.0000 Degrees of Freedom: model 1, resid 593 -----------------------Summary of Estimated Coefficients------------------------ Variable Coef Std Err t-stat p-value CI 2.5% CI 97.5% -------------------------------------------------------------------------------- x 0.7903 0.0655 12.06 0.0000 0.6619 0.9187 intercept 2.4286 0.0952 25.50 0.0000 2.2419 2.6152 ---------------------------------End of Summary---------------------------------
The mw below represents the series $m_t w_t$ in our analytical model described above.
mw = todf( gdpinc * rinc, 'mw' )
mwpc = todf( pcent( mw, 12), 'mwpc' )
gdprpc = todf( pcent( gdpr, 12), 'Gpc' )
dataf = paste( [gdprpc, mwpc] )
# The 0 in the formula means no intercept:
result = regressformula( dataf['1964':], 'Gpc ~ 0 + mwpc' )
print result.summary()
OLS Regression Results ============================================================================== Dep. Variable: Gpc R-squared: 0.802 Model: OLS Adj. R-squared: 0.802 Method: Least Squares F-statistic: 2413. Date: Tue, 09 Dec 2014 Prob (F-statistic): 2.42e-211 Time: 19:53:23 Log-Likelihood: -1142.8 No. Observations: 595 AIC: 2288. Df Residuals: 594 BIC: 2292. Df Model: 1 ============================================================================== coef std err t P>|t| [95.0% Conf. Int.] ------------------------------------------------------------------------------ mwpc 1.3924 0.028 49.120 0.000 1.337 1.448 ============================================================================== Omnibus: 96.415 Durbin-Watson: 0.102 Prob(Omnibus): 0.000 Jarque-Bera (JB): 167.474 Skew: -0.979 Prob(JB): 4.30e-37 Kurtosis: 4.709 Cond. No. 1.00 ==============================================================================
R-squared for dataf after 1964 looks respectable at around 0.80, however, the fit does terrible after the Great Recession.
The coefficent implies this fitted equation: $\%(G) = 1.39 * \%(m w)$.
In contrast, our analytic model (as opposed to the regression model in Appendix 2) suggested for the most recent data: $\%(G) = 1.20 * \%(w)$