Notebook

Expectated vs Realized Income Growth in A Standard Life Cycle Model¶

This notebook uses the income process in Cocco, Gomes & Maenhout (2005) to demonstrate that estimates of a regression of expected income changes on realized income changes are sensitive to the size of transitory shocks.

We first load some tools from the HARK toolkit.

In [1]:

import statsmodels.api as sm
from linearmodels.panel.model import PanelOLS
from HARK.distribution import calc_expectation
from HARK.ConsumptionSaving.ConsIndShockModel import (
    IndShockConsumerType,
    init_lifecycle,
)

from HARK.Calibration.Income.IncomeTools import (
    parse_income_spec,
    parse_time_params,
    CGM_income,
)

from HARK.datasets.life_tables.us_ssa.SSATools import parse_ssa_life_table
import pandas as pd
from copy import copy

We now create a population of agents with the income process of Cocco, Gomes & Maenhout (2005), which is implemented as a default calibration in the toolkit.

In [2]:

birth_age = 21
death_age = 66
adjust_infl_to = 1992
income_calib = CGM_income
education = "HS"

# Income specification
income_params = parse_income_spec(
    age_min=birth_age,
    age_max=death_age,
    adjust_infl_to=adjust_infl_to,
    **income_calib[education],
    SabelhausSong=True,
)

# We need survival probabilities only up to death_age-1, because survival
# probability at death_age is 1.
liv_prb = parse_ssa_life_table(
    female=True, cross_sec=True, year=2004, min_age=birth_age, max_age=death_age - 1
)

# Parameters related to the number of periods implied by the calibration
time_params = parse_time_params(age_birth=birth_age, age_death=death_age)

# Update all the new parameters
params = copy(init_lifecycle)
params.update(time_params)
# params.update(dist_params)
params.update(income_params)
params.update(
    {
        "LivPrb": liv_prb,
        "pLvlInitStd": 0.0,
        "PermGroFacAgg": 1.0,
        "UnempPrb": 0.0,
        "UnempPrbRet": 0.0,
        "track_vars": ["pLvl", "t_age", "PermShk", "TranShk"],
        "AgentCount": 200,
        "T_sim": 500,
    }
)

Agent = IndShockConsumerType(**params)
Agent.solve()

We simulate a population of agents

In [3]:

# Run the simulations
Agent.initialize_sim()
Agent.simulate();

$\newcommand{\Ex}{\mathbb{E}}$ $\newcommand{\PermShk}{\psi}$ $\newcommand{\pLvl}{\mathbf{p}}$ $\newcommand{\pLvl}{P}$ $\newcommand{\yLvl}{\mathbf{y}}$ $\newcommand{\yLvl}{Y}$ $\newcommand{\PermGroFac}{\Gamma}$ $\newcommand{\UnempPrb}{\wp}$ $\newcommand{\TranShk}{\theta}$

We assume a standard income process with transitory and permanent shocks: The consumer's Permanent noncapital income $\pLvl$ grows by a predictable factor $\PermGroFac$ and is subject to an unpredictable multiplicative shock $\Ex_{t}[\PermShk_{t+1}]=1$,

\begin{eqnarray} \pLvl_{t+1} & = & \pLvl_{t} \PermGroFac_{t+1} \PermShk_{t+1}, \notag \end{eqnarray}

and, if the consumer is employed, actual income $Y$ is permanent income multiplied by a transitory shock $\Ex_{t}[\TranShk_{t+1}]=1$, \begin{eqnarray} \yLvl_{t+1} & = & \pLvl_{t+1} \TranShk_{t+1}, \notag \end{eqnarray}

$\Gamma_{t}$ captures the predictable life cycle profile of income growth (faster when young, slower when old). See our replication of CGM-2005 for a detailed account of how these objects map to CGM's notation.

Now define $\newcommand{\yLog}{y}\newcommand{\pLog}{p}\yLog = \log \yLvl,\pLog=\log \pLvl$ and similarly for other variables.

Using this notation, we construct all the necessary inputs to the regressors. The main input is the expected income growth of every agent at every time period, which is given by \begin{equation} \begin{split} \Ex_t[\yLvl_{t+1}/\yLvl_{t}] &= \mathbb{E}_t[\left(\frac{\theta_{t+1}\pLvl_{t} \PermGroFac_{t+1} \PermShk_{t+1}}{\theta_{t}P_{t}}\right)]\\ &= \left(\frac{\PermGroFac_{t+1}}{\theta_{t}}\right)\\ \Ex_t[\yLog_{t+1} - \yLog_{t}] & = \log \Gamma_{t+1}-\log \theta_t \end{split} \end{equation}

In [4]:

exp = [
    calc_expectation(Agent.IncShkDstn[i], func=lambda x: x[0] * x[1])
    for i in range(Agent.T_cycle)
]
exp_df = pd.DataFrame(
    {
        "exp_prod": exp,
        "PermGroFac": Agent.PermGroFac,
        "Age": [x + birth_age for x in range(Agent.T_cycle)],
    }
)

raw_data = {
    "Age": Agent.history["t_age"].T.flatten() + birth_age - 1,
    "pLvl": Agent.history["pLvl"].T.flatten(),
    "PermShk": Agent.history["PermShk"].T.flatten(),
    "TranShk": Agent.history["TranShk"].T.flatten(),
}

Data = pd.DataFrame(raw_data)

# Create an individual id
Data["id"] = (Data["Age"].diff(1) < 0).cumsum()

Data["Y"] = Data.pLvl * Data.TranShk

# Find Et[Yt+1 - Yt]
Data = Data.join(exp_df.set_index("Age"), on="Age", how="left")
Data["ExpIncChange"] = Data["pLvl"] * (
    Data["PermGroFac"] * Data["exp_prod"] - Data["TranShk"]
)

Data["Y_change"] = Data.groupby("id")["Y"].diff(1)

A corresponding version of this relationship can be estimated in simulated data: \begin{equation*} \Ex_t[\Delta y_{i,t+1}] = \gamma_{0} + \gamma_{1} \Delta y_{i,t} + f_i + \epsilon_{i,t} \end{equation*} We now estimate an analogous regression in our simulated population.

In [5]:

Data = Data.set_index(["id", "Age"])

# Create the variables they actually use
Data["ExpBin"] = 0
Data.loc[Data["ExpIncChange"] > 0, "ExpBin"] = 1
Data.loc[Data["ExpIncChange"] < 0, "ExpBin"] = -1

Data["ChangeBin"] = 0
Data.loc[Data["Y_change"] > 0, "ChangeBin"] = 1
Data.loc[Data["Y_change"] < 0, "ChangeBin"] = -1

mod = PanelOLS(Data.ExpBin, sm.add_constant(Data.ChangeBin), entity_effects=True)
fe_res = mod.fit()
print(fe_res)

                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                 ExpBin   R-squared:                        0.1955
Estimator:                   PanelOLS   R-squared (Between):             -0.0730
No. Observations:              100000   R-squared (Within):               0.1955
Date:                Fri, Nov 24 2023   R-squared (Overall):              0.1889
Time:                        15:15:16   Log-likelihood                -1.288e+05
Cov. Estimator:            Unadjusted                                           
                                        F-statistic:                   2.371e+04
Entities:                        2422   P-value                           0.0000
Avg Obs:                       41.288   Distribution:                 F(1,97577)
Min Obs:                       2.0000                                           
Max Obs:                       46.000   F-statistic (robust):          2.371e+04
                                        P-value                           0.0000
Time periods:                      45   Distribution:                 F(1,97577)
Avg Obs:                       2222.2                                           
Min Obs:                       1951.0                                           
Max Obs:                       2425.0                                           
                                                                                
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
const          0.1368     0.0028     48.687     0.0000      0.1313      0.1423
ChangeBin     -0.4404     0.0029    -153.98     0.0000     -0.4460     -0.4348
==============================================================================

F-test for Poolability: 1.2900
P-value: 0.0000
Distribution: F(2421,97577)

Included effects: Entity

The estimated $\hat{\gamma}_{1}$ is negative because in usual life-cycle calibrations, transitory shocks are volatile enough that mean reversion of transitory fluctuations is a stronger force than persistent trends in income age-profiles.

However, with less volatile transitory shocks, the regression coefficient would be positive. We demonstrate this by shutting off transitory shocks, simulating another population of agents, and re-running the regression.

In [6]:

params_no_transitory = copy(params)
params_no_transitory.update({"TranShkStd": [0.0] * len(params["TranShkStd"])})

# Create agent
Agent_nt = IndShockConsumerType(**params_no_transitory)
Agent_nt.solve()
# Run the simulations
Agent_nt.initialize_sim()
Agent_nt.simulate();

In [7]:

exp = [
    calc_expectation(Agent_nt.IncShkDstn[i], func=lambda x: x[0] * x[1])
    for i in range(Agent_nt.T_cycle)
]
exp_df = pd.DataFrame(
    {
        "exp_prod": exp,
        "PermGroFac": Agent_nt.PermGroFac,
        "Age": [x + birth_age for x in range(Agent.T_cycle)],
    }
)

raw_data = {
    "Age": Agent_nt.history["t_age"].T.flatten() + birth_age - 1,
    "pLvl": Agent_nt.history["pLvl"].T.flatten(),
    "PermShk": Agent_nt.history["PermShk"].T.flatten(),
    "TranShk": Agent_nt.history["TranShk"].T.flatten(),
}

Data = pd.DataFrame(raw_data)

# Create an individual id
Data["id"] = (Data["Age"].diff(1) < 0).cumsum()

Data["Y"] = Data.pLvl * Data.TranShk

# Find Et[Yt+1 - Yt]
Data = Data.join(exp_df.set_index("Age"), on="Age", how="left")
Data["ExpIncChange"] = Data["pLvl"] * (
    Data["PermGroFac"] * Data["exp_prod"] - Data["TranShk"]
)

Data["Y_change"] = Data.groupby("id")["Y"].diff(1)

In [8]:

# Create variables
Data["ExpBin"] = 0
Data.loc[Data["ExpIncChange"] > 0, "ExpBin"] = 1
Data.loc[Data["ExpIncChange"] < 0, "ExpBin"] = -1

Data["ChangeBin"] = 0
Data.loc[Data["Y_change"] > 0, "ChangeBin"] = 1
Data.loc[Data["Y_change"] < 0, "ChangeBin"] = -1

Data = Data.set_index(["id", "Age"])
mod = PanelOLS(Data.ExpBin, sm.add_constant(Data.ChangeBin), entity_effects=True)
fe_res = mod.fit()
print(fe_res)

                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                 ExpBin   R-squared:                        0.0086
Estimator:                   PanelOLS   R-squared (Between):             -0.0259
No. Observations:              100000   R-squared (Within):               0.0086
Date:                Fri, Nov 24 2023   R-squared (Overall):              0.0088
Time:                        15:15:18   Log-likelihood                -1.394e+05
Cov. Estimator:            Unadjusted                                           
                                        F-statistic:                      848.82
Entities:                        2422   P-value                           0.0000
Avg Obs:                       41.288   Distribution:                 F(1,97577)
Min Obs:                       2.0000                                           
Max Obs:                       46.000   F-statistic (robust):             848.82
                                        P-value                           0.0000
Time periods:                      45   Distribution:                 F(1,97577)
Avg Obs:                       2222.2                                           
Min Obs:                       1951.0                                           
Max Obs:                       2425.0                                           
                                                                                
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
const          0.1129     0.0031     36.103     0.0000      0.1067      0.1190
ChangeBin      0.0934     0.0032     29.135     0.0000      0.0871      0.0997
==============================================================================

F-test for Poolability: 1.1591
P-value: 0.0000
Distribution: F(2421,97577)

Included effects: Entity

The estimated $\hat{\gamma}_{1}$ when there are no transitory shocks is positive.