Review 2B Key#
by Professor Throckmorton
for Time Series Econometrics
W&M ECON 408/PUBP 616
The value of homes and rent might be cointegrated. If rent is above equilibrium, then home ownership is more appealing, which should drive rent down and home prices up, i.e., restoring a long-run equilibrium relationship. Was that true in the lead up to the 2007-2009 Great Recession?
1)#
Read data on home prices (FRED: CSUSHPINSA) and rental prices (FRED: CUUR0000SEHA)
Put the data in a dataframe and reindex the data to month end.
Display the head and tail of the dataframe in a table (not a plot).
# Libraries
from fredapi import Fred
import pandas as pd
# Setup acccess to FRED
fred_api_key = pd.read_csv('fred_api_key.txt', header=None).iloc[0,0]
fred = Fred(api_key=fred_api_key)
# Series to get
series = ['CSUSHPINSA','CUUR0000SEHA']
rename = ['home','rent']
# Get and append data to list
dl = []
for idx, string in enumerate(series):
var = fred.get_series(string).to_frame(name=rename[idx])
dl.append(var)
print(var.head(2)); print(var.tail(2))
home
1975-01-01 NaN
1975-02-01 NaN
home
2025-04-01 329.638
2025-05-01 331.107
rent
1914-12-01 21.0
1915-01-01 NaN
rent
2025-05-01 433.698
2025-06-01 434.594
# Concatenate data to create data frame (time-series table)
raw = pd.concat(dl, axis=1).sort_index()
# Make all columns numeric
raw = raw.apply(pd.to_numeric, errors='coerce')
# Resample/reindex to month end
raw = raw.resample('ME').last().dropna()
# Display dataframe
display(raw)
home | rent | |
---|---|---|
1987-01-31 | 63.733 | 121.300 |
1987-02-28 | 64.132 | 121.700 |
1987-03-31 | 64.468 | 121.800 |
1987-04-30 | 64.973 | 122.000 |
1987-05-31 | 65.547 | 122.300 |
... | ... | ... |
2025-01-31 | 323.652 | 429.506 |
2025-02-28 | 325.107 | 430.603 |
2025-03-31 | 327.658 | 431.798 |
2025-04-30 | 329.638 | 432.956 |
2025-05-31 | 331.107 | 433.698 |
461 rows × 2 columns
2)#
Transform the data with \(100\times \log\)
Remove seasonality as needed.
Conduct ADF unit root tests to verify that series are I(\(1\)) from 1990-2006.
# Scientific computing
import numpy as np
data = pd.DataFrame()
# Transform data
data['home'] = 100*np.log(raw['home'])
data['dhome'] = data['home'].diff(12)
data['d2home'] = data['dhome'].diff(1)
data['rent'] = 100*np.log(raw['rent'])
data['drent'] = data['rent'].diff(12)
data['d2rent'] = data['drent'].diff(1)
display(data)
home | dhome | d2home | rent | drent | d2rent | |
---|---|---|---|---|---|---|
1987-01-31 | 415.470248 | NaN | NaN | 479.826682 | NaN | NaN |
1987-02-28 | 416.094346 | NaN | NaN | 480.155900 | NaN | NaN |
1987-03-31 | 416.616898 | NaN | NaN | 480.238036 | NaN | NaN |
1987-04-30 | 417.397180 | NaN | NaN | 480.402104 | NaN | NaN |
1987-05-31 | 418.276744 | NaN | NaN | 480.647704 | NaN | NaN |
... | ... | ... | ... | ... | ... | ... |
2025-01-31 | 577.966886 | 4.037125 | 0.129897 | 606.263571 | 4.156625 | -0.027557 |
2025-02-28 | 578.415436 | 3.866708 | -0.170417 | 606.518655 | 4.005756 | -0.150868 |
2025-03-31 | 579.197038 | 3.320426 | -0.546282 | 606.795789 | 3.915179 | -0.090577 |
2025-04-30 | 579.799508 | 2.691851 | -0.628575 | 607.063611 | 3.902339 | -0.012840 |
2025-05-31 | 580.244159 | 2.225996 | -0.465856 | 607.234844 | 3.741261 | -0.161078 |
461 rows × 6 columns
from statsmodels.tsa.stattools import adfuller
# Function to organize ADF test results
def adf_test(data,const_trend):
keys = ['Test Statistic','p-value','# of Lags','# of Obs']
values = adfuller(data,regression=const_trend)
test = pd.DataFrame.from_dict(dict(zip(keys,values[0:4])),
orient='index',columns=[data.name])
return test
# Select sample
start_date, end_date = '01-01-1990', '12-31-2006'
sample = data[start_date:end_date]
display(sample)
# ADF unit root tests
dl = []
for column in sample.columns:
test = adf_test(sample[column],const_trend='c')
dl.append(test)
results = pd.concat(dl, axis=1)
display(results)
home | dhome | d2home | rent | drent | d2rent | |
---|---|---|---|---|---|---|
1990-01-31 | 433.764362 | 3.878123 | -0.421044 | 491.118322 | 3.980999 | -0.085827 |
1990-02-28 | 433.842735 | 3.480830 | -0.397292 | 491.265489 | 3.822121 | -0.158878 |
1990-03-31 | 434.107442 | 3.148408 | -0.332422 | 491.632461 | 4.036422 | 0.214301 |
1990-04-30 | 434.431179 | 2.858522 | -0.289886 | 491.998093 | 4.173482 | 0.137060 |
1990-05-31 | 434.765515 | 2.622813 | -0.235709 | 492.216831 | 4.164170 | -0.009312 |
... | ... | ... | ... | ... | ... | ... |
2006-08-31 | 521.713443 | 4.707097 | -1.152809 | 542.141956 | 3.692450 | 0.213446 |
2006-09-30 | 521.601127 | 3.640424 | -1.066673 | 542.539045 | 3.814687 | 0.122237 |
2006-10-31 | 521.522919 | 2.921304 | -0.719120 | 542.934563 | 3.890497 | 0.075810 |
2006-11-30 | 521.292831 | 2.177308 | -0.743996 | 543.328523 | 3.965768 | 0.075271 |
2006-12-31 | 521.073674 | 1.717451 | -0.459857 | 543.807931 | 4.218161 | 0.252393 |
204 rows × 6 columns
home | dhome | d2home | rent | drent | d2rent | |
---|---|---|---|---|---|---|
Test Statistic | -0.776473 | -2.059508 | -2.501802 | 1.557229 | -1.936576 | -3.251748 |
p-value | 0.825935 | 0.261105 | 0.115036 | 0.997723 | 0.315055 | 0.017178 |
# of Lags | 14.000000 | 15.000000 | 14.000000 | 4.000000 | 12.000000 | 11.000000 |
# of Obs | 189.000000 | 188.000000 | 189.000000 | 199.000000 | 191.000000 | 192.000000 |
3)#
What is the lag order for a VECM selectd by AIC? Make sure
maxlags
is not constraining the result.Conduct a Johansen cointegration test for the lag order. Set
det_order=-1
.
sample_I1 = sample[['dhome','drent']]
# Select number of lags in VECM
from statsmodels.tsa.vector_ar.vecm import select_order
lag_order_results = select_order(
sample_I1, maxlags=20, deterministic='co')
print(f'Selected lag order (AIC) = {lag_order_results.aic}')
print(f'Selected lag order (BIC) = {lag_order_results.bic}')
Selected lag order (AIC) = 15
Selected lag order (BIC) = 1
# Johansen cointegration tests
from statsmodels.tsa.vector_ar.vecm import coint_johansen
test = coint_johansen(sample_I1, det_order=-1, k_ar_diff=lag_order_results.bic)
test_stats = test.lr1; crit_vals = test.cvt[:, 1]
# Print results
for r_0, (test_stat, crit_val) in enumerate(zip(test_stats, crit_vals)):
print(f'H_0: r <= {r_0}')
print(f' Test Stat. = {test_stat:.2f}, 5% Crit. Value = {crit_val:.2f}')
if test_stat > crit_val:
print(' => Reject null hypothesis.')
else:
print(' => Fail to reject null hypothesis.')
H_0: r <= 0
Test Stat. = 13.17, 5% Crit. Value = 12.32
=> Reject null hypothesis.
H_0: r <= 1
Test Stat. = 0.09, 5% Crit. Value = 4.13
=> Fail to reject null hypothesis.
4)#
Estimate a VECM given your answers to the previous questions.
For which variable is the weight on the error correction term significant? Interpret that result.
# Estimate VECM
from statsmodels.tsa.vector_ar.vecm import VECM
# Estimate VECM
model_vecm = VECM(sample_I1, deterministic='co',
k_ar_diff=lag_order_results.bic,
coint_rank=1)
results_vecm = model_vecm.fit()
display(results_vecm.summary())
coef | std err | z | P>|z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | -0.0438 | 0.020 | -2.138 | 0.033 | -0.084 | -0.004 |
L1.dhome | 0.9364 | 0.028 | 33.666 | 0.000 | 0.882 | 0.991 |
L1.drent | -0.0771 | 0.058 | -1.336 | 0.182 | -0.190 | 0.036 |
coef | std err | z | P>|z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 0.0662 | 0.024 | 2.741 | 0.006 | 0.019 | 0.114 |
L1.dhome | -0.1193 | 0.033 | -3.636 | 0.000 | -0.184 | -0.055 |
L1.drent | -0.1052 | 0.068 | -1.545 | 0.122 | -0.239 | 0.028 |
coef | std err | z | P>|z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
ec1 | -0.0042 | 0.002 | -2.301 | 0.021 | -0.008 | -0.001 |
coef | std err | z | P>|z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
ec1 | 0.0064 | 0.002 | 2.965 | 0.003 | 0.002 | 0.011 |
coef | std err | z | P>|z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
beta.1 | 1.0000 | 0 | 0 | 0.000 | 1.000 | 1.000 |
beta.2 | -5.0155 | 1.900 | -2.640 | 0.008 | -8.739 | -1.292 |
The loading coefficient for home price on the ECT is probably negative, which mean that short-run changes in home prices drive the system back to the long-run equilibrium.
I.e., if home prices are high, then people will prefer to rent, which will drive the home price back down toward equilbrium (and rents should follow).