VECM#

by Professor Throckmorton
for Time Series Econometrics
W&M ECON 408/PUBP 616
Slides

Summary#

  • Vector Error Correction Model (VECM) is specifically used for analyzing cointegrated time series that are first-order integrated, I(\(1\)).

  • VECM is builds on the concept of cointegration, i.e., each time series might be I(\(1\)), and certain linear combinations of them are stationary, I(\(0\)).

  • VECM is derived from a reduced-form VAR in levels when the data is cointegrated

  • VECM includes an error correction term that reflects the deviation from the lagged long-run average or “equilibrium”

VECM#

  • Suppose \(\mathbf{y}_t\) is a VAR(\(p\)) with I(\(1\)), possibly cointegrated, variables in levels, then

    \[\begin{gather*} \Delta \mathbf{y}_t = \boldsymbol{\Pi} \mathbf{y}_{t-1} + \sum_{h=1}^{p-1} \boldsymbol{\Gamma}_h \Delta \mathbf{y}_{t-h} + \boldsymbol{\upsilon}_t \end{gather*}\]
  • \(\Delta \mathbf{y}_t\) is the first difference of a vector of \(n\) non-stationary, I(\(1\)), variables, making them stationary. \(\sum_{h=1}^{p-1} \boldsymbol{\Gamma}_h \Delta \mathbf{y}_{t-h}\) captures the short-run dynamics of the system, and \(\boldsymbol{\upsilon}_t\) is a vector of white noise errors.

  • \(\boldsymbol{\Pi} \mathbf{y}_{t-1}\) is the error correction term, i.e., the long-run equilibrium relationship between variables in levels, where \(\boldsymbol{\Pi}\) is the cointegration structure.

    • If \(\boldsymbol{\Pi} = \boldsymbol{0}\), there is no cointegration and no need for VECM.

    • If \(\boldsymbol{\Pi}\) has reduced rank, \(r < n\), then \(\boldsymbol{\Pi} = \boldsymbol{\alpha} \boldsymbol{\beta}'\), where:

      • \(\boldsymbol{\beta}_{n \times r}\) is a matrix of cointegrating vectors.

      • \(\boldsymbol{\alpha}_{n \times r}\) is a matrix of adjustment coefficients, determining how each variable returns to its long-run equilibrium.

    • If \(\boldsymbol{\Pi}\) is full rank, \(r = n\), then \(\mathbf{y}_t\) is already stationary and no need for VECM.

Rank of a matrix#

  • A square matrix \(\mathbf{A}_{n \times n}\) is full rank if its rank is \(n\). It’s not full rank, or has reduced rank, if its rank \(<n\).

  • The following conditions are all equivalent — if any one is true, the matrix is full rank (warning: this is not a complete list)

    1. Linearly independent columns: No column can be written as a linear combination of the others.

    2. Linearly independent rows: No row is a linear combination of other rows.

    3. Non-zero determinant: \(\det(\mathbf{A}) \ne 0\) or \(|\mathbf{A}| \ne 0\)

    4. Invertibility: there exists \(\mathbf{A}^{-1}\) such that \(\mathbf{A} \mathbf{A}^{-1} = I_n\)

Examples#

  • Consider the following square matrix

    \[\begin{gather*} \mathbf{A} = \begin{bmatrix} 2 & 4 \\ 1 & 2 \end{bmatrix} \end{gather*}\]
    • Column 2 is \(2\times\) column 1 \(\rightarrow\) \(\mathbf{A}\) is not full rank

    • Row 1 is \(2\times\) row 2 \(\rightarrow\) \(\mathbf{A}\) is not full rank

  • Consider the following square matrix

    \[\begin{gather*} \mathbf{A} = \begin{bmatrix} 3 & 5 \\ 2 & 4 \end{bmatrix} \end{gather*}\]
    • We can’t multiply one row or column by a constant to get the other.

    • The determinant of a \(2 \times 2\) matrix is \(|\mathbf{A}| = ad - bc\)

    • Plug in the values \(|\mathbf{A}| = (3)(4) - (5)(2) = 12 - 10 = 2\)

    • Since \(|\mathbf{A}| \neq 0\), the matrix is full rank.

# Scientific computing
import numpy as np
# Matrix
A = np.array([
    [1, 2], 
    [2, 4]
])
# Rank
rank = np.linalg.matrix_rank(A)
# Determinant
det = np.linalg.det(A)
# Display
print(f'det = {det:.4f}')
print(rank)
det = 0.0000
1
# Matrix
A = np.array([
    [3, 5], 
    [2, 4]
])
# Rank
rank = np.linalg.matrix_rank(A)
# Determinant
det = np.linalg.det(A)
# Display
print(f'det = {det:.4f}')
print(rank)
det = 2.0000
2
  • For a \(3 \times 3\) square matrix

    \[\begin{gather*} \mathbf{A} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \end{gather*}\]
  • The determinant formula is \(a(ei - fh) - b(di - fg) + c(dh - eg)\).

  • That is the sum of the three determinants for \(2 \times 2\) matrices.

  • See https://en.wikipedia.org/wiki/Determinant.

Cointegration#

  • If some linear combination of non-stationary I(\(1\)) time series is stationary, then the variables are cointegrated, i.e., they share a stable long-run linear relationship despite short-run fluctuations.

  • Suppose we want to test for how many cointegrating relationships, \(r\), exist among \(n\), I(\(1\)), time series.

  • For example, if \(r = 0\), then there are no cointegrating relationships, and if \(r=n-1\) then all variables are cointegrated with eachother.

  • To justify estimating a VECM, there should be at least one cointegrating relationship between the variables, otherwise just esimate a VAR after first differencing.

Johansen Cointegration Test#

  • First, use the ADF (unit root) test to confirm that all time series are I(\(1\)).

  • The Johansen Cointegration Test has nested hypotheses

    • Suppose there are \(r_0 = 0\) cointegrating relationships

      • Null Hypothesis: \(H_0: r \leq r_0\) (i.e., the number of cointegrating relationships \(\leq r_0\))

      • Alternative Hypothesis: \(H_A: r > r_0\)

    • For each null hypothesis \(H_0: r \leq r_0\), if the test statistic \(>\) critical value \(\Rightarrow\) reject \(H_0\).

    • Then update \(r_0 = 1,2,3,\ldots\) until test fails to reject

  • In practice, use coint_johansen() function from statsmodels.tsa.vector_ar.vecm.

    • Choose lag length (\(p-1\)) for the VAR in differences.

    • Decide on deterministic components (constant/time trend) in the test.

  • Proceed to estimate a VECM if cointegration is found.

Example#

  • The U.S. macroeconomy has gone through many important generational changes that might affect the cointegrating relationships between aggregate time series variables

  • E.g, The Great Depression, Post-War Period, Great Inflation of the late 1960s to early 1980s, The Great Moderation, The Great Recession (and the slow recovery), Post COVID

  • Let’s test whether consumption and income are cointegrated for the 25 years prior to the Great Moderation (1960-1984)

  • Measure them with log real PCE and log real GDP

Read Data#

# Libraries
from fredapi import Fred
import pandas as pd
# Setup acccess to FRED
fred_api_key = pd.read_csv('fred_api_key.txt', header=None).iloc[0,0]
fred = Fred(api_key=fred_api_key)
# Series to get
series = ['GDP','PCE','GDPDEF']
rename = ['gdp','cons','price']
# Get and append data to list
dl = []
for idx, string in enumerate(series):
    var = fred.get_series(string).to_frame(name=rename[idx])
    dl.append(var)
    print(var.head(2)); print(var.tail(2))
            gdp
1946-01-01  NaN
1946-04-01  NaN
                  gdp
2025-01-01  29962.047
2025-04-01  30331.117
             cons
1959-01-01  306.1
1959-02-01  309.6
               cons
2025-05-01  20615.2
2025-06-01  20685.2
             price
1947-01-01  11.141
1947-04-01  11.299
              price
2025-01-01  127.429
2025-04-01  128.059

Create Dataframe#

# Concatenate data to create data frame (time-series table)
raw = pd.concat(dl, axis=1).sort_index()
# Resample/reindex to quarterly frequency
raw = raw.resample('QE').last()
# Display dataframe
display(raw.head(2))
display(raw.tail(2))
gdp cons price
1946-03-31 NaN NaN NaN
1946-06-30 NaN NaN NaN
gdp cons price
2025-03-31 29962.047 20578.5 127.429
2025-06-30 30331.117 20685.2 128.059

Transform Data#

data = pd.DataFrame()
# log real GDP
data['logGDP'] = 100*(np.log(raw['gdp']/raw['price']))
data['dlogGDP'] = data['logGDP'].diff(1)
# log real Consumption
data['logCons'] = 100*(np.log(raw['cons']/raw['price']))
data['dlogCons'] = data['logCons'].diff(1)
# Select sample
sample = data['01-01-1960':'12-31-1984']
display(sample.head(2))
display(sample.tail(2))
logGDP dlogGDP logCons dlogCons
1960-03-31 356.027682 2.226717 306.351449 1.839417
1960-06-30 355.485842 -0.541840 306.068692 -0.282757
logGDP dlogGDP logCons dlogCons
1984-09-30 441.310256 0.959766 393.568460 0.634225
1984-12-31 442.127526 0.817270 394.731279 1.162819

Plot Sample#

# Data in levels
sample_lev = sample[['logGDP','logCons']]
# Plot data
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6.5,2.5))
ax.plot(sample_lev['logGDP'], 
        label='100*log(Real GDP)', linestyle='-')
ax.plot(sample_lev['logCons'], 
        label='100*log(Real Cons.)', linestyle='--')
ax.grid(); ax.legend(loc='upper left');
_images/d8af7b44120edbcb7708d68cbc4c6ce77707c877840c9cae6f09f5917eb8c623.png

ADF Unit Root Test#

from statsmodels.tsa.stattools import adfuller
# Function to organize ADF test results
def adf_test(data,const_trend):
    keys = ['Test Statistic','p-value','# of Lags','# of Obs']
    values = adfuller(data,regression=const_trend)
    test = pd.DataFrame.from_dict(dict(zip(keys,values[0:4])),
                                  orient='index',columns=[data.name])
    return test
dl = []
for column in sample.columns:
    test = adf_test(sample[column],const_trend='c')
    dl.append(test)
results = pd.concat(dl, axis=1)
display(results)
logGDP dlogGDP logCons dlogCons
Test Statistic -1.341693 -4.949552 -1.158324 -5.421674
p-value 0.609889 0.000028 0.691262 0.000003
# of Lags 2.000000 1.000000 2.000000 1.000000
# of Obs 97.000000 98.000000 97.000000 98.000000
  • Note: the argument const_trend sets whether the test accounts for a constant, 'c', and time trend, 'ct'.

  • Fail to reject null for both time series in levels

  • Reject the null for both first-differenced time series

  • Since they are both \(I(1)\), proceed to Johansen Cointegration Test

Cointegration Test#

# Johansen Cointegration Test
from statsmodels.tsa.vector_ar.vecm import coint_johansen
test = coint_johansen(sample_lev, det_order=-1, k_ar_diff=1)
test_stats = test.lr1; crit_vals = test.cvt[:, 1]
# Print results
for r_0, (test_stat, crit_val) in enumerate(zip(test_stats, crit_vals)):
    print(f'H_0: r <= {r_0}')
    print(f'  Test Stat. = {test_stat:.2f}, 5% Crit. Value = {crit_val:.2f}')
    if test_stat > crit_val:
        print('  => Reject null hypothesis.')
    else:
        print('  => Fail to reject null hypothesis.')
H_0: r <= 0
  Test Stat. = 34.30, 5% Crit. Value = 12.32
  => Reject null hypothesis.
H_0: r <= 1
  Test Stat. = 0.03, 5% Crit. Value = 4.13
  => Fail to reject null hypothesis.
  • There is evidence that GDP and consumption are cointegrated.

  • det_order = -1 omits a common constant or linear time trend from the cointegrating relationship.

  • k_ar_diff = 1 means we are using a VECM with 1 lag of differences.

Estimation#

  • There is evidence that both GDP and consumption are I(\(1\)) and they are cointegrated.

  • If the data wasn’t cointegrated, they should be first differenced and estimate a VAR instead.

  • Proceed with estimating a bivariate VECM

    \[\begin{align*} \Delta \tilde{y}_{t} &= \mu_1 + \alpha_1 (\beta_1 \tilde{y}_{t-1} + \beta_2 \tilde{c}_{t-1}) + \Gamma_{11} \Delta \tilde{y}_{t-1} + \Gamma_{12} \Delta \tilde{c}_{t-1} + \cdots + \varepsilon_{1,t} \\ \Delta \tilde{c}_{t} &= \mu_2 + \alpha_2 (\beta_1 \tilde{y}_{t-1} + \beta_2 \tilde{c}_{t-1}) + \Gamma_{21} \Delta \tilde{y}_{t-1} + \Gamma_{22} \Delta \tilde{c}_{t-1} + \cdots +\varepsilon_{2,t} \end{align*}\]

    where \(\tilde{y} =\) log real GDP and \(\tilde{c} =\) log real PCE and \(\beta_1 \tilde{y}_{t-1} + \beta_2 \tilde{c}_{t-1}\) is their cointegrating relationship.

Model Selection#

# Select number of lags in VECM
from statsmodels.tsa.vector_ar.vecm import select_order
lag_order_results = select_order(
    sample_lev, maxlags=8, deterministic='co')
print(f'Selected lag order (AIC) = {lag_order_results.aic}')
Selected lag order (AIC) = 1

In select_order:

  • deterministic sets which deterministic terms appear in the VECM, e.g., deterministic = 'co' puts a constant outside the error correction to estimate the intercept since \(\Delta \tilde{y}_{t}\) and \(\Delta \tilde{c}_{t}\) have non-zero means

  • Selection criterion for lag order includes AIC and BIC (among others)

# Determine number of cointegrating relationships
from statsmodels.tsa.vector_ar.vecm import select_coint_rank
coint_rank_results = select_coint_rank(
    sample_lev, method='trace', det_order=-1, k_ar_diff=lag_order_results.aic)
print(f'Cointegration rank = {coint_rank_results.rank}')
Cointegration rank = 1

In select_coint_rank:

  • method='trace' uses the Johansen Cointegration Test

  • det_order=-1 means there are not any deterministic terms in the cointegrating relationship(s), which is the same assumption as above when we used coint_johansen directly (above).

  • k_ar_diff is set to result of the AIC selection criterion.

Results#

from statsmodels.tsa.vector_ar.vecm import VECM
# Estimate VECM
model_vecm = VECM(sample_lev, deterministic='co', 
            k_ar_diff=lag_order_results.aic, 
            coint_rank=coint_rank_results.rank)
results_vecm = model_vecm.fit()
tables = results_vecm.summary().tables
# Print summary tables
#for _, tab in enumerate(tables):
#    print(tab.as_html())
Det. terms outside the coint. relation & lagged endog. parameters for equation logGDP
coef std err z P>|z| [0.025 0.975]
const 16.8363 4.371 3.852 0.000 8.269 25.403
L1.logGDP 0.1289 0.100 1.290 0.197 -0.067 0.325
L1.logCons 0.3300 0.123 2.677 0.007 0.088 0.572
Det. terms outside the coint. relation & lagged endog. parameters for equation logCons
coef std err z P>|z| [0.025 0.975]
const 9.3086 4.419 2.106 0.035 0.647 17.970
L1.logGDP 0.1865 0.101 1.845 0.065 -0.012 0.385
L1.logCons -0.1434 0.125 -1.151 0.250 -0.388 0.101
Loading coefficients (alpha) for equation logGDP
coef std err z P>|z| [0.025 0.975]
ec1 -0.2361 0.063 -3.764 0.000 -0.359 -0.113
Loading coefficients (alpha) for equation logCons
coef std err z P>|z| [0.025 0.975]
ec1 -0.1218 0.063 -1.921 0.055 -0.246 0.002
Cointegration relations for loading-coefficients-column 1
coef std err z P>|z| [0.025 0.975]
beta.1 1.0000 0 0 0.000 1.000 1.000
beta.2 -0.9444 0.015 -64.031 0.000 -0.973 -0.916

Cointegrating Relationship#

  • \(\boldsymbol{\Pi} \mathbf{y}_{t-1}\) is the error correction term, i.e., the long-run equilibrium relationship between variables

  • If \(\boldsymbol{\Pi}\) has reduced rank, \(r < n\), then \(\boldsymbol{\Pi} = \boldsymbol{\alpha} \boldsymbol{\beta}'\), where \(\boldsymbol{\beta}_{n \times r}\) is a matrix of cointegrating vectors.

Pi = results_vecm.alpha@results_vecm.beta.T
rankPi = np.linalg.matrix_rank(Pi)
print(f'alpha = {results_vecm.alpha}')
print(f'beta = {results_vecm.beta}')
print(f'Pi = {Pi}')
print(f'rank(Pi) = {rankPi}')
alpha = [[-0.23609997]
 [-0.12179596]]
beta = [[ 1.       ]
 [-0.9444435]]
Pi = [[-0.23609997  0.22298308]
 [-0.12179596  0.1150294 ]]
rank(Pi) = 1

Error Correction Term#

  • Interpreting the cointegration coefficients, or the error correction term (ECT)

    \[\begin{align*} ECT_{t-1} &= \hat{\beta}' [\tilde{y}_{t-1},\tilde{c}_{t-1}] = \tilde{y}_{t-1} - 0.94 \tilde{c}_{t-1} \\ \rightarrow \tilde{y}_{t-1} &= 0.94 \tilde{c}_{t-1} + ECT_{t-1} \end{align*}\]
  • The long-run linear relationship between GDP and consumption is strongly positive, \(\tilde{y}_t \approx 0.94 \tilde{c}_t\), and the ECT measures how far away the system is from equilibrium.

  • \(\hat{\alpha}_1 = -0.24\) (and is significant) measures how much GDP corrects disequilibrium, i.e., returns to equilibrium

    • The significance indicates which variables adjust, e.g., \(\hat{\alpha}_2\) was not significant so consumption probably does not correct disquilibrium.

    • The sign indicates the direction of adjustment of each variable towards the long-run equilibrium, e.g., since \(\hat{\alpha}_1 < 0\), GDP pushes the system back to equilibrium

    • The magnitude is the speed or strength of adjustment, e.g., since \(|\hat{\alpha}_1| > |\hat{\alpha}_2|\), GDP adjusts more quickly than consumption.

# Error Correction Term
ECT = sample_lev@results_vecm.beta
ECT.name = 'ECT'
# Plot data
fig, ax = plt.subplots(figsize=(6.5,2.5))
ax.plot(ECT); ax.grid(); ax.set_title(ECT.name);
_images/0afaa5de93e324bc8eddd44cf2fdfa055f2032a0ee9e8701363ba14b4b8c0382.png
# Unit Root Test   
test = adf_test(ECT,'c')
display(test)
ECT
Test Statistic -3.330790
p-value 0.013557
# of Lags 11.000000
# of Obs 88.000000

Result: reject that the ECT has a unit root and conclude that it is probably stationary, as desired.

Impulse Response Functions#

  • The estimated VECM cannot be used to compute IRFs

  • IRFs are only defined for a VAR with stationary time series or a VAR with cointegrated time series in levels (e.g., Christiano et al., 2005)

  • Instead we would estimate a reduced form VAR(\(p\)) separately to compute IRFs, where \(p\) comes from the selected lag order from the VECM estimation

  • Then we can ask the question, how much does an increase in income change consumption and for how long?

  • Since both GDP and consumption are both clearly endogenous to many other things in the macroeconomy, it doesn’t make sense to use a Cholesky decomposition (e.g., orthogonalization) to estimate shocks.

# VAR model
from statsmodels.tsa.api import VAR
# make the VAR model
model_var = VAR(sample_lev)
# Estimate VAR(p)
results_var = model_var.fit(model_vecm.k_ar_diff + 1)
# Assign impulse response functions (IRFs)
irf = results_var.irf(20)
# Plot IRFs
fig = irf.plot(orth=False,impulse='logGDP',figsize=(6.5,4));
fig.suptitle(" ");
_images/19d8f1c602ad917da0dd1d43ed0698bc7d20164dd82effdce8a353f8481395e1.png

Forecasting#

  • With the estimated VAR, forecasts can also be made

  • Want to compare forecasts to actual data

  • This may not be a great example because the 1980s is still relative volatile

# Lag order and end of sample
p = results_var.k_ar
last_obs = sample_lev.values[-p:] 
# Forecast 6 quarters ahead
h = 6;
forecast = results_var.forecast(y=last_obs, steps=h)
forecast_df = pd.DataFrame(forecast, columns=sample_lev.columns)
dates = pd.date_range(start='03-31-1985', periods=h, freq='QE')
forecast_df.index = dates  # Assign to index
# Actual Data
start_date = pd.Timestamp('12-31-1984')-pd.DateOffset(months=3*p)
end_date = pd.Timestamp('12-31-1984')+pd.DateOffset(months=3*h)
print(f'start date = {start_date}, end date = {end_date}')
sample_forecast = data[start_date:end_date]
start date = 1984-06-30 00:00:00, end date = 1986-06-30 00:00:00
fig, ax = plt.subplots(figsize=(6.5,2))
forecast_df['logGDP'].plot(ax=ax, linestyle='--')
sample_forecast['logGDP'].plot(ax=ax, linestyle='-')
ax.grid(); ax.autoscale(tight=True); 
_images/27fb6639c380314d955cd75bbc53ee15865ea92bee43d90e8bfce27bd2bc740c.png
fig, ax = plt.subplots(figsize=(6.5,2))
forecast_df['logCons'].plot(ax=ax, linestyle='--')
sample_forecast['logCons'].plot(ax=ax, linestyle='-')
ax.grid(); ax.autoscale(tight=True); 
_images/8d4122f06bdb0dff5e9e050faf6207e6cc693a2582253a9388d9c3df2d53c587.png