Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

VECM

by Professor Throckmorton
for Time Series Econometrics
W&M ECON 408/PUBP 616
Slides

Summary

  • Vector Error Correction Model (VECM) is specifically used for analyzing cointegrated time series that are first-order integrated, I(1).

  • VECM is builds on the concept of cointegration, i.e., each time series might be I(1), and certain linear combinations of them are stationary, I(0).

  • VECM is derived from a reduced-form VAR in levels when the data is cointegrated

  • VECM includes an error correction term that reflects the deviation from the lagged long-run average or “equilibrium”

VECM

  • Suppose yt\mathbf{y}_t is a VAR(pp) with I(1), possibly cointegrated, variables in levels, then

    Δyt=Πyt1+h=1p1ΓhΔyth+υt\begin{gather*} \Delta \mathbf{y}_t = \boldsymbol{\Pi} \mathbf{y}_{t-1} + \sum_{h=1}^{p-1} \boldsymbol{\Gamma}_h \Delta \mathbf{y}_{t-h} + \boldsymbol{\upsilon}_t \end{gather*}
  • Δyt\Delta \mathbf{y}_t is the first difference of a vector of nn non-stationary, I(1), variables, making them stationary. h=1p1ΓhΔyth\sum_{h=1}^{p-1} \boldsymbol{\Gamma}_h \Delta \mathbf{y}_{t-h} captures the short-run dynamics of the system, and υt\boldsymbol{\upsilon}_t is a vector of white noise errors.

  • Πyt1\boldsymbol{\Pi} \mathbf{y}_{t-1} is the error correction term, i.e., the long-run equilibrium relationship between variables in levels, where Π\boldsymbol{\Pi} is the cointegration structure.

    • If Π=0\boldsymbol{\Pi} = \boldsymbol{0}, there is no cointegration and no need for VECM.

    • If Π\boldsymbol{\Pi} has reduced rank, r<nr < n, then Π=αβ\boldsymbol{\Pi} = \boldsymbol{\alpha} \boldsymbol{\beta}', where:

      • βn×r\boldsymbol{\beta}_{n \times r} is a matrix of cointegrating vectors.

      • αn×r\boldsymbol{\alpha}_{n \times r} is a matrix of adjustment coefficients, determining how each variable returns to its long-run equilibrium.

    • If Π\boldsymbol{\Pi} is full rank, r=nr = n, then yt\mathbf{y}_t is already stationary and no need for VECM.

Rank of a matrix

  • A square matrix An×n\mathbf{A}_{n \times n} is full rank if its rank is nn. It’s not full rank, or has reduced rank, if its rank <n<n.

  • The following conditions are all equivalent — if any one is true, the matrix is full rank (warning: this is not a complete list)

    1. Linearly independent columns: No column can be written as a linear combination of the others.

    2. Linearly independent rows: No row is a linear combination of other rows.

    3. Non-zero determinant: det(A)0\det(\mathbf{A}) \ne 0 or A0|\mathbf{A}| \ne 0

    4. Invertibility: there exists A1\mathbf{A}^{-1} such that AA1=In\mathbf{A} \mathbf{A}^{-1} = I_n

Examples

  • Consider the following square matrix

    A=[2412]\begin{gather*} \mathbf{A} = \begin{bmatrix} 2 & 4 \\ 1 & 2 \end{bmatrix} \end{gather*}
    • Column 2 is 2×2\times column 1 \rightarrow A\mathbf{A} is not full rank

    • Row 1 is 2×2\times row 2 \rightarrow A\mathbf{A} is not full rank

  • Consider the following square matrix

    A=[3524]\begin{gather*} \mathbf{A} = \begin{bmatrix} 3 & 5 \\ 2 & 4 \end{bmatrix} \end{gather*}
    • We can’t multiply one row or column by a constant to get the other.

    • The determinant of a 2×22 \times 2 matrix is A=adbc|\mathbf{A}| = ad - bc

    • Plug in the values A=(3)(4)(5)(2)=1210=2|\mathbf{A}| = (3)(4) - (5)(2) = 12 - 10 = 2

    • Since A0|\mathbf{A}| \neq 0, the matrix is full rank.

# Scientific computing
import numpy as np
# Matrix
A = np.array([
    [1, 2], 
    [2, 4]
])
# Rank
rank = np.linalg.matrix_rank(A)
# Determinant
det = np.linalg.det(A)
# Display
print(f'det = {det:.4f}')
print(rank)
det = 0.0000
1
# Matrix
A = np.array([
    [3, 5], 
    [2, 4]
])
# Rank
rank = np.linalg.matrix_rank(A)
# Determinant
det = np.linalg.det(A)
# Display
print(f'det = {det:.4f}')
print(rank)
det = 2.0000
2
  • For a 3×33 \times 3 square matrix

    A=[abcdefghi]\begin{gather*} \mathbf{A} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \end{gather*}
  • The determinant formula is a(eifh)b(difg)+c(dheg)a(ei - fh) - b(di - fg) + c(dh - eg).

  • That is the sum of the three determinants for 2×22 \times 2 matrices.

  • See Determinant.

Cointegration

  • If some linear combination of non-stationary I(1) time series is stationary, then the variables are cointegrated, i.e., they share a stable long-run linear relationship despite short-run fluctuations.

  • Suppose we want to test for how many cointegrating relationships, rr, exist among nn, I(1), time series.

  • For example, if r=0r = 0, then there are no cointegrating relationships, and if r=n1r=n-1 then all variables are cointegrated with eachother.

  • To justify estimating a VECM, there should be at least one cointegrating relationship between the variables, otherwise just esimate a VAR after first differencing.

Johansen Cointegration Test

  • First, use the ADF (unit root) test to confirm that all time series are I(1).

  • The Johansen Cointegration Test has nested hypotheses

    • Suppose there are r0=0r_0 = 0 cointegrating relationships

      • Null Hypothesis: H0:rr0H_0: r \leq r_0 (i.e., the number of cointegrating relationships r0\leq r_0)

      • Alternative Hypothesis: HA:r>r0H_A: r > r_0

    • For each null hypothesis H0:rr0H_0: r \leq r_0, if the test statistic >> critical value \Rightarrow reject H0H_0.

    • Then update r0=1,2,3,r_0 = 1,2,3,\ldots until test fails to reject

  • In practice, use coint_johansen() function from statsmodels.tsa.vector_ar.vecm.

    • Choose lag length (p1p-1) for the VAR in differences.

    • Decide on deterministic components (constant/time trend) in the test.

  • Proceed to estimate a VECM if cointegration is found.

Example

  • The U.S. macroeconomy has gone through many important generational changes that might affect the cointegrating relationships between aggregate time series variables

  • E.g, The Great Depression, Post-War Period, Great Inflation of the late 1960s to early 1980s, The Great Moderation, The Great Recession (and the slow recovery), Post COVID

  • Let’s test whether consumption and income are cointegrated for the 25 years prior to the Great Moderation (1960-1984)

  • Measure them with log real PCE and log real GDP

Read Data

# Libraries
from fredapi import Fred
import pandas as pd
# Setup acccess to FRED
fred_api_key = pd.read_csv('fred_api_key.txt', header=None).iloc[0,0]
fred = Fred(api_key=fred_api_key)
# Series to get
series = ['GDP','PCE','GDPDEF']
rename = ['gdp','cons','price']
# Get and append data to list
dl = []
for idx, string in enumerate(series):
    var = fred.get_series(string).to_frame(name=rename[idx])
    dl.append(var)
    print(var.head(2)); print(var.tail(2))
            gdp
1946-01-01  NaN
1946-04-01  NaN
                  gdp
2025-01-01  29962.047
2025-04-01  30331.117
             cons
1959-01-01  306.1
1959-02-01  309.6
               cons
2025-05-01  20615.2
2025-06-01  20685.2
             price
1947-01-01  11.141
1947-04-01  11.299
              price
2025-01-01  127.429
2025-04-01  128.059

Create Dataframe

# Concatenate data to create data frame (time-series table)
raw = pd.concat(dl, axis=1).sort_index()
# Resample/reindex to quarterly frequency
raw = raw.resample('QE').last()
# Display dataframe
display(raw.head(2))
display(raw.tail(2))
Loading...
Loading...

Transform Data

data = pd.DataFrame()
# log real GDP
data['logGDP'] = 100*(np.log(raw['gdp']/raw['price']))
data['dlogGDP'] = data['logGDP'].diff(1)
# log real Consumption
data['logCons'] = 100*(np.log(raw['cons']/raw['price']))
data['dlogCons'] = data['logCons'].diff(1)
# Select sample
sample = data['01-01-1960':'12-31-1984']
display(sample.head(2))
display(sample.tail(2))
Loading...
Loading...

Plot Sample

# Data in levels
sample_lev = sample[['logGDP','logCons']]
# Plot data
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6.5,2.5))
ax.plot(sample_lev['logGDP'], 
        label='100*log(Real GDP)', linestyle='-')
ax.plot(sample_lev['logCons'], 
        label='100*log(Real Cons.)', linestyle='--')
ax.grid(); ax.legend(loc='upper left');
<Figure size 650x250 with 1 Axes>

ADF Unit Root Test

from statsmodels.tsa.stattools import adfuller
# Function to organize ADF test results
def adf_test(data,const_trend):
    keys = ['Test Statistic','p-value','# of Lags','# of Obs']
    values = adfuller(data,regression=const_trend)
    test = pd.DataFrame.from_dict(dict(zip(keys,values[0:4])),
                                  orient='index',columns=[data.name])
    return test
dl = []
for column in sample.columns:
    test = adf_test(sample[column],const_trend='c')
    dl.append(test)
results = pd.concat(dl, axis=1)
display(results)
Loading...
  • Note: the argument const_trend sets whether the test accounts for a constant, 'c', and time trend, 'ct'.

  • Fail to reject null for both time series in levels

  • Reject the null for both first-differenced time series

  • Since they are both I(1)I(1), proceed to Johansen Cointegration Test

  • Stochastic trend, i.e., yt=μ+yt1+εty_t = \mu + y_{t-1} + \varepsilon_t

    • Can drift without bound — non-stationary

  • Deterministic trend, i.e., yt=μ+δt+εty_t = \mu + \delta t + \varepsilon_t

    • Deviations from the time trend are mean-reverting

  • Both deterministic and stochastic trends, i.e., yt=μ+δt+yt1+εty_t = \mu + \delta t + y_{t-1} + \varepsilon_t

  • First differencing will remove either trend.

# Test data for deterministic time trend
dl = []
for column in sample_lev.columns:
    test = adf_test(sample_lev[column],'ct')
    dl.append(test)
results = pd.concat(dl, axis=1)
display(results)
Loading...
  • In adfuller, setting regression = 'ct' allows for both a constant and linear time trend.

  • Fail to reject the null that the data (after removing linear trend) has a unit root, i.e., removing the linear trend did not make the data stationary.

  • The data is probably stationary around a stochastic trend.

  • Regardless, first differencing the data for the VECM will remove these trends, so we don’t need to account for them later.

Cointegration Test

# Johansen Cointegration Test
from statsmodels.tsa.vector_ar.vecm import coint_johansen
test = coint_johansen(sample_lev, det_order=-1, k_ar_diff=1)
test_stats = test.lr1; crit_vals = test.cvt[:, 1]
# Print results
for r_0, (test_stat, crit_val) in enumerate(zip(test_stats, crit_vals)):
    print(f'H_0: r <= {r_0}')
    print(f'  Test Stat. = {test_stat:.2f}, 5% Crit. Value = {crit_val:.2f}')
    if test_stat > crit_val:
        print('  => Reject null hypothesis.')
    else:
        print('  => Fail to reject null hypothesis.')
H_0: r <= 0
  Test Stat. = 34.30, 5% Crit. Value = 12.32
  => Reject null hypothesis.
H_0: r <= 1
  Test Stat. = 0.03, 5% Crit. Value = 4.13
  => Fail to reject null hypothesis.
  • There is evidence that GDP and consumption are cointegrated.

  • det_order = -1 omits a common constant or linear time trend from the cointegrating relationship.

  • k_ar_diff = 1 means we are using a VECM with 1 lag of differences.

Estimation

  • There is evidence that both GDP and consumption are I(1) and they are cointegrated.

  • If the data wasn’t cointegrated, they should be first differenced and estimate a VAR instead.

  • Proceed with estimating a bivariate VECM

    Δy~t=μ1+α1(β1y~t1+β2c~t1)+Γ11Δy~t1+Γ12Δc~t1++ε1,tΔc~t=μ2+α2(β1y~t1+β2c~t1)+Γ21Δy~t1+Γ22Δc~t1++ε2,t\begin{align*} \Delta \tilde{y}_{t} &= \mu_1 + \alpha_1 (\beta_1 \tilde{y}_{t-1} + \beta_2 \tilde{c}_{t-1}) + \Gamma_{11} \Delta \tilde{y}_{t-1} + \Gamma_{12} \Delta \tilde{c}_{t-1} + \cdots + \varepsilon_{1,t} \\ \Delta \tilde{c}_{t} &= \mu_2 + \alpha_2 (\beta_1 \tilde{y}_{t-1} + \beta_2 \tilde{c}_{t-1}) + \Gamma_{21} \Delta \tilde{y}_{t-1} + \Gamma_{22} \Delta \tilde{c}_{t-1} + \cdots +\varepsilon_{2,t} \end{align*}

    where y~=\tilde{y} = log real GDP and c~=\tilde{c} = log real PCE and β1y~t1+β2c~t1\beta_1 \tilde{y}_{t-1} + \beta_2 \tilde{c}_{t-1} is their cointegrating relationship.

Model Selection

# Select number of lags in VECM
from statsmodels.tsa.vector_ar.vecm import select_order
lag_order_results = select_order(
    sample_lev, maxlags=8, deterministic='co')
print(f'Selected lag order (AIC) = {lag_order_results.aic}')
Selected lag order (AIC) = 1

In select_order:

  • deterministic sets which deterministic terms appear in the VECM, e.g., deterministic = 'co' puts a constant outside the error correction to estimate the intercept since Δy~t\Delta \tilde{y}_{t} and Δc~t\Delta \tilde{c}_{t} have non-zero means

  • Selection criterion for lag order includes AIC and BIC (among others)

# Determine number of cointegrating relationships
from statsmodels.tsa.vector_ar.vecm import select_coint_rank
coint_rank_results = select_coint_rank(
    sample_lev, method='trace', det_order=-1, k_ar_diff=lag_order_results.aic)
print(f'Cointegration rank = {coint_rank_results.rank}')
Cointegration rank = 1

In select_coint_rank:

  • method='trace' uses the Johansen Cointegration Test

  • det_order=-1 means there are not any deterministic terms in the cointegrating relationship(s), which is the same assumption as above when we used coint_johansen directly (above).

  • k_ar_diff is set to result of the AIC selection criterion.

Results

from statsmodels.tsa.vector_ar.vecm import VECM
# Estimate VECM
model_vecm = VECM(sample_lev, deterministic='co', 
            k_ar_diff=lag_order_results.aic, 
            coint_rank=coint_rank_results.rank)
results_vecm = model_vecm.fit()
tables = results_vecm.summary().tables
# Print summary tables
#for _, tab in enumerate(tables):
#    print(tab.as_html())
coefstd errzP>|z|[0.0250.975]
const16.83634.3713.8520.0008.26925.403
L1.logGDP0.12890.1001.2900.197-0.0670.325
L1.logCons0.33000.1232.6770.0070.0880.572
coefstd errzP>|z|[0.0250.975]
const9.30864.4192.1060.0350.64717.970
L1.logGDP0.18650.1011.8450.065-0.0120.385
L1.logCons-0.14340.125-1.1510.250-0.3880.101
coefstd errzP>|z|[0.0250.975]
ec1-0.23610.063-3.7640.000-0.359-0.113
coefstd errzP>|z|[0.0250.975]
ec1-0.12180.063-1.9210.055-0.2460.002
coefstd errzP>|z|[0.0250.975]
beta.11.0000000.0001.0001.000
beta.2-0.94440.015-64.0310.000-0.973-0.916

Cointegrating Relationship

  • Πyt1\boldsymbol{\Pi} \mathbf{y}_{t-1} is the error correction term, i.e., the long-run equilibrium relationship between variables

  • If Π\boldsymbol{\Pi} has reduced rank, r<nr < n, then Π=αβ\boldsymbol{\Pi} = \boldsymbol{\alpha} \boldsymbol{\beta}', where βn×r\boldsymbol{\beta}_{n \times r} is a matrix of cointegrating vectors.

Pi = results_vecm.alpha@results_vecm.beta.T
rankPi = np.linalg.matrix_rank(Pi)
print(f'alpha = {results_vecm.alpha}')
print(f'beta = {results_vecm.beta}')
print(f'Pi = {Pi}')
print(f'rank(Pi) = {rankPi}')
alpha = [[-0.23609997]
 [-0.12179596]]
beta = [[ 1.       ]
 [-0.9444435]]
Pi = [[-0.23609997  0.22298308]
 [-0.12179596  0.1150294 ]]
rank(Pi) = 1

Error Correction Term

  • Interpreting the cointegration coefficients, or the error correction term (ECT)

    ECTt1=β^[y~t1,c~t1]=y~t10.94c~t1y~t1=0.94c~t1+ECTt1\begin{align*} ECT_{t-1} &= \hat{\beta}' [\tilde{y}_{t-1},\tilde{c}_{t-1}] = \tilde{y}_{t-1} - 0.94 \tilde{c}_{t-1} \\ \rightarrow \tilde{y}_{t-1} &= 0.94 \tilde{c}_{t-1} + ECT_{t-1} \end{align*}
  • The long-run linear relationship between GDP and consumption is strongly positive, y~t0.94c~t\tilde{y}_t \approx 0.94 \tilde{c}_t, and the ECT measures how far away the system is from equilibrium.

  • α^1=0.24\hat{\alpha}_1 = -0.24 (and is significant) measures how much GDP corrects disequilibrium, i.e., returns to equilibrium

    • The significance indicates which variables adjust, e.g., α^2\hat{\alpha}_2 was not significant so consumption probably does not correct disquilibrium.

    • The sign indicates the direction of adjustment of each variable towards the long-run equilibrium, e.g., since α^1<0\hat{\alpha}_1 < 0, GDP pushes the system back to equilibrium

    • The magnitude is the speed or strength of adjustment, e.g., since α^1>α^2|\hat{\alpha}_1| > |\hat{\alpha}_2|, GDP adjusts more quickly than consumption.

# Error Correction Term
ECT = sample_lev@results_vecm.beta
ECT.name = 'ECT'
# Plot data
fig, ax = plt.subplots(figsize=(6.5,2.5))
ax.plot(ECT); ax.grid(); ax.set_title(ECT.name);
<Figure size 650x250 with 1 Axes>
# Unit Root Test   
test = adf_test(ECT,'c')
display(test)
Loading...

Result: reject that the ECT has a unit root and conclude that it is probably stationary, as desired.

Impulse Response Functions

  • The estimated VECM cannot be used to compute IRFs

  • IRFs are only defined for a VAR with stationary time series or a VAR with cointegrated time series in levels (e.g., Christiano et al., 2005)

  • Instead we would estimate a reduced form VAR(pp) separately to compute IRFs, where pp comes from the selected lag order from the VECM estimation

  • Then we can ask the question, how much does an increase in income change consumption and for how long?

  • Since both GDP and consumption are both clearly endogenous to many other things in the macroeconomy, it doesn’t make sense to use a Cholesky decomposition (e.g., orthogonalization) to estimate shocks.

# VAR model
from statsmodels.tsa.api import VAR
# make the VAR model
model_var = VAR(sample_lev)
# Estimate VAR(p)
results_var = model_var.fit(model_vecm.k_ar_diff + 1)
# Assign impulse response functions (IRFs)
irf = results_var.irf(20)
# Plot IRFs
fig = irf.plot(orth=False,impulse='logGDP',figsize=(6.5,4));
fig.suptitle(" ");
<Figure size 650x400 with 2 Axes>

Forecasting

  • With the estimated VAR, forecasts can also be made

  • Want to compare forecasts to actual data

  • This may not be a great example because the 1980s is still relative volatile

# Lag order and end of sample
p = results_var.k_ar
last_obs = sample_lev.values[-p:] 
# Forecast 6 quarters ahead
h = 6;
forecast = results_var.forecast(y=last_obs, steps=h)
forecast_df = pd.DataFrame(forecast, columns=sample_lev.columns)
dates = pd.date_range(start='03-31-1985', periods=h, freq='QE')
forecast_df.index = dates  # Assign to index
# Actual Data
start_date = pd.Timestamp('12-31-1984')-pd.DateOffset(months=3*p)
end_date = pd.Timestamp('12-31-1984')+pd.DateOffset(months=3*h)
print(f'start date = {start_date}, end date = {end_date}')
sample_forecast = data[start_date:end_date]
start date = 1984-06-30 00:00:00, end date = 1986-06-30 00:00:00
fig, ax = plt.subplots(figsize=(6.5,2))
forecast_df['logGDP'].plot(ax=ax, linestyle='--')
sample_forecast['logGDP'].plot(ax=ax, linestyle='-')
ax.grid(); ax.autoscale(tight=True); 
<Figure size 650x200 with 1 Axes>
fig, ax = plt.subplots(figsize=(6.5,2))
forecast_df['logCons'].plot(ax=ax, linestyle='--')
sample_forecast['logCons'].plot(ax=ax, linestyle='-')
ax.grid(); ax.autoscale(tight=True); 
<Figure size 650x200 with 1 Axes>