Next Article in Journal
Acknowledgement to Reviewers of Econometrics in 2018
Next Article in Special Issue
Asymptotic Theory for Cointegration Analysis When the Cointegration Rank Is Deficient
Previous Article in Journal
The Specification of Dynamic Discrete-Time Two-State Panel Data Models
Previous Article in Special Issue
The Stochastic Stationary Root Model
Article

Cointegration and Adjustment in the CVAR(∞) Representation of Some Partially Observed CVAR(1) Models

Department of Economics, University of Copenhagen, ∅ster Farimagsgade 5, building 26, DK-1353 Copenhagen K, Denmark
Received: 18 September 2018 / Revised: 29 October 2018 / Accepted: 8 January 2019 / Published: 10 January 2019
(This article belongs to the Special Issue Celebrated Econometricians: Katarina Juselius and Søren Johansen)

Abstract

A multivariate CVAR(1) model for some observed variables and some unobserved variables is analysed using its infinite order CVAR representation of the observations. Cointegration and adjustment coefficients in the infinite order CVAR are found as functions of the parameters in the CVAR(1) model. Conditions for weak exogeneity for the cointegrating vectors in the approximating finite order CVAR are derived. The results are illustrated by two simple examples of relevance for modelling causal graphs.
Keywords: adjustment coefficients; cointegrating coefficients; CVAR; causal models adjustment coefficients; cointegrating coefficients; CVAR; causal models

1. Introduction

In a conceptual exploration of long-run causal order, Hoover (2018) applies the CVAR(1) model for the processes X t = ( x 1 t , , x p t ) and T t = ( T 1 t , , T m t ) , to model a causal graph. The process ( X t ; T t ) is a solution to the equations
Δ X t + 1 = M X t + C T t + ε t + 1 , Δ T t + 1 = η t + 1 ,
where the error terms ε t are independent identically distributed (i.i.d.) Gaussian variables with mean 0 and variance Ω ε = diag ( ω 11 , , ω p p ) > 0 , and are independent of the errors η t , which are (i.i.d.) Gaussian with mean 0 and variance Ω η .
Thus, the stochastic trends, T t are nonstationary random walks and conditions will be given below for X t to be I ( 1 ) , that is, nonstationary, but Δ X t stationary. This will imply that M X t + C T t is stationary, so that X t and T t cointegrate.
The entry M i j 0 means that x j causes x i , which is written x j x i , and C i j 0 means that T j x i , and it is further assumed that M i i 0 . Note that the model assumes that there are no causal links from X t to T t , so that T t is strongly exogenous.
A simple example for three variables, x 1 , x 2 , x 3 , and a trend T, is the graph
T x 1 x 2 x 3 ,
where the matrices are given by
M = 0 0 0 0 , C = 0 0
where ∗ indicates a nonzero coefficient.
Provided that I p + M has all eigenvalues in the open unit disk, it is seen that
M X t + 1 + C T t + 1 = ( I p + M ) ( M X t + C T t ) + M ε t + 1 + C η t + 1 ,
determines a stationary process defined for all t . We define a nonstationary solution to (1) for t = 0 , 1 , by
X t = M 1 C i = 1 t η i + M 1 i = 0 ( I p + M ) i ( M ε t i + C η t i ) and T t = i = 1 t η i .
Note that the starting values are
X 0 = M 1 i = 0 ( I p + M ) i ( M ε i + C η i ) and T 0 = 0 .
It is seen that Δ X t + 1 , Δ T t + 1 and M X t + C T t are stationary processes for all t , and that ( X t ; T t ) is a solution to Equation (1). In the following, we assume that ( X t ; T t ) is defined by (2) for t = 0 , 1 ,
The paper by Hoover gives a detailed and general discussion of the problems of recovering causal structures from nonstationary observations X t , or subsets of X t , when T t is unobserved, that is, X t = ( X 1 t ; X 2 t ) where the observations X 1 t are p 1 -dimensional and the unobserved processes X 2 t and T t are p 2 - and m-dimensional respectively, p = p 1 + p 2 . It is assumed that there are at least as many observations as trends, that is p 1 m .
Model (1) is therefore rewritten as
Δ X 1 , t + 1 = M 11 X 1 t + M 12 X 2 t + C 1 T t + ε 1 , t + 1 , Δ X 2 , t + 1 = M 21 X 1 t + M 22 X 2 t + C 2 T t + ε 2 , t + 1 , Δ T t + 1 = η t + 1 .
Note that there is now a causal link from the observed process X 1 t to the unobserved process X 2 t if M 21 0 .
It follows from (3) that X 1 t is I ( 1 ) and cointegrated with p 1 m cointegrating vectors β , see Theorem 1. Therefore, Δ X 1 t has an infinite order autoregressive representation, see (Johansen and Juselius 2014, Lemma 2), which is written as
Δ X 1 , t + 1 = α β X 1 t + i = 1 Γ i Δ X 1 , t + 1 i + ν t + 1 β ,
where the operator norm | | Γ i | | = λ max 1 / 2 ( Γ i Γ i ) is O ( ρ i ) for some 0 < ρ < 1 . The matrices α and β are p 1 × m of rank m, and ν t + 1 β = Δ X 1 , t + 1 E ( Δ X 1 , t + 1 | F t β ) , where F t β = σ ( Δ X 1 s , s t , β X 1 t ) . Thus, X 1 t is not measurable with respect to F t β , but β X 1 t is measurable with respect to F t β . Here, the prediction errors ν t + 1 β are i.i.d. N p 1 ( 0 , Σ ) , where Σ is calculated below. The representation of X 1 t , similar to (2), is
X 1 t = β ( α Γ β ) 1 α i = 1 t ν i β + i = 0 C i ν t i β , t = 0 , 1 ,
where Γ = I p 1 i = 1 Γ i and | | C i | | = O ( ρ i ) . Here, β is a p 1 × ( p 1 m ) matrix of full rank for which β β = 0 , and similarly for α . This shows that X 1 t is a cointegrated I ( 1 ) process, that is, X 1 t is nonstationary, while β X 1 t and Δ X 1 t are stationary.
A statistical analysis, including estimation of α , β , and Γ , can be conducted for the observations X 1 t , t = 1 , T , using an approximating finite order CVAR, see Saikkonen (1992) and Saikkonen and Lütkepohl (1996).
Hoover (2018) investigates, in particular, whether weak exogeneity for β in the approximating finite order CVAR, that is, a zero row in α , is a useful tool for finding the causal structure in the graph.
The present note solves the problem of finding expressions for the parameters α and β in the CVAR() model (4) for the observation X 1 t , as functions of the parameters in model (3), and finds conditions on these for the presence of a zero row in α , and hence weak exogeneity for β in the approximating finite order CVAR.

2. The Assumptions and Main Results

First, some definitions and assumptions are given, then the main results on α and β are presented and proved in Theorems 1 and 2. These results rely on Theorem A1 on the solution of an algebraic Riccati equation, which is given and proved in the Appendix A.
In the following, a k × k matrix is called stable, if all eigenvalues are contained in the open unit disk. If A is a k 1 × k 2 matrix of rank k min ( k 1 , k 2 ) , an orthogonal complement, A , is defined as a k 1 × ( k 1 k ) matrix of rank k 1 k for which A A = 0 . If k 1 = k , A = 0 . Note that A is only defined up to multiplication from the right by a ( k 1 k ) × ( k 1 k ) matrix of full rank. Throughout, E t ( . ) and V a r t ( . ) denote conditional expectation and variance given the sigma-field F 0 , t = σ { X 1 , s , 0 s t } , generated by the observations.
Assumption 1.
In Equation (3), it is assumed that
(i) ε 1 t , ε 2 t , and η t are mutually independent and i.i.d. Gaussian with mean zero and variances Ω 1 , Ω 2 , and Ω η , where Ω 1 and Ω 2 are diagonal matrices,
(ii) I p 1 + M 11 , I p 2 + M 22 and I p + M are stable,
(iii) C 1.2 = C 1 M 12 M 22 1 C 2 has full rank m.
Let ( X 1 t ; X 2 t ; T t ) , 0 = 1 , , n , be the solution to (3) given in (2), such that Δ X t and M X t + C T t are stationary.
Assumption 1(ii) on M 11 , M 22 and M is taken from Hoover (2018) to ensure that, for instance, the process X t given by the equations X t = ( I p + M ) X t 1 + i n p u t , is stationary if the input is stationary, such that the nonstationarity of X t in model (3) is created by the trends T t , and not by the own dynamics of X t as given by M . It follows from this assumption that M is nonsingular, because I p + M is stable, and similarly for M 11 and M 22 . Moreover M 11.2 = M 11 M 12 M 22 1 M 21 is nonsingular because
det M = det M 22 det M 11.2 0 .

The Main Results

The first result on β is a simple consequence of model (3).
Theorem 1.
Assumption 1 implies that the cointegrating rank is r = p 1 m , and that the coefficients β and β in the CVAR( ) representation for X 1 t , see (4), are given for p 1 > m as
β = M 11.2 1 C 1.2 a n d β = M 11.2 ( C 1.2 ) .
For p 1 = m , β has rank p 1 , and there is no cointegration: α = β = 0 .
Proof Theorem of 1.
From the model Equation (3), it follows, by eliminating X 2 t from the first two equations, that
Δ X 1 , t + 1 M 12 M 22 1 Δ X 2 , t + 1 = M 11.2 X 1 t + C 1.2 T t + ε 1 t + 1 M 12 M 22 1 ε 2 , t + 1 .
Solving for the nonstationary terms gives
M 11.2 X 1 t + C 1.2 T t = Δ X 1 , t + 1 M 12 M 22 1 Δ X 2 , t + 1 ε 1 t + 1 , + M 12 M 22 1 ε 2 , t + 1 .
Multiplying by β M 11.2 1 , it is seen that β X 1 t is stationary, if β M 11.2 1 C 1.2 = 0 . By Assumption 1(i), C 1 . 2 has rank m , so that β has rank p 1 m , which proves (6). ☐
The result for α is more involved and is given in Theorem 2. The proof is a further analysis of (7) and involves first, the representation X 1 t in terms of a sum of prediction errors ν t β = Δ X 1 t E ( Δ X 1 t | F t 1 β ) , see (5), and second, a representation of E ( T t | F 0 , t ) = E ( T t | X 10 , , X 1 t ) as the (weighted) sum of the prediction errors ν 0 t = Δ X 1 t E ( Δ X 1 t | F 0 , t 1 ) . The second representation requires a result from control theory on the solution of an algebraic Riccati equation, together with some results based on the Kalman filter for the calculation of the conditional mean and variance of the unobserved processes X 2 t , T t given the observations X 0 s , 0 s t . These are collected as Theorem A1 in the Appendix A.
For the discussion of these results, it is useful to reformulate (3) by defining the unobserved variables and errors
T t = X 2 t T t , η t = ε 2 t η t , Ω = V a r ( η t ) = Ω 2 0 0 Ω η
and the matrices
Q = I p 2 + M 22 C 2 0 I m , M 21 = M 21 0 , C = ( M 12 ; C 1 ) .
Then, (3) becomes
X 1 , t + 1 = ( I p 1 + M 11 ) X 1 t + C T t + ε 1 , t + 1 , T t + 1 = M 21 X 1 t + Q T t + η t + 1 .
One can then show, see Theorem A1, that based on properties of the Gaussian distribution, a recursion can be found for the calculation of V t = V a r t ( T t ) and E t = E t ( T t ) = E t ( T t | F 0 t ) and V t = V a r t ( T t ) = V a r t ( T t | F 0 t ) , using the matrices in (8) and (9), by the equations Some
V t + 1 = Q V t Q + Ω Q V t C ( C V t C + Ω 1 ) 1 C V t Q ,
E t + 1 = M 21 X 1 t + Q E t + Q V t C ( C V t C + Ω 1 ) 1 ν 0 t + 1 .
It then follows from results from control theory, that V = lim t V a r t ( T t ) exists and satisfies the algebraic Riccati equation
V = Q V Q + Ω Q V C ( C V C + Ω 1 ) 1 C V Q .
Moreover, the prediction errors ν 0 t = Δ X 1 t E ( Δ X 1 t | F 0 , t 1 ) are independent N p 1 ( 0 , Σ t ) for Σ t = C V t C + Ω 1 , and the prediction errors ν t β = Δ X 1 t E ( Δ X 1 t | F t 1 β ) are independent identically distributed N p 1 ( 0 , Σ ) for Σ = C V C + Ω 1 . Finally, E t ( T t ) has the representation in the prediction errors, ν 0 i ,
E t ( T t ) = E 0 ( T 0 ) + ( 0 ; I m ) i = 1 t V i C Σ i 1 ν 0 i ,
where E 0 ( T 0 ) = E ( T 0 | X 10 ) = 0 .
Comparing the representation (5) for X 1 t and (14) for E t ( T t ) gives a more precise relation between the coefficients of the nonstationary terms in (7). The main result of the paper is to show how this leads to expressions for the coefficients α and α as functions of the parameters in model (3).
Theorem 2.
Assumption 1 implies, that the coefficients α and α in the CVAR( ) representation of X 1 t are given for p 1 > m as
α = Σ 1 ( M 12 V 2 T + C 1 V T T ) , α = Σ ( M 12 V 2 T + C 1 V T T ) ,
where
Σ = V a r ( ν t β ) = C V C + Ω 1 = ( M 12 ; C 1 ) V 22 V 2 T V T 2 V T T ( M 12 ; C 1 ) + Ω 1 .
Proof of Theorem 2.
The left hand side of (7) has two nonstationary terms. The observation X 1 t is represented in (5) in terms of a random walk in the prediction errors ν i β , plus a stationary term, and T t is a random walk in η i . Calculating the conditional expectation given the sigma-field F 0 , t , T t is replaced by E t ( T t ) , which in (14) is represented as a weighted sum of ν 0 i . Thus, the conditional expectation of (7) gives
M 11.2 X 1 t + C 1.2 E t ( T t ) = E t ( Δ X 1 t + 1 M 12 M 22 1 Δ X 2 , t + 1 ) ,
where the right hand side is bounded in mean:
E | E t ( Δ X 1 , t + 1 M 12 M 22 1 Δ X 2 , t + 1 ) | c { E | Δ X 1 , t + 1 | + | Δ X 2 , t + 1 | } c .
Setting t = [ n u ] and dividing by n 1 / 2 , it follows from (5) that
n 1 / 2 X 1 [ n u ] D β ( α Γ β ) 1 α W ν ( u ) ,
where W ν ( u ) is the Brownian motion generated by the i.i.d. prediction errors ν t β .
From (14), it can be proved that
n 1 / 2 E [ n u ] ( T [ n u ] ) = ( 0 ; I m ) n 1 / 2 t = 1 [ n u ] V t C Σ t 1 ν 0 t D ( 0 ; I m ) V C Σ 1 W ν ( u ) .
This follows by replacing V t , Σ t by V ,   Σ , because for δ t = V t C Σ t 1 V C Σ 1 0 , it holds that
V a r ( n 1 / 2 t = 1 [ n u ] δ t ν 0 t ) = n 1 t = 1 [ n u ] δ t Σ t δ t 0 , n .
Next we can replace ν 0 t by ν t β as follows: For t = 0 , 1 , the sum
α β X 1 t + i = 1 t Γ i Δ X 1 , t + 1 i = α β X 1 t + Γ 1 Δ X 1 t + + Γ t Δ X 11 ,
is measurable with respect to both F t β and F 0 t , such that
ν 0 , t + 1 ν t + 1 β = E ( i = t + 1 Γ i Δ X 1 , t + 1 i | F 0 , t ) + i = t + 1 Γ i Δ X 1 , t + 1 i .
Then
E | ν 0 , t + 1 ν t + 1 β | c i = t + 1 ρ i E | Δ X 1 , t + 1 i | = O ( ρ t ) ,
and therefore
E | n 1 / 2 i = 1 [ n u ] ( ν t + 1 β ν 0 , t + 1 ) | n 1 / 2 i = 1 [ n u ] E | ν t + 1 β ν 0 , t + 1 | c n 1 / 2 i = 1 [ n u ] ρ i 0 , n ,
which proves (19).
Finally, setting t = [ n u ] and normalizing (17) by n 1 / 2 , it follows that in the limit
M 11.2 β ( α Γ β ) 1 α W ν ( u ) + C 1.2 ( 0 ; I m ) V C Σ 1 W ν ( u ) = 0 for u [ 0 , 1 ] .
This relation shows that the coefficient to W ν ( u ) is zero, so that α can be chosen as
α = Σ 1 C V ( 0 ; I m ) = Σ 1 ( M 12 V 2 T + C 1 V T T )
and therefore α = Σ ( M 12 V 2 T + C 1 V T T ) which proves (15). ☐

3. Two Examples of Simplifying Assumptions

It follows from Theorem 2 that in order to investigate a zero row in α , the matrix V is needed. This is easy to calculate from the recursion (11), for a given value of the parameters, but the properties of V are more difficult to evaluate. In general, α does not contain a zero row, but if M 12 V 2 T = 0 , the expressions for α and α simplify, so that simple conditions on M 12 and C 1 imply a zero row in α and hence give weak exogeneity in the statistical analysis of the approximating finite order CVAR. This extra condition, M 12 V 2 T = 0 , implies that
Σ = ( M 12 ; C 1 ) V ( M 12 ; C 1 ) + Ω 1 = M 12 V 22 M 12 + C 1 V T T C 1 + Ω 1 ,
and
( M 12 V 2 T + C 1 V T T ) = ( C 1 V T T ) = C 1 ,
such that α simplifies to
α = ( M 12 V 22 M 12 + C 1 V T T C 1 + Ω 1 ) C 1 = ( M 12 V 22 M 12 + Ω 1 ) C 1 .
Thus, a condition for a zero row in α is
e i α = e i M 12 V 22 M 12 C 1 + ω i e i C 1 = 0
because Ω 1 = diag ( ω 1 , , ω p 1 ) . This is simple to check by inspecting the matrices M 12 and C 1 in model (3). In the next section, two cases are given, where such a simple solution is available.
Case 1
(M12 = 0). If the unobserved process X 2 t does not cause the observation X 1 t , then M 12 = 0 . Therefore, M 12 V 2 T = 0 and from (20) it follows that
e i α = ω i e i C 1 = 0 .
Thus, α has a zero row if C 1 has a zero row.
An example of M 12 = 0 is the chain T x 1 x 2 x 3 , where X 1 = { x 1 , x 2 , x 3 } is observed and X 2 = 0 , and hence M 12 = 0 and C 2 = 0 . Then, because T x 1
C 1 = 0 0 , C 1 = 0 0 1 0 0 1 .
Thus, the first row of C 1 is a zero row, such that x 1 is weakly exogenous.
To formulate the next case, a definition of strong orthogonality of two matrices is introduced.
Definition 1.
Let A be a k × k 1 matrix and B a k × k 2 matrix. Then, A and B are called strongly orthogonal if A D B = 0 for all diagonal matrices D, or equivalently if A j i B j = 0 for all i , j , .
Thus, if A j i 0 , we assume that row j of B is zero, and if B j 0 , row j of A is zero. A simple example is
A = 0 0 0 , B = 0 0 .
Thus, the definition means that if two matrices are strongly orthogonal, it is due to the positions of the zeros and not to linear combination of nonzero numbers being zero.
Thus, in particular if M 12 and C 1 are strongly orthogonal, and if T causes a variable in X 1 , then X 2 does not cause that variable. The expression for V simplifies in the following case.
Lemma 1.
If C 2 = 0 , and M 12 Ω 1 1 C 1 = 0 , then Q = blockdiag ( I p 2 + M 22 ; I m ) , and V 2 T = 0 such that V = blockdiag ( V 22 ; V T T ) .
Proof of Lemma 1.
We first prove that V t is blockdiagonal for t = 0 . From (2), it follows that
X 10 X 20 = M 1 i = 0 ( I p + M ) i ( M ε i + C η i ) and T 0 = 0 .
Thus, if Φ denotes the variance of ( X 10 ; X 20 ) , then
V 0 = V a r X 20 T 0 | X 10 = Φ 22 . 1 0 0 0 ,
and hence blockdiagonal. Assume, therefore, that V t = blockdiag( V t 22 ; V t T T ) and consider the expression for V t + 1 , see (11). In this expression, Q is block diagonal (because C 2 = 0 ) and Q V t Q and Ω are block diagonal, and the same holds for Q V t 1 / 2 . Thus, it is enough to show that
V t 1 / 2 C { C V t C + Ω 1 } 1 C V t 1 / 2 ,
is block diagonal. To simplify the notation, define the normalized matrices
M ˇ = Ω 1 1 / 2 M 12 V t 22 1 / 2 and C ˇ = Ω 1 1 / 2 C 1 V t T T 1 / 2 .
Then, by assumption,
M ˇ C ˇ = V t 22 1 / 2 M 12 Ω 1 1 C 1 V t T T 1 / 2 = 0 ,
so that, using V t 2 T = 0 ,
V t 1 / 2 C ( C V t C + Ω 1 ) 1 C V t 1 / 2 = ( M ˇ , C ˇ ) ( M ˇ M ˇ + C ˇ C ˇ + I p 1 ) 1 ( M ˇ , C ˇ ) .
A direct calculation shows that
( M ˇ M ˇ + C ˇ C ˇ + I p 1 ) 1 = I p 1 M ˇ ( I p 2 + M ˇ M ˇ ) 1 M ˇ C ˇ ( I p 2 + C ˇ C ˇ ) 1 C ˇ ,
and that
M ˇ { I p 1 M ˇ ( I p 2 + M ˇ M ˇ ) 1 M ˇ C ˇ ( I p 2 + C ˇ C ˇ ) 1 C ˇ } C ˇ = 0
such that ( M ˇ , C ˇ ) ( M ˇ M ˇ + C ˇ C ˇ + I p 1 ) 1 ( M ˇ , C ˇ ) is block diagonal.
Then, V t 1 / 2 C { C V t C + Ω 1 } 1 C V t 1 / 2 and hence V t + 1 are block diagonal. Taking the limit for t , it is seen that also V is block diagonal. ☐
Case 2
(C2 = 0, and M12 and C1 are strongly orthogonal). Because C 2 = 0 and M 21 Ω 1 1 C 1 = 0 , Lemma 1 shows that V 2 T = 0 , so that the condition M 12 V 2 T = 0 and (20) hold. Moreover, strong orthogonality also implies that M 12 C 1 = 0 such that M 12 = C 1 ξ for some ξ . Hence
e i α = e i M 12 V 22 M 12 C 1 + ω i e i C 1 = e i C 1 ( ξ V 22 M 12 C 1 + ω i I p 1 m ) ,
and therefore, a zero row in C 1 gives a zero row in α.
Consider again the chain T x 1 x 2 x 3 , but assume now that x 2 is not observed. Thus, X 1 = { x 1 , x 3 } and X 2 = { x 2 } . Here, T causes x 1 , and x 2 causes x 3 , so that
M 12 = 0 , C 1 = 0 , C 2 = 0 .
Note that M 12 D C 1 = 0 for all diagonal D because T and X 2 cause disjoint subsets of X 1 . This, together with C 2 = 0 , implies that V is block diagonal and that (21) holds. Thus, x i is weakly exogenous, e i α = 0 , if
e i C 1 = e i 0 = 0 .

4. Conclusions

This paper investigates the problem of finding adjustment and cointegrating coefficients for the infinite order CVAR representation of a partially observed simple CVAR(1) model. The main tools are some classical results for the solution of the algebraic Riccati equation, and the results are exemplified by an analysis of CVAR(1) models for causal graphs in two cases where simple conditions for weak exogeneity are derived in terms of the parameters of the CVAR(1) model.

Funding

This research received no external funding

Acknowledgments

The author would like to thank Kevin Hoover for long discussions on the problem and its solution, and Massimo Franchi for reading a first version of the paper and for pointing out the excellent book by Lancaster and Rodman, and two anonymous referees who helped clarify some of the proofs.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A.

The next Theorem shows how the Kalman filter can be used to calculate V a r t ( T t ) and E t ( T t ) using the same technique as for the common trends model and proves the existence of the limit of V t . The last result follows from the theory of the algebraic Riccati equation, see Lancaster and Rodman (1995), in the following LR(1995).
Theorem A1.
Let X 1 t and T t be given by model (10) and let Assumption 1 be satisfied. Then, V t = V a r t ( T t ) and E t = E t ( T t ) are given recursively, using the starting values E 0 and V 0 by
V t + 1 = Q V t Q + Ω Q V t C Σ t 1 C V t Q ,
E t + 1 = M 21 X 1 t + Q E t + Q V t C Σ t 1 ν 0 , t + 1 ,
where
Σ t = C V t C + Ω 1 ,
and the prediction errors
ν 0 , t + 1 = X 1 , t + 1 E t ( X 1 , t + 1 )
are independent N p 1 ( 0 , Σ t ) .
The sequence V t starting with V 0 , converges to a finite positive limit V, which satisfies the algebraic Riccati equation,
V = Q V Q + Ω Q V C Σ 1 C V Q , Σ = C V C + Ω 1 .
Furthermore,
Q Q V C Σ 1 C
is stable, and E t ( T t ) satisfies the equation
E t + 1 ( T t + 1 ) = E t ( T t ) + ( 0 ; I m ) V t C Σ t 1 ν 0 , t + 1 .
Proof of Theorem A1.
The variance V t = V a r t ( T t ) can be calculated recursively, using the properties of the Gaussian distribution, as
V a r t + 1 ( T t + 1 ) = V a r t ( T t + 1 | X 1 , t + 1 ) = V a r t ( T t + 1 ) C o v t ( T t + 1 ; X 1 , t + 1 ) V a r t ( X 1 , t + 1 ) 1 C o v t ( X 1 , t + 1 ; T t + 1 ) .
From the model Equation (10), it follows that
V a r t ( T t + 1 ) = V a r t { M 21 X 1 t + Q T t + η t + 1 } = Q V a r t ( T t ) Q + Ω ,
C o v t ( T t + 1 ; X 1 , t + 1 ) = C o v t { T t + 1 ; ( I p 1 + M 11 ) X 1 t + C T t + ε 1 t + 1 } = Q V a r t ( T t ) C ,
V a r t ( X 1 , t + 1 ) = V a r t { ( I p 1 + M 11 ) X 1 t + C T t + ε 1 t + 1 } = C V a r t ( T t ) C + Ω 1 .
Then, (A8)–(A11) give the recursion for V t = V a r t ( T t ) in (A1). Similarly, for the conditional mean, it is seen that
E t + 1 ( T t + 1 ) = E t ( T t + 1 | X 1 , t + 1 ) = E t ( T t + 1 ) + C o v t ( T t + 1 ; X 1 , t + 1 ) V a r t ( X 1 , t + 1 ) 1 ν 0 , t + 1 , E t ( T t + 1 ) = M 21 X 1 t + Q E t ( T t ) ,
which implies (A2) with prediction error ν 0 , t + 1 = Δ X 1 , t + 1 E t ( Δ X 1 , t + 1 ) .
Note that (A1) is the usual recursion from the Kalman filter equations for the state space model obtained from (10) for M 21 = 0 , see Durbin and Koopman (2012). Note also, however, that (A2) is not the usual recursion from the common trends model, because of the first term containing M 21 . It is seen from (A1) that if V t converges to V , then V has to satisfy the algebraic Riccati equation (A5) and Σ is given as indicated.
The result that V t converges to a finite positive limit follows from LR (1995, Theorem 17.5.3), where the assumptions, in the present notation, are
a . 1 ( Q ; I p 2 + m ) is controllable,
a . 2 ( Q ; I p 2 + m ) is stabilizable,
a . 3 ( C ; Q ) is detectable.
Before giving the proof, some definitions from control theory are given, which are needed for checking the conditions of the results in LR(1995).
Let A be a k × k matrix and B be a k × k 1 matrix.
d . 1 The pair { A , B } is called controllable if
r a n k ( B ; A B ; ; A k 1 B ) = k ,
LR(1995, (4.1.3)).
d . 2 The pair { A ; B } is stabilizable if there is a k 1 × k matrix K , such that A + B K is stable LR(1995, page 90, line 5-).
d . 3 Finally { B ; A } is detectable means that { A ; B } is stabilizable, LR(1995, page 91 line 6-).
The first assumption, a . 1 , is easy to check: The pair ( Q ; I p 2 + m ) is controllable, see d . 1 , means that
rank ( I p 2 + m ; Q I p 2 + m ; ; Q p 2 + m 1 I p 2 + m ) = p 2 + m .
The second assumption, a . 2 , follows because controllability implies stabilizability, see LR (1995, Theorem 4.4.2).
Finally, d . 3 shows that ( C ; Q ) detectable means ( Q ; C ) stabilizable, and LR(1995, Theorem 4.5.6 (b)), see also Hautus (1969), shows that ( Q ; C ) is stabilizable, if and only if
rank ( Q λ I p 2 + m ; C ) = rank M 12 C 1 I p 2 + M 22 λ I p 2 C 2 0 I m λ I m = p 2 + m for all | λ | 1 .
For λ = 1 , using C 1.2 = C 1 M 12 M 22 1 C 2 and Assumption 1, it follows that
rank ( M ( 1 ) ) = rank M 12 C 1 M 22 C 2 = rank 0 C 1 . 2 M 22 C 2 = rank ( C 1 . 2 ) + rank ( M 22 ) = m + p 2 .
For | λ | > 1 , using Assumption 1(ii), it is seen that
rank ( M ( λ ) ) = rank ( I p 2 + M 22 λ I p 2 ) + rank ( I m λ I m ) = p 2 + m ,
because λ is not an eigenvalue of the stable matrix I p 2 + M 22 , when | λ | > 1 .
Thus, ( Q ; C ) is stabilizable, and assumptions a . 1 , a . 2 , a . 3 hold, such that and LR (1995, Theorem 17.5.3) applies. This proves that limit V = lim t V t exists and (A6) holds.
Multiplying (A2) by ( 0 ; I m ) , it is seen, using ( 0 ; I m ) Q = ( 0 ; I m ) , and ( 0 ; I m ) M 21 = 0 , that a recursion for E t ( T t ) is given by (A7). ☐

References

  1. Durbin, James, and Siem Jan Koopman. 2012. Time Series Analysis by State Space Methods, 2nd ed. Oxford: Oxford University Press. [Google Scholar]
  2. Hautus, Malo L. J. 1969. Controllability and observability conditions of linear autonomous systems. Koninklijke Nederlandse Akademie van Wetenschappen. Indagationes Mathematicae 12: 443–48. [Google Scholar]
  3. Hoover, Kevin D. 2018. Long-Run Causal Order: A Preliminary Investigation. Durham: Department of Economics and Department of Philosophy, Duke University. [Google Scholar]
  4. Johansen, Søren, and Katarina Juselius. 2014. An asymptotic invariance property of the common trends under linear transformations of the data. Journal of Econometrics 17: 310–15. [Google Scholar] [CrossRef]
  5. Lancaster, Peter, and Leiba Rodman. 1995. Algebraic Riccati Equations. Oxford: Clarendon Press. [Google Scholar]
  6. Saikkonen, Pentti. 1992. Estimation and testing of cointegrated systems by an autoregressive approximation. Econometric Theory 8: 1–27. [Google Scholar] [CrossRef]
  7. Saikkonen, Pentti, and Helmut Lütkepohl. 1996. Infinite order cointegrated vector autoregressive processes. Estimation and Inference. Econometric Theory 12: 814–44. [Google Scholar] [CrossRef]
Back to TopTop