Next Article in Journal / Special Issue
Modeling Real Exchange Rate Persistence in Chile
Previous Article in Journal / Special Issue
Sustainable Financial Obligations and Crisis Cycles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems

1
Amsterdam School of Economics and Tinbergen Institute, University of Amsterdam, 1001 NJ Amsterdam, The Netherlands
2
Joint Research Centre, European Commission, 21027 Ispra (VA), Italy
*
Author to whom correspondence should be addressed.
Submission received: 28 February 2017 / Revised: 30 May 2017 / Accepted: 21 June 2017 / Published: 29 June 2017
(This article belongs to the Special Issue Recent Developments in Cointegration)

Abstract

:
Likelihood ratio tests of over-identifying restrictions on the common trends loading matrices in I(2) VAR systems are discussed. It is shown how hypotheses on the common trends loading matrices can be translated into hypotheses on the cointegration parameters. Algorithms for (constrained) maximum likelihood estimation are presented, and asymptotic properties sketched. The techniques are illustrated using the analysis of the PPP and UIP between Switzerland and the US.
JEL Classification:
C32

1. Introduction

The duality between the common trends representation and the vector equilibrium-correction model-form (VECM) in cointegrated systems allows researchers to formulate hypotheses of economic interest on any of the two. The VECM is centered on the adjustment with respect to disequilibria in the system; in this way it facilitates the interpretation of cointegrating relations as (deviations from) equilibria.
The common trends representation instead highlights how variables in the system as pushed around by common stochastic trends, which are often interpreted as the main persistent economic factors influencing the long-term. Both representations provide economic insights on the economic system under scrutiny. Examples of both perspectives are given in Juselius (2017a, 2017b)
The common trends and VECM representations are connected through representation results such as the Granger Representation Theorem, in the case of I(1) systems, see Engle and Granger (1987) and Johansen (1991), and the Johansen Representation Theorem, for the case of I(2) systems, see Johansen (1992). In particular, both representation theorems show that the loading matrix of the common stochastic trends of highest order is a basis of the orthogonal complement of the matrix of cointegrating relations. Because of this property, these two matrices are linked, and any one of them can be written as a function of the other one.
This paper focuses on I(2) vector autoregressive (VAR) systems, and it considers the situation where (possibly over-identifying) economic hypotheses are entertained for the factor loading matrix of the I(2) trends. It is shown how they can then be translated into hypotheses on the cointegrating relations, which appear in the VECM representation; the latter forms the basis for maximum likelihood (ML) estimation of I(2) VAR models. In this way, constrained ML estimators are obtained and the associated likelihood ratio (LR) tests of these hypotheses can be defined. These tests are discussed in the present paper; Wald tests on just-identified loading matrices of the I(1) and I(2) common trends have already been proposed by Paruolo (1997, 2002).
The running example of the paper is taken from Juselius and Assenmacher (2015), which is the working paper version of Juselius and Assenmacher (2017). The following notation is used: for a full column-rank matrix a, col a denotes the space spanned by the columns of a and a indicates a basis of the orthogonal complement of col a . For a matrix b of the same dimensions of a, and for which b a is full rank, let b a : = b ( a b ) 1 ; a special case is when a = b , for which a ¯ : = a a = a ( a a ) 1 . Let also P a : = a ( a a ) 1 a indicate the orthogonal projection matrix onto col a , and let the matrix P a = I P a denote the orthogonal projection matrix on its orthogonal complement. Finally e j is used to indicate the j-th column of an identity matrix of appropriate dimension.
The rest of this paper is organized as follows: Section 2 contains the motivation and the definition of the problem considered in the paper. The identification of the I(2) common trends loading matrix under linear restrictions is analysed in Section 3. The relationship between the identified parametrization of I(2) common trends loading matrix and an identified version of the cointegration matrix is also discussed. Section 4 considers a parametrization of the VECM, and discusses its identification. ML estimation of this model is discussed in Section 5; the asymptotic distributions of the resulting ML estimator of the I(2) loading matrix and the LR statistic of the over-identifying restrictions are sketched in Section 6. Section 7 reports an illustration of the techniques developed in the paper on a system of US and Swiss prices, interest rates and exchange rate. Section 8 concludes, while two appendices report additional technical material.

2. Common Trends Representation for I(2) Systems

This section introduces quantities of interest and presents the motivation of the paper. Consider a p-variate VAR(k) process X t :
X t = A 1 X t 1 + + A k X t k + μ 0 + μ 1 t + ε t ,
where A i , i = 1 , , k are p × p matrices, μ 0 and μ 1 are p × 1 vectors, and ε t is a p × 1 i.i.d. N ( 0 , Ω ) vector, with Ω positive definite. Under the conditions of the Johansen Representation Theorem, see Appendix A, called the I(2) conditions, X t admits a common trends I(2) representation of the form
X t = C 2 S 2 t + C 1 S 1 t + Y t + v 0 + v 1 t ,
where S 2 t : = i = 1 t s = 1 i ε s are the I(2) stochastic trends (cumulated random walks), S 1 t : = Δ S 2 t = i = 1 t ε i is a random walk component, and Y t is an I(0) linear process.
Cointegration occurs when the matrix C 2 has reduced rank r 2 < p , such that C 2 = a b , where a and b are p × r 2 and of full column rank. This observation lends itself to the following interpretation: b S 2 t defines the r 2 common I(2) trends, while a acts as the loading matrix of X t on the I(2) trends. The reduced rank of C 2 implies that there exist m : = p r 2 linearly independent cointegrating vectors, collected in a p × m matrix τ , satisfying τ C 2 = 0 ; hence τ X t is I(1). Combining this with C 2 = a b , it is clear that a = τ , i.e., the columns of the loading matrix span the orthogonal complement of the cointegration space col τ . Interest in this paper is on hypotheses on a = τ .1
Observe that C 2 = a b is invariant to the choice of basis of either col a and col b . In fact, ( a , b ) can be replaced by ( a Q , b Q 1 ) with Q square and nonsingular without affecting C 2 . One way to resolve this identification problem is to impose restrictions on the entries of a = τ ; enough restrictions of this kind would make the choice of τ unique. Such an approach to identification is common in confirmatory factor analysis in the statistics literature, see Jöreskog et al. (2016).
If more restrictions are imposed than needed for identification, they are over-identifying. Such over-identifying restrictions on τ usually correspond to (similarly over-identifying) restrictions on τ , see Section 3 below. Although economic hypotheses may directly imply restrictions on the cointegrating vectors in τ , in some cases it is more natural to formulate restrictions on the I(2) loading matrix τ . This is illustrated by the two following examples.

2.1. Example 1

Kongsted (2005) considers a model for X t = ( m t : y t n : p t ) , where m t , y t n and p t denote the nominal money stock, nominal income and the price level, respectively (all variables in logs); here ‘:’ indicates horizontal concatenation. He assumes that the system is I(2), with r 2 = 1 . Given the definition of the variables, Kongsted (2005) considers the natural question of whether real money m t p t and real income y t n p t are at most I(1). This corresponds to an (over-identified) cointegrating matrix τ and loading vector τ of the form
τ = 1 0 0 1 1 1 , τ = 1 1 1 .
The form of τ corresponds to the fact that the I(1) linear combinations τ X t are (linear combinations of) ( ( m t p t ) : ( y t n p t ) ) , as required. On the other hand, the restriction on τ says that each of the three series have exactly the same I(2) trend, with the same scale factor. Both formulations are easily interpretable.
Note that the hypothesis on τ involves two over-identifying restrictions (the second and third component are equal to the first component), in addition to a normalization (the first component equals 1). Similarly, the restriction that the matrix consisting of the first two rows of τ equals I 2 is a normalization; the two over-identifying restrictions are that the entries in both columns sum to 0.
As this first example shows, knowing τ is the same as knowing τ and vice versa2.

2.2. Example 2

Juselius and Assenmacher (2015) consider a 7-dimensional VAR with X t = ( p 1 t : p 2 t : e 12 t : b 1 t : b 2 t : s 1 t : s 2 t ) with r 2 = 2 , where p i t , b i t , s i t are the (log of) the price index, the long and the short interest rate of country i at time t respectively, and e 12 t is the log of the exchange rate between country 1 (Switzerland) and 2 (the US) at time t. They expect the common trends representation to have a loading matrix τ of the form:
τ = ϕ 11 0 ϕ 21 ϕ 22 ϕ 31 ϕ 32 0 ϕ 42 0 ϕ 52 0 ϕ 62 0 ϕ 72 .
where ϕ i j indicates an entry not restricted to 0.
The second I(2) trend is loaded on the interest rates b 1 t , b 2 t , s 1 t , s 2 t , as well as on US prices p 2 t and the exchange rate e 12 t ; this can be interpreted as a financial (or ‘speculative’) trend affecting world prices. The first I(2) trend, instead, is only loaded on p 1 t , p 2 t , e 12 t and embodies a ‘relative price’ I(2) trend; it can be interpreted as the Swiss contribution to the trend in prices.
The cointegrating matrix τ in this example is of dimension 7 × 5 . It is not obvious what type of restrictions on τ correspond to the structure in (3). However, it is τ rather than τ that enters the likelihood function (as will be analyzed in Section 4). The rest of the paper shows that the restrictions in (3) are over-identifying, how they can be translated into hypotheses on τ , and how they can be tested via LR tests.

3. Hypothesis on the Common Trends Loadings

This section discusses linear hypotheses on τ and their relation to τ . First, attention is focused on the case of linear hypotheses on the normalized version τ c : = τ c τ 1 of τ . Here c is a full-column-rank matrix of the same dimension of τ such that c τ is square and nonsingular3. This normalization was introduced by Johansen (1991) in the context of the I(1) model in order to isolate the (just-) identified parameters in the cointegration matrix.
Later, linear hypotheses formulated directly on τ are discussed. The main result of this section is the fact that the parameters of interest appears linearly both in τ c and in τ c in the first case; this is not necessarily true in the second case.
The central relation employed in this section (for both cases), is the following identity:
τ c : = τ c τ 1 = ( I c τ c 1 τ ) c ¯ = ( I c τ c ) c ¯ ,
where c ¯ : = c ( c c ) 1 . This identity readily follows from the oblique projections identity
I = τ c τ 1 c + c τ c 1 τ ,
see e.g. Srivastava and Kathri (1979, p. 19), by post-multiplication by c ¯ .

3.1. Linear hypotheses on τ c

Johansen (1991) noted that the function a b : = a ( b a ) 1 is invariant with respect to the choice of basis of the space spanned by a. in fact, consider in the present context any alternative basis τ of the space spanned by τ ; this has representation τ = τ Q for Q square and full rank. Inserting τ in place of τ in the definition of τ c : = τ c τ 1 , one finds
τ c = τ c τ 1 = τ Q c τ Q 1 = τ c .
Hence τ c , similarly to the cointegration matrix in the I(1) model in Johansen (1991), is (just-)identified.
To facilitate stating hypotheses on the unconstrained elements of τ c , the following representation of τ c appears useful:
τ c = c ¯ + c ϑ
where ϑ is an m × r 2 matrix of free coefficients in τ 4. For example, one may have
c = 0 3 × 2 I 2 , c = I 3 0 2 × 3 , τ c = c ¯ + c ϑ 11 ϑ 12 ϑ 21 ϑ 22 ϑ 31 ϑ 32 = ϑ 11 ϑ 12 ϑ 21 ϑ 22 ϑ 31 ϑ 32 1 0 0 1
with p = 5 , m = 3 , r 2 = 2 .
Consider over-identifying linear restrictions on the columns of ϑ in (5). Typically, such restrictions will come in the form of zero (exclusion) restrictions or unit restrictions, where the latter would indicate equal loadings of a specific variable and the variable on which the column of τ c has been normalized. The general formulation of such restrictions is
ϑ i = k i + K i ϕ i , i = 1 , , r 2 ,
where ϑ i is the i-th column vector of ϑ , k i and K i are conformable vectors and matrices, and ϕ i contains the remaining unknown parameters in ϑ i . If only zero restrictions are imposed, then k i = 0 m .
The formulation in (7) includes several notable special cases. For instance, if all K i = K and k i = 0 m , one obtains the hypothesis that ϑ is contained in a given linear space, ϑ = K ϕ . Another example is given by the case where one column ϑ 1 is known, ϑ = ( k 1 : ϕ ) ; this corresponds to the choice ϑ 1 = k 1 with K 1 and ϕ 1 void and k 2 = = k r 2 = 0 , K 2 = = K r 2 = I .
The restrictions in (7) may be summarized as
vec ϑ = k + K ϕ ,
where k = ( k 1 : : k r 2 ) , K = blkdiag ( K 1 , , K r 2 ) and ϕ = ( ϕ 1 : : ϕ r 2 ) . Here blkdiag ( B 1 , B 2 , , B n ) indicates a matrix with the (not necessarily square) blocks B 1 , B 2 , , B n along the main diagonal. Formulation (8) generalises (7).
The main result of this section is stated in the next theorem.
Theorem 1
(Hypotheses on τ c ). Assume that ϑ satisfies linear restrictions of the type (8); then these restrictions are translated into a linear hypothesis on vec τ c via
vec τ c = ( vec c ¯ ( I m c ) K m , r 2 k ) ( I m c ) K m , r 2 K ϕ ,
where K m , n is the commutation matrix satisfying K m , n vec A = vec A , with A of dimensions m × n , see Magnus and Neudecker (2007).
Proof. 
Substitute (8) into (4) and vectorize using standard properties of the vec operator, see Magnus and Neudecker (2007). ☐
The previous theorem shows that, when one can express a linear hypothesis on the coefficients in ϑ that are unrestricted in τ c , then the same linear hypothesis is translated into a restriction on vec τ c . Note that the proof simply exploits (4).
Identification of the restricted coefficients ϕ under these hypothesis can be addressed in a straightforward way. In fact, the parameters in ϑ are identified; hence ϕ is identified provided that the matrix K is of full column rank, which in turn will imply that the Jacobian matrix vec τ c / ϕ = ( I m c ) K m , r 2 K in (9) has full column rank.
Because, in practice, econometricians may explore the form of τ via unrestricted estimates of τ c , see Paruolo (2002), before formulating restrictions on τ , using hypothesis on the unrestricted coefficients in τ c appears a natural sequential step.
The next subsection discusses the alternative approach of specifying hypotheses directly on τ .

3.2. Linear Hypotheses on τ

In case placing restrictions on the unrestricted coefficients in τ c is not what the econometrician wants, this subsection considers linear hypothesis on τ directly. It is shown that sometimes it is possible to translate linear hypothesis on τ into linear hypothesis on τ c for some c . It is also shown that this is always possible for r 2 = 2 , for which a constructive proof is provided.
Analogously to (7), consider linear hypotheses on the columns of τ , of the following type:
τ , i = h i + H i ϕ i , i = 1 , , r 2 ,
summarized as
vec τ = h + H ϕ .
In this case, non-zero vectors h i represent normalizations of the columns of the loading matrix, and as before, ϕ i collects the unknown parameters in τ , i .
Theorem 2 (Hypotheses on τ).
Assume that τ = τ ( ϕ ) satisfies linear restrictions of the type (11), then these restrictions are translated in general into a non-linear hypothesis on vec τ c via
τ c = ( I c τ ( ϕ ) c 1 τ ( ϕ ) ) c ¯
and the Jacobian of the transformation from ϕ to vec τ c is
J ( · ) : = vec τ c · ϕ = ( τ c ( · ) c ( τ ( · ) c ) 1 ) K p , r 2 H .
This parametrization is smooth on an open set in the parameter space Φ of ϕ where c τ is of full rank.
Proof. 
Equation (12) is a re-statement of (4). Differentiation of (12) delivers (13). ☐
One can note that the Jacobian matrix in (13) can be used to check local identification using the results in Rothenberg (1971).
The result of Theorem 2 is in contrast with the result of Theorem 1, because the latter delivers a linear hypothesis for τ c while Theorem 2 gives in general non-linear restrictions on τ c . One may hence ask the following question: when is it possible to reduce the more general linear hypothesis on τ given in (11) to the simpler linear hypothesis on ϑ given in (8)?
In the special case of r 2 = 2 , the following theorem states that this can be always obtained. This applies for instance to the motivating example (3), where one can choose some c so that τ c is equal to the identity, as shown below. Consider the formulation (10) with r 2 = 2 , and assume that no normalizations have been imposed yet, such that h 1 = h 2 = 0 . It is assumed that τ , under the equation-by-equation restrictions, satisfies the usual rank conditions for identification, see Johansen (1995, Theorem 1) :
rank R i τ = 1 for i = 1 , 2 ,
where R i = H i , .
Theorem 3 (Case r2 = 2).
Let τ obey the restrictions τ = ( H 1 ϕ 1 : H 2 ϕ 2 ) satisfying the rank conditions (14); then one can choose normalization conditions on ϕ 1 and ϕ 2 so that there exists a matrix c such that c τ = I . This implies that a hypotheses on τ can be stated in terms of ϑ in (5), and, by Theorem 1, a linear hypotheses on vec ϑ corresponds to linear hypothesis on vec τ c .
Proof. 
Because R 1 τ = ( 0 : R 1 H 2 ϕ 2 ) has rank 1, one can select (at least) one linear combination of R 1 , R 1 a 1 say, so that ϕ 2 is normalized to be one in the direction b 2 : = a 1 R 1 H 2 , i.e., b 2 ϕ 2 = 1 . Similarly, R 2 τ = ( R 2 H 1 ϕ 1 : 0 ) has rank 1, and one can select (at least) one linear combination of R 2 , R 2 a 2 say, so that ϕ 1 is normalized to be one in the direction b 1 : = a 2 R 2 H 1 , i.e., b 1 ϕ 1 = 1 . Next define c = ( R 2 a 2 : R 1 a 1 ) which by construction satisfies c τ = I 2 . ☐
The proof of the previous theorem provides a way to construct c when r 2 = 2 and the usual rank condition for identification (14) holds. The rest of the paper focuses attention on the case of linear restrictions on ϑ in (8), which can be translated linearly into restrictions on τ c as shown in Theorem 1.

3.3. Example 2 Continued

Consider (3); this hypothesis is of type τ = ( H 1 ϕ 1 : H 2 ϕ 2 ) with
H 1 = I 3 0 4 × 3 , H 2 = 0 1 × 6 I 6 ,
and hence R 1 = ( I 4 : 0 4 × 3 ) and R 2 = ( I 6 : 0 6 × 1 ) . In this case one can define c = ( e 2 : e 3 : e 5 : e 6 : e 7 ) and c = ( e 1 : e 4 ) where e j is the j-th column of I 7 .
It is simple to verify that, under the additional normalization restrictions ϕ 11 = 1 and ϕ 42 = 1 , τ in (3) satisfies c τ = I 2 . Therefore, define τ c as (3) under these normalization restrictions. Using formula (4) one can see that
τ c = ( I c τ c ) c ¯ = ϕ 21 ϕ 31 0 0 0 1 0 0 0 0 0 1 0 0 0 ϕ 22 ϕ 32 ϕ 52 ϕ 62 ϕ 72 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 ,
so that vec τ c is linear in ϕ , as predicted by Theorem 3.

4. The VECM Parametrization

This section describes the I(2) parametrization employed in the statistical analysis of the paper. Consider the following τ -parametrization ( τ -par) of the VECM for I(2) VAR systems5. See Mosconi and Paruolo (2017):
Δ 2 X t = α ρ τ X t 1 + ψ Δ X t 1 + λ τ Δ X t 1 + Υ Δ 2 X t 1 + ε t ,
with Υ Δ 2 X t 1 = j = 1 k 2 Υ j Δ 2 X t j . Recall that m = p r 2 is the total number of cointegrating relations, i.e., the number of I(1) linear combinations τ X t . The number of linear combinations of τ X t that cointegrate with Δ X t to I(0), i.e., the number of I(0) linear combinations ρ τ X t + ψ Δ X t , is indicated6 by r m . Here α is p × r , τ is p × m and the other parameter matrices are conformable; the parameters are α , ρ , τ , ψ , λ , Υ , Ω , all freely varying, and Ω is assumed to be positive definite. When λ is restricted as λ = Ω α ( α Ω α ) 1 κ with κ a ( p r ) × m matrix of freely varying parameters, the τ -par reduces to the parametrization of Johansen (1997); this restriction on λ is not imposed here.

4.1. Identification of τ

The parameters in the τ -par (16) are not identified; in particular τ can be replaced by B τ with B square and nonsingular, provided ρ and λ are simultaneously replaced by B 1 ρ and λ B 1 . This is because τ enters the likelihood only via (16) in the products ρ τ = ρ B 1 B τ and λ τ = ( λ B 1 ) ( B τ ) . The transformation that generates observationally equivalent parameters, i.e., the post multiplication of τ by a square and invertible matrix B , is the same type of transformation that induces observational equivalence in the classical system of simultaneous equations, see Sargan (1988), or to the set of cointegrating equations in I(1) systems, see Johansen (1995). This leads to the following result.
Theorem 4 (Identification of τ in the τ-par).
Assume that τ c is specified as the restricted τ c in (9), which is implied by the general linear hypothesis (8) on τ c ; then the restricted τ c is identified within the τ-par if and only if
rank R τ ( I m τ ) = m 2 , R τ m p × m τ = G , G : = ( I m c ) K m , r 2 K
(rank condition), where m τ = m p dim ϕ . The corresponding order condition is m τ m 2 , or equivalently m r 2 dim ϕ .
Alternatively, consider the general linear hypothesis (11) on τ ; then the constrained τ c in (12) is identified in a neighborhood of the point ϕ = ϕ provided the Jacobian J ( ϕ ) : = vec τ c ( ϕ ) / ϕ in (13) is of full rank.
Proof. 
The rank condition follows from Sargan (1988), given that the class of transformation that induce observational equivalence is the same as the classical one for systems of simultaneous equations. The local identification condition follows from Rothenberg (1971). ☐

4.2. The Identification of Remaining Parameters

This subsection discusses conditions for remaining parameters of the τ -par to be identified, when τ is identified as in Theorem 4. These additional conditions are used in the discussion of the ML algorithms of the next section.
The VECM can be rewritten as
Δ 2 X t = ν ς τ X t 1 Δ X t 1 + Υ Δ 2 X t 1 + ε t , with ς : = ρ ψ 0 τ , ν : = α : λ .
One can see that the equilibrium correction terms ν ς ( τ X t 1 ) : Δ X t 1 may be replaced by ν ς ( τ X t 1 ) : Δ X t 1 without changing the likelihood, where ν : = ν Q 1 = ( α A 1 : λ B 1 α A 1 C ) , ς : = Q ς W 1 and
Q : = A C B 0 B , W : = B 0 0 I p , ς : = Q ς W 1 = A ρ B 1 A ψ + C B τ 0 B τ ;
here A and B are square nonsingular matrices, and C is a generic matrix. Hence one observes that ( α , ρ , τ , ψ , λ , Υ , Ω ) is observationally equivalent to ( α , ρ , τ , ψ , λ , Υ , Ω ) . A, B and C define the class of observationally equivalent transformations in the τ -par for all parameters, including τ . When τ is identified one has B = I m in the above formulae.
Consider additional restrictions on φ of the type:
R φ m φ × f φ vec φ = q φ , φ : = ρ : ψ .
where f φ = r ( p + m ) . The next theorem states rank conditions for these restrictions to identify the remaining parameters.
Theorem 5 (Identification of other parameters in the τ-par).
Assume that τ is identified as in Theorem 4; the restrictions (18) identify φ and all other parameters in the τ-par if and only if (rank condition)
rank R φ ς I r = r ( r + m ) .
A necessary but not sufficient condition (order condition) for this is that
m φ r ( r + m ) .
Proof. 
Because τ is identified, one has B = I m in Q. For the identification of φ , observe that ς ς = ς ( I Q ) . One finds φ φ = ( ς ς ) ( I r : 0 ) = ς ( I m + r Q ) ( I r : 0 ) . Because both φ and φ satisfy (18), one has 0 = R φ vec ( φ φ ) = R φ ( ς I r ) vec ( ( I r : 0 ) ( I r + m Q ) ) . This implies that ( I r : 0 ) ( I m + r Q ) = 0 , i.e., that both A = I r and C = 0 r × m , and that φ is identified, if and only if rank R φ ( ς I r ) = r ( r + m ) . This completes the proof. ☐
Observe that the identification properties of the τ -par differ from the ones of the parametrization of Johansen (1997), where λ = Ω α ( α Ω α ) 1 κ is restricted, and hence the adding-and-subtracting associated with C above is not permitted.

4.3. Deterministic Terms

The τ -par in (16) does not involve deterministic terms. Allowing a constant and a trend to enter the VAR Equation (1) in a way that rules out quadratic trends, one obtains the following equilibrium correction I(2) model—for simplicity still called the τ -par below:
Δ 2 X t = α ρ τ X t 1 + ψ Δ X t 1 + λ τ Δ X t 1 + Υ Δ 2 X t 1 + ε t .
Here X t 1 = ( X t 1 : t ) so that Δ X t 1 = ( Δ X t 1 : 1 ) ; and τ = ( τ : τ 1 ) and ψ = ( ψ : ψ 0 ) .
This parametrization satisfies the conditions of the Johansen Representation Theorem and it generates deterministic trends up to first order, as shown in Appendix A. This is the I(2) model used in the application, with the addition of unrestricted dummy variables.

5. Likelihood Maximization

This section discusses likelihood maximization of the τ -par of the I(2) model (16) under linear, possibly over-identifying, restrictions on τ c , i.e., on ϑ in (5). The same treatment applies to (21) replacing ( X t 1 , Δ X t 1 ) with ( X t 1 , Δ X t 1 ), and ( τ , ψ ), with ( τ , ψ ). The formulation (16) is preferred here for simplicity in exposition.
The alternating maximization procedure proposed here is closely related, but not identical, to the algorithms proposed by Doornik (2017b); related algorithms were discussed in Paruolo (2000b). Restricted ML estimation in the I(1) model was discussed in Boswijk and Doornik (2004).

5.1. Normalizations

Consider restrictions (8), which are translated into linear hypotheses on τ c in (9) as follows
vec τ c = ( vec c ¯ ( I m c ) K m , r 2 k ) ( I m c ) K m , r 2 K ϕ = : g + G ϕ ,
where by construction g and G satisfy ( I m c ) g = vec I r m and ( I m c ) G = 0 such that c τ c = I m .
Next, consider just-identifying restrictions on the remaining parameters. For ψ , the linear combinations of first differences entering the multicointegration relations, one can consider
c ψ = 0 ψ = c δ ,
where δ is the r × r 2 matrix of multicointegration parameters. This restriction differs from the restriction ψ = τ δ which is considered e.g., in Juselius (2017a, 2017b), and it was proposed and analysed by Boswijk (2000).
Furthermore, the m × r matrix ρ can be normalized as follows
d ρ = I r ρ = d ¯ + d ϱ ,
where d is some known m × r matrix, and where ϱ , of dimension ( m r ) × r , contains freely varying parameters.
It can be shown that restrictions (22) and (23) identify the remaining parameters using Theorem 5. In fact, (22) and (23) can be written as φ V = v where V : = blkdiag ( d , c ) and v : = ( I r : 0 r × m ) . Vectorizing, one obtains an equation R φ vec φ = q φ of the form (18) with R φ = ( V I r ) and q φ = vec v . The rank condition (19) is satisfied, since R φ ς I r = V ς I r = I r ( m + r ) because
V ς = d ρ 0 c ψ c τ = I r 0 0 I m ,
where the last equality follows from (22) and (23) and τ = τ c .

5.2. The Concentrated Likelihood Function

The model (16), after concentrating out the unrestricted parameter matrix Υ , can be represented by the equations
Z 0 t = α ( ρ τ Z 2 t + ψ Z 1 t ) + λ τ Z 1 t + ε t ξ ,
where ξ indicates the vector of free parameters in ( α , ϱ , ϕ , δ , λ ) , Z 0 t , Z 1 t and Z 2 t are residual vectors of regressions of Δ 2 X t , Δ X t 1 and X t 1 , respectively, on X t 1 ;7 this derivation follows similarly to Chapter 6.1 in Johansen (1996). The associated log-likelihood function, concentrated with respect to Υ , is given by
( ξ , Ω ) = T 2 log Ω 1 2 t = 1 T ε t ξ Ω 1 ε t ξ ,
In the rest of this section, ε t is used as shorthand for ε t ξ .
Algorithms for the maximization of the concentrated log-likelihood function ( ξ , Ω ) are proposed below. The first one, called al1, considers the alternative maximization of ( ξ , Ω ) over ( α , ϱ , δ , λ , Ω ) for a fixed value of ϕ (called the α -step), and over ( ϕ , δ ) for a given value of ( α , ϱ , λ , Ω ) (called the τ -step).
A variant of this algorithm, called al2, can be defined fixing δ in the τ -step to the value of δ obtained in the α -step. It can be shown that the increase in ( ξ , Ω ) obtained in one combination of α -step and τ -step of al1 is greater or equal to the one obtained by al2. The proof of this result is reported in Proposition A1 in Appendix B. Because of this property, and because al2 may display very slow convergence properties in practice, al1 is implemented in the illustration below.
The rest of this section presents algorithms al1 and al2, defining first the τ -step, then the α -step and finally discussing the starting values, a line search and normalizations.

5.2.1. τ Step

Taking differentials, one has d = t = 1 T ε t Ω 1 d ε t . Keeping ( α , ϱ , λ ) fixed, one finds
d ε t = d α ρ τ Z 2 t + α ψ Z 1 t + λ τ Z 1 t   = ( Z 2 t α ρ ) + ( Z 1 t λ ) d vec τ + ( Z 1 t α ) d vec ψ   = ( Z 2 t α ρ ) + ( Z 1 t λ ) K m , r 1 G d ϕ + ( Z 1 t c α ) d vec δ .
Writing ε t in terms of ϕ and vec δ , i.e., ε t = Z 0 t ( Z 2 t α ρ ) + ( Z 1 t λ ) K m , r 1 ( G ϕ + g ) ( Z 1 t c α ) vec δ , the first-order conditions / ϕ = 0 and / vec δ = 0 are solved by
ϕ ^ vec δ ^ = G U 1 Ω 1 I T U 1 G G U 1 Ω 1 I T U 2 U 2 Ω 1 I T U 1 G U 2 Ω 1 I T U 2 1 ·   · G U 1 Ω 1 I T U 2 Ω 1 I T vec Z 0 U 1 g ,
where Z j = Z j 1 : : Z j T , j = 0 , 1 , 2 , and where U 1 = ( α ρ Z 2 ) + ( λ Z 1 ) , and U 2 = ( α Z 1 c ) . Note that (25) is the GLS estimator in a regression of vec Z 0 U 1 g on U 1 G : U 2 . This defines the τ -step for al1.
The τ -step for al2 is defined similarly, but keeping δ fixed. In this case it is simple to see that
ϕ ^ = G U 1 Ω 1 I T U 1 G 1 G U 1 Ω 1 I T vec Z 0 U 1 g vec Z 1 ψ α .

5.2.2. α Step

When ϕ is fixed (and hence τ is fixed), one can construct Z 3 t = τ Z 1 t and
Z 4 t = d ¯ τ Z 2 t d τ Z 2 t c Z 1 t , γ = I r ϱ δ .
The concentrated model (24) can then be written as a reduced rank regression:
Z 0 t = α γ Z 4 t + λ Z 3 t + ε t ,
for which the Guassian ML estimator for α , γ , λ has a closed-form solution, see Johansen (1996). Specifically, let M i j : = T 1 t = 1 T Z i t Z j t , i , j = 0 , 3 , 4 and S i j : = M i j M i 3 M 33 1 M 3 j , i , j = 0 , 4 . If v i , i = 1 , , r , are the eigenvectors corresponding to the largest r eigenvalues of the problem
( μ S 44 S 40 S 00 1 S 04 ) v = 0 ,
and v = ( v i , , v r ) is the matrix of the corresponding eigenvectors, then the optimal solutions for ϱ , δ , α , λ is given by
γ ^ = I r 0 ϱ ^ δ ^ = v ( e v ) 1 , α ^ = S 04 γ ^ ( γ ^ S 44 γ ^ ) 1 , λ ^ = ( M 03 α ^ γ ^ M 43 ) M 33 1 ,
where e = ( I r : 0 ) . Optimization with respect to Ω ^ is performed using Ω ξ = T 1 t = 1 T ε t ( ξ ) ε t ( ξ ) replacing ξ with ξ ^ formed from the previous expressions, namely taking ( α , ϱ , δ , λ ) equal to ( α ^ , ϱ ^ , δ ^ , λ ^ ) in the above display and ϕ = ϕ ^ from the τ -step. Using the S i j matrices, one can also compute Ω ^ directly as Ω ^ = S 00 S 04 γ ^ ( γ ^ S 44 γ ^ ) 1 γ ^ S 40 . This completes the definition of the α -step.

5.2.3. Starting Values and Line Search

If the system is just-identified, consistent starting values for all parameters can be obtained by imposing the identifying restrictions on the two-stage estimator for the I(2) model (2SI2), see Johansen (1995) and Paruolo (2000a). In case of over-identification, this method may be used to produce starting values for ( α , ϱ , λ ) , which may then be used as input for the first τ -step to obtain starting values for ϕ and δ .
Let η be the vector containing all free parameters in ( α , ϱ , δ , λ ) , and let ξ : = ( ϕ : η ) . Denote by ξ j 1 = ( ϕ j 1 : η j 1 ) the value of ξ in iteration ( j 1 ) of algorithms. Denote as ξ ^ j = ( ϕ ^ j : η ^ j ) the value of ξ obtained by the application of a τ -step and α -step of algorithms al1 and al2 at iteration j starting from ξ j 1 . In an I(1) context, Doornik (2017a) found that better convergence properties can be obtained if a line search is added. For this purpose, define the final value of the j-th iteration as
ξ j ( ω ) = ξ j 1 + ω ( ξ ^ j ξ j 1 )
where ω is chosen in R + = ( 0 , ) using a line search; note that values of ω greater than 1 are admissible. A simple (albeit admittedly sub-optimal) implementation of the line search is employed in Doornik (2017a); it consists of evaluating the log-likelihood function ( ξ , Ω ( ξ ) ) with Ω ξ = T 1 t = 1 T ε t ( ξ ) ε t ( ξ ) setting ξ equal to ξ j ( ω ) for ω { 1 . 2 , 2 , 4 , 8 } , and in choosing the value of ω with the highest loglikelihood . This simple choice of line search is used in the empirical illustration.

5.3. Standard Errors

The asymptotic variance matrix of the ML estimators may be obtained from the inverse observed (concentrated) information matrix as usual. Writing (24) as Z 0 t = Π Z 2 t + Γ Z 1 t + ε t , and letting θ = ( vec ( Π ) : vec ( Γ ) ) , the observed concentrated information matrix for the reduced-form parameter vector θ is obtained from
I θ = 2 ( θ ) θ θ = Ω 1 Z 2 Z 2 Ω 1 Z 2 Z 1 Ω 1 Z 1 Z 2 Ω 1 Z 2 Z 2 .
This leads to the following information matrix in terms of the parameters ( ϕ , η ) :
I ϕ , η = J ϕ J η I θ J ϕ J η ,
where J ϕ = θ / ϕ and J η = θ / η . From Π = α ρ τ and Γ = α ψ + λ τ , one obtains
J ϕ = α ρ I p λ I p G .
Define η = ( vec ( α ) : vec ( ϱ ) : vec ( δ ) : vec ( λ ) ) , so that J η = [ J α : J ϱ : J δ : J λ ] , with
J α = I p τ ρ I p ψ , J ϱ = α τ d 0 , J δ = 0 α c , J λ = 0 I p τ .
With these ingredients, one finds
var ^ ( ϕ ^ ) = J ^ ϕ I ^ θ J ^ ϕ J ^ ϕ I ^ θ J ^ η ( J ^ η I ^ θ J ^ η ) 1 J ^ η I ^ θ J ^ ϕ 1 ,
where I ^ θ , J ^ ϕ and J ^ η are the expressions given above, evaluated at the ML estimators. Standard errors of individual parameters estimates are obtained as the square root of the diagonal elements of var ^ ( ϕ ^ ) . Asymptotic normality of resulting t-statistics (under the null hypothesis), and χ 2 asymptotic null distributions of likelihood ratio test statistics for the over-identifying restrictions, depend on conditions for asymptotic mixed normality being satisfied; this is discussed next.

6. Asymptotics

The asymptotic distribution of the ML estimator in the I(2) model has been discussed in Johansen (1997, 2006). As shown there and discussed in Boswijk (2000), the limit distribution of the ML estimator is not jointly mixed normal as in the I(1) case. As a consequence, the limit distribution of LR test statistics of generic hypotheses need not be χ 2 under the null hypothesis.
In some special cases, the asymptotic distribution of the just-identified ML estimator of the cointegration parameters can be shown to be asymptotically mixed normal. Consider the case r 1 = 0 (i.e., r = m ), and assume as before that no deterministic terms are included in the model. In this case, the limit distribution of the cointegration parameters in Theorem 4 in Johansen (2006), J06 hereafter, can be described in terms of the estimated parameters B ^ 0 : = τ ¯ ( ψ ^ ψ ) and B ^ 2 : = τ ¯ ( τ ^ τ ) , where τ ^ is identified as τ c with c = τ . Note that the components C and B 1 in the above theorem do not appear here, because r 1 = 0 . One has
T B ^ 0 T 2 B ^ 2 w B : = 0 1 H * ( s ) H * ( s ) d s 1 0 1 H * ( s ) d W 1 ( s )
with H * ( u ) : = ( H 0 ( u ) : H 2 ( u ) ) ,
H 2 u : = 0 u H 0 ( s ) d s , H 0 ( u ) : = τ C 2 W ( u ) , W 1 ( u ) : = α Ω 1 α 1 α Ω 1 W ( u ) ,
and where T 1 2 i = 1 T u ε i w W ( u ) , a vector Brownian motion with covariance matrix Ω 8.
As noted in J06, B has a mixed normal distribution with mean 0, because H * ( u ) is a function of α W ( u ) , which is independent of W 1 ( u ) . Moreover in the case r 1 = 0 , the C component of the ML limit distribution does not appear, so that the whole limit distribution of the cointegration parameters is jointly mixed normal, unlike in the case r 1 > 0 .
One can see that hypothesis (8) defines a smooth restriction of the B 2 parameters9. More precisely B 2 depends smoothly only on ϕ 2 , B 2 = B 2 ( ϕ 2 ) , where ϕ 2 contains the ϕ parameters in (8). Note also that B 0 depends on the parameters in ψ , which are unrestricted by (8); hence B 0 depends only on ϕ 1 , B 0 = B 0 ( ϕ 1 ) , where ϕ 1 contains the parameters in δ in (22).
The conditions of Theorem 5 in J06 are next shown to be verified, and hence the LR test of the hypothesis (8) is asymptotically χ 2 with degrees of freedom equal to the number of constraints, in case r 1 = 0 . In fact, B 0 ( ϕ 1 ) , B 2 ( ϕ 2 ) are smoothly parametrizated by the continuously identified parameters ϕ 1 and ϕ 2 . Because B 2 does not depend on ϕ 1 , one easily deduces B 2 / ϕ 1 = 2 B 2 / ϕ 1 2 = 0 in (37) of J06. Similarly, one has ϕ 1 = ϕ 1 B with B 0 / ϕ 1 and B 2 / ϕ 2 of full rank; hence (38) of J06 is satisfied. This shows that the LR statistic is asymptotically χ 2 under the null, for r 1 = 0 .
In case r 1 = ( m r ) > 0 , the asymptotic distribution of τ ^ is defined in terms of ( B , C ) in J06 p. 92, which is not jointly mixed normal. In such cases, Boswijk (2000) showed that inference is mixed normal if the restrictions on τ ^ c can be asymptotically linearized in ( B , C ) , and separated into two sets of restrictions, the first group involving B only, and the second group involving C only. Because the conditions of Theorem 5 in J06 cannot be easily verified for general linear hypotheses of the form (8) in this case, they will need to be checked case by case. The authors intend to develop more readily verifiable conditions for χ 2 inference on τ in their future research.

7. Illustration

Following Juselius and Assenmacher (2015), consider a 7-dimensional VAR with
X t = ( p 1 t : p 2 t : e 12 t : b 1 t : b 2 t : s 1 t : s 2 t ) ,
where p i t , b i t , s i t are the (log of) the price index, the long and the short interest rate of country i at time t respectively, and e 12 t is the log of the exchange rate between country 1 (Switzerland) and 2 (the US) at time t. The results are based on quarterly data over the period 1975:1–2013:3. The model has two lags, a restricted linear trend as in (21), which appears in the equilibrium correction only appended to the vector of lagged levels, and a number of dummy variables; see Juselius and Assenmacher (2017), which is an updated version of Juselius and Assenmacher (2015), for further details on the empirical model. The data set used here is taken from Juselius and Assenmacher (2017).
Specification (3) is based on the prediction that r 2 = 2 . Based on I(2) cointegration tests, Juselius and Assenmacher (2017) choose a model with r = m = 5 , which indeed implies r 2 = 2 , but also r 1 = m r = 0 ; arguably, however, the test results in Table 1 of their paper also support the hypothesis ( r , r 1 ) = ( 4 , 1 ) , which has the same number r 2 = 2 of common I ( 2 ) trends. The latter model would be selected applying the sequential procedure in Nielsen and Rahbek (2007) using a 5 % or 10 % significance level in each test in the sequence.
Consider the case ( r , r 1 ) = ( 5 , 0 ) . The over-identifying restrictions on τ implied by (3) are incorporated in the parametrization (3), with normalizations ϕ 11 = ϕ 42 = 1 , which in turn leads to the over-identified structure for τ c in (15), to be estimated by ML. The restricted ML estimate of τ c is (standard errors in parentheses):
τ ^ c = 1 0 1.49 ( 0.11 ) 25.14 ( 5.23 ) 1.88 ( 0.72 ) 35.70 ( 29.81 ) 0 1 0 1.91 ( 0.53 ) 0 1.23 ( 0.29 ) 0 3.02 ( 0.95 ) .
The LR statistics for the 3 over-identifying restrictions equals 16.11 . Using the χ 2 ( 3 ) asymptotic limit distribution, one finds an asymptotic p-value of 0.001 , and hence a rejection of the null hypothesis. This indicates that the hypothesized structure on τ is rejected.
For comparison, consider also the case ( r , r 1 ) = ( 4 , 1 ) , for which the LR test for cointegration ranks has a p-value of 0.13 . The resulting restricted estimate of τ c is:
τ ^ c = 1 0 1.38 ( 0.09 ) 24.67 ( 5.22 ) 1.07 ( 0.56 ) 30.10 ( 22.42 ) 0 1 0 1.75 ( 0.52 ) 0 1.20 ( 0.28 ) 0 2.97 ( 1.02 ) .
The estimates and standard errors are similar to those obtained under the hypothesis ( r , r 1 ) = ( 5 , 0 ) . The LR statistic for the over-identifying restrictions now equals 10.08 . If one conjectured that the limit distribution of the LR test is also χ 2 ( 3 ) in this case, one would obtain an asymptotic p-value of 0.018 , so the evidence against the hypothesized structure of τ appears slightly weaker in this model.
The results for both model ( r , r 1 ) = ( 5 , 0 ) and for model ( r , r 1 ) = ( 4 , 1 ) are in line with the preferred specification of Juselius and Assenmacher (2017), who select an over-identified structure for τ , which is not nested in (15), and therefore implies a different impact of the common I(2) trends.

8. Conclusions

Hypotheses on the loading matrix of I(2) common trends are of economic interest. They are shown to be related to the cointegration relations. This link is explicitly discussed in this paper, also for hypotheses that are over-identifying. Likelihood maximization algorithms are proposed and discussed, along with LR tests of the hypotheses.
The application of these LR tests to a system of prices, exchange rates and interest rates for Switzerland and the US shows support for the existence of two I(2) common trends. These may represent a ‘speculative’ trend and a ‘relative prices’ trend, but there is little empirical support for the corresponding exclusion restrictions in the loading matrix.

Acknowledgments

Helpful comments and suggestions from two anonymous referees and the Academic Editor, Katarina Juselius, are gratefully acknowledged.

Author Contributions

Both authors contributed equally to the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Theorem A1 (Johansen Representation Theorem).
Let the vector process X t satisfy A ( L ) X t = μ 0 + μ 1 t + ε t , where A ( L ) : = I p i = 1 k A i L i , a matrix lag polynomial of degree k, and where ε t is an i.i.d. ( 0 , Ω ) sequence. Assume that A ( z ) is of full rank for all z < 1 + c , c > 0 , with the exception of z = 1 . Let A, A ˙ and A ¨ denote A ( 1 ) , the first and second derivative of A ( z ) with respect to z, evaluated at z = 1 ; finally define Γ = A ˙ A . Then X t is I(2) if and only if the following conditions hold:
(i) 
A = α β where α, β are p × r matrices of full column rank r < p ,
(ii) 
P α Γ P β = α 1 β 1 where α 1 , β 1 are p × r 1 matrices of full column rank r 1 < p r ,
(iii) 
α 2 Θ β 2 is of full rank r 2 : = p r r 1 , where Θ : = 1 2 A ¨ + A ˙ β ¯ α ¯ A ˙ , α 2 : = ( α , α 1 ) and β 2 : = ( β , β 1 ) ,
(iv) 
μ 1 = α β D for some β D ,
(v) 
α 2 μ 0 = α 2 Γ β ¯ β D .
Under these conditions, X t admits a common trends I(2) representation of the form
X t = C 2 i = 1 t s = 1 i ε s + C 1 i = 1 t ε i + C ( L ) ε t + v 0 + v 1 t ,
where
C 2 = β 2 ( α 2 Θ β 2 ) 1 α 2 ,
C ( L ) ε t is an I(0) linear process, and v 0 and v 1 depend on the VAR coefficients and on the initial values of the process.
Proof. 
See Johansen (1992), Johansen (2009) and Rahbek et al. (1999), which also contain expressions for C 1 , C * ( L ) and ( v 0 , v 1 ) . ☐
It is next shown that conditions (iv) and (v) are satisfied by the τ -par (21). In fact, condition (iv) holds for β D = ρ τ 1 . Note that Γ = α ψ + λ τ , β = τ ρ and P α Γ P β = P α λ τ P β = α 1 β 1 . The l.h.s. of (v) is
α 2 μ 0 = α 2 λ τ 1 .
Next write the r.h.s. of (v) using τ τ ρ ( ρ τ τ ρ ) 1 ρ = I ρ ( ρ ( τ τ ) 1 ρ ) 1 ρ ( τ τ ) 1 by oblique projections; one finds
α 2 Γ β ¯ β D = α 2 λ τ τ ρ ( ρ τ τ ρ ) 1 ρ τ 1   = α 2 λ τ 1 α 2 λ ρ ( ρ ( τ τ ) 1 ρ ) 1 ρ ( τ τ ) 1 τ 1 = α 2 λ τ 1
where the last equality holds because α 2 λ ρ = 0 , as shown below. Note in fact that β 1 = τ ¯ ρ lies in col β and α 2 lies in col α ; hence one can write
α 2 λ ρ = α 2 λ τ τ ¯ ρ = α 2 P α λ τ P β β 1 = α 2 P α Γ P β β 1 = α 2 α 1 β 1 β 1 = 0 .
Hence, because (A3) equals (A4), condition (v) is satisfied.

Appendix B

This Appendix contains a proof that the increase in in one combination of α -step and τ -step of al1 is greater or equal to the one obtained by al2. In order to state the argument in somewhat greater generality, define a parameter vector θ partitioned in 3 components, denoted ( θ 1 , θ 2 , θ 3 ) , where each θ j represents a subvector of parameters, respectively of dimensions n 1 , n 2 , n 3 . Let ( θ ) be the log-likelihood function. Define also the following switching algorithms, both starting at the same initial value ( θ 1 ( j 1 ) , θ 2 ( j 1 ) , θ 3 ( j 1 ) ) :
Definition A1.
algo1 (3 way switching)
Step 1:
for fixed θ 1 , maximize ℓ with respect to ( θ 2 , θ 3 ) ;
Step 2:
for fixed θ 2 , maximize ℓ with respect to ( θ 1 , θ 3 ) .
Let ( θ ( 1 , j ) ) be the value of ℓ corresponding to the application of step 1 and 2 of algo1.
Definition A2.
algo2 (Pure switching)
Step 1:
for fixed θ 1 , maximize ℓ with respect to ( θ 2 , θ 3 ) ;
Step 2:
for fixed ( θ 2 , θ 3 ) , maximize ℓ with respect to θ 1 .
Let ( θ ( 2 , j ) ) be the value of ℓ corresponding to the application of step 1 and 2 of algo2.
Proposition A1 (Pure versus 3-way switching).
One has ( θ ( 1 , j ) ) ( θ ( 2 , j ) ) .
Proof. 
In order to see this, let
( θ 2 , θ 3 ) = arg max θ 2 , θ 3 ( θ 1 ( j 1 ) , θ 2 , θ 3 ) .
Step 1 is the same for algo1 and algo2. In the second step of algo1 one considers
( θ ( 1 , j ) ) = max θ 1 , θ 3 ( θ 1 , θ 2 , θ 3 ) ,
while for algo2 one considers
( θ ( 2 , j ) ) = max θ 1 ( θ 1 , θ 2 , θ 3 ) .
The conclusion that ( θ ( 1 , j ) ) ( θ ( 2 , j ) ) follows from the fact that the maximization problem (A6) is a constrained version of (A5) under θ 3 = θ 3 . ☐
It is simple to observe that the argument of the proof implies that the larger the dimension of n 3 , the better.

References

  1. Boswijk, H. Peter. 2000. Mixed normality and ancillarity in I(2) systems. Econometric Theory 16: 878–904. [Google Scholar] [CrossRef]
  2. Boswijk, H. Peter, and Jurgen A. Doornik. 2004. Identifying, estimating and testing restricted cointegrated systems: An overview. Statistica Neerlandica 58: 440–65. [Google Scholar] [CrossRef]
  3. Doornik, Jurgen A. 2017a. Accelerated estimation of switching algorithms: The cointegrated VAR model and other applications. Working paper, University of Oxford, Oxford, UK. [Google Scholar]
  4. Doornik, Jurgen A. 2017b. Maximum likelihood estimation of the I(2) model under linear restrictions. Econometrics 5: 19. [Google Scholar] [CrossRef]
  5. Engle, Robert F., and Clive W. J. Granger. 1987. Co-integration and error correction: Representation, estimation, and testing. Econometrica 55: 251–76. [Google Scholar] [CrossRef]
  6. Johansen, Søren. 1991. Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59: 1551–80. [Google Scholar] [CrossRef]
  7. Johansen, Søren. 1992. A representation of vector autoregressive processes integrated of order 2. Econometric Theory 8: 188–202. [Google Scholar] [CrossRef]
  8. Johansen, Søren. 1995. Identifying restrictions of linear equations with applications to simultaneous equations and cointegration. Journal of Econometrics 69: 111–32. [Google Scholar] [CrossRef]
  9. Johansen, Søren. 1996. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. 2nd printing. Oxford: Oxford University Press. [Google Scholar]
  10. Johansen, Søren. 1997. A likelihood analysis of the I(2) model. Scandinavian Journal of Statistics 24: 433–62. [Google Scholar] [CrossRef]
  11. Johansen, Søren. 2006. Statistical analysis of hypotheses on the cointegrating relations in the I(2) model. Journal of Econometrics 132: 81–115. [Google Scholar] [CrossRef]
  12. Johansen, Søren. 2009. Representation of cointegrated autoregressive processes with application to fractional processes. Econometric Reviews 28: 121–45. [Google Scholar] [CrossRef]
  13. Jöreskog, Karl G., Ulf H. Olsson, and Fan Y. Wallentin. 2016. Multivariate Analysis with LISREL. Basel: Springer International Publishing. [Google Scholar]
  14. Juselius, Katarina. 2017a. A theory consistent CVAR scenario for a standard monetary model using data-generated expectations. Working paper, University of Copenhagen, Copenhagen, Denmark. [Google Scholar]
  15. Juselius, Katarina. 2017b. Using a theory-consistent CVAR scenario to test an exchange rate model based on imperfect knowledge. Working paper, University of Copenhagen, Copenhagen, Denmark. [Google Scholar]
  16. Juselius, Katarina, and Katrin Assenmacher. 2015. Real exchange rate persistence: The case of the Swiss franc-US dollar rate. SNB Working paper 2015-03, Swiss National Bank, Zürich, Switzerland. [Google Scholar]
  17. Juselius, Katarina, and Katrin Assenmacher. 2017. Real Exchange Rate Persistence and the Excess Return Puzzle: The Case of Switzerland Versus the US. Journal of Applied Econometrics. forthcoming. [Google Scholar] [CrossRef]
  18. Kongsted, Hans Christian. 2005. Testing the nominal-to-real transformation. Journal of Econometrics 124: 205–25. [Google Scholar] [CrossRef]
  19. Magnus, Jan R., and Heinz Neudecker. 2007. Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd ed. New York: Wiley. [Google Scholar]
  20. Mosconi, Rocco, and Paolo Paruolo. 2017. Cointegration and error correction in I(2) vector autoregressive models: Identification, estimation and testing. Mimeo, Polictenico di Milano, Milano, Italy. [Google Scholar]
  21. Nielsen, Heino Bohn, and Anders Rahbek. 2007. The likelihood ratio test for cointegration ranks in the I(2) model. Econometric Theory 23: 615–37. [Google Scholar] [CrossRef]
  22. Paruolo, Paolo. 1997. Asymptotic inference on the moving average impact matrix in cointegrated I(1) VAR systems. Econometric Theory 13: 79–118. [Google Scholar] [CrossRef]
  23. Paruolo, Paolo. 2000a. Asymptotic efficiency of the two stage estimator in I(2) systems. Econometric Theory 16: 524–50. [Google Scholar] [CrossRef]
  24. Paruolo, Paolo. 2000b. On likelihood-maximizing algorithms for I(2) VAR models. Mimeo, University of Insubria, Varese, Italy. [Google Scholar]
  25. Paruolo, Paolo. 2002. Asymptotic inference on the moving average impact matrix in cointegrated I(2) VAR systems. Econometric Theory 18: 673–90. [Google Scholar] [CrossRef]
  26. Rahbek, Anders, Hans Christian Kongsted, and Clara Jørgensen. 1999. Trend-stationarity in the I(2) cointegration model. Journal of Econometrics 90: 265–89. [Google Scholar] [CrossRef]
  27. Rothenberg, Thomas J. 1971. Identification in parametric models. Econometrica 39: 577–91. [Google Scholar] [CrossRef]
  28. Sargan, J. Denis. 1988. Lectures on Advanced Econometric Theory. Edited by Meghnad Desai. Oxford: Basil Blackwell. [Google Scholar]
  29. Srivastava, Muni S., and C. G. Kathri. 1979. An Introduction to Multivariate Statistics. New York: North Holland. [Google Scholar]
1.
In the I(2) cointegration literature, τ is also referred to as β 2 , see the Johansen Representation Theorem in Appendix A.
2.
Up to normalizations, see below.
3.
When c τ is square and nonsingular, then one can prove that also c τ is square and nonsingular, see e.g., Johansen (1996, Exercise 3.7).
4.
This equation is obtained by using orthogonal projection of τ c on the columns spaces of c and c , and applying the equality c τ c = I r 2 which follows by definition.
5.
In the general VAR(k) model (1), ε t in (16) is replaced by μ 0 + μ 1 t + ε t ; see Section 4.3 below.
6.
The difference m r = p r r 2 is referred to as either s or r 1 in the I(2) cointegration literature, see Appendix A.
7.
If a restricted constant and linear trend are included in the model, as in (21), then Z 1 t and Z 2 t are defined as the residual vectors of regressions of Δ X t 1 and X t 1 , respectively, on X t 1 .
8.
Here w indicates weak convergence and · denotes the greatest integer part.
9.
In the rest of this section the notation ϕ 1 , ϕ 2 and B i / ϕ j are used in accordance to the notation in J06.

Share and Cite

MDPI and ACS Style

Boswijk, H.P.; Paruolo, P. Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems. Econometrics 2017, 5, 28. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics5030028

AMA Style

Boswijk HP, Paruolo P. Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems. Econometrics. 2017; 5(3):28. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics5030028

Chicago/Turabian Style

Boswijk, H. Peter, and Paolo Paruolo. 2017. "Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems" Econometrics 5, no. 3: 28. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics5030028

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop