Next Article in Journal
Linear Stochastic Models in Discrete and Continuous Time
Previous Article in Journal
Climate Disaster Risks—Empirics and a Multi-Phase Dynamic Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Indirect Inference Estimation of Spatial Autoregressions

1
Department of Economics, Purdue University, West Lafayette, IN 47907, USA
2
School of Economics, Nanjing Audit University, Nanjing 211815, China
3
National Academy of Development and Strategy, Renmin University of China, Beijing 100872, China
*
Author to whom correspondence should be addressed.
Submission received: 1 May 2020 / Revised: 5 August 2020 / Accepted: 31 August 2020 / Published: 3 September 2020

Abstract

:
The ordinary least squares (OLS) estimator for spatial autoregressions may be consistent as pointed out by Lee (2002), provided that each spatial unit is influenced aggregately by a significant portion of the total units. This paper presents a unified asymptotic distribution result of the properly recentered OLS estimator and proposes a new estimator that is based on the indirect inference (II) procedure. The resulting estimator can always be used regardless of the degree of aggregate influence on each spatial unit from other units and is consistent and asymptotically normal. The new estimator does not rely on distributional assumptions and is robust to unknown heteroscedasticity. Its good finite-sample performance, in comparison with existing estimators that are also robust to heteroscedasticity, is demonstrated by a Monte Carlo study.
JEL Classification:
C21; C10; C13

1. Introduction

Spatial autoregressions (SAR) have attracted lots of attention from both practical and theoretical sides in economics and other disciplines of social sciences since the classical work of Cliff and Ord (1981). Its popularity is mainly due to its parsimonious representation of the cross-sectional correlation by a weight matrix. Correlation in spatial data arises naturally due to competition, copycatting, spillover, aggregation, to name just a few. In these contexts, space embodied in the weight matrix can be defined in terms of not only geographical distance but also economic distance.
The spatial autoregression model extends autocorrelation in time series to the spatial dimension in the sense that in the structural equation a spatially “lagged” dependent variable is included as a regressor. In time series, the autoregression model can be estimated consistently by the ordinary least squares (OLS), but for the spatial autoregression, the OLS is usually regarded as an inconsistent method. Robinson (2008) provided an excellent discussion of the intuition behind this. Estimation strategies like the maximum likelihood (ML), quasi maximum likelihood (QML), instrument variables (IV), and generalized method of moments (GMM) have been proposed in the literature. The ML is the most efficient, but it imposes stringent distributional assumptions on the data generating process, whereas both the ML and QML rule out heteroscedasticity in the error term. Further, the (Q)ML method involves calculating the determinant or eigenvalues of a matrix that is of the same size as the sample, and thus many researchers dismiss its use in moderately large samples and advocate the more flexible IV and GMM estimators that may incur less computational burden and are also robust to heteroscedasticity.
Lee (2002) overturned the traditional wisdom regarding the OLS estimator in spatial autoregressions when there are exogenous regressors included. He showed that while the OLS estimator is inconsistent for spatial autoregressions with a sparse weight matrix, it can be consistent when spatial units may have small spatial impacts on other units but each unit may be influenced aggregately by a significant portion of the total units. For the special case of the so-called pure SAR model, namely, when there is no other exogenous regressor, Lee (2002) demonstrated that regardless of the structure of the weight matrix, the OLS estimator is always inconsistent.
In practice, one may have limited knowledge to judge whether a spatial unit is influenced by a significant portion of the total units. When a researcher is constructing the weight matrix, she may know the number of neighboring units for each unit, but she may be unable to tell whether the number of neighbors is significant in finite samples. Thus, this poses a challenge for practitioners regarding the usefulness of the OLS estimator: it may or may not be consistent, depending on the degree of aggregate influence on each unit from other units, which may hardly manifest itself in finite samples. This paper carefully analyzes the asymptotic distribution theory for the OLS estimator. A unified asymptotic distribution result for the recentered OLS estimator is presented under different regimes for the spatial weight matrix. Given the asymptotic result for the recentered OLS estimator, a new estimator based on the indirect inference (II) procedure is proposed.
Kyriacou et al. (2017) novelly used the II procedure to correct the inconsistency of the OLS estimator in the pure SAR model under homoscedasticity. Even though they provided promising simulation results under some mild heteroscedasticity, they have yet to show rigorously how to construct a consistent estimator with no restrictions on the form of heteroscedasticity. In contrast, this paper considers the SAR model with exogenous regressors and it adds to the existing spatial literature with unknown heteroscedasticity.1
Note that the problem of inconsistency (of the OLS estimator) is solely due to the presence of the endogenous spatially lagged variable. Once the spatial autoregression parameter is estimated consistently, the OLS procedure can be used to estimate the remaining parameters, though the asymptotic variance needs to be modified accordingly to take into account the uncertainty in the estimated spatial autoregression parameter.
The structure of this paper is as follows. In the next section, the asymptotic behavior of the OLS estimator is discussed under different spatial scenarios. The II estimator, which aims to correct the possible inconsistency of the OLS estimator, is defined and its asymptotic distribution is derived. A very important message from this section is that regardless of the degree of aggregate influence on each spatial unit from other units, the II procedure can always be used and the resulting estimator is consistent and asymptotically normal. Section 3 discusses the special case of pure SAR. Section 4 provides Monte Carlo evidence of the effectiveness of the II estimation strategy. It shows that the II estimator possesses good finite-sample performance relative to other consistent estimators that are also robust to unknown heteroscedasticity and that the II method may be favored for the purpose of hypothesis testing, especially for testing the spatial autoregression parameter. Section 5 concludes. Some useful lemmas are collected in Appendix A and proofs of the results presented in Section 2 and Section 3 are given in Appendix B.
Throughout this paper, K is used to denote a positive constant on different occasions, arbitrarily large but bounded, that does not depend on the sample size n and whose value may vary in different contexts. I n is the identity matrix of dimension n and 1 n is an n × 1 vector of ones. For an n × 1 vector a n , a i , n denotes its i-th element and for an n × n matrix A n , a i j , n denotes its i j -th element. A n = max 1 i < n j = 1 n | a i j , n | and A n 1 = max 1 j < n i = 1 n | a i j , n | are the maximum row sum norm and maximum column sum norm, respectively. A sequence of matrices { A n } is uniformly bounded in row sum if A n K , and is uniformly bounded in column sum if A n 1 K . tr and ⊙ are matrix trace and Hadamard product operators, respectively. Dg ( a n ) denotes a square diagonal matrix with the vector a n spanning the main diagonal, dg ( A n ) is an n × 1 column vector that collects in order the diagonal elements of the square matrix A n , and Dg ( A n ) = Dg ( dg ( A n ) ) . The subscript 0 is used to signify the true parameter value.

2. Main Results

Consider the SAR model
y n = λ W n y n + X n β + u n = Z n θ + u n ,
where n is the total number of cross-sectional units, Z n = ( W n y n , X n ) , θ = ( λ , β ) , y n is an n × 1 vector, collecting observations on the dependent variable, X n is an n × k matrix of observations on k exogenous nonstochastic regressors with coefficient vector β , W n is an n × n matrix of spatial weights with zero diagonals, λ is the spatial autoregression coefficient, and u n is an n-dimensional vector of error terms.
For the ease of presentation, the following matrix notation is introduced in this paper:
S n ( λ ) = I n λ W n , G n ( λ ) = W n S n 1 ( λ ) , M n = I n X n ( X n X n ) 1 X n , D n ( λ ) = Dg ( M n G n ( λ ) ) , E n ( λ ) = M n G n ( λ ) D n ( λ ) .
When a matrix is presented without its argument λ , it means that it is evaluated at the parameter value λ 0 . Namely, S n = S n ( λ 0 ) , G n = G n ( λ 0 ) , D n = D n ( λ 0 ) , and E n = E n ( λ 0 ) .
If λ is known (equal to its true value), the model becomes a standard linear model with S n y n being the dependent variable; otherwise, W n y n appears on the right-hand side of (1) as a spatially lagged or weighted variable. The OLS estimator of θ 0 is θ ^ n = ( Z n Z n ) 1 Z n y n , which may be inconsistent. Since the inconsistency of θ ^ n is solely due to the endogenous W n y n , the properties of λ ^ n (the first element of θ ^ n ) are discussed first in this paper. Once a consistent estimator of λ 0 is available, one can show that a consistent estimator of β 0 follows immediately (see Theorem 4 to be introduced).2
Let r n = r 1 n + r 2 n , r 1 n = u n M n G n u n , r 2 n = β 0 X n G n M n u n , d n = d 1 n + d 2 n + d 3 n , d 1 n = u n G n M n G n u n , d 2 n = 2 β 0 X n G n M n G n u n , and d 3 n = β 0 X n G n M n G n X n β 0 . By using the partitioned regression formula and substituting W n y n = G n X n β 0 + G n u n , one may put
λ ^ n λ 0 = y n W n M n u n y n W n M n W n y n = r n d n = r 1 n + r 2 n d 1 n + d 2 n + d 3 n .
The following assumptions are made throughout this paper.
Assumption 1.
(i) i j , w i j , n = O ( h n 1 ) , where the rate sequence { h n } is uniformly bounded away from zero and lim n h n / n = 0 ; (ii) the sequence { W n } is is uniformly bounded in row and column sums; (iii) w i i , n = 0 .
Assumption 2.
(i) S n 1 exists; (ii) the sequence { S n 1 } is uniformly bounded in row and column sums.
Assumption 3.
The error terms { u i , n } in u n = ( u 1 , n , , u n , n ) have following properties: (i) E ( u i , n ) = 0 ; (ii) E ( | u i , n | 4 + δ ) < for some positive constant δ; (iii) u i , n and u j , n are independent for any i j .
Assumption 4.
λ 0 is contained in a compact parameter space Λ. For any admissible λ Λ , { S n 1 ( λ ) } is uniformly bounded in row and column sums.
Assumption 5.
(i) The elements of X n are uniformly bounded constants for all n; (ii) the probability limit Γ of n 1 Z n Z n exists and is nonsingular.
Assumption 6.
(i) lim n n 1 Var ( r n ) exists and is nonzero; (ii) lim n n 1 Var ( u n E n u n + β 0 X n G n M n u n ) exists and is nonzero.
Intuitions and related discussions for Assumptions 1–5 are provided in Kelejian and Prucha (2010) and Lee (2001, 2002, 2004). Assumption 1(i) follows naturally when W n is row- or column-normalized, as is typically the case. Assumption 1(ii) and Assumption 2 limit the degree of dependency among the spatial units and is originated by Kelejian and Prucha (1999). Given Assumption 2, the equilibrium solution of y n is y n = S n 1 X n β 0 + S n 1 u n . Under Assumption 3, let σ i , n 2 = E ( u i , n 2 ) , Σ n = Dg ( σ 1 , n 2 , , σ n , n 2 ) , and Σ n ( j ) = Dg ( μ 1 , n ( j ) , , μ n , n ( j ) ) with μ i , n ( j ) = E ( u i , n j ) , j = 3 , 4 . Lee (2002) emphasized that Assumption 5(ii) is related to an identification condition for estimation in the least squares and IV frameworks. It rules out possible multicollinearities among X n and G n X n β 0 for large n and implies that the limit of d 3 n / n is bounded away from zero.3 Assumption 6 ensures that the asymptotic variances of the (properly recentered) OLS estimator and the resulting II estimator are positive.4

2.1. The Asymptotic Behavior of the OLS estimator

From Lemma A6 in Appendix A, r 1 n and d 1 n are both O P ( n / h n ) , r 2 n and d 2 n are both O P ( n ) , and d 3 n = O ( n ) . The asymptotic properties of the OLS estimator crucially depend on the magnitude of h n .
When h n is bounded, the OLS estimator cannot be consistent, since now
λ ^ n λ 0 = 1 n r 1 n 1 n d 1 n + 1 n d 3 n + o P ( 1 ) ,
but the probability limit of the numerator n 1 r 1 n is typically nonzero.
If h n , d 3 n dominates the denominator in (3), then
λ ^ n λ 0 = 1 h n h n n r 1 n + 1 n 1 n r 2 n 1 n d 3 n + o P ( 1 ) p 0 ,
indicating that λ ^ n is consistent as long as h n .
More interestingly, the behavior of λ ^ n depends on how fast h n diverges to infinity. If h n tends to infinity at a rate slower than n , then r 1 n dominates r 2 n in the numerator, and
h n ( λ ^ n λ 0 ) = h n n r 1 n 1 n d 3 n + o P ( 1 ) ,
which implies that λ ^ n converges at the slower rate h n , but it does not converge to λ 0 at rate h n , as plim n ( h n / n ) r 1 n = lim n ( h n / n ) tr ( Σ n M n G n ) is typically nonzero.
If h n = O ( n ) , r 1 n and r 2 n are of the same order in the numerator, so
n ( λ ^ n λ 0 ) = 1 n r 1 n + 1 n r 2 n 1 n d 3 n + o P ( 1 ) ,
indicating that λ ^ n converges at rate n , but at this rate it does not converge to λ 0 , as in general plim n n 1 / 2 r 1 n = lim n n 1 / 2 tr ( Σ n M n G n ) is nonzero.
If h n tends to infinity at a rate faster than n (and yet smaller than n), namely, lim n n / h n = 0 (and lim n h n / n = 0 ) , then in the numerator r 2 n dominates r 1 n ,
n ( λ ^ n λ 0 ) = 1 n r 2 n 1 n d 3 n + o P ( 1 ) ,
indicating that λ ^ n converges to λ 0 at rate n and is asymptotically normal if one applies a central limit theorem to n 1 / 2 r 2 n .
One sees that the asymptotic behavior of λ ^ n depends on the magnitude of h n , which may be unknown in practice. The following theorem shows that if one can properly recenter λ ^ n , then a unified asymptotic distribution result follows.
Theorem 1.
Under Assumptions 1–6, the OLS estimator λ ^ n of λ 0 in the SAR model (1) has the following asymptotic distribution,
n λ ^ n λ 0 tr ( Σ n M n G n ) y n W n M n W n y n d N ( 0 , v ) ,
where
v = lim n n Var ( r n ) [ tr ( Σ n G n M n G n ) + β 0 X n G n M n G n X n β 0 ] 2
with
Var ( r n ) = tr ( Σ n ( 4 ) M n G n M n G n ) + tr [ Σ n M n G n Σ n ( M n G n + G n M n ) ] + β 0 X n G n M n Σ n M n G n X n β 0 + 2 β 0 X n G n M n dg ( Σ n ( 3 ) M n G n ) .
Remark 1.
The recentering term tr ( Σ n M n G n ) / y n W n M n W n y n is in fact E ( r n ) / d n . One could have recentered λ ^ n λ 0 by E ( r n ) / E ( d n ) . (This is the approach taken by Kyriacou et al. (2017) when dealing with the special case of pure SAR model.) But then by following a similar expansion as in the proof of Theorem 1 (see Appendix B.1), one can find that the asymptotic variance of the resulting recentered estimator is much more complicated, involving the variances of r n and d n as well as their covariance.
Remark 2.
When lim n n / h n = 0 , the recentering term tr ( Σ n M n G n ) / y n W n M n W n y n is in fact o P ( n 1 / 2 ) and the asymptotic distribution of n ( λ ^ n λ 0 ) centers at zero, so one does not need to recenter λ ^ n λ 0 by tr ( Σ n M n G n ) / y n W n M n W n y n ; nor does one need to recenter it by y n S n M n D n M n S n y n / y n W n M n W n y n , which is also o P ( n 1 / 2 ) , as in Theorem 2 to be introduced.
Remark 3.
When h n diverges, one may single out dominating terms in Var ( r n ) and E ( d n ) so that a finer expression of v = lim n [ Var ( r n ) / n ] / [ E ( d n ) / n ] 2 can be presented. For example, under divergent h n , one can replace Var ( r n ) with Var ( r 2 n ) = β 0 X n G n M n Σ n M n G n X n β 0 and replace E ( d n ) with its dominating term β 0 X n G n M n G n X n β 0 . When h n is bounded, however, such replacements are not available and one needs to keep all the term in Var ( r n ) and E ( d n ) . Moreover, since Var ( r n ) depends on higher-order moments of u, namely, Σ n ( 3 ) and Σ n ( 4 ) , then, with bounded h n , the asymptotic variance v of the recentered OLS estimator depends on them, too.
Remark 4.
In view of Remark 3, under divergent h n , if further Σ n = σ 0 2 I n , namely, under homoscedasticity with σ 0 2 = E ( u i , n 2 ) , then v corresponds to the top left element of σ 0 2 Γ 1 . This, together with Remark 2, is in line with the observation in Lee (2002) that the (consistent, uncentered) OLS estimator (when lim n n / h n = 0 ) has the same limiting distribution of the optimal IV estimator and under normality it has the same limiting distribution of the ML estimator. It also implies that under other cases of divergent h n (slower than or equal to rate n ), as long as the OLS estimator is properly recentered, it achieves the same limiting distribution.
Remark 5.
Theorem 1 gives a unified representation of the asymptotic distribution of the properly recentered OLS estimator, regardless of the possibly unknown magnitude of h n . Further, it facilitates the construction of the indirect inference estimator to be introduced that corrects the inconsistency, when present, of the OLS estimator.

2.2. The Indirect Inference Estimator

One can see from Theorem 1 in the previous subsection that the OLS estimator λ ^ n may have an asymptotic bias. Yet a direct feasible bias-correction procedure is not possible, since the bias itself depends on unknown parameters, including λ 0 , which may not be consistently estimated by the OLS estimator λ ^ n . Following Phillips (2012) and Kyriacou et al. (2017), one may define the binding function that involves the bias of λ ^ n . Unfortunately, the resulting binding function then involves the unknown Σ n , which appears in the recentering quantity for the OLS estimator in (7). The strategy in this paper is to replace tr ( Σ n M n G n ) = E ( u n M n G n u n ) with a term that is of the same order of u n M n G n u n and at the same time does not directly involve Σ n .
Recall that D n = Dg ( M n G n ) . Then under a general form of unknown heteroscedasticity, tr ( Σ n M n G n ) = tr ( Σ n D n ) = E ( u n D n u n ) , where u n = S n y n X n β 0 . If λ 0 is known, then β 0 can be consistently estimated by β ˜ n = β ˜ n ( λ 0 ) = ( X n X n ) 1 X n S n y n . Let u ˜ n = u ˜ n ( λ 0 ) = S n y n X n β ˜ n = M n S n y n . Now one may be able to replace E ( u n D n u n ) with u ˜ n D n u ˜ n = y n S n M n D n M n S n y n and use y n S n M n D n M n S n y n / y n W n M n W n y n as the recentering quantity.
Theorem 2.
Under Assumptions 1–6, the OLS estimator λ ^ n of λ 0 in the SAR model (1) has the following asymptotic distribution:
n λ ^ n λ 0 y n S n M n D n M n S n y n y n W n M n W n y n d N ( 0 , η ) ,
where
η = lim n n tr [ Σ n E n Σ n ( E n + E n ) ] + β 0 X n G n M n Σ n M n G n X n β 0 [ tr ( Σ n G n M n G n ) + β 0 X n G n M n G n X n β 0 ] 2 .
Remark 6.
When one replaces tr ( Σ n M n G n ) that involves the unknown Σ n and appears in the recentering term of the OLS estimator, the asymptotic variance η of the newly recentered estimator no longer involves Σ n ( 3 ) and Σ n ( 4 ) . This stands in contrast to the asymptotic variance v (see Remark 3). So replacing tr ( Σ n M n G n ) with y n S n M n D n M n S n y n facilitates not only the construction of the indirect inference estimator to be introduced but also the inference procedure.
Given Theorem (2) and the observed sample data y n and X n , one can always define the sample binding function. Recall that S n ( λ ) = I n λ W n and D n ( λ ) = Dg ( M n G n ( λ ) ) = Dg ( M n W n S n 1 ( λ ) ) are functions of the parameter λ (as well as X n ). So the binding function can be defined as
b n ( λ ) = λ + y n S n ( λ ) M n D n ( λ ) M n S n ( λ ) y n y n W n M n W n y n
and the II estimator inverts this binding function:
λ ^ n I I = b n 1 ( λ ^ n ) .
Intuitively, the II estimator defined as such tries to match λ ^ n from the observed data to its expectation, at least approximately. Typically, the expectation may be approximated, to an arbitrary degree of accuracy, via the method of simulations, as in the original spirit of Gouriéroux et al. (1993) and Smith (1993); however, in simulations one needs to make some distributional assumption to generate the pseudo error term. Instead one may use some analytical approximation as in Phillips (2012), Kyriacou et al. (2017), and this paper.
In the definition of the II estimator as in (10), it is implicitly assumed that the binding function b n ( λ ) is invertible. Note that b n ( λ ) is a function of λ and the sample data y n and X n and thus it is random.5
Assumption 7.
For all λ Λ , the binding function (9) is monotonic in λ with probability 1 and when h n is bounded, b n ( λ 0 ) a . s . b 0 0 , where
b n ( λ 0 ) = 1 + y n S n M n Dg ( M n G n 2 ) M n S n y n 2 y n W n M n D n M n S n y n y n W n M n W n y n .
One can see from the expression (A.10) in Appendix B.3 that the derivative of the binding function with respect to λ is
b n ( λ ) = 1 + O P 1 h n .
For divergent h n , one can show that the O P ( h n 1 ) converges almost surely to zero for all λ Λ . So in large samples, Assumption 7 is more likely to hold. Assumption 7 lists the conditions under which the II estimator exists and is consistent. It would be desirable if one could lay down some primitive conditions on the data matrix X n , the weight matrix W n , and the parameter space Λ so that Assumption 7 would be satisfied. Given the sample data, one may plot the binding function against λ to verify numerically the validity of this assumption. Simulations as in Gospodinov et al. (2017) may also help to establish this assumption’s credibility.
Theorem 3.
For the SAR model (1), under Assumptions 1–7, the II estimator λ ^ n I I of λ 0 , defined as in (10), which is based on the binding function (9), has the following asymptotic distribution:
n ( λ ^ n I I λ 0 ) d N 0 , η b 0 2 .
Remark 7.
Since when h n diverges, b n ( λ 0 ) converges almost surely to 1, the asymptotic distribution of the II estimator is identical to that of the properly recentered OLS estimator. Further, under divergent h n , β 0 X n G n M n Σ n M n G n X n β 0 dominates Var ( u n E n u n + β 0 X n G n M n u n ) and β 0 X n G n M n G n X n β 0 dominates E ( d n ) , so η = v + o ( 1 ) . In light of Remark 4, this means that under homoscedasticity, the II estimator defined as in (10) is as efficient as the optimal IV estimator and can be as efficient as the ML estimator if the spatial data is normally distributed. (The same conclusion holds for the slightly modified II estimator, designed specifically under homoscedasticity, in Appendix B.6.)
Remark 8.
Theorem 3 shows that regardless of the magnitude of h n , which researchers may not know in practice, one can always apply the II procedure after the OLS estimation is done. At worst, when lim n n / h n = 0 , this procedure is redundant (see Remark 2), but still the resulting II estimator has exactly the same asymptotic distribution as the consistent OLS estimator (since η = v + o ( 1 ) , see Remark 7). Otherwise, the II procedure provides a correction to the inconsistent OLS estimator.
Once the spatial autoregression parameter λ is consistently estimated by λ ^ n I I , one can estimate the parameter vector β by
β ^ n I I = ( X n X n ) 1 X n S ^ n y n ,
where S ^ n = S n ( λ ^ n I I ) = S n ( λ ^ n I I λ 0 ) W n .
Theorem 4.
For the SAR model (1), under Assumptions 1–7, the OLS estimator of β 0 defined as in (13) has the asymptotic distribution
n ( β ^ n I I β 0 ) d N ( 0 , V ) ,
and jointly,
n ( θ ^ n I I θ 0 ) = n λ ^ n I I λ 0 β ^ n I I β 0 d N 0 , η b 0 2 γ γ V ,
where V , assumed to exist and be positive definite, is given by (A.14) and γ is given by (A.15), respectively, in Appendix B.4.
Remark 9.
One can see (from Appendix B.4) that the expression of V contains the traditional OLS variance term under heteroscedasticity, lim n n ( X n X n ) 1 X n Σ n X n ( X n X n ) 1 , as well as terms that signal the additional uncertainty introduced by λ ^ n I I in the definition of β ^ n I I as in (13).
In practice, in order to make asymptotically valid inference from the II estimation strategy, one needs to estimate η / b 0 2 , V , and γ in Theorems 3 and 4 by η ( θ ^ n I I ) / [ b n ( λ ^ n I I ) ] 2 , V ( θ ^ n I I , Σ ^ n ) , and γ ( θ ^ n I I , Σ ^ n ) , respectively, where Σ ^ n = Dg ( u ^ 1 , n 2 , , u ^ n , n 2 ) and u ^ i , n ’s are the sample residuals from the II estimation.6

3. The Special Case of Pure SAR

It is worthwhile to discuss the case when there is no X n , namely, the so-called pure SAR model
y n = λ W n y n + u n .
This case is of special interest since there is no IV available. On the other hand, the QML estimator is not consistent under heteroscedasticity. Kyriacou et al. (2017) were the first to explore the possibility of using the II procedure to correct the inconsistency of the OLS estimator under some mild form of heteroscedasticity and their results were quite promising. In this paper, no restrictions are imposed on the form of the unknown heteroscedasticity.
Given the expansion (3), one can see obviously that
plim n ( λ ^ n λ 0 ) = plim n h n n r 1 n plim n h n n d 1 n = lim n h n n tr ( Σ n G n ) lim n h n n tr ( Σ n G n G n ) 0 ,
regardless of the magnitude of h n . Proceeding similarly as before, as long as h n = o ( n ) , one can write r 1 n E ( r 1 n ) = O P ( n / h n ) , d 1 n E ( d 1 n ) = O P ( n / h n ) , and E ( d 1 n ) = O ( n / h n ) . (See Lemma A6 in Appendix A). Then, by a Nagar-type (Nagar (1959)) expansion,
n h n λ ^ n λ 0 E ( r 1 n ) d 1 n = n h n r 1 n E ( r 1 n ) E ( d 1 n ) + d 1 n E ( d 1 n ) = n h n r 1 n E ( r 1 n ) E ( d 1 n ) + o P ( 1 ) .
Assumption 6 needs to be modified accordingly to ensure the asymptotic variance of the properly recentered λ ^ n exists and is positive. Now let D n = Dg ( G n ) and E n = G n D n .
Assumption 8.
(i)
v = lim n n { tr ( Σ n ( 4 ) G n G n ) + tr [ Σ n G n Σ n ( G n + G n ) ] } h n [ tr ( Σ n G n G n ) ] 2
exists and is positive; (ii)
η = lim n n tr [ Σ n E n Σ n ( E n + E n ) ] h n [ tr ( Σ n G n G n ) ] 2
exists and is positive.
Corollary 1.
Under Assumptions 1–4 and 8, the OLS estimator λ ^ n of λ in the pure SAR model (14) has the following asymptotic distribution,
n h n λ ^ n λ 0 tr ( Σ n G n ) y n W n W n y n d N ( 0 , v ) ,
where v is defined as in Assumption 8(i).
Corollary 2.
Under Assumptions 1–4 and 8, the OLS estimator λ ^ n of λ in the pure SAR model (14) has the following asymptotic distribution:
n h n λ ^ n λ 0 y n S n D n S n y n y n W n W n y n d N ( 0 , η ) ,
where η is defined as in Assumption 8(ii).
Let the sample binding function be
b n ( λ ) = λ + y n S n ( λ ) D n ( λ ) S n ( λ ) y n y n W n W n y n .
Accordingly, Assumption 7 is modified as follows.
Assumption 9.
For all λ in Λ, the binding function (18) is monotonic in λ with probability 1 and b n ( λ 0 ) a . s . b 0 0 , where
b n ( λ 0 ) = 1 + y n S n Dg ( G n 2 ) S n y n 2 y n W n D n S n y n y n W n W n y n .
Corollary 3.
For the pure SAR model (14), under Assumptions 1–4, 8, and 9, the II estimator λ ^ n I I of λ that is based on the binding function (18) has the following asymptotic distribution:
n h n λ ^ n I I λ 0 d N 0 , η b 0 2 ,
where η is defined as in Assumption 8(ii).
Remark 10.
If Σ n = σ 0 2 I n (namely, under homoscedasticity) and ones uses E ( r 1 n ) / E ( d 1 n ) = tr ( G n ) / tr ( G n G n ) as the recentering term, which fortunately does not involve the unknown variance parameter σ 0 2 , then the binding function, its derivative, and the asymptotic distribution of the resulting II estimator can be modified accordingly as in Kyriacou et al. (2017). In contrast to the SAR model with X n , the recentered OLS estimator and the II estimator in the pure SAR model have convergence rate n / h n .
Remark 11.
One can see that b n ( λ 0 ) does not converge almost surely to 1 when h n is divergent as in the case when X is present. This is because y n W n W n y n = O P ( n / h n ) for the pure case and y n W n M n W n y n = O P ( n ) when X is present, whereas both the numerators in the second terms on the right-hand sides of (19) and (11) are O P ( n / h n ) .
Remark 12.
Admittedly, the convergence rate in (20) depends on h n . However, this does not prevent one from using the II estimator if one is interested in estimating the pure SAR model, since the binding function (18) does not involve h n . For inference purpose, since η defined as in Assumption 8(ii) has the scaling factor n / h n , one can see that once λ ^ n I I and the sample residuals u ^ i , n are available, the standard error of λ ^ n I I can be calculated as tr [ Σ ^ n E ^ n Σ ^ n ( E ^ n + E ^ n ) ] / [ b n ( λ ^ n I I ) tr ( Σ ^ n G ^ n G ^ n ) ] 2 , where G ^ n = G n ( λ ^ n I I ) , E ^ n = E n ( λ ^ n I I ) , and Σ ^ n = Dg ( u ^ 1 , n 2 , , u ^ n , n 2 ) .

4. Monte Carlo Evidence

In this section, Monte Carlo simulations are provided to demonstrate the performance of λ ^ n I I (as well as β ^ n I I in (13)) in finite samples, in comparison with consistent estimators that are also robust under heteroscedasticity: the optimal robust GMM estimator of Lin and Lee (2010) and the modified QML (MQML) estimator of Liu and Yang (2015). For the optimal robust GMM estimator of Lin and Lee (2010), u n Σ n 1 ( G n ( λ ) Dg ( G n ( λ ) ) u n and u n Σ n 1 ( G n ( λ ) X n β , X n ) are used as the optimal moment conditions, see Debarsy et al. (2015). They involve λ (appearing in G n ( λ ) ) and β as well as the covariance matrix Σ n . For λ and β , an initial estimation is constructed from the simple 2SLS with W n X n and X n as IV’s. One may assume a model for Σ n and then estimate the assumed model so as to construct the moment conditions. In this section, two choices are made regarding this: one is to use u n ( G n Dg ( G n ) u n and u n ( G n X n β , X n ) as the moment conditions and the other is to use u n Σ n 1 ( G n Dg ( G n ) u n and u n Σ n 1 ( G n X n β , X n ) with the true Σ n (known in simulations) plugged in. The two resulting estimators are denoted by GMM and GMM( Σ n ), respectively, in Table 1, Table 2, Table 3 and Table 4. One would expect that in practice the performance of the optimal robust GMM estimator with an estimated Σ n appearing in the moment conditions would most likely stand between the two.
In each experiment, for each estimator, reported are the bias and root mean squared error (RMSE) from 1000 Monte Carlo simulations. The empirical rejection probabilities of the relevant t tests for testing each parameter equal to its true value at 5 % are also reported, denoted by P ( 5 % ) , where the asymptotic variances for λ n I I and β ^ n I I , as discussed in the last paragraph of Section 2, are estimated with the unknowns replaced by their estimates based on the II procedure.7
For the purpose of comparison, the experimental design in Lin and Lee (2010) is followed closely. The spatial scenario under a group interaction weight matrix of Case (1991) is considered. The exogenous variables include a constant term and two independently distributed random variables following N ( 3 , 1 ) and U ( 1 , 2 ) , respectively. The size of each group is determined by a U ( 3 , 20 ) random variable. The error terms follow a zero-mean normal distribution with variances varying across groups. Two variance structures (V1 and V2) are considered. V1: for each group, if the group size is greater than 10, then the error variance is the same as the group size; otherwise, the variance is the inverse of the square of the group size. V2: for each group, the error variance is the inverse of the group size. Two sets of parameter configurations are used: θ 0 = ( λ 0 , 0.8 , 0.2 , 1.5 ) and θ 0 = ( λ 0 , 0.2 , 0.2 , 0.1 ) , named P1 and P2, respectively. Different degrees of spatial autocorrelation are considered: λ 0 = 0.2 , 0.6 , 0.9 . Results are reported in Table 1, Table 2, Table 3 and Table 4.
Some interesting observations arise. Firstly, all the consistent estimators deliver almost unbiased results across all the experimental configurations, though the estimated intercept term associated with X n is relatively more biased. Secondly, among the consistent estimators, the optimal robust GMM using the true Σ n usually achieves the smallest RMSE. The other three estimators have very similar performance in terms of RMSE. Thirdly, for the purpose of hypothesis testing, it appears that the II-based procedure is as good as the one based on (the infeasible) GMM( Σ n ), with the empirical rejection rates matching very closely the nominal size. The MQML of Liu and Yang (2015) tends to deliver under-sized t-test regarding the spatial autoregression parameter λ when its value is relatively high. For example, in Table 4, one sees the rejection rates of 0.4 % and 0.1 % , under R = 100 and R = 200 , respectively, for testing λ equal to its true value when λ 0 = 0.9 from a 5 % t-test based on Liu and Yang (2015). This under-size problem, when the degree of spatial correlation is high, also carries over to the t test associated with the intercept parameter. The GMM estimator, when one is unsure of the error variance structure and uses u n ( G n Dg ( G n ) u n and u n ( G n X n β , X n ) as the moment conditions, delivers very disappointing size performance in Table 4 when testing either the spatial autoregression parameter λ or the intercept parameter: the rejection rates approach around 20 % at the 5 % nominal size.
Given the simulated data, it is worthwhile to look at a plot of the binding function to check whether the binding function is monotonic, as required by Assumption 7. Figure 1 is drawn for 1000 simulated data sets under variance structure V1 and parameter configuration P1 with R = 100 .8 Recall that for a given θ 0 = ( λ 0 , β 0 ) and the exogenous X n , the data generating process generates the observable data y n and the binding function λ + y n S n ( λ ) M n D n ( λ ) M n S n ( λ ) y n / y n W n M n W n y n is a function of λ . Figure 1 clearly illustrates that the binding function is a monotonic function of λ and thus the monotonicity condition in Assumption 7 is numerically valid. Figures drawn for the simulated data under other configurations display similar patterns and are omitted.
One may wonder about the performance of the proposed II estimator under homoscedasticity, relative to the QML estimator of Lee (2004) and the best GMM estimator of Lee (2007).9 Table 5 and Table 6 report the Monte Carlo results under parameter configurations P1 and P2, but now the error term is simulated as a standard normal random variable. The exogenous variables were simulated the same as before. From Table 5 and Table 6, one observes that the II estimator is slightly better than the best GMM estimator of Lee (2007), usually delivering smaller finite-sample bias and lower RMSE. Both methods have good finite-sample size performance in terms of the 5 % t test. The finite-sample performance of the QML, on the other hand, is quite different from what the asymptotic theory predicts. Its bias is more severe than the other two and it also gives higher RMSE. Moreover, its size performance is very poor.
The simulation results suggest that the II estimator could be used at least as a complement to other consistent estimators proposed in the literature that are robust to unknown heteroscedasticity. The II method may be favored for the purpose of hypothesis testing, especially for testing the spatial autoregression parameter. It also has very good finite-sample performance under homoscedasticity.

5. Concluding Remarks

Lee (2002) challenged the traditional wisdom that the OLS estimator is biased and inconsistent in spatial autoregressions and showed that it may be consistent under some special circumstances if there are exogenous regressors included. This paper thoroughly examines the asymptotic behavior of the OLS estimator under different specifications of the degree of aggregate influence on each unit from other units and provides a unified asymptotic distribution result of the recentered OLS estimator. Based on this, an indirect inference estimator, which is consistent and asymptotically normal, is introduced. The new estimator is relatively easy to calculate, does not rely on distributional assumptions on the data, and is robust to heteroscedasticity. Monte Carlo experiments in this paper show the good finite-sample performance of the II estimator in comparison with other consistent estimators that are robust to unknown heteroscedasticity.
In this paper, no attempt is made to conduct some comparison of the asymptotic variances of the GMM and II estimators. The II estimator in this paper may be interpreted as an estimator that uses one moment condition, namely, by matching the OLS estimator λ ^ n with its approximate analytical expectation. In contrast, the GMM estimator in Lin and Lee (2010) is based on a set of exact expectations of bilinear and quadratic forms in u n . The OLS estimator itself is based on an incorrect moment condition, namely, exogeneity of W n y n . It is not clear whether correcting an incorrect moment condition is as efficient as using a set of correct moment conditions. A fruitful strategy is perhaps to design a combined estimator.10 Another possible extension is to consider the more general higher-order SARAR (spatial autoregressive model with spatial autoregressive disturbances) with heteroscedastic innovations as in Badinger and Egger (2011) and Jin and Lee (2019). In this more general setup, the II procedure may be implemented as follows. One can first derive the approximate analytical expectation of the OLS estimator of the parameters in the SAR part, taken the parameters in the disturbance part as given, and thus design a “corrected” SAR estimator. Then based on the residuals that arise from the “corrected” SAR estimator, one can derive approximate analytical expectation of the OLS estimator of the parameters in the disturbance part. In the end, one can jointly estimate all the parameters in the SAR and disturbance parts by using the two sets of approximate analytical expectations. Some preliminary simulations show very promising results from this approach. Rigorous treatments of these extensions are left for future studies.

Author Contributions

The three authors of the paper have contributed equally, via joint efforts, regarding both ideas, research, and writing. Conceptualization, Y.B. and X.L.; methodology, Y.B.; software, Y.B. and X.L.; validation, Y.B., X.L. and L.Y.; formal analysis, Y.B., X.L. and L.Y.; investigation, Y.B.; resources, not applicable; writing–original draft preparation, Y.B.; writing–review and editing, Y.B., X.L. and L.Y.; visualization, Y.B.; supervision, not applicable; project administration, Y.B., X.L. and L.Y.; funding acquisition, L.Y. All authors have read and agreed to the published version of the manuscript.

Funding

Lihong Yang’s research was partially supported by the National Natural Science Foundation of China under Grant No. 71573269.

Acknowledgments

The authors are grateful to the three anonymous referees, seminar participants at Huazhong Agricultural University, Huazhong University of Science and Technology, Nanjing Audit University, Shandong University, South China Normal University, and University of California, Riverside, and conference participants at the 2019 CES conference (Dalian) for their helpful comments. Jeff Ello from the Krannert Computing Center at Purdue University kindly created a virtual machine from a computer cluster to facilitate the simulations conducted in an early version of this paper. The authors are responsible for all remaining errors.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Lemmas

This appendix collects several lemmas that are useful for deriving the main results. Some of these results (without proofs) were either derived or presented in different ways, see Kelejian and Prucha (1999, 2001) and Lee (2001, 2002, 2004).
Lemma A1.
If { A n } and { B n } are uniformly bounded in row and column sums, then so are { A n + B n } and { A n B n } .
Lemma A2.
Suppose { A n } has its elements of order O ( h n 1 ) . If { B n } is uniformly bounded in column sum, then the elements of A n B n are O ( h n 1 ) ; if { B n } is uniformly bounded in row sum, then the elements of B n A n are O ( h n 1 ) . In either case, tr ( A n B n ) = O ( n / h n ) .
Lemma A3.
For a product involving (powers of) G n and G n , l = 1 m ( G n i 1 G n i 2 ) j l , where i 1 , i 2 0 , i 1 i 2 > 0 , j l 0 , l = 1 m j l > 0 , i 1 , i 2 , j l all being integers, under Assumptions 1 and 2, its elements are of order O ( h n 1 ) and its trace is of order O ( n / h n ) .
Proof. 
Under Assumptions 1 and 2, from Lemma A2, the elements of G n = W n S n 1 are O ( h n 1 ) and tr ( G n ) = O ( n / h n ) . From Lemma A1, G n = W n S n 1 is uniformly bounded in row and column sums. By successive using of Lemma A1, G n i 1 1 is uniformly bounded in row and column sums, and through Lemma A2, the elements of G n i 1 = G n G n i 1 1 are O ( h n 1 ) and tr ( G n i 1 ) = O ( n / h n ) . Similarly, such a claim applies to G n i 2 , which is also uniformly bounded in row and column sums. Then G n i 1 G n i 2 is uniformly bounded in row and column sums with its elements being O ( h n 1 ) and tr ( G n i 1 G n i 2 ) = O ( n / h n ) . Proceeding similarly, one can see that the product l = 1 m ( G n i 1 G n i 2 ) j l shares these properties too. □
Lemma A4.
For the sequence { u n } with the elements following Assumption 3, let A n and B n be nonrandom, then
E ( u n A n u n ) = tr ( Σ n A n ) ,
E ( u n u n A n u n ) = dg ( Σ n ( 3 ) A n ) , E ( u n A n u n u n B n u n ) = tr ( Σ n ( 4 ) A n B n ) + tr ( Σ n A n ) tr ( Σ n B n )
+ tr [ Σ n A n Σ n ( B n + B n ) ] .
Lemma A5.
Under Assumptions 1–5,
u n G n u n = O P ( n / h n ) , u n M n G n u n = O P ( n / h n ) , u n G n M n G n u n = O P ( n / h n ) , β 0 X n G n M n u n = O P ( n ) , β 0 X n G n M n G n u n = O P ( n ) , β 0 X n G n M n G n X n β 0 = O ( n ) .
Proof. 
From (A.1), E ( u n G n u n ) = tr ( Σ n G n ) . Under Assumption 3, Σ n is uniformly bounded, so tr ( Σ n G n ) K tr ( G n ) = O ( n / h n ) from Lemma A3. Lee (2004) shows that, under Assumption 5, lim n n 1 ( X n , G n X n β 0 ) ( X n , G n X n β 0 ) is nonsingular if and only if both the limits of n 1 X n X n and n 1 β 0 X n G n M n G n X n β 0 are nonsingular, indicating that X n X n = O ( n ) and β 0 X n G n M n G n X n β 0 = O ( n ) . Also from Lee (2004), M n is uniformly bounded in row and column sums, and then E ( u n M n G n u n ) = tr ( Σ n M n G n ) K tr ( M n G n ) = O ( n / h n ) from Lemmas A2 and A3. Similarly, one can show E ( u n G n M n G n u n ) = O ( n / h n ) . As for β 0 X n G n M n u n , its expectation is zero and its variance is given by β 0 X n G n M n Σ n M n G n X n β 0 , which is bounded by K β 0 X n G n M n G n X n β 0 = O ( n ) . Then it follows that β 0 X n G n M n u n = O P ( n ) . Similarly, β 0 X n G n M n G n u n = O P ( n ) . Note that G n M n G n G n M n G n is uniformly bounded in row and column sums through Lemmas A1 and A2. Given that the elements of X n are uniformly bounded, one has β 0 X n G n M n G n G n M n G n X n β 0 = O ( n ) . Then it follows that β 0 X n G n M n G n u n = O P ( n ) . □
Lemma A6.
Under Assumptions 1–5, E ( r n ) = O ( n / h n ) , Var ( r n ) = O ( n ) , E ( d n ) = O ( n ) , and Var ( d n ) = O ( n ) . When there is no X , then E ( r 1 n ) = O ( n / h n ) , Var ( r 1 n ) = O ( n / h n ) , E ( d 1 n ) = O ( n / h n ) , and Var ( d 1 n ) = O ( n / h n ) .
Proof. 
Given Lemmas A4 and A5,
E ( r n ) = tr ( Σ n M n G n ) = O ( n / h n )
and
E ( d n ) = tr ( Σ n G n M n G n ) + β 0 X n G n M n G n X n β 0 = O ( n )
are obvious. Using Lemma A4,
Var ( r n ) = Var ( u n M n G n u n ) + Var ( β 0 X n G n M n u n ) + 2 Cov ( u n M n G n u n , β 0 X n G n M n u n ) = E [ ( u n M n G n u n ) 2 ] [ E ( u n M n G n u n ) ] 2 + E ( u n M n G n X n β 0 β 0 X n G n M n u n ) + 2 β 0 X n G n M n E ( u n u n M n G n u n ) = tr ( Σ n ( 4 ) M n G n M n G n ) + tr [ Σ n M n G n Σ n ( M n G n + G n M n ) ] + β 0 X n G n M n Σ n M n G n X n β 0 + 2 β 0 X n G n M n dg ( Σ n ( 3 ) M n G n ) ,
where in view of Lemma A3 and A5, β 0 X n G n M n Σ n M n G n X n β 0 = O ( n ) is the leading term. Similarly,
Var ( d n ) = E [ ( u n G n M n G n u n ) 2 ] [ E ( u n G n M n G n u n ) ] 2 + 4 ( u n G n M n G n X n β 0 β 0 X n G n M n G n u n ) + 4 β 0 X n G n M n G n E ( u n u n G n M n G n u n ) = tr ( Σ n ( 4 ) G n M n G n G n M n G n ) + 2 tr [ Σ n G n M n G n Σ n G n M n G n ] + 4 β 0 X n G n M n G n Σ n G n M n G n X n β 0 + 4 β 0 X n G n M n G n dg ( Σ n ( 3 ) G n M n G n ) ,
in which β 0 X n G n M n G n Σ n G n M n G n X n β 0 = O ( n ) is the leading term. For the case when there is no X , the results are obvious. □
Lemma A7.
Suppose { A n } is a sequence of matrices with uniformly bounded row and column sums. Let { b n } be a sequence of constants with uniformly bounded elements and sup n n 1 i = 1 n | b i , n | 2 + η 1 < for some η 1 > 0 . For the sequence { u n } that satisfies Assumption 3, let Q n = b n u n + u n A n u n . Then
Q n E ( Q n ) Var ( Q n ) d N ( 0 , 1 ) .

Appendix B. Proofs

The proofs of Theorems 1–4 in Section 2 and Corollary 2 in Section 3 are provided in this appendix and those of Corollaries 1 and 3, which follow similarly, are skipped.

Appendix B.1. Proof of Theorem 1

Proof. 
By a Nagar-type (Nagar (1959)) expansion,
n λ ^ n λ 0 tr ( Σ n M n G n ) y n W n M n W n y n = n r 1 n E ( r 1 n ) + r 2 n d n = n r 1 n E ( r 1 n ) + r 2 n E ( d n ) + d n E ( d n ) = n r 1 n E ( r 1 n ) + r 2 n E ( d n ) 1 + d n E ( d n ) E ( d n ) 1 = n r 1 n E ( r 1 n ) + r 2 n E ( d n ) + o P ( 1 )
where, in light of the proof of Lemma A6, r 1 n E ( r 1 n ) = O P ( n / h n ) , r 2 n = O ( n ) , and E ( d n ) = O ( n ) . So when h n diverges,
n λ ^ n λ 0 tr ( Σ n M n G n ) y n W n M n W n y n = n r 2 n E ( d n ) + o P ( 1 ) ,
and one can apply Lemma A7 to r 2 n = β 0 X n G n M n u n ; when h n is bounded, one can apply Lemma A7 to r n = r 1 n + r 2 n = u n M n G n u n + β 0 X n G n M n u n . From Lemma A6, one sees that Var ( r 2 n ) = Var ( r n ) + o ( n ) when h n diverges. Therefore, regardless of h n ,
n λ ^ n λ 0 tr ( Σ n M n G n ) y n W n M n W n y n d N 0 , lim n n Var ( r n ) [ E ( d n ) ] 2 .
 □

Appendix B.2. Proof of Theorem 2

Proof. 
Note that
u ˜ n D n u ˜ n = u n D n u n + ( β ˜ n β 0 ) X n X n ( β ˜ n β 0 ) 2 ( β ˜ n β 0 ) X n D n u n = u n D n u n + O P ( 1 ) + O P ( n 1 / 2 ) O P ( n / h n 2 ) ,
in view of X n X n = O ( n ) , β ˜ n β 0 = O P ( n 1 / 2 ) and Var ( X n D n u n ) = X n D n Σ n D n X n = O ( n / h n 2 ) . Then
n λ ^ n λ 0 y n S n M n D n M n S n y n y n W n M n W n y n = n λ ^ n λ 0 u n D n u n d n + u n D n u n u ˜ n D n u ˜ n d n = n r n u n D n u n d n + o P ( 1 ) = n r n u n D n u n E ( d n ) + o P ( 1 ) .
It follows from Lemma A7 that
n λ ^ n λ 0 y n S n M n D n M n S n y n y n W n M n W n y n d N ( 0 , η ) ,
where
η = lim n n Var ( r n u n D n u n ) [ E ( d n ) ] 2 = lim n n Var ( u n E n u n + β 0 X n G n M n u n ) [ E ( d n ) ] 2 = lim n n tr [ Σ n E n Σ n ( E n + E n ) ] + β 0 X n G n M n Σ n M n G n X n β 0 [ E ( d n ] 2
by using Lemma A4 and the fact that dg ( E n ) = 0 . □

Appendix B.3. Proof of Theorem 3

Proof. 
One can apply the extended delta method as in Phillips (2012) to derive the asymptotic distribution of n ( λ ^ n I I λ 0 ) . For this purpose, one needs to check a technical condition, namely, the b n 1 ( λ ) / λ should be asymptotically locally equicontinuous at λ 0 : for given ζ > 0 , if s n and s n / n 0 ,
sup s n λ λ 0 < ζ b n 1 ( λ ) b n 1 ( λ 0 ) b n 1 ( λ 0 ) = sup s n λ λ 0 < ζ b n ( λ 0 ) b n ( λ ) b n ( λ ) a . s . 0 .
The first derivative of the binding function is
b n ( λ ) = 1 + y n S n ( λ ) M n Dg ( M n G n 2 ( λ ) ) M n S n ( λ ) y n 2 y n W n M n Dg ( M n G n ( λ ) ) M n S n ( λ ) y n y n W n M n W n y n .
By substituting y n = S n 1 X n β 0 + S n 1 u n and using Lemmas A2–A4, one can see that the second term in (A.10) converges almost surely to a bounded constant for all λ Λ . In a similar way, one can show
b n ( λ ) = 2 y n S n ( λ ) M n Dg ( M n G n 3 ( λ ) ) M n S n ( λ ) y n + 2 y n W n M n Dg ( M n G n ( λ ) ) M n W n y n y n W n M n W n y n
also converges almost surely to a bounded constant. Thus, for some λ * that lies between λ 0 and λ ,
b n ( λ 0 ) b n ( λ ) b n ( λ ) = λ λ 0 b n ( λ * ) b n ( λ ) < ζ s n b n ( λ * ) b n ( λ ) a . s . 0 .
With all these results, (12) follows immediately from Theorem 1 of Phillips (2012) and Theorem 2 in this paper. □

Appendix B.4. Proof of Theorem 4

Proof. 
Upon substitution,
β ^ n I I = ( X n X n ) 1 X n S ^ n S n 1 X n β 0 + ( X n X n ) 1 X n S ^ n S n 1 u n = ( X n X n ) 1 X n S n ( λ ^ n I I λ 0 ) W n S n 1 X n β 0 + ( X n X n ) 1 X n S n ( λ ^ n I I λ 0 ) W n S n 1 u n = β 0 + ( X n X n ) 1 X n u n ( λ ^ n I I λ 0 ) ( X n X n ) 1 X n G n X n β 0 + ( X n X n ) 1 X n G n u n ,
where λ ^ n I I λ 0 = O p ( n 1 / 2 ) (from Theorem 3). Further, ( X n X n ) 1 X n u n = O p ( n 1 / 2 ) , ( X n X n ) 1 X n G n X n β 0 = O ( 1 ) , and ( X n X n ) 1 X n G n u n = O p ( n 1 / 2 ) (from Lemma A5). Note that if one expands λ ^ n I I λ 0 = b n 1 ( λ ^ n ) b n 1 ( b n ( λ 0 ) ) ,
n ( λ ^ n I I λ 0 ) = n λ ^ n b n ( λ 0 ) b n ( λ 0 ) + O P ( n 1 / 2 ) .
From the proof of Theorem 2,
λ ^ n b n ( λ 0 ) = r n u n D n u n E ( d n ) + o P ( n 1 / 2 ) = u n E n u n + β 0 X n G n M n u n E ( d n ) + o P ( n 1 / 2 ) .
Given the above results, one has
n ( β ^ n I I β 0 ) = n ( X n X n ) 1 X n u n n ( λ ^ n I I λ 0 ) ( X n X n ) 1 X n G n X n β 0 + o P ( 1 ) = n ( X n X n ) 1 X n u n n λ ^ n b n ( λ 0 ) b n ( λ 0 ) ( X n X n ) 1 X n G n X n β 0 + o P ( 1 ) = n ( X n X n ) 1 X n u n n ( X n X n ) 1 X n G n X n β 0 b n ( λ 0 ) u n E n u n + β 0 X n G n M n u n E ( d n ) + o P ( 1 ) .
One can check that Lemma A7 can be applied to each element or any nonstochastic linear combination of elements of n ( β ^ n I I β 0 ) under such a representation. So n ( β ^ n I I β 0 ) converges to a normal distribution, with the asymptotic covariance matrix,
V = lim n { n ( X n X n ) 1 X n Σ n X n ( X n X n ) 1 + η b 0 2 ( X n X n ) 1 X n G n X n β 0 β 0 X n G n X n ( X n X n ) 1 n ( X n X n ) 1 X n G n X n β 0 β 0 X n G n M n Σ n X n ( X n X n ) 1 b 0 [ tr ( Σ n G n M n G n ) + β 0 X n G n M n G n X n β 0 ] n ( X n X n ) 1 X n Σ n M n G n X n β 0 β 0 X n G n X n ( X n X n ) 1 b 0 [ tr ( Σ n G n M n G n ) + β 0 X n G n M n G n X n β 0 ] .
The asymptotic covariance between β ^ n I I and λ ^ n I I follows from the expansion of n ( λ ^ n I I λ 0 ) (see (A.11) and (A.12)) and that of n ( β ^ n I I β 0 ) (see (A.13)), given by
γ = lim n n b 0 E ( d n ) ( X n X n ) 1 X n Σ n M n G n X n β 0 η b 0 2 ( X n X n ) 1 X n G n X n β 0 .
 □

Appendix B.5. Proof of Corollary 2

Proof. 
By substitution and using the Nagar (1959) expansion,
n h n λ ^ n λ 0 y n S n D n S n y n y n W n W n y n = n h n r 1 n d 1 n u n D n u n d 1 n = n h n u n E n u n d 1 n = n h n u n E n u n E ( d 1 n ) + o P ( 1 ) ,
where E ( u n E n u n ) = 0 , Var ( u n E n u n ) = tr [ Σ n E n Σ n ( E n + E n ) ] = O ( n / h n ) , and and the asymptotic distribution follows when one applies Lemma A7 to the quadratic form u n E n u n .  □

Appendix B.6. The Case of Homoscedastic Error Term

If Σ n = σ 0 2 I n (and further Σ n ( j ) = μ j I n , μ j = E ( u i , n j ) , j = 3 , 4 ), then the recentering term in (7) becomes σ 0 2 tr ( M n G n ) / y n W n M n W n y n . To make the II procedure feasible, one may replace σ 0 2 tr ( M n G n ) / y n W n M n W n y n by n 1 y n S n M n S n y n tr ( M n G n ) / y n W n M n W n y n . Similar to the proof of Theorem 2, one can put
n λ ^ n λ 0 n 1 y n S n M n S n y n tr ( M n G n ) y n W n M n W n y n = n r n * E ( r n * ) E ( d n ) + o P ( 1 ) d N ( 0 , ω ) ,
where ω = lim n n Var ( r n * ) / { [ E ( d n ) ] 2 } , r n * = u n M n G n * u n + β 0 X n G n M n u n , G n * = G n n 1 tr ( M n G n ) . In particular,
Var ( r n * ) = μ 4 tr ( M n G n * M n G n * ) + σ 0 2 tr [ M n G n * ( M n G n * + G n * M n ) ] + σ 0 2 β 0 X n G n M n M n G n X n β 0 + 2 μ 3 β 0 X n G n M n dg ( M n G n * ) .
Define the binding function as b n ( λ ) = λ + n 1 y n S n ( λ ) M n S n ( λ ) y n tr ( M n G n ( λ ) ) / y n W n M n W n y n and b n ( λ 0 ) = 1 + ( n y n W n M n W n y n ) 1 [ y n S n M n S n y n tr ( M n G n 2 ) 2 y n W n M n S n y n tr ( M n G n ) ] . Assume b n ( λ 0 ) a . s . b 0 0 . The asymptotic distribution of ( λ ^ n I I , β ^ n I I ) resulting from this binding function follows similarly from the proofs of Theorems 3 and 4, given by
n λ ^ n I I λ 0 β ^ n I I β 0 d N 0 , ω b 0 2 γ γ V ,
where
V = lim n { n σ 0 2 ( X n X n ) 1 + ω b 0 2 ( X n X n ) 1 X n G n X n β 0 β 0 X n G n X n ( X n X n ) 1 n ( X n X n ) 1 X n G n X n β 0 [ σ 0 2 β 0 X n G n M n + μ 3 dg ( M n G n * ) ] X n ( X n X n ) 1 b 0 [ σ 0 2 tr ( G n M n G n ) + β 0 X n G n M n G n X n β 0 ] n ( X n X n ) 1 X n [ σ 0 2 M n G n X n β 0 + μ 3 dg ( M n G n * ) ] β 0 X n G n X n ( X n X n ) 1 b 0 [ σ 0 2 tr ( G n M n G n ) + β 0 X n G n M n G n X n β 0 ]
and
γ = lim n n b 0 E ( d n ) ( X n X n ) 1 X n [ σ 0 2 M n G n X n β 0 + μ 3 dg ( M n G n * ) ] ω b 0 2 ( X n X n ) 1 X n G n X n β 0 .

References

  1. Badinger, Harald, and Peter Egger. 2011. Estimation of higher-order spatial autoregressive cross-section models with heteroscedastic disturbances. Papers in Regional Science 90: 213–35. [Google Scholar] [CrossRef]
  2. Case, Anne C. 1991. Spatial patterns in household demand. Econometrica 59: 953–65. [Google Scholar] [CrossRef] [Green Version]
  3. Cheng, Xu, Zhipeng Liao, and Ruoyao Shi. 2019. On uniform asymptotic risk of averaging GMM estimators. Quantitative Economics 3: 931–97. [Google Scholar] [CrossRef]
  4. Cliff, Andrew David, and J. Keith Ord. 1981. Spatial Processes: Models and Applications. London: Pion Ltd. [Google Scholar]
  5. Debarsy, Nicolas, Fei Jin, and Lung-Fei Lee. 2015. Large sample properties of the matrix exponential spatial specification with an applicationto FDI. Journal of Econometrics 188: 1–21. [Google Scholar] [CrossRef] [Green Version]
  6. Gospodinov, Nikolay, Ivana Komunjer, and Serena Ng. 2017. Simulated minimum distance estimation of dynamic models with errors-in-variables. Journal of Econometrics 200: 181–93. [Google Scholar] [CrossRef]
  7. Gouriéroux, Christian, Alain Monfort, and Eric Renault. 1993. Indirect inference. Journal of Applied Econometrics 8: S85–S118. [Google Scholar] [CrossRef]
  8. Jin, Fei, and Lung-Fei Lee. 2019. GEL estimation and tests of spatial autoregressive models. Journal of Econometrics 208: 585–612. [Google Scholar] [CrossRef]
  9. Kelejian, Harry H., and Ingmar R. Prucha. 1999. A generalized moments estimator for the autoregressive parameter in a spatial model. International Economic Review 40: 509–33. [Google Scholar] [CrossRef] [Green Version]
  10. Kelejian, Harry H., and Ingmar R. Prucha. 2001. On the asymptotic distribution of the Moran I test statistic with applications. Journal of Econometrics 104: 219–57. [Google Scholar] [CrossRef] [Green Version]
  11. Kelejian, Harry H., and Ingmar R. Prucha. 2010. Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Journal of Econometrics 157: 53–67. [Google Scholar] [CrossRef] [Green Version]
  12. Kyriacou, Maria, Peter C. B. Phillips, and Francesca Rossi. 2017. Indirect inference in spatial autoregression. The Econometrics Journal 20: 168–89. [Google Scholar] [CrossRef]
  13. Lam, Clifford, and Pedro C. L. Souza. 2019. Estimation and selection of spatial weight matrix in a spatial lag model. Journal of Business and Economic Statistics 3: 693–710. [Google Scholar] [CrossRef]
  14. Lee, Lung-Fei. 2001. Asymptotic Distribution of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive Models I: Spatial Autoregressive Processes. Working Paper. Columbus: Department of Econommics, Ohio State University. [Google Scholar]
  15. Lee, Lung-Fei. 2002. Consistency and efficiency of least squares estimation for mixed regressive, spatial autoregressive models. Econometric Theory 18: 252–77. [Google Scholar] [CrossRef]
  16. Lee, Lung-Fei. 2004. Asymptotic distribution of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72: 1899–925. [Google Scholar] [CrossRef]
  17. Lee, Lung-Fei. 2007. GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. Journal of Econometrics 137: 489–514. [Google Scholar] [CrossRef]
  18. Lin, Xu, and Lung-Fei Lee. 2010. GMM estimation of spatial autoregressive models with unknown heteroskedasticity. Journal of Econometrics 157: 34–52. [Google Scholar] [CrossRef]
  19. Liu, Shew Fan, and Zhenlin Yang. 2015. Modified QML estimation of spatial autoregressive models with unknown heteroskedasticity and nonnormality. Regional Science and Urban Economics 52: 50–70. [Google Scholar] [CrossRef]
  20. Nagar, Anirudh L. 1959. The bias and moment matrix of the general k-class estimators of the parameters in simultaneous equations. Econometrica 27: 575–95. [Google Scholar] [CrossRef]
  21. Phillips, Peter C. B. 2012. Folklore theorems, implicit maps, and indirect inference. Econometrica 80: 425–54. [Google Scholar]
  22. Robinson, Peter M. 2008. Correlation testing in time series, spatial and cross-sectional data. Journal of Econometrics 147: 5–16. [Google Scholar] [CrossRef] [Green Version]
  23. Smith, Anthony A., Jr. 1993. Estimating nonlinear time-series models using simulated vector autoregressions. Journal of Applied Econometrics 8: S63–S84. [Google Scholar] [CrossRef] [Green Version]
  24. Zhang, Xinyu, and Jihai Yu. 2018. Spatial weights matrix election and model averaging for spatial autoregressive models. Journal of Econometrics 203: 1–18. [Google Scholar] [CrossRef]
1.
Recent literature on dealing with heteroscedasticity in the spatial framework includes Kelejian and Prucha (2010), Badinger and Egger (2011), Liu and Yang (2015), Jin and Lee (2019), among others. An essential idea in this strand of literature is to use some moment conditions that are robust to unknown heteroscedasticity.
2.
It should be pointed out when lim n n / h n = 0 (and lim n h n / n = 0 , where h n 1 is the order of magnitude of elements of W n ), θ ^ n is consistent, as shown in Lee (2002), and thus one may not need to seek a consistent estimator of λ 0 separately and then use it to construct a consistent estimator of β 0 . In practice, one may not know a priori the rate of h n , but the II estimator to be introduced is always consistent regardless of the rate of h n .
3.
Multicollinearities can happen, for example, when X n = 1 n and W n is row-normalized. Lee (2004) showed that under homoscedasticity, however, the QML estimator can still be consistent in spite of violation of this condition. Since the II estimator to be discussed in this paper is to correct the possible inconsistency of the OLS estimator, Assumption 5(ii) is maintained.
4.
The asymptotic variances are given by lim n n 1 Var ( r n ) / [ plim ( d n / n ) ] 2 = lim n [ Var ( r n ) / n ] / [ E ( d n ) / n ] 2 and lim n n 1 Var ( u n E n u n + β 0 X n G n M n u n ) / [ plim ( d n / n ) ] 2 = lim n [ Var ( u n E n u n + β 0 X n G n M n u n ) / n ] / [ E ( d n ) / n ] 2 , respectively, for the (properly recentered) OLS estimator and the resulting II estimator. Their explicit expressions are given respectively in Theorems 1 and 2 to be introduced. Assumption 5(ii) implies that plim n ( d n / n ) = plim n ( y n W n M n W n y n / n ) exists and is nonzero. It can be shown (see Appendix A) that Var ( r n ) = Var ( u n M n G n u n ) + β 0 X n G n M n Σ n M n G n X n β 0 + 2 Cov ( β 0 X n G n M n Σ n M n G n X n β 0 , u n M n G n u n ) , where the covariance term disappears under normality, and Var ( u n E n u n + β 0 X n G n M n u n ) = Var ( u n E n u n ) + β 0 X n G n M n Σ n M n G n X n β 0 . When h n diverges, β 0 X n G n M n Σ n M n G n X n β 0 is the dominating term in Var ( r n ) as well as Var ( u n E n u n + β 0 X n G n M n u n ) . Then the usual condition that plim n n 1 Z n Σ n Z n exists and is nonsingular is sufficient for Assumption 6 to hold. When h n is bounded, a more precise characterization of a sufficient condition is not immediately obvious. Essentially, it requires, in addition to the existence and nonsingularity of plim n n 1 Z n Σ n Z n , the existence of lim n n 1 Var ( u n E n u n ) and lim n n 1 Var ( u n M n G n u n ) , where Var ( u n E n u n ) = O ( n / h n ) and Var ( u n M n G n u n ) = O ( n / h n ) .
5.
The use of observed, endogenous but non-simulated, variables within the binding function does not appear to be common. An interesting example is Gospodinov et al. (2017), where the authors used observed data within the binding function to hedge against misspecification bias. In their set-up of the autoregressive distributed lag model with a latent scalar predictor under the presence of measurement error, a similar technical difficulty exists regarding the invertibility condition of their binding function and they resorted to simulations to approximate the binding function and then the invertibility condition is numerically verified based on the approximated binding function.
6.
This follows similarly from the proof of Proposition 2 in Lin and Lee (2010).
7.
Neither Lin and Lee (2010) nor Liu and Yang (2015) reported how the inference procedures based on their estimators would perform in finite samples.
8.
Each sub-figure contains 1000 lines, one for each of the simulated data set.
9.
The authors thank a referee for suggesting this comparison. Since one needs to concentrate out the scalar error variance instead of the nuisance matrix Σ n , the II procedure needs to be modified, see Appendix B.6.
10.
Very recently, Zhang and Yu (2018) and Lam and Souza (2019) proposed combining spatial weight matrices in recognition of possible misspecification of the weight matrix and Cheng et al. (2019) suggested combining a conservative GMM estimator based on valid moment conditions and an aggressive GMM estimator based on both valid and possibly misspecified moment conditions.
Figure 1. b n ( λ ) under variance structure V1 and parameter configuration P1, R = 100 .
Figure 1. b n ( λ ) under variance structure V1 and parameter configuration P1, R = 100 .
Econometrics 08 00034 g001
Table 1. Estimation of spatial autoregressions (SAR) with variance structure V1 and parameter configuration P1.
Table 1. Estimation of spatial autoregressions (SAR) with variance structure V1 and parameter configuration P1.
MQML GMM GMM( Σ n ) II
R θ 0 BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % )
1000.2 −0.0090.0673.6% −0.0150.0735.8% 0.0000.014.9% −0.0090.0673.8%
0.8 0.0340.3674.5% 0.0500.3775.4% 0.0010.0365.6% 0.0330.3674.7%
0.2 −0.0020.0995.0% −0.0020.0995.1% 0.0000.0085.5% −0.0020.0995.0%
1.5 −0.0040.0884.9% −0.0050.0885.1% 0.0000.0074.1% −0.0040.0884.9%
0.6 −0.0050.0353.0% −0.0080.0397.2% 0.0000.0055.3% −0.0050.0354.1%
0.8 0.0320.3663.8% 0.0470.3785.9% 0.0010.0375.6% 0.0310.3674.7%
0.2 −0.0010.1025.5% −0.0010.1025.5% 0.0000.0085.7% −0.0010.1025.5%
1.5 0.0040.0894.9% 0.0040.0894.9% 0.0000.0074.6% 0.0040.0894.9%
0.9 −0.0010.0090.3% −0.0020.017.7% 0.0000.0016.1% −0.0010.0094.3%
0.8 0.0220.3674.0% 0.0370.3786.0% 0.0030.046.5% 0.020.3675.6%
0.2 −0.0020.1024.4% −0.0020.1024.4% 0.0000.0085.3% −0.0020.1024.4%
1.5 0.0000.0935.9% 0.0000.0946.0% 0.0000.0074.7% 0.000.0946.1%
2000.2 0.0000.0475.6% −0.0030.0518.0% 0.0000.0075.3% 0.0000.0476.1%
0.8 −0.0050.2555.1% 0.0030.2615.8% 0.0010.0266.9% −0.0050.2555.2%
0.2 0.0020.0715.8% 0.0010.0715.7% 0.0000.0065.4% 0.0020.0715.8%
1.5 0.0030.0613.3% 0.0020.0613.4% 0.0000.0055.4% 0.0030.0613.3%
0.6 −0.0030.0252.2% −0.0050.0288.6% 0.0000.0045.1% −0.0030.0255.0%
0.8 0.0210.2564.1% 0.0300.2646.0% 0.0000.0265.2% 0.0200.2564.7%
0.2 −0.0010.0694.5% −0.0010.0694.6% 0.0000.0065.7% −0.0010.0694.5%
1.5 0.0000.0624.2% −0.0010.0624.3% 0.0000.0054.9% 0.0000.0624.2%
0.9 −0.0010.0060.7% −0.0010.0078.3% 0.0000.0016.0% −0.0010.0064.7%
0.8 0.0220.2624.2% 0.0300.2696.7% 0.0000.0264.5% 0.0210.2625.7%
0.2 −0.0010.0725.5% −0.0010.0725.3% 0.0000.0054.8% −0.0010.0725.5%
1.5 0.0000.0633.8% 0.0000.0633.9% 0.0000.0054.4% 0.0000.0633.8%
The weight matrix is a social interaction matrix W n = I R [ ( 1 m j 1 m j I m j ) / ( m j 1 ) ] , m j IID U ( 3 , 20 ) , j = 1 , , R . Reported for each estimator are the bias, root mean squared error (RMSE), and the empirical size of the t test for each parameter equal to its true value at 5 % from 1000 simulations. The exogenous regressors are ( 1 , x i 1 , x i 2 ) , x i 1 IID N ( 3 , 1 ) , x i 2 IID U ( 1 , 2 ) , and x i 1 is independent of x i 2 . The group size m j follows IID U ( 3 , 20 ) . The error terms follow a zero-mean independent normal distribution. The variance structure is such that if m j > 10 , the error variance in the j-th group is m j and otherwise is 1 / m j 2 , j = 1 , , R .
Table 2. Estimation of SAR with variance structure V1 and parameter configuration P2.
Table 2. Estimation of SAR with variance structure V1 and parameter configuration P2.
MQML GMM GMM( Σ n ) II
R θ 0 BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % )
1000.2 −0.0100.0765.5% −0.0140.0785.9% −0.0030.0495.4% −0.0090.0765.7%
0.2 0.0070.3355.7% 0.0130.3366.0% 0.0050.0565.8% 0.0070.3355.8%
0.2 0.0010.1025.7% 0.0010.1025.7% −0.0010.0086.3% 0.0010.1025.7%
0.1 0.0000.0955.9% 0.0000.0965.9% 0.0000.0076.0% 0.0000.0955.9%
0.6 −0.0050.0392.8% −0.0070.0405.2% −0.0020.0264.8% −0.0040.0394.6%
0.2 −0.0040.3304.1% 0.0020.3314.3% 0.0050.0585.3% −0.0040.3304.2%
0.2 0.0050.1004.3% 0.0050.1004.4% 0.0000.0086.4% 0.0050.1004.3%
0.1 0.0000.0915.6% 0.0000.0915.7% 0.0000.0075.3% 0.0000.0915.6%
0.9 −0.0010.0100.9% −0.0020.0104.6% 0.0000.0063.9% −0.0010.0105.0%
0.2 0.0130.3355.2% 0.0180.3366.1% 0.0010.0554.1% 0.0120.3355.7%
0.2 −0.0010.1025.2% −0.0010.1025.4% 0.0000.0087.3% −0.0010.1025.2%
0.1 0.0020.0895.1% 0.0020.0895.1% 0.0000.0075.9% 0.0020.0895.1%
2000.2 −0.0070.0545.4% −0.0090.0555.9% −0.0050.0345.6% −0.0070.0545.9%
0.2 0.0130.2314.7% 0.0160.2324.7% 0.0060.0406.0% 0.0130.2314.7%
0.2 −0.0020.0714.3% −0.0020.0714.4% 0.0000.0055.1% −0.0020.0714.3%
0.1 0.0000.0644.7% 0.0000.0644.7% 0.0000.0055.1% 0.0000.0644.7%
0.6 −0.0030.0272.8% −0.0040.0275.3% −0.0010.0174.1% −0.0030.0274.9%
0.2 −0.0010.2335.2% 0.0010.2335.4% 0.0020.0394.4% −0.0020.2335.2%
0.2 0.0030.0714.7% 0.0030.0714.7% 0.0000.0055.1% 0.0030.0714.7%
0.1 0.0000.0635.5% 0.0000.0635.5% 0.0000.0055.8% 0.0000.0635.5%
0.9 −0.0010.0071.3% −0.0010.0075.7% −0.0010.0054.9% −0.0010.0075.3%
0.2 0.0120.2323.4% 0.0150.2334.2% 0.0040.0395.0% 0.0120.2324.1%
0.2 −0.0020.0725.0% −0.0020.0725.1% 0.0000.0065.5% −0.0020.0725.0%
0.1 0.0020.0635.0% 0.0020.0635.0% 0.0000.0056.7% 0.0020.0635.0%
The weight matrix is a social interaction matrix W n = I R [ ( 1 m j 1 m j I m j ) / ( m j 1 ) ] , m j IID U ( 3 , 20 ) , j = 1 , , R . Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at 5 % from 1000 simulations. The exogenous regressors are ( 1 , x i 1 , x i 2 ) , x i 1 IID N ( 3 , 1 ) , x i 2 IID U ( 1 , 2 ) , and x i 1 is independent of x i 2 . The group size m j follows IID U ( 3 , 20 ) . The error terms follow a zero-mean independent normal distribution. The variance structure is such that if m j > 10 , the error variance in the j-th group is m j and otherwise is 1 / m j 2 , j = 1 , , R .
Table 3. Estimation of SAR with variance structure V2 and parameter configuration P1.
Table 3. Estimation of SAR with variance structure V2 and parameter configuration P1.
MQML GMM GMM( Σ n ) II
R θ 0 BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % )
1000.2 0.0000.0144.6% 0.0000.0145.2% −0.0010.0134.9% 0.0000.0145.0%
0.8 0.0010.0473.6% 0.0000.0473.7% 0.0030.0423.7% 0.0010.0463.7%
0.2 0.0000.0084.1% 0.0000.0084.1% 0.0000.0085.1% 0.0000.0084.2%
1.5 0.0000.0085.3% 0.0000.0085.4% 0.0000.0074.6% 0.0000.0085.3%
0.6 0.0000.0084.3% 0.0000.0084.9% −0.0010.0074.3% 0.0000.0084.6%
0.8 0.0010.0494.5% 0.0020.0495.7% 0.0020.0434.3% 0.0010.0495.5%
0.2 0.0000.0085.3% 0.0000.0085.1% 0.0000.0084.9% 0.0000.0085.2%
1.5 0.0000.0085.3% 0.0000.0085.4% 0.0000.0074.7% 0.0000.0085.3%
0.9 0.0000.0025.1% 0.0000.0027.8% 0.0000.0025.9% 0.0000.0026.8%
0.8 0.0020.0504.1% 0.0020.0505.8% 0.0020.0454.9% 0.0020.0494.6%
0.2 0.0000.0085.4% 0.0000.0085.4% 0.0000.0085.6% 0.0000.0085.4%
1.5 0.0000.0085.2% 0.0000.0085.2% 0.0000.0074.7% 0.0000.0085.0%
2000.2 0.0000.0103.8% 0.0000.0104.7% 0.0000.0093.8% 0.0000.0104.4%
0.8 0.0010.0334.6% 0.0010.0334.1% 0.0010.0302.6% 0.0010.0334.6%
0.2 0.0000.0064.6% 0.0000.0064.6% 0.0000.0054.7% 0.0000.0064.6%
1.5 0.0000.0054.9% 0.0000.0054.6% 0.0000.0055.5% 0.0000.0054.9%
0.6 0.0000.0053.9% 0.0000.0065.4% 0.0000.0054.5% 0.0000.0054.6%
0.8 0.0000.0343.7% −0.0010.0345.0% 0.0000.0303.8% 0.0000.0343.9%
0.2 0.0000.0065.9% 0.0000.0066.1% 0.0000.0065.1% 0.0000.0065.9%
1.5 0.0000.0065.5% 0.0000.0065.4% 0.0000.0054.8% 0.0000.0065.6%
0.9 0.0000.0013.9% 0.0000.0015.9% 0.0000.0015.4% 0.0000.0015.0%
0.8 −0.0010.0354.3% −0.0020.0355.4% 0.0000.0315.8% −0.0010.0345.0%
0.2 0.0000.0063.8% 0.0000.0063.9% 0.0000.0054.4% 0.0000.0063.8%
1.5 0.0000.0065.1% 0.0000.0065.2% 0.0000.0054.6% 0.0000.0065.5%
The weight matrix is a social interaction matrix W n = I R [ ( 1 m j 1 m j I m j ) / ( m j 1 ) ] , m j IID U ( 3 , 20 ) , j = 1 , , R . Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at 5 % from 1000 simulations. The exogenous regressors are ( 1 , x i 1 , x i 2 ) , x i 1 IID N ( 3 , 1 ) , x i 2 IID U ( 1 , 2 ) , and x i 1 is independent of x i 2 . The group size m j follows IID U ( 3 , 20 ) . The error terms follow a zero-mean independent normal distribution. The variance structure is such that the error variance in the j-th group is 1 / m j , j = 1 , , R .
Table 4. Estimation of SAR with variance structure V2 and parameter configuration P2.
Table 4. Estimation of SAR with variance structure V2 and parameter configuration P2.
MQML GMM GMM( Σ n ) II
R θ 0 BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % )
1000.2 −0.0050.0585.2% −0.0040.08019.0% −0.0050.0516.4% −0.0050.0585.9%
0.2 0.0060.0665.0% 0.0060.08615.4% 0.0060.0586.4% 0.0060.0666.7%
0.2 0.0000.0084.8% 0.0000.0084.9% 0.0000.0085.2% 0.0000.0084.8%
0.1 0.0000.0084.6% 0.0000.0084.6% 0.0000.0074.8% 0.0000.0084.6%
0.6 −0.0030.0291.4% −0.0010.04219.7% −0.0030.0276.5% −0.0030.0295.5%
0.2 0.0070.0671.5% 0.0030.09117.9% 0.0070.0616.2% 0.0060.0675.2%
0.2 0.0000.0083.7% 0.0000.0083.6% 0.0000.0084.2% 0.0000.0083.8%
0.1 0.0000.0074.4% 0.0000.0074.0% 0.0000.0075.6% 0.0000.0074.4%
0.9 −0.0010.0080.4% −0.0010.01119.2% −0.0010.0075.4% −0.0010.0075.8%
0.2 0.0090.0670.4% 0.0080.09317.8% 0.0090.0594.4% 0.0070.0655.0%
0.2 0.0000.0094.8% 0.0000.0095.7% 0.0000.0086.1% 0.0000.0095.1%
0.1 0.0000.0085.5% 0.0000.0085.7% 0.0000.0075.6% 0.0000.0085.6%
2000.2 −0.0030.0395.4% −0.0040.05820.1% −0.0040.0355.9% −0.0030.0396.0%
0.2 0.0050.0454.9% 0.0060.06417.1% 0.0050.0416.4% 0.0040.0455.7%
0.2 −0.0010.0065.3% −0.0010.0065.6% 0.0000.0065.4% −0.0010.0065.3%
0.1 0.0000.0055.8% 0.0000.0055.7% 0.0000.0055.3% 0.0000.0055.8%
0.6 −0.0020.0200.7% −0.0010.02916.8% −0.0030.0184.8% −0.0010.0204.6%
0.2 0.0030.0451.1% 0.0020.06215.6% 0.0050.0404.3% 0.0020.0444.5%
0.2 0.0000.0065.0% 0.0000.0065.0% 0.0000.0054.9% 0.0000.0065.0%
0.1 0.0000.0055.3% 0.0000.0055.2% 0.0000.0055.2% 0.0000.0055.1%
0.9 0.0000.0050.1% 0.0000.00817.3% 0.0000.0045.4% 0.0000.0054.7%
0.2 0.0020.0450.2% 0.0020.06415.9% 0.0030.0393.7% 0.0020.0443.9%
0.2 0.0000.0064.4% 0.0000.0065.3% 0.0000.0064.6% 0.0000.0064.7%
0.1 0.0000.0055.8% 0.0000.0055.8% 0.0000.0055.1% 0.0000.0055.9%
The weight matrix is a social interaction matrix W n = I R [ ( 1 m j 1 m j I m j ) / ( m j 1 ) ] , m j IID U ( 3 , 20 ) , j = 1 , , R . Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at 5 % from 1000 simulations. The exogenous regressors are ( 1 , x i 1 , x i 2 ) , x i 1 IID N ( 3 , 1 ) , x i 2 IID U ( 1 , 2 ) , and x i 1 is independent of x i 2 . The group size m j follows IID U ( 3 , 20 ) . The error terms follow a zero-mean independent normal distribution. The variance structure is such that the error variance in the j-th group is 1 / m j , j = 1 , , R .
Table 5. Estimation of SAR under homoscedasticity with parameter configuration P1.
Table 5. Estimation of SAR under homoscedasticity with parameter configuration P1.
QML GMM II
R θ 0 BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % )
1000.2 0.0140.04319.0% −0.0070.0407.5% −0.0030.0375.6%
0.8 −0.0320.14618.5% 0.0250.1446.7% 0.0130.1386.3%
0.2 −0.0020.02915.9% −0.0020.0296.1% −0.0020.0296.1%
1.5 −0.0010.02617.7% −0.0020.0266.1% −0.0010.0265.9%
0.6 0.0260.03234.1% −0.0040.0205.4% −0.0020.0195.2%
0.8 −0.1370.18826.5% 0.0220.1375.4% 0.0100.1314.7%
0.2 −0.0010.02917.3% −0.0010.0295.1% 0.0000.0295.1%
1.5 −0.0050.02615.3% 0.0000.0264.5% 0.0010.0264.4%
0.9 0.0110.01151.3% −0.0010.0055.2% 0.0000.0054.1%
0.8 −0.2130.24839.9% 0.0220.1405.5% 0.0090.1345.0%
0.2 −0.0020.02814.5% −0.0010.0284.3% −0.0010.0283.9%
1.5 −0.0110.02919.8% 0.0010.0275.1% 0.0010.0274.7%
2000.2 0.0170.03223.0% −0.0030.0265.6% −0.0010.0254.3%
0.8 −0.0440.10318.6% 0.0090.0945.0% 0.0030.0903.6%
0.2 −0.0010.02015.7% −0.0010.0204.1% −0.0010.0204.2%
1.5 0.0010.01714.1% 0.0010.0174.6% 0.0010.0174.9%
0.6 0.0280.03062.1% −0.0020.0145.3% −0.0010.0134.2%
0.8 −0.1450.17247.1% 0.0090.0964.8% 0.0030.0933.8%
0.2 0.0000.02118.4% 0.0000.0214.5% 0.0000.0214.5%
1.5 −0.0050.01918.1% 0.0000.0185.3% 0.0010.0184.8%
0.9 0.0110.01185.1% 0.0000.0046.3% 0.0000.0034.1%
0.8 −0.2200.23768.7% 0.0090.0975.9% 0.0040.0935.0%
0.2 −0.0020.02117.7% 0.0000.0214.5% 0.0000.0214.4%
1.5 −0.0140.02325.0% −0.0020.0194.6% −0.0010.0194.5%
The weight matrix is a social interaction matrix W n = I R [ ( 1 m j 1 m j I m j ) / ( m j 1 ) ] , m j IID U ( 3 , 20 ) , j = 1 , , R . Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at 5 % from 1000 simulations. The exogenous regressors are ( 1 , x i 1 , x i 2 ) , x i 1 IID N ( 3 , 1 ) , x i 2 IID U ( 1 , 2 ) , and x i 1 is independent of x i 2 . The group size m j follows IID U ( 3 , 20 ) . The error terms follow a standard normal distribution.
Table 6. Estimation of SAR under homoscedasticity with parameter configuration P2.
Table 6. Estimation of SAR under homoscedasticity with parameter configuration P2.
QML GMM II
R θ 0 BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % ) BiasRMSE P ( 5 % )
1000.2 0.0410.07813.3% −0.0070.0565.7% −0.0040.0555.6%
0.2 −0.0370.12114.7% 0.0170.1125.3% 0.0110.1095.1%
0.2 −0.0020.02916.2% −0.0040.0295.9% −0.0020.0296.0%
0.1 0.0000.02618.0% −0.0010.0266.6% 0.0000.0266.4%
0.6 0.0720.07629.7% −0.0040.0294.6% −0.0020.0284.3%
0.2 −0.1490.18125.9% 0.0070.1085.3% 0.0010.1075.6%
0.2 −0.0010.02916.4% −0.0010.0295.2% 0.0010.0295.2%
0.1 0.0000.02618.2% 0.0000.0276.6% 0.0010.0276.2%
0.9 0.0260.02748.5% −0.0010.0085.3% −0.0010.0085.3%
0.2 −0.2140.23538.4% 0.0100.1105.3% 0.0040.1094.9%
0.2 −0.0030.02815.8% 0.0000.0284.0% 0.0010.0284.2%
0.1 −0.0020.02515.2% 0.0000.0265.1% 0.0010.0264.9%
2000.2 0.0460.06423.3% −0.0040.0384.6% −0.0030.0384.5%
0.2 −0.0460.09318.7% 0.0090.0795.3% 0.0050.0784.7%
0.2 −0.0010.02117.5% −0.0020.0216.0% −0.0010.0215.6%
0.1 0.0000.01815.8% 0.0000.0184.3% 0.0000.0184.0%
0.6 0.0740.07670.7% −0.0020.0204.8% −0.0010.0204.6%
0.2 −0.1520.17049.3% 0.0040.0816.3% 0.0000.0795.9%
0.2 −0.0020.02117.6% −0.0010.0215.0% 0.0010.0215.0%
0.1 0.0000.01917.8% 0.0000.0196.1% 0.0010.0196.4%
0.9 0.0270.02795.2% −0.0010.0054.6% 0.0000.0054.1%
0.2 −0.2180.22972.6% 0.0050.0816.0% 0.0010.0795.2%
0.2 −0.0030.02015.3% 0.0000.0204.8% 0.0010.0204.8%
0.1 −0.0010.01815.0% 0.0000.0184.2% 0.0010.0183.9%
The weight matrix is a social interaction matrix W n = I R [ ( 1 m j 1 m j I m j ) / ( m j 1 ) ] , m j IID U ( 3 , 20 ) , j = 1 , , R . Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at 5 % from 1000 simulations. The exogenous regressors are ( 1 , x i 1 , x i 2 ) , x i 1 IID N ( 3 , 1 ) , x i 2 IID U ( 1 , 2 ) , and x i 1 is independent of x i 2 . The group size m j follows IID U ( 3 , 20 ) . The error terms follow a standard normal distribution.

Share and Cite

MDPI and ACS Style

Bao, Y.; Liu, X.; Yang, L. Indirect Inference Estimation of Spatial Autoregressions. Econometrics 2020, 8, 34. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics8030034

AMA Style

Bao Y, Liu X, Yang L. Indirect Inference Estimation of Spatial Autoregressions. Econometrics. 2020; 8(3):34. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics8030034

Chicago/Turabian Style

Bao, Yong, Xiaotian Liu, and Lihong Yang. 2020. "Indirect Inference Estimation of Spatial Autoregressions" Econometrics 8, no. 3: 34. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics8030034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop