Indirect Inference Estimation of Spatial Autoregressions

Bao, Yong; Liu, Xiaotian; Yang, Lihong

doi:10.3390/econometrics8030034

Open AccessArticle

Indirect Inference Estimation of Spatial Autoregressions

by

Yong Bao

^1,*,

Xiaotian Liu

¹ and

Lihong Yang

^2,3

¹

Department of Economics, Purdue University, West Lafayette, IN 47907, USA

²

School of Economics, Nanjing Audit University, Nanjing 211815, China

³

National Academy of Development and Strategy, Renmin University of China, Beijing 100872, China

^*

Author to whom correspondence should be addressed.

Econometrics 2020, 8(3), 34; https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics8030034

Submission received: 1 May 2020 / Revised: 5 August 2020 / Accepted: 31 August 2020 / Published: 3 September 2020

Download

Browse Figure

Versions Notes

Abstract

:

The ordinary least squares (OLS) estimator for spatial autoregressions may be consistent as pointed out by Lee (2002), provided that each spatial unit is influenced aggregately by a significant portion of the total units. This paper presents a unified asymptotic distribution result of the properly recentered OLS estimator and proposes a new estimator that is based on the indirect inference (II) procedure. The resulting estimator can always be used regardless of the degree of aggregate influence on each spatial unit from other units and is consistent and asymptotically normal. The new estimator does not rely on distributional assumptions and is robust to unknown heteroscedasticity. Its good finite-sample performance, in comparison with existing estimators that are also robust to heteroscedasticity, is demonstrated by a Monte Carlo study.

Keywords:

spatial autoregression; OLS; indirect inference

JEL Classification:

C21; C10; C13

1. Introduction

Spatial autoregressions (SAR) have attracted lots of attention from both practical and theoretical sides in economics and other disciplines of social sciences since the classical work of Cliff and Ord (1981). Its popularity is mainly due to its parsimonious representation of the cross-sectional correlation by a weight matrix. Correlation in spatial data arises naturally due to competition, copycatting, spillover, aggregation, to name just a few. In these contexts, space embodied in the weight matrix can be defined in terms of not only geographical distance but also economic distance.

The spatial autoregression model extends autocorrelation in time series to the spatial dimension in the sense that in the structural equation a spatially “lagged” dependent variable is included as a regressor. In time series, the autoregression model can be estimated consistently by the ordinary least squares (OLS), but for the spatial autoregression, the OLS is usually regarded as an inconsistent method. Robinson (2008) provided an excellent discussion of the intuition behind this. Estimation strategies like the maximum likelihood (ML), quasi maximum likelihood (QML), instrument variables (IV), and generalized method of moments (GMM) have been proposed in the literature. The ML is the most efficient, but it imposes stringent distributional assumptions on the data generating process, whereas both the ML and QML rule out heteroscedasticity in the error term. Further, the (Q)ML method involves calculating the determinant or eigenvalues of a matrix that is of the same size as the sample, and thus many researchers dismiss its use in moderately large samples and advocate the more flexible IV and GMM estimators that may incur less computational burden and are also robust to heteroscedasticity.

Lee (2002) overturned the traditional wisdom regarding the OLS estimator in spatial autoregressions when there are exogenous regressors included. He showed that while the OLS estimator is inconsistent for spatial autoregressions with a sparse weight matrix, it can be consistent when spatial units may have small spatial impacts on other units but each unit may be influenced aggregately by a significant portion of the total units. For the special case of the so-called pure SAR model, namely, when there is no other exogenous regressor, Lee (2002) demonstrated that regardless of the structure of the weight matrix, the OLS estimator is always inconsistent.

In practice, one may have limited knowledge to judge whether a spatial unit is influenced by a significant portion of the total units. When a researcher is constructing the weight matrix, she may know the number of neighboring units for each unit, but she may be unable to tell whether the number of neighbors is significant in finite samples. Thus, this poses a challenge for practitioners regarding the usefulness of the OLS estimator: it may or may not be consistent, depending on the degree of aggregate influence on each unit from other units, which may hardly manifest itself in finite samples. This paper carefully analyzes the asymptotic distribution theory for the OLS estimator. A unified asymptotic distribution result for the recentered OLS estimator is presented under different regimes for the spatial weight matrix. Given the asymptotic result for the recentered OLS estimator, a new estimator based on the indirect inference (II) procedure is proposed.

Kyriacou et al. (2017) novelly used the II procedure to correct the inconsistency of the OLS estimator in the pure SAR model under homoscedasticity. Even though they provided promising simulation results under some mild heteroscedasticity, they have yet to show rigorously how to construct a consistent estimator with no restrictions on the form of heteroscedasticity. In contrast, this paper considers the SAR model with exogenous regressors and it adds to the existing spatial literature with unknown heteroscedasticity.1

Note that the problem of inconsistency (of the OLS estimator) is solely due to the presence of the endogenous spatially lagged variable. Once the spatial autoregression parameter is estimated consistently, the OLS procedure can be used to estimate the remaining parameters, though the asymptotic variance needs to be modified accordingly to take into account the uncertainty in the estimated spatial autoregression parameter.

The structure of this paper is as follows. In the next section, the asymptotic behavior of the OLS estimator is discussed under different spatial scenarios. The II estimator, which aims to correct the possible inconsistency of the OLS estimator, is defined and its asymptotic distribution is derived. A very important message from this section is that regardless of the degree of aggregate influence on each spatial unit from other units, the II procedure can always be used and the resulting estimator is consistent and asymptotically normal. Section 3 discusses the special case of pure SAR. Section 4 provides Monte Carlo evidence of the effectiveness of the II estimation strategy. It shows that the II estimator possesses good finite-sample performance relative to other consistent estimators that are also robust to unknown heteroscedasticity and that the II method may be favored for the purpose of hypothesis testing, especially for testing the spatial autoregression parameter. Section 5 concludes. Some useful lemmas are collected in Appendix A and proofs of the results presented in Section 2 and Section 3 are given in Appendix B.

Throughout this paper, K is used to denote a positive constant on different occasions, arbitrarily large but bounded, that does not depend on the sample size n and whose value may vary in different contexts.

I_{n}

is the identity matrix of dimension n and

1_{n}

is an

n \times 1

vector of ones. For an

n \times 1

vector

a_{n}

,

a_{i, n}

denotes its i-th element and for an

n \times n

matrix

A_{n}

,

a_{i j, n}

denotes its

i j

-th element.

{||A_{n}||}_{\infty} = {max}_{1 \leq i < n} \sum_{j = 1}^{n} | a_{i j, n} |

and

{||A_{n}||}_{1} = {max}_{1 \leq j < n} \sum_{i = 1}^{n} | a_{i j, n} |

are the maximum row sum norm and maximum column sum norm, respectively. A sequence of matrices

{A_{n}}

is uniformly bounded in row sum if

{||A_{n}||}_{\infty} \leq K

, and is uniformly bounded in column sum if

{||A_{n}||}_{1} \leq K

.

tr

and ⊙ are matrix trace and Hadamard product operators, respectively.

Dg (a_{n})

denotes a square diagonal matrix with the vector

a_{n}

spanning the main diagonal,

dg (A_{n})

is an

n \times 1

column vector that collects in order the diagonal elements of the square matrix

A_{n}

, and

Dg (A_{n}) = Dg (dg (A_{n}))

. The subscript 0 is used to signify the true parameter value.

2. Main Results

Consider the SAR model

y_{n} = λ W_{n} y_{n} + X_{n} β + u_{n} = Z_{n} θ + u_{n},

(1)

where n is the total number of cross-sectional units,

Z_{n} = (W_{n} y_{n}, X_{n})

,

θ = {(λ, β^{'})}^{'}

,

y_{n}

is an

n \times 1

vector, collecting observations on the dependent variable,

X_{n}

is an

n \times k

matrix of observations on k exogenous nonstochastic regressors with coefficient vector

β

,

W_{n}

is an

n \times n

matrix of spatial weights with zero diagonals,

λ

is the spatial autoregression coefficient, and

u_{n}

is an n-dimensional vector of error terms.

For the ease of presentation, the following matrix notation is introduced in this paper:

\begin{array}{l} S_{n} (λ) = I_{n} - λ W_{n}, G_{n} (λ) = W_{n} S_{n}^{- 1} (λ), M_{n} = I_{n} - X_{n} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'}, \\ D_{n} (λ) = Dg (M_{n} G_{n} (λ)), E_{n} (λ) = M_{n} G_{n} (λ) - D_{n} (λ) . \end{array}

(2)

When a matrix is presented without its argument

λ

, it means that it is evaluated at the parameter value

λ_{0}

. Namely,

S_{n} = S_{n} (λ_{0})

,

G_{n} = G_{n} (λ_{0})

,

D_{n} = D_{n} (λ_{0})

, and

E_{n} = E_{n} (λ_{0})

.

If

λ

is known (equal to its true value), the model becomes a standard linear model with

S_{n} y_{n}

being the dependent variable; otherwise,

W_{n} y_{n}

appears on the right-hand side of (1) as a spatially lagged or weighted variable. The OLS estimator of

θ_{0}

is

{\hat{θ}}_{n} = {(Z_{n}^{'} Z_{n})}^{- 1} Z_{n}^{'} y_{n}

, which may be inconsistent. Since the inconsistency of

{\hat{θ}}_{n}

is solely due to the endogenous

W_{n} y_{n}

, the properties of

{\hat{λ}}_{n}

(the first element of

{\hat{θ}}_{n}

) are discussed first in this paper. Once a consistent estimator of

λ_{0}

is available, one can show that a consistent estimator of

β_{0}

follows immediately (see Theorem 4 to be introduced).2

Let

r_{n} = r_{1 n} + r_{2 n}

,

r_{1 n} = u_{n}^{'} M_{n} G_{n} u_{n}

,

r_{2 n} = β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}

,

d_{n} = d_{1 n} + d_{2 n} + d_{3 n}

,

d_{1 n} = u_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n}

,

d_{2 n} = 2 β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n}

, and

d_{3 n} = β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}

. By using the partitioned regression formula and substituting

W_{n} y_{n} = G_{n} X_{n} β_{0} + G_{n} u_{n}

, one may put

{\hat{λ}}_{n} - λ_{0} = \frac{y_{n}^{'} W_{n}^{'} M_{n} u_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}} = \frac{r_{n}}{d_{n}} = \frac{r_{1 n} + r_{2 n}}{d_{1 n} + d_{2 n} + d_{3 n}} .

(3)

The following assumptions are made throughout this paper.

Assumption 1.

(i)

\forall i \neq j

,

w_{i j, n} = O (h_{n}^{- 1})

, where the rate sequence

{h_{n}}

is uniformly bounded away from zero and

lim_{n \to \infty} h_{n} / n = 0

; (ii) the sequence

{W_{n}}

is is uniformly bounded in row and column sums; (iii)

w_{i i, n} = 0

.

Assumption 2.

(i)

S_{n}^{- 1}

exists; (ii) the sequence

{S_{n}^{- 1}}

is uniformly bounded in row and column sums.

Assumption 3.

The error terms

{u_{i, n}}

in

u_{n} = {(u_{1, n}, \dots, u_{n, n})}^{'}

have following properties: (i)

E (u_{i, n}) = 0

; (ii)

E (| u_{i, n} |^{4 + δ}) < \infty

for some positive constant δ; (iii)

u_{i, n}

and

u_{j, n}

are independent for any

i \neq j

.

Assumption 4.

λ_{0}

is contained in a compact parameter space Λ. For any admissible

λ \in Λ

,

{S_{n}^{- 1} (λ)}

is uniformly bounded in row and column sums.

Assumption 5.

(i) The elements of

X_{n}

are uniformly bounded constants for all n; (ii) the probability limit Γ of

n^{- 1} Z_{n}^{'} Z_{n}

exists and is nonsingular.

Assumption 6.

(i)

{lim}_{n \to \infty} n^{- 1} Var (r_{n})

exists and is nonzero; (ii)

{lim}_{n \to \infty} n^{- 1} Var (u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n})

exists and is nonzero.

Intuitions and related discussions for Assumptions 1–5 are provided in Kelejian and Prucha (2010) and Lee (2001, 2002, 2004). Assumption 1(i) follows naturally when

W_{n}

is row- or column-normalized, as is typically the case. Assumption 1(ii) and Assumption 2 limit the degree of dependency among the spatial units and is originated by Kelejian and Prucha (1999). Given Assumption 2, the equilibrium solution of

y_{n}

is

y_{n} = S_{n}^{- 1} X_{n} β_{0} + S_{n}^{- 1} u_{n}

. Under Assumption 3, let

σ_{i, n}^{2} = E (u_{i, n}^{2})

,

Σ_{n} = Dg (σ_{1, n}^{2}, \dots, σ_{n, n}^{2})

, and

Σ_{n}^{(j)} = Dg (μ_{1, n}^{(j)}, \dots, μ_{n, n}^{(j)})

with

μ_{i, n}^{(j)} = E (u_{i, n}^{j})

,

j = 3, 4

. Lee (2002) emphasized that Assumption 5(ii) is related to an identification condition for estimation in the least squares and IV frameworks. It rules out possible multicollinearities among

X_{n}

and

G_{n} X_{n} β_{0}

for large n and implies that the limit of

d_{3 n} / n

is bounded away from zero.3 Assumption 6 ensures that the asymptotic variances of the (properly recentered) OLS estimator and the resulting II estimator are positive.4

2.1. The Asymptotic Behavior of the OLS estimator

From Lemma A6 in Appendix A,

r_{1 n}

and

d_{1 n}

are both

O_{P} (n / h_{n})

,

r_{2 n}

and

d_{2 n}

are both

O_{P} (\sqrt{n})

, and

d_{3 n} = O (n)

. The asymptotic properties of the OLS estimator crucially depend on the magnitude of

h_{n}

.

When

h_{n}

is bounded, the OLS estimator cannot be consistent, since now

{\hat{λ}}_{n} - λ_{0} = \frac{\frac{1}{n} r_{1 n}}{\frac{1}{n} d_{1 n} + \frac{1}{n} d_{3 n}} + o_{P} (1),

but the probability limit of the numerator

n^{- 1} r_{1 n}

is typically nonzero.

If

h_{n} \to \infty

,

d_{3 n}

dominates the denominator in (3), then

{\hat{λ}}_{n} - λ_{0} = \frac{\frac{1}{h_{n}} (\frac{h_{n}}{n} r_{1 n}) + \frac{1}{\sqrt{n}} (\frac{1}{\sqrt{n}} r_{2 n})}{\frac{1}{n} d_{3 n}} + o_{P} (1) \overset{p}{\to} 0,

indicating that

{\hat{λ}}_{n}

is consistent as long as

h_{n} \to \infty

.

More interestingly, the behavior of

{\hat{λ}}_{n}

depends on how fast

h_{n}

diverges to infinity. If

h_{n}

tends to infinity at a rate slower than

\sqrt{n}

, then

r_{1 n}

dominates

r_{2 n}

in the numerator, and

h_{n} ({\hat{λ}}_{n} - λ_{0}) = \frac{\frac{h_{n}}{n} r_{1 n}}{\frac{1}{n} d_{3 n}} + o_{P} (1),

(4)

which implies that

{\hat{λ}}_{n}

converges at the slower rate

h_{n}

, but it does not converge to

λ_{0}

at rate

h_{n}

, as

\underset{n \to \infty}{plim} (h_{n} / n) r_{1 n} = lim_{n \to \infty} (h_{n} / n) tr (Σ_{n} M_{n} G_{n})

is typically nonzero.

If

h_{n} = O (\sqrt{n})

,

r_{1 n}

and

r_{2 n}

are of the same order in the numerator, so

\sqrt{n} ({\hat{λ}}_{n} - λ_{0}) = \frac{\frac{1}{\sqrt{n}} r_{1 n} + \frac{1}{\sqrt{n}} r_{2 n}}{\frac{1}{n} d_{3 n}} + o_{P} (1),

(5)

indicating that

{\hat{λ}}_{n}

converges at rate

\sqrt{n}

, but at this rate it does not converge to

λ_{0}

, as in general

\underset{n \to \infty}{plim} n^{- 1 / 2} r_{1 n} = lim_{n \to \infty} n^{- 1 / 2} tr (Σ_{n} M_{n} G_{n})

is nonzero.

If

h_{n}

tends to infinity at a rate faster than

\sqrt{n}

(and yet smaller than n), namely,

lim_{n \to \infty} \sqrt{n} / h_{n} = 0

(and

lim_{n \to \infty} h_{n} / n = 0)

, then in the numerator

r_{2 n}

dominates

r_{1 n}

,

\sqrt{n} ({\hat{λ}}_{n} - λ_{0}) = \frac{\frac{1}{\sqrt{n}} r_{2 n}}{\frac{1}{n} d_{3 n}} + o_{P} (1),

(6)

indicating that

{\hat{λ}}_{n}

converges to

λ_{0}

at rate

\sqrt{n}

and is asymptotically normal if one applies a central limit theorem to

n^{- 1 / 2} r_{2 n}

.

One sees that the asymptotic behavior of

{\hat{λ}}_{n}

depends on the magnitude of

h_{n}

, which may be unknown in practice. The following theorem shows that if one can properly recenter

{\hat{λ}}_{n}

, then a unified asymptotic distribution result follows.

Theorem 1.

Under Assumptions 1–6, the OLS estimator

{\hat{λ}}_{n}

of

λ_{0}

in the SAR model (1) has the following asymptotic distribution,

\sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{tr (Σ_{n} M_{n} G_{n})}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) \overset{d}{\to} N (0, v),

(7)

where

v = lim_{n \to \infty} \frac{n Var (r_{n})}{{[tr (Σ_{n} G_{n}^{'} M_{n} G_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}]}^{2}}

with

\begin{matrix} Var (r_{n}) & = tr (Σ_{n}^{(4)} ⊙ M_{n} G_{n} ⊙ M_{n} G_{n}) + tr [Σ_{n} M_{n} G_{n} Σ_{n} (M_{n} G_{n} + G_{n}^{'} M_{n})] \\ + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0} + 2 β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} dg (Σ_{n}^{(3)} ⊙ M_{n} G_{n}) . \end{matrix}

Remark 1.

The recentering term

tr (Σ_{n} M_{n} G_{n}) / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

is in fact

E (r_{n}) / d_{n}

. One could have recentered

{\hat{λ}}_{n} - λ_{0}

by

E (r_{n}) / E (d_{n})

. (This is the approach taken by Kyriacou et al. (2017) when dealing with the special case of pure SAR model.) But then by following a similar expansion as in the proof of Theorem 1 (see Appendix B.1), one can find that the asymptotic variance of the resulting recentered estimator is much more complicated, involving the variances of

r_{n}

and

d_{n}

as well as their covariance.

Remark 2.

When

lim_{n \to \infty} \sqrt{n} / h_{n} = 0

, the recentering term

tr (Σ_{n} M_{n} G_{n}) / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

is in fact

o_{P} (n^{- 1 / 2})

and the asymptotic distribution of

\sqrt{n} ({\hat{λ}}_{n} - λ_{0})

centers at zero, so one does not need to recenter

{\hat{λ}}_{n} - λ_{0}

by

tr (Σ_{n} M_{n} G_{n}) / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

; nor does one need to recenter it by

y_{n}^{'} S_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n} / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

, which is also

o_{P} (n^{- 1 / 2}),

as in Theorem 2 to be introduced.

Remark 3.

When

h_{n}

diverges, one may single out dominating terms in

Var (r_{n})

and

E (d_{n})

so that a finer expression of

v = {lim}_{n \to \infty} [Var (r_{n}) / n] / {[E (d_{n}) / n]}^{2}

can be presented. For example, under divergent

h_{n}

, one can replace

Var (r_{n})

with

Var (r_{2 n}) = β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}

and replace

E (d_{n})

with its dominating term

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}

. When

h_{n}

is bounded, however, such replacements are not available and one needs to keep all the term in

Var (r_{n})

and

E (d_{n})

. Moreover, since

Var (r_{n})

depends on higher-order moments of u, namely,

Σ_{n}^{(3)}

and

Σ_{n}^{(4)}

, then, with bounded

h_{n}

, the asymptotic variance v of the recentered OLS estimator depends on them, too.

Remark 4.

In view of Remark 3, under divergent

h_{n}

, if further

Σ_{n} = σ_{0}^{2} I_{n}

, namely, under homoscedasticity with

σ_{0}^{2} = E (u_{i, n}^{2})

, then v corresponds to the top left element of

σ_{0}^{2} Γ^{- 1}

. This, together with Remark 2, is in line with the observation in Lee (2002) that the (consistent, uncentered) OLS estimator (when

lim_{n \to \infty} \sqrt{n} / h_{n} = 0

) has the same limiting distribution of the optimal IV estimator and under normality it has the same limiting distribution of the ML estimator. It also implies that under other cases of divergent

h_{n}

(slower than or equal to rate

\sqrt{n}

), as long as the OLS estimator is properly recentered, it achieves the same limiting distribution.

Remark 5.

Theorem 1 gives a unified representation of the asymptotic distribution of the properly recentered OLS estimator, regardless of the possibly unknown magnitude of

h_{n}

. Further, it facilitates the construction of the indirect inference estimator to be introduced that corrects the inconsistency, when present, of the OLS estimator.

2.2. The Indirect Inference Estimator

One can see from Theorem 1 in the previous subsection that the OLS estimator

{\hat{λ}}_{n}

may have an asymptotic bias. Yet a direct feasible bias-correction procedure is not possible, since the bias itself depends on unknown parameters, including

λ_{0}

, which may not be consistently estimated by the OLS estimator

{\hat{λ}}_{n}

. Following Phillips (2012) and Kyriacou et al. (2017), one may define the binding function that involves the bias of

{\hat{λ}}_{n}

. Unfortunately, the resulting binding function then involves the unknown

Σ_{n}

, which appears in the recentering quantity for the OLS estimator in (7). The strategy in this paper is to replace

tr (Σ_{n} M_{n} G_{n}) = E (u_{n}^{'} M_{n} G_{n} u_{n})

with a term that is of the same order of

u_{n}^{'} M_{n} G_{n} u_{n}

and at the same time does not directly involve

Σ_{n}

.

Recall that

D_{n} = Dg (M_{n} G_{n})

. Then under a general form of unknown heteroscedasticity,

tr (Σ_{n} M_{n} G_{n}) = tr (Σ_{n} D_{n}) = E (u_{n}^{'} D_{n} u_{n})

, where

u_{n} = S_{n} y_{n} - X_{n} β_{0}

. If

λ_{0}

is known, then

β_{0}

can be consistently estimated by

{\tilde{β}}_{n} = {\tilde{β}}_{n} (λ_{0}) = {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} S_{n} y_{n}

. Let

{\tilde{u}}_{n} = {\tilde{u}}_{n} (λ_{0}) = S_{n} y_{n} - X_{n} {\tilde{β}}_{n} = M_{n} S_{n} y_{n}

. Now one may be able to replace

E (u_{n}^{'} D_{n} u_{n})

with

{\tilde{u}}_{n}^{'} D_{n} {\tilde{u}}_{n} = y_{n}^{'} S_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n}

and use

y_{n}^{'} S_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n} / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

as the recentering quantity.

Theorem 2.

Under Assumptions 1–6, the OLS estimator

{\hat{λ}}_{n}

of

λ_{0}

in the SAR model (1) has the following asymptotic distribution:

\sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{y_{n}^{'} S_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) \overset{d}{\to} N (0, η),

(8)

where

\begin{matrix} η & = lim_{n \to \infty} \frac{n \{tr [Σ_{n} E_{n} Σ_{n} (E_{n} + E_{n}^{'})] + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}\}}{{[tr (Σ_{n} G_{n}^{'} M_{n} G_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}]}^{2}} . \end{matrix}

Remark 6.

When one replaces

tr (Σ_{n} M_{n} G_{n})

that involves the unknown

Σ_{n}

and appears in the recentering term of the OLS estimator, the asymptotic variance η of the newly recentered estimator no longer involves

Σ_{n}^{(3)}

and

Σ_{n}^{(4)}

. This stands in contrast to the asymptotic variance v (see Remark 3). So replacing

tr (Σ_{n} M_{n} G_{n})

with

y_{n}^{'} S_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n}

facilitates not only the construction of the indirect inference estimator to be introduced but also the inference procedure.

Given Theorem (2) and the observed sample data

y_{n}

and

X_{n}

, one can always define the sample binding function. Recall that

S_{n} (λ) = I_{n} - λ W_{n}

and

D_{n} (λ) = Dg (M_{n} G_{n} (λ)) = Dg (M_{n} W_{n} S_{n}^{- 1} (λ))

are functions of the parameter

λ

(as well as

X_{n}

). So the binding function can be defined as

b_{n} (λ) = λ + \frac{y_{n}^{'} S_{n}^{'} (λ) M_{n} D_{n} (λ) M_{n} S_{n} (λ) y_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}

(9)

and the II estimator inverts this binding function:

{\hat{λ}}_{n}^{I I} = b_{n}^{- 1} ({\hat{λ}}_{n}) .

(10)

Intuitively, the II estimator defined as such tries to match

{\hat{λ}}_{n}

from the observed data to its expectation, at least approximately. Typically, the expectation may be approximated, to an arbitrary degree of accuracy, via the method of simulations, as in the original spirit of Gouriéroux et al. (1993) and Smith (1993); however, in simulations one needs to make some distributional assumption to generate the pseudo error term. Instead one may use some analytical approximation as in Phillips (2012), Kyriacou et al. (2017), and this paper.

In the definition of the II estimator as in (10), it is implicitly assumed that the binding function

b_{n} (λ)

is invertible. Note that

b_{n} (λ)

is a function of

λ

and the sample data

y_{n}

and

X_{n}

and thus it is random.5

Assumption 7.

For all

λ \in Λ

, the binding function (9) is monotonic in λ with probability 1 and when

h_{n}

is bounded,

b_{n}^{'} (λ_{0}) \overset{a . s .}{\to} b_{0} \neq 0

, where

b_{n}^{'} (λ_{0}) = 1 + \frac{y_{n}^{'} S_{n}^{'} M_{n} Dg (M_{n} G_{n}^{2}) M_{n} S_{n} y_{n} - 2 y_{n}^{'} W_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}} .

(11)

One can see from the expression (A.10) in Appendix B.3 that the derivative of the binding function with respect to

λ

is

b_{n}^{'} (λ) = 1 + O_{P} (\frac{1}{h_{n}}) .

For divergent

h_{n}

, one can show that the

O_{P} (h_{n}^{- 1})

converges almost surely to zero for all

λ \in Λ

. So in large samples, Assumption 7 is more likely to hold. Assumption 7 lists the conditions under which the II estimator exists and is consistent. It would be desirable if one could lay down some primitive conditions on the data matrix

X_{n}

, the weight matrix

W_{n}

, and the parameter space

Λ

so that Assumption 7 would be satisfied. Given the sample data, one may plot the binding function against

λ

to verify numerically the validity of this assumption. Simulations as in Gospodinov et al. (2017) may also help to establish this assumption’s credibility.

Theorem 3.

For the SAR model (1), under Assumptions 1–7, the II estimator

{\hat{λ}}_{n}^{I I}

of

λ_{0}

, defined as in (10), which is based on the binding function (9), has the following asymptotic distribution:

\sqrt{n} ({\hat{λ}}_{n}^{I I} - λ_{0}) \overset{d}{\to} N (0, \frac{η}{b_{0}^{2}}) .

(12)

Remark 7.

Since when

h_{n}

diverges,

b_{n}^{'} (λ_{0})

converges almost surely to 1, the asymptotic distribution of the II estimator is identical to that of the properly recentered OLS estimator. Further, under divergent

h_{n}

,

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}

dominates

Var (u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n})

and

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}

dominates

E (d_{n})

, so

η = v + o (1)

. In light of Remark 4, this means that under homoscedasticity, the II estimator defined as in (10) is as efficient as the optimal IV estimator and can be as efficient as the ML estimator if the spatial data is normally distributed. (The same conclusion holds for the slightly modified II estimator, designed specifically under homoscedasticity, in Appendix B.6.)

Remark 8.

Theorem 3 shows that regardless of the magnitude of

h_{n}

, which researchers may not know in practice, one can always apply the II procedure after the OLS estimation is done. At worst, when

lim_{n \to \infty} \sqrt{n} / h_{n} = 0

, this procedure is redundant (see Remark 2), but still the resulting II estimator has exactly the same asymptotic distribution as the consistent OLS estimator (since

η = v + o (1)

, see Remark 7). Otherwise, the II procedure provides a correction to the inconsistent OLS estimator.

Once the spatial autoregression parameter

λ

is consistently estimated by

{\hat{λ}}_{n}^{I I}

, one can estimate the parameter vector

β

by

{\hat{β}}_{n}^{I I} = {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} {\hat{S}}_{n} y_{n},

(13)

where

{\hat{S}}_{n} = S_{n} ({\hat{λ}}_{n}^{I I}) = S_{n} - ({\hat{λ}}_{n}^{I I} - λ_{0}) W_{n}

.

Theorem 4.

For the SAR model (1), under Assumptions 1–7, the OLS estimator of

β_{0}

defined as in (13) has the asymptotic distribution

\sqrt{n} ({\hat{β}}_{n}^{I I} - β_{0}) \overset{d}{\to} N (0, V),

and jointly,

\sqrt{n} ({\hat{θ}}_{n}^{I I} - θ_{0}) = \sqrt{n} (\begin{matrix} {\hat{λ}}_{n}^{I I} - λ_{0} \\ {\hat{β}}_{n}^{I I} - β_{0} \end{matrix}) \overset{d}{\to} N (0, (\begin{matrix} \frac{η}{b_{0}^{2}} & γ^{'} \\ γ & V \end{matrix})),

where

V

, assumed to exist and be positive definite, is given by (A.14) and γ is given by (A.15), respectively, in Appendix B.4.

Remark 9.

One can see (from Appendix B.4) that the expression of

V

contains the traditional OLS variance term under heteroscedasticity,

lim_{n \to \infty} n {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} Σ_{n} X_{n} {(X_{n}^{'} X_{n})}^{- 1}

, as well as terms that signal the additional uncertainty introduced by

{\hat{λ}}_{n}^{I I}

in the definition of

{\hat{β}}_{n}^{I I}

as in (13).

In practice, in order to make asymptotically valid inference from the II estimation strategy, one needs to estimate

η / b_{0}^{2}

,

V

, and

γ

in Theorems 3 and 4 by

η ({\hat{θ}}_{n}^{I I}) / {[b_{n}^{'} ({\hat{λ}}_{n}^{I I})]}^{2}

,

V ({\hat{θ}}_{n}^{I I}, {\hat{Σ}}_{n})

, and

γ ({\hat{θ}}_{n}^{I I}, {\hat{Σ}}_{n})

, respectively, where

{\hat{Σ}}_{n} = Dg ({\hat{u}}_{1, n}^{2}, \dots, {\hat{u}}_{n, n}^{2})

and

{\hat{u}}_{i, n}

’s are the sample residuals from the II estimation.6

3. The Special Case of Pure SAR

It is worthwhile to discuss the case when there is no

X_{n}

, namely, the so-called pure SAR model

y_{n} = λ W_{n} y_{n} + u_{n} .

(14)

This case is of special interest since there is no IV available. On the other hand, the QML estimator is not consistent under heteroscedasticity. Kyriacou et al. (2017) were the first to explore the possibility of using the II procedure to correct the inconsistency of the OLS estimator under some mild form of heteroscedasticity and their results were quite promising. In this paper, no restrictions are imposed on the form of the unknown heteroscedasticity.

Given the expansion (3), one can see obviously that

\underset{n \to \infty}{plim} ({\hat{λ}}_{n} - λ_{0}) = \frac{\underset{n \to \infty}{plim} \frac{h_{n}}{n} r_{1 n}}{\underset{n \to \infty}{plim} \frac{h_{n}}{n} d_{1 n}} = \frac{lim_{n \to \infty} \frac{h_{n}}{n} tr (Σ_{n} G_{n})}{lim_{n \to \infty} \frac{h_{n}}{n} tr (Σ_{n} G_{n}^{'} G_{n})} \neq 0,

regardless of the magnitude of

h_{n}

. Proceeding similarly as before, as long as

h_{n} = o (n)

, one can write

r_{1 n} - E (r_{1 n}) = O_{P} (\sqrt{n / h_{n}})

,

d_{1 n} - E (d_{1 n}) = O_{P} (\sqrt{n / h_{n}})

, and

E (d_{1 n}) = O (n / h_{n})

. (See Lemma A6 in Appendix A). Then, by a Nagar-type (Nagar (1959)) expansion,

\begin{matrix} \sqrt{\frac{n}{h_{n}}} ({\hat{λ}}_{n} - λ_{0} - \frac{E (r_{1 n})}{d_{1 n}}) & = \sqrt{\frac{n}{h_{n}}} (\frac{r_{1 n} - E (r_{1 n})}{E (d_{1 n}) + d_{1 n} - E (d_{1 n})}) \\ = \sqrt{\frac{n}{h_{n}}} \frac{r_{1 n} - E (r_{1 n})}{E (d_{1 n})} + o_{P} (1) . \end{matrix}

(15)

Assumption 6 needs to be modified accordingly to ensure the asymptotic variance of the properly recentered

{\hat{λ}}_{n}

exists and is positive. Now let

D_{n} = Dg (G_{n})

and

E_{n} = G_{n} - D_{n} .

Assumption 8.

(i)

v = lim_{n \to \infty} \frac{n {tr (Σ_{n}^{(4)} ⊙ G_{n} ⊙ G_{n}) + tr [Σ_{n} G_{n} Σ_{n} (G_{n} + G_{n}^{'})]}}{h_{n} {[tr (Σ_{n} G_{n}^{'} G_{n})]}^{2}}

exists and is positive; (ii)

η = lim_{n \to \infty} \frac{n tr [Σ_{n} E_{n} Σ_{n} (E_{n} + E_{n}^{'})]}{h_{n} {[tr (Σ_{n} G_{n}^{'} G_{n})]}^{2}}

exists and is positive.

Corollary 1.

Under Assumptions 1–4 and 8, the OLS estimator

{\hat{λ}}_{n}

of λ in the pure SAR model (14) has the following asymptotic distribution,

\sqrt{\frac{n}{h_{n}}} ({\hat{λ}}_{n} - λ_{0} - \frac{tr (Σ_{n} G_{n})}{y_{n}^{'} W_{n}^{'} W_{n} y_{n}}) \overset{d}{\to} N (0, v),

(16)

where v is defined as in Assumption 8(i).

Corollary 2.

Under Assumptions 1–4 and 8, the OLS estimator

{\hat{λ}}_{n}

of λ in the pure SAR model (14) has the following asymptotic distribution:

\sqrt{\frac{n}{h_{n}}} ({\hat{λ}}_{n} - λ_{0} - \frac{y_{n}^{'} S_{n}^{'} D_{n} S_{n} y_{n}}{y_{n}^{'} W_{n}^{'} W_{n} y_{n}}) \overset{d}{\to} N (0, η),

(17)

where η is defined as in Assumption 8(ii).

Let the sample binding function be

b_{n} (λ) = λ + \frac{y_{n}^{'} S_{n}^{'} (λ) D_{n} (λ) S_{n} (λ) y_{n}}{y_{n}^{'} W_{n}^{'} W_{n} y_{n}} .

(18)

Accordingly, Assumption 7 is modified as follows.

Assumption 9.

For all λ in Λ, the binding function (18) is monotonic in λ with probability 1 and

b_{n}^{'} (λ_{0}) \overset{a . s .}{\to} b_{0} \neq 0

, where

b_{n}^{'} (λ_{0}) = 1 + \frac{y_{n}^{'} S_{n}^{'} Dg (G_{n}^{2}) S_{n} y_{n} - 2 y_{n}^{'} W_{n}^{'} D_{n} S_{n} y_{n}}{y_{n}^{'} W_{n}^{'} W_{n} y_{n}} .

(19)

Corollary 3.

For the pure SAR model (14), under Assumptions 1–4, 8, and 9, the II estimator

{\hat{λ}}_{n}^{I I}

of λ that is based on the binding function (18) has the following asymptotic distribution:

\sqrt{\frac{n}{h_{n}}} ({\hat{λ}}_{n}^{I I} - λ_{0}) \overset{d}{\to} N (0, \frac{η}{b_{0}^{2}}),

(20)

where η is defined as in Assumption 8(ii).

Remark 10.

If

Σ_{n} = σ_{0}^{2} I_{n}

(namely, under homoscedasticity) and ones uses

E (r_{1 n}) / E (d_{1 n}) = tr (G_{n}) / tr (G_{n}^{'} G_{n})

as the recentering term, which fortunately does not involve the unknown variance parameter

σ_{0}^{2}

, then the binding function, its derivative, and the asymptotic distribution of the resulting II estimator can be modified accordingly as in Kyriacou et al. (2017). In contrast to the SAR model with

X_{n}

, the recentered OLS estimator and the II estimator in the pure SAR model have convergence rate

\sqrt{n / h_{n}}

.

Remark 11.

One can see that

b_{n}^{'} (λ_{0})

does not converge almost surely to 1 when

h_{n}

is divergent as in the case when

X

is present. This is because

y_{n}^{'} W_{n}^{'} W_{n} y_{n} = O_{P} (n / h_{n})

for the pure case and

y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n} = O_{P} (n)

when

X

is present, whereas both the numerators in the second terms on the right-hand sides of (19) and (11) are

O_{P} (n / h_{n})

.

Remark 12.

Admittedly, the convergence rate in (20) depends on

h_{n}

. However, this does not prevent one from using the II estimator if one is interested in estimating the pure SAR model, since the binding function (18) does not involve

h_{n}

. For inference purpose, since η defined as in Assumption 8(ii) has the scaling factor

n / h_{n}

, one can see that once

{\hat{λ}}_{n}^{I I}

and the sample residuals

{\hat{u}}_{i, n}

are available, the standard error of

{\hat{λ}}_{n}^{I I}

can be calculated as

\sqrt{tr [{\hat{Σ}}_{n} {\hat{E}}_{n} {\hat{Σ}}_{n} ({\hat{E}}_{n} + {\hat{E}}_{n}^{'})] / {[b_{n}^{'} ({\hat{λ}}_{n}^{I I}) tr ({\hat{Σ}}_{n} {\hat{G}}_{n}^{'} {\hat{G}}_{n})]}^{2}}

, where

{\hat{G}}_{n} = G_{n} ({\hat{λ}}_{n}^{I I})

,

{\hat{E}}_{n} = E_{n} ({\hat{λ}}_{n}^{I I})

, and

{\hat{Σ}}_{n} = Dg ({\hat{u}}_{1, n}^{2}, \dots, {\hat{u}}_{n, n}^{2})

.

4. Monte Carlo Evidence

In this section, Monte Carlo simulations are provided to demonstrate the performance of

{\hat{λ}}_{n}^{I I}

(as well as

{\hat{β}}_{n}^{I I}

in (13)) in finite samples, in comparison with consistent estimators that are also robust under heteroscedasticity: the optimal robust GMM estimator of Lin and Lee (2010) and the modified QML (MQML) estimator of Liu and Yang (2015). For the optimal robust GMM estimator of Lin and Lee (2010),

u_{n}^{'} Σ_{n}^{- 1} (G_{n} (λ) - Dg (G_{n} (λ)) u_{n}

and

u_{n}^{'} Σ_{n}^{- 1} (G_{n} (λ) X_{n} β, X_{n})

are used as the optimal moment conditions, see Debarsy et al. (2015). They involve

λ

(appearing in

G_{n} (λ)

) and

β

as well as the covariance matrix

Σ_{n}

. For

λ

and

β

, an initial estimation is constructed from the simple 2SLS with

W_{n} X_{n}

and

X_{n}

as IV’s. One may assume a model for

Σ_{n}

and then estimate the assumed model so as to construct the moment conditions. In this section, two choices are made regarding this: one is to use

u_{n}^{'} (G_{n} - Dg (G_{n}) u_{n}

and

u_{n}^{'} (G_{n} X_{n} β, X_{n})

as the moment conditions and the other is to use

u_{n}^{'} Σ_{n}^{- 1} (G_{n} - Dg (G_{n}) u_{n}

and

u_{n}^{'} Σ_{n}^{- 1} (G_{n} X_{n} β, X_{n})

with the true

Σ_{n}

(known in simulations) plugged in. The two resulting estimators are denoted by GMM and GMM(

Σ_{n}

), respectively, in Table 1, Table 2, Table 3 and Table 4. One would expect that in practice the performance of the optimal robust GMM estimator with an estimated

Σ_{n}

appearing in the moment conditions would most likely stand between the two.

In each experiment, for each estimator, reported are the bias and root mean squared error (RMSE) from 1000 Monte Carlo simulations. The empirical rejection probabilities of the relevant t tests for testing each parameter equal to its true value at

5 %

are also reported, denoted by

P (5 %)

, where the asymptotic variances for

λ_{n}^{I I}

and

{\hat{β}}_{n}^{I I}

, as discussed in the last paragraph of Section 2, are estimated with the unknowns replaced by their estimates based on the II procedure.7

For the purpose of comparison, the experimental design in Lin and Lee (2010) is followed closely. The spatial scenario under a group interaction weight matrix of Case (1991) is considered. The exogenous variables include a constant term and two independently distributed random variables following

N (3, 1)

and

U (- 1, 2)

, respectively. The size of each group is determined by a

U (3, 20)

random variable. The error terms follow a zero-mean normal distribution with variances varying across groups. Two variance structures (V1 and V2) are considered. V1: for each group, if the group size is greater than 10, then the error variance is the same as the group size; otherwise, the variance is the inverse of the square of the group size. V2: for each group, the error variance is the inverse of the group size. Two sets of parameter configurations are used:

θ_{0} = {(λ_{0}, 0.8, 0.2, 1.5)}^{'}

and

θ_{0} = {(λ_{0}, 0.2, 0.2, 0.1)}^{'}

, named P1 and P2, respectively. Different degrees of spatial autocorrelation are considered:

λ_{0} = 0.2, 0.6, 0.9

. Results are reported in Table 1, Table 2, Table 3 and Table 4.

Some interesting observations arise. Firstly, all the consistent estimators deliver almost unbiased results across all the experimental configurations, though the estimated intercept term associated with

X_{n}

is relatively more biased. Secondly, among the consistent estimators, the optimal robust GMM using the true

Σ_{n}

usually achieves the smallest RMSE. The other three estimators have very similar performance in terms of RMSE. Thirdly, for the purpose of hypothesis testing, it appears that the II-based procedure is as good as the one based on (the infeasible) GMM(

Σ_{n}

), with the empirical rejection rates matching very closely the nominal size. The MQML of Liu and Yang (2015) tends to deliver under-sized t-test regarding the spatial autoregression parameter

λ

when its value is relatively high. For example, in Table 4, one sees the rejection rates of

0.4 %

and

0.1 %

, under

R = 100

and

R = 200

, respectively, for testing

λ

equal to its true value when

λ_{0} = 0.9

from a

5 %

t-test based on Liu and Yang (2015). This under-size problem, when the degree of spatial correlation is high, also carries over to the t test associated with the intercept parameter. The GMM estimator, when one is unsure of the error variance structure and uses

u_{n}^{'} (G_{n} - Dg (G_{n}) u_{n}

and

u_{n}^{'} (G_{n} X_{n} β, X_{n})

as the moment conditions, delivers very disappointing size performance in Table 4 when testing either the spatial autoregression parameter

λ

or the intercept parameter: the rejection rates approach around

20 %

at the

5 %

nominal size.

Given the simulated data, it is worthwhile to look at a plot of the binding function to check whether the binding function is monotonic, as required by Assumption 7. Figure 1 is drawn for 1000 simulated data sets under variance structure V1 and parameter configuration P1 with

R = 100

.8 Recall that for a given

θ_{0} = {(λ_{0}, β_{0}^{'})}^{'}

and the exogenous

X_{n}

, the data generating process generates the observable data

y_{n}

and the binding function

λ + y_{n}^{'} S_{n}^{'} (λ) M_{n} D_{n} (λ) M_{n} S_{n} (λ) y_{n} / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

is a function of

λ

. Figure 1 clearly illustrates that the binding function is a monotonic function of

λ

and thus the monotonicity condition in Assumption 7 is numerically valid. Figures drawn for the simulated data under other configurations display similar patterns and are omitted.

One may wonder about the performance of the proposed II estimator under homoscedasticity, relative to the QML estimator of Lee (2004) and the best GMM estimator of Lee (2007).9 Table 5 and Table 6 report the Monte Carlo results under parameter configurations P1 and P2, but now the error term is simulated as a standard normal random variable. The exogenous variables were simulated the same as before. From Table 5 and Table 6, one observes that the II estimator is slightly better than the best GMM estimator of Lee (2007), usually delivering smaller finite-sample bias and lower RMSE. Both methods have good finite-sample size performance in terms of the

5 %

t test. The finite-sample performance of the QML, on the other hand, is quite different from what the asymptotic theory predicts. Its bias is more severe than the other two and it also gives higher RMSE. Moreover, its size performance is very poor.

The simulation results suggest that the II estimator could be used at least as a complement to other consistent estimators proposed in the literature that are robust to unknown heteroscedasticity. The II method may be favored for the purpose of hypothesis testing, especially for testing the spatial autoregression parameter. It also has very good finite-sample performance under homoscedasticity.

5. Concluding Remarks

Lee (2002) challenged the traditional wisdom that the OLS estimator is biased and inconsistent in spatial autoregressions and showed that it may be consistent under some special circumstances if there are exogenous regressors included. This paper thoroughly examines the asymptotic behavior of the OLS estimator under different specifications of the degree of aggregate influence on each unit from other units and provides a unified asymptotic distribution result of the recentered OLS estimator. Based on this, an indirect inference estimator, which is consistent and asymptotically normal, is introduced. The new estimator is relatively easy to calculate, does not rely on distributional assumptions on the data, and is robust to heteroscedasticity. Monte Carlo experiments in this paper show the good finite-sample performance of the II estimator in comparison with other consistent estimators that are robust to unknown heteroscedasticity.

In this paper, no attempt is made to conduct some comparison of the asymptotic variances of the GMM and II estimators. The II estimator in this paper may be interpreted as an estimator that uses one moment condition, namely, by matching the OLS estimator

{\hat{λ}}_{n}

with its approximate analytical expectation. In contrast, the GMM estimator in Lin and Lee (2010) is based on a set of exact expectations of bilinear and quadratic forms in

u_{n}

. The OLS estimator itself is based on an incorrect moment condition, namely, exogeneity of

W_{n} y_{n}

. It is not clear whether correcting an incorrect moment condition is as efficient as using a set of correct moment conditions. A fruitful strategy is perhaps to design a combined estimator.10 Another possible extension is to consider the more general higher-order SARAR (spatial autoregressive model with spatial autoregressive disturbances) with heteroscedastic innovations as in Badinger and Egger (2011) and Jin and Lee (2019). In this more general setup, the II procedure may be implemented as follows. One can first derive the approximate analytical expectation of the OLS estimator of the parameters in the SAR part, taken the parameters in the disturbance part as given, and thus design a “corrected” SAR estimator. Then based on the residuals that arise from the “corrected” SAR estimator, one can derive approximate analytical expectation of the OLS estimator of the parameters in the disturbance part. In the end, one can jointly estimate all the parameters in the SAR and disturbance parts by using the two sets of approximate analytical expectations. Some preliminary simulations show very promising results from this approach. Rigorous treatments of these extensions are left for future studies.

Author Contributions

The three authors of the paper have contributed equally, via joint efforts, regarding both ideas, research, and writing. Conceptualization, Y.B. and X.L.; methodology, Y.B.; software, Y.B. and X.L.; validation, Y.B., X.L. and L.Y.; formal analysis, Y.B., X.L. and L.Y.; investigation, Y.B.; resources, not applicable; writing–original draft preparation, Y.B.; writing–review and editing, Y.B., X.L. and L.Y.; visualization, Y.B.; supervision, not applicable; project administration, Y.B., X.L. and L.Y.; funding acquisition, L.Y. All authors have read and agreed to the published version of the manuscript.

Funding

Lihong Yang’s research was partially supported by the National Natural Science Foundation of China under Grant No. 71573269.

Acknowledgments

The authors are grateful to the three anonymous referees, seminar participants at Huazhong Agricultural University, Huazhong University of Science and Technology, Nanjing Audit University, Shandong University, South China Normal University, and University of California, Riverside, and conference participants at the 2019 CES conference (Dalian) for their helpful comments. Jeff Ello from the Krannert Computing Center at Purdue University kindly created a virtual machine from a computer cluster to facilitate the simulations conducted in an early version of this paper. The authors are responsible for all remaining errors.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Lemmas

This appendix collects several lemmas that are useful for deriving the main results. Some of these results (without proofs) were either derived or presented in different ways, see Kelejian and Prucha (1999, 2001) and Lee (2001, 2002, 2004).

Lemma A1.

If

{A_{n}}

and

{B_{n}}

are uniformly bounded in row and column sums, then so are

{A_{n} + B_{n}}

and

{A_{n} B_{n}}

.

Lemma A2.

Suppose

{A_{n}}

has its elements of order

O (h_{n}^{- 1})

. If

{B_{n}}

is uniformly bounded in column sum, then the elements of

A_{n} B_{n}

are

O (h_{n}^{- 1})

; if

{B_{n}}

is uniformly bounded in row sum, then the elements of

B_{n} A_{n}

are

O (h_{n}^{- 1})

. In either case, tr

(A_{n} B_{n}) = O (n / h_{n})

.

Lemma A3.

For a product involving (powers of)

G_{n}

and

G_{n}^{'}

,

\prod_{l = 1}^{m} {(G_{n}^{i_{1}} G_{n}^{' i 2})}^{j_{l}}

, where

i_{1}, i_{2} \geq 0

,

i_{1} i_{2} > 0

,

j_{l} \geq 0

,

\prod_{l = 1}^{m} j_{l} > 0

,

i_{1}

,

i_{2}

,

j_{l}

all being integers, under Assumptions 1 and 2, its elements are of order

O (h_{n}^{- 1})

and its trace is of order

O (n / h_{n})

.

Proof.

Under Assumptions 1 and 2, from Lemma A2, the elements of

G_{n} = W_{n} S_{n}^{- 1}

are

O (h_{n}^{- 1})

and

tr (G_{n}) = O (n / h_{n})

. From Lemma A1,

G_{n} = W_{n} S_{n}^{- 1}

is uniformly bounded in row and column sums. By successive using of Lemma A1,

G_{n}^{i_{1} - 1}

is uniformly bounded in row and column sums, and through Lemma A2, the elements of

G_{n}^{i_{1}} = G_{n} G_{n}^{i_{1} - 1}

are

O (h_{n}^{- 1})

and

tr (G_{n}^{i_{1}}) = O (n / h_{n})

. Similarly, such a claim applies to

G_{n}^{' i 2}

, which is also uniformly bounded in row and column sums. Then

G_{n}^{i_{1}} G_{n}^{' i 2}

is uniformly bounded in row and column sums with its elements being

O (h_{n}^{- 1})

and

tr (G_{n}^{i_{1}} G_{n}^{' i 2}) = O (n / h_{n})

. Proceeding similarly, one can see that the product

\prod_{l = 1}^{m} {(G_{n}^{i_{1}} G_{n}^{' i 2})}^{j_{l}}

shares these properties too. □

Lemma A4.

For the sequence

{u_{n}}

with the elements following Assumption 3, let

A_{n}

and

B_{n}

be nonrandom, then

\begin{matrix} E (u_{n}^{'} A_{n} u_{n}) & = tr (Σ_{n} A_{n}), \end{matrix}

(A.1)

\begin{matrix} E (u_{n} u_{n}^{'} A_{n} u_{n}) & = dg (Σ_{n}^{(3)} ⊙ A_{n}), \\ E (u_{n}^{'} A_{n} u_{n} u_{n}^{'} B_{n} u_{n}) & = tr (Σ_{n}^{(4)} ⊙ A_{n} ⊙ B_{n}) + tr (Σ_{n} A_{n}) tr (Σ_{n} B_{n}) \end{matrix}

(A.2)

\begin{matrix} + tr [Σ_{n} A_{n} Σ_{n} (B_{n} + B_{n}^{'})] . \end{matrix}

(A.3)

Lemma A5.

Under Assumptions 1–5,

\begin{matrix} u_{n}^{'} G_{n} u_{n} = O_{P} (n / h_{n}), u_{n}^{'} M_{n} G_{n} u_{n} = O_{P} (n / h_{n}), \\ u_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n} = O_{P} (n / h_{n}), β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n} = O_{P} (\sqrt{n}), \\ β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n} = O_{P} (\sqrt{n}), β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} = O (n) . \end{matrix}

Proof.

From (A.1),

E (u_{n}^{'} G_{n} u_{n}) = tr (Σ_{n} G_{n})

. Under Assumption 3,

Σ_{n}

is uniformly bounded, so

tr (Σ_{n} G_{n}) \leq K tr (G_{n}) = O (n / h_{n})

from Lemma A3. Lee (2004) shows that, under Assumption 5,

{lim}_{n \to \infty} n^{- 1} {(X_{n}, G_{n} X_{n} β_{0})}^{'} (X_{n}, G_{n} X_{n} β_{0})

is nonsingular if and only if both the limits of

n^{- 1} X_{n}^{'} X_{n}

and

n^{- 1} β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}

are nonsingular, indicating that

X_{n}^{'} X_{n} = O (n)

and

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} = O (n)

. Also from Lee (2004),

M_{n}

is uniformly bounded in row and column sums, and then

E (u_{n}^{'} M_{n} G_{n} u_{n}) = tr (Σ_{n} M_{n} G_{n}) \leq K tr (M_{n} G_{n}) = O (n / h_{n})

from Lemmas A2 and A3. Similarly, one can show

E (u_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n}) = O (n / h_{n})

. As for

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}

, its expectation is zero and its variance is given by

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}

, which is bounded by

K β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} = O (n)

. Then it follows that

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n} = O_{P} (\sqrt{n})

. Similarly,

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n} = O_{P} (\sqrt{n})

. Note that

G_{n}^{'} M_{n} G_{n} G_{n}^{'} M_{n} G_{n}

is uniformly bounded in row and column sums through Lemmas A1 and A2. Given that the elements of

X_{n}

are uniformly bounded, one has

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} = O (n)

. Then it follows that

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n} = O_{P} (\sqrt{n})

. □

Lemma A6.

Under Assumptions 1–5,

E (r_{n}) = O (n / h_{n})

,

Var (r_{n}) = O (n)

,

E (d_{n}) = O (n)

, and

Var (d_{n}) = O (n)

. When there is no

X

, then

E (r_{1 n}) = O (n / h_{n})

,

Var (r_{1 n}) = O (n / h_{n})

,

E (d_{1 n}) = O (n / h_{n})

, and

Var (d_{1 n}) = O (n / h_{n}) .

Proof.

Given Lemmas A4 and A5,

E (r_{n}) = tr (Σ_{n} M_{n} G_{n}) = O (n / h_{n})

(A.4)

and

E (d_{n}) = tr (Σ_{n} G_{n}^{'} M_{n} G_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} = O (n)

(A.5)

are obvious. Using Lemma A4,

\begin{matrix} Var (r_{n}) & = Var (u_{n}^{'} M_{n} G_{n} u_{n}) + Var (β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}) \\ + 2 Cov (u_{n}^{'} M_{n} G_{n} u_{n}, β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}) \\ = E [{(u_{n}^{'} M_{n} G_{n} u_{n})}^{2}] - {[E (u_{n}^{'} M_{n} G_{n} u_{n})]}^{2} + E (u_{n}^{'} M_{n} G_{n} X_{n} β_{0} β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}) \\ + 2 β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} E (u_{n} u_{n}^{'} M_{n} G_{n} u_{n}) \\ = tr (Σ_{n}^{(4)} ⊙ M_{n} G_{n} ⊙ M_{n} G_{n}) + tr [Σ_{n} M_{n} G_{n} Σ_{n} (M_{n} G_{n} + G_{n}^{'} M_{n})] \\ + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0} + 2 β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} dg (Σ_{n}^{(3)} ⊙ M_{n} G_{n}), \end{matrix}

(A.6)

where in view of Lemma A3 and A5,

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0} = O (n)

is the leading term. Similarly,

\begin{matrix} Var (d_{n}) & = E [{(u_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n})}^{2}] - {[E (u_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n})]}^{2} \\ + 4 (u_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n}) \\ + 4 β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} E (u_{n} u_{n}^{'} G_{n}^{'} M_{n} G_{n} u_{n}) \\ = tr (Σ_{n}^{(4)} ⊙ G_{n}^{'} M_{n} G_{n} ⊙ G_{n}^{'} M_{n} G_{n}) + 2 tr [Σ_{n} G_{n}^{'} M_{n} G_{n} Σ_{n} G_{n}^{'} M_{n} G_{n}] \\ + 4 β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} Σ_{n} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} \\ + 4 β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} dg (Σ_{n}^{(3)} ⊙ G_{n}^{'} M_{n} G_{n}), \end{matrix}

(A.7)

in which

β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} Σ_{n} G_{n}^{'} M_{n} G_{n} X_{n} β_{0} = O (n)

is the leading term. For the case when there is no

X

, the results are obvious. □

Lemma A7.

Suppose

{A_{n}}

is a sequence of matrices with uniformly bounded row and column sums. Let

{b_{n}}

be a sequence of constants with uniformly bounded elements and

{sup}_{n \to \infty} n^{- 1} \sum_{i = 1}^{n} {| b_{i, n} |}^{2 + η_{1}} < \infty

for some

η_{1} > 0

. For the sequence

{u_{n}}

that satisfies Assumption 3, let

Q_{n} = b_{n}^{'} u_{n} + u_{n}^{'} A_{n} u_{n}

. Then

\frac{Q_{n} - E (Q_{n})}{\sqrt{Var (Q_{n})}} \overset{d}{\to} N (0, 1) .

(A.8)

Appendix B. Proofs

The proofs of Theorems 1–4 in Section 2 and Corollary 2 in Section 3 are provided in this appendix and those of Corollaries 1 and 3, which follow similarly, are skipped.

Appendix B.1. Proof of Theorem 1

Proof.

By a Nagar-type (Nagar (1959)) expansion,

\begin{matrix} \sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{tr (Σ_{n} M_{n} G_{n})}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) & = \sqrt{n} (\frac{r_{1 n} - E (r_{1 n}) + r_{2 n}}{d_{n}}) \\ = \sqrt{n} (\frac{r_{1 n} - E (r_{1 n}) + r_{2 n}}{E (d_{n}) + d_{n} - E (d_{n})}) \\ = \sqrt{n} (\frac{r_{1 n} - E (r_{1 n}) + r_{2 n}}{E (d_{n})}) {(1 + \frac{d_{n} - E (d_{n})}{E (d_{n})})}^{- 1} \\ = \sqrt{n} (\frac{r_{1 n} - E (r_{1 n}) + r_{2 n}}{E (d_{n})}) + o_{P} (1) \end{matrix}

where, in light of the proof of Lemma A6,

r_{1 n} - E (r_{1 n}) = O_{P} (\sqrt{n / h_{n}})

,

r_{2 n} = O (\sqrt{n})

, and

E (d_{n}) = O (n)

. So when

h_{n}

diverges,

\sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{tr (Σ_{n} M_{n} G_{n})}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) = \frac{\sqrt{n} r_{2 n}}{E (d_{n})} + o_{P} (1),

and one can apply Lemma A7 to

r_{2 n} = β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}

; when

h_{n}

is bounded, one can apply Lemma A7 to

r_{n} = r_{1 n} + r_{2 n} = u_{n}^{'} M_{n} G_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}

. From Lemma A6, one sees that

Var (r_{2 n}) = Var (r_{n}) + o (n)

when

h_{n}

diverges. Therefore, regardless of

h_{n}

,

\sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{tr (Σ_{n} M_{n} G_{n})}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) \overset{d}{\to} N (0, lim_{n \to \infty} \frac{n Var (r_{n})}{{[E (d_{n})]}^{2}}) .

□

Appendix B.2. Proof of Theorem 2

Proof.

Note that

\begin{matrix} {\tilde{u}}_{n}^{'} D_{n} {\tilde{u}}_{n} & = u_{n}^{'} D_{n} u_{n} + {({\tilde{β}}_{n} - β_{0})}^{'} X_{n}^{'} X_{n} ({\tilde{β}}_{n} - β_{0}) - 2 {({\tilde{β}}_{n} - β_{0})}^{'} X_{n}^{'} D_{n} u_{n} \\ = u_{n}^{'} D_{n} u_{n} + O_{P} (1) + O_{P} (n^{- 1 / 2}) O_{P} (\sqrt{n / h_{n}^{2}}), \end{matrix}

in view of

X_{n}^{'} X_{n} = O (n)

,

{\tilde{β}}_{n} - β_{0} = O_{P} (n^{- 1 / 2})

and

Var (X_{n}^{'} D_{n} u_{n}) = X_{n}^{'} D_{n} Σ_{n} D_{n} X_{n} = O (n / h_{n}^{2})

. Then

\begin{matrix} \sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{y_{n}^{'} S_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) \\ = \sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{u_{n}^{'} D_{n} u_{n}}{d_{n}} + \frac{u_{n}^{'} D_{n} u_{n} - {\tilde{u}}_{n}^{'} D_{n} {\tilde{u}}_{n}}{d_{n}}) \\ = \sqrt{n} (\frac{r_{n} - u_{n}^{'} D_{n} u_{n}}{d_{n}}) + o_{P} (1) \\ = \sqrt{n} (\frac{r_{n} - u_{n}^{'} D_{n} u_{n}}{E (d_{n})}) + o_{P} (1) . \end{matrix}

It follows from Lemma A7 that

\sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{y_{n}^{'} S_{n}^{'} M_{n} D_{n} M_{n} S_{n} y_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) \overset{d}{\to} N (0, η),

where

\begin{matrix} η & = lim_{n \to \infty} n \frac{Var (r_{n} - u_{n}^{'} D_{n} u_{n})}{{[E (d_{n})]}^{2}} \\ = lim_{n \to \infty} n \frac{Var (u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n})}{{[E (d_{n})]}^{2}} \\ = lim_{n \to \infty} \frac{n \{tr [Σ_{n} E_{n} Σ_{n} (E_{n} + E_{n}^{'})] + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}\}}{[E {(d_{n}]}^{2}} \end{matrix}

(A.9)

by using Lemma A4 and the fact that

dg (E_{n}) = 0

. □

Appendix B.3. Proof of Theorem 3

Proof.

One can apply the extended delta method as in Phillips (2012) to derive the asymptotic distribution of

\sqrt{n} ({\hat{λ}}_{n}^{I I} - λ_{0})

. For this purpose, one needs to check a technical condition, namely, the

\partial b_{n}^{- 1} (λ) / \partial λ

should be asymptotically locally equicontinuous at

λ_{0}

: for given

ζ > 0

, if

s_{n} \to \infty

and

s_{n} / \sqrt{n} \to 0

,

sup_{s_{n} |λ - λ_{0}| < ζ} |\frac{b_{n}^{- 1'} (λ) - b_{n}^{- 1'} (λ_{0})}{b_{n}^{- 1'} (λ_{0})}| = sup_{s_{n} |λ - λ_{0}| < ζ} |\frac{b_{n}^{'} (λ_{0}) - b_{n}^{'} (λ)}{b_{n}^{'} (λ)}| \overset{a . s .}{\to} 0 .

The first derivative of the binding function is

\begin{matrix} b_{n}^{'} (λ) \\ = 1 + \frac{y_{n}^{'} S_{n}^{'} (λ) M_{n} Dg (M_{n} G_{n}^{2} (λ)) M_{n} S_{n} (λ) y_{n} - 2 y_{n}^{'} W_{n}^{'} M_{n} Dg (M_{n} G_{n} (λ)) M_{n} S_{n} (λ) y_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}} . \end{matrix}

(A.10)

By substituting

y_{n} = S_{n}^{- 1} X_{n} β_{0} + S_{n}^{- 1} u_{n}

and using Lemmas A2–A4, one can see that the second term in (A.10) converges almost surely to a bounded constant for all

λ \in Λ

. In a similar way, one can show

\begin{matrix} b_{n}^{″} (λ) \\ = \frac{2 y_{n}^{'} S_{n}^{'} (λ) M_{n} Dg (M_{n} G_{n}^{3} (λ)) M_{n} S_{n} (λ) y_{n} + 2 y_{n}^{'} W_{n}^{'} M_{n} Dg (M_{n} G_{n} (λ)) M_{n} W_{n} y_{n}}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}} \end{matrix}

also converges almost surely to a bounded constant. Thus, for some

λ^{*}

that lies between

λ_{0}

and

λ

,

|\frac{b_{n}^{'} (λ_{0}) - b_{n}^{'} (λ)}{b_{n}^{'} (λ)}| = |λ - λ_{0}| |\frac{b_{n}^{″} (λ^{*})}{b_{n}^{'} (λ)}| < \frac{ζ}{s_{n}} |\frac{b_{n}^{″} (λ^{*})}{b_{n}^{'} (λ)}| \overset{a . s .}{\to} 0 .

With all these results, (12) follows immediately from Theorem 1 of Phillips (2012) and Theorem 2 in this paper. □

Appendix B.4. Proof of Theorem 4

Proof.

Upon substitution,

\begin{matrix} {\hat{β}}_{n}^{I I} & = {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} {\hat{S}}_{n} S_{n}^{- 1} X_{n} β_{0} + {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} {\hat{S}}_{n} S_{n}^{- 1} u_{n} \\ = {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} [S_{n} - ({\hat{λ}}_{n}^{I I} - λ_{0}) W_{n}] S_{n}^{- 1} X_{n} β_{0} \\ + {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} [S_{n} - ({\hat{λ}}_{n}^{I I} - λ_{0}) W_{n}] S_{n}^{- 1} u_{n} \\ = β_{0} + {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} u_{n} - ({\hat{λ}}_{n}^{I I} - λ_{0}) [{(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} + {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} u_{n}], \end{matrix}

where

{\hat{λ}}_{n}^{I I} - λ_{0} = O_{p} (n^{- 1 / 2})

(from Theorem 3). Further,

{(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} u_{n} = O_{p} (n^{- 1 / 2})

,

{(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} = O (1)

, and

{(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} u_{n} = O_{p} (n^{- 1 / 2})

(from Lemma A5). Note that if one expands

{\hat{λ}}_{n}^{I I} - λ_{0} = b_{n}^{- 1} ({\hat{λ}}_{n}) - b_{n}^{- 1} (b_{n} (λ_{0}))

,

\sqrt{n} ({\hat{λ}}_{n}^{I I} - λ_{0}) = \frac{\sqrt{n} ({\hat{λ}}_{n} - b_{n} (λ_{0}))}{b_{n}^{'} (λ_{0})} + O_{P} (n^{- 1 / 2}) .

(A.11)

From the proof of Theorem 2,

{\hat{λ}}_{n} - b_{n} (λ_{0}) = \frac{r_{n} - u_{n}^{'} D_{n} u_{n}}{E (d_{n})} + o_{P} (n^{- 1 / 2}) = \frac{u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}}{E (d_{n})} + o_{P} (n^{- 1 / 2}) .

(A.12)

Given the above results, one has

\begin{matrix} \sqrt{n} ({\hat{β}}_{n}^{I I} - β_{0}) \\ = \sqrt{n} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} u_{n} - \sqrt{n} ({\hat{λ}}_{n}^{I I} - λ_{0}) {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} + o_{P} (1) \\ = \sqrt{n} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} u_{n} - [\frac{\sqrt{n} ({\hat{λ}}_{n} - b_{n} (λ_{0}))}{b_{n}^{'} (λ_{0})}] {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} + o_{P} (1) \\ = \sqrt{n} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} u_{n} - \frac{\sqrt{n} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0}}{b_{n}^{'} (λ_{0})} \frac{u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}}{E (d_{n})} + o_{P} (1) . \end{matrix}

(A.13)

One can check that Lemma A7 can be applied to each element or any nonstochastic linear combination of elements of

\sqrt{n} ({\hat{β}}_{n}^{I I} - β_{0})

under such a representation. So

\sqrt{n} ({\hat{β}}_{n}^{I I} - β_{0})

converges to a normal distribution, with the asymptotic covariance matrix,

\begin{matrix} V = & lim_{n \to \infty} {n {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} Σ_{n} X_{n} {(X_{n}^{'} X_{n})}^{- 1} \\ + \frac{η}{b_{0}^{2}} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} β_{0}^{'} X_{n}^{'} G_{n}^{'} X_{n} {(X_{n}^{'} X_{n})}^{- 1} \\ - \frac{n {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} X_{n} {(X_{n}^{'} X_{n})}^{- 1}}{b_{0} [tr (Σ_{n} G_{n}^{'} M_{n} G_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}]} \\ - \frac{n {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} Σ_{n} M_{n} G_{n} X_{n} β_{0} β_{0}^{'} X_{n}^{'} G_{n}^{'} X_{n} {(X_{n}^{'} X_{n})}^{- 1}}{b_{0} [tr (Σ_{n} G_{n}^{'} M_{n} G_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}]}\} . \end{matrix}

(A.14)

The asymptotic covariance between

{\hat{β}}_{n}^{I I}

and

{\hat{λ}}_{n}^{I I}

follows from the expansion of

\sqrt{n} ({\hat{λ}}_{n}^{I I} - λ_{0})

(see (A.11) and (A.12)) and that of

\sqrt{n} ({\hat{β}}_{n}^{I I} - β_{0})

(see (A.13)), given by

\begin{matrix} γ & = lim_{n \to \infty} [\frac{n}{b_{0} E (d_{n})} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} Σ_{n} M_{n} G_{n} X_{n} β_{0} - \frac{η}{b_{0}^{2}} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0}] . \end{matrix}

(A.15)

□

Appendix B.5. Proof of Corollary 2

Proof.

By substitution and using the Nagar (1959) expansion,

\begin{matrix} \sqrt{\frac{n}{h_{n}}} ({\hat{λ}}_{n} - λ_{0} - \frac{y_{n}^{'} S_{n}^{'} D_{n} S_{n} y_{n}}{y_{n}^{'} W_{n}^{'} W_{n} y_{n}}) & = \sqrt{\frac{n}{h_{n}}} (\frac{r_{1 n}}{d_{1 n}} - \frac{u_{n}^{'} D_{n} u_{n}}{d_{1 n}}) \\ = \sqrt{\frac{n}{h_{n}}} (\frac{u_{n}^{'} E_{n} u_{n}}{d_{1 n}}) \\ = \sqrt{\frac{n}{h_{n}}} \frac{u_{n}^{'} E_{n} u_{n}}{E (d_{1 n})} + o_{P} (1), \end{matrix}

where

E (u_{n}^{'} E_{n} u_{n}) = 0

,

Var (u_{n}^{'} E_{n} u_{n}) = tr [Σ_{n} E_{n} Σ_{n} (E_{n} + E_{n}^{'})] = O (n / h_{n})

, and and the asymptotic distribution follows when one applies Lemma A7 to the quadratic form

u_{n}^{'} E_{n} u_{n} .

□

Appendix B.6. The Case of Homoscedastic Error Term

If

Σ_{n} = σ_{0}^{2} I_{n}

(and further

Σ_{n}^{(j)} = μ_{j} I_{n}

,

μ_{j} = E (u_{i, n}^{j})

,

j = 3, 4

), then the recentering term in (7) becomes

σ_{0}^{2} tr (M_{n} G_{n}) / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

. To make the II procedure feasible, one may replace

σ_{0}^{2} tr (M_{n} G_{n}) / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

by

n^{- 1} y_{n}^{'} S_{n}^{'} M_{n} S_{n} y_{n} tr (M_{n} G_{n}) / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

. Similar to the proof of Theorem 2, one can put

\sqrt{n} ({\hat{λ}}_{n} - λ_{0} - \frac{n^{- 1} y_{n}^{'} S_{n}^{'} M_{n} S_{n} y_{n} tr (M_{n} G_{n})}{y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}}) = \sqrt{n} (\frac{r_{n}^{*} - E (r_{n}^{*})}{E (d_{n})}) + o_{P} (1) \overset{d}{\to} N (0, ω),

where

ω = {lim}_{n \to \infty} n Var (r_{n}^{*}) / {{[E (d_{n})]}^{2}}

,

r_{n}^{*} = u_{n}^{'} M_{n} G_{n}^{*} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}

,

G_{n}^{*} = G_{n} - n^{- 1} tr (M_{n} G_{n})

. In particular,

\begin{matrix} Var (r_{n}^{*}) & = μ_{4} tr (M_{n} G_{n}^{*} ⊙ M_{n} G_{n}^{*}) + σ_{0}^{2} tr [M_{n} G_{n}^{*} (M_{n} G_{n}^{*} + G_{n}^{*'} M_{n})] \\ + σ_{0}^{2} β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} M_{n} G_{n} X_{n} β_{0} + 2 μ_{3} β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} dg (M_{n} G_{n}^{*}) . \end{matrix}

Define the binding function as

b_{n} (λ) = λ + n^{- 1} y_{n}^{'} S_{n}^{'} (λ) M_{n} S_{n} (λ) y_{n} tr (M_{n} G_{n} (λ)) / y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n}

and

b_{n}^{'} (λ_{0}) = 1 + {(n y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n})}^{- 1} [y_{n}^{'} S_{n}^{'} M_{n} S_{n} y_{n} tr (M_{n} G_{n}^{2}) - 2 y_{n}^{'} W_{n}^{'} M_{n} S_{n} y_{n} tr (M_{n} G_{n})]

. Assume

b_{n}^{'} (λ_{0}) \overset{a . s .}{\to} b_{0} \neq 0

. The asymptotic distribution of

{({\hat{λ}}_{n}^{I I}, {\hat{β}}_{n}^{I I'})}^{'}

resulting from this binding function follows similarly from the proofs of Theorems 3 and 4, given by

\sqrt{n} (\begin{matrix} {\hat{λ}}_{n}^{I I} - λ_{0} \\ {\hat{β}}_{n}^{I I} - β_{0} \end{matrix}) \overset{d}{\to} N (0, (\begin{matrix} \frac{ω}{b_{0}^{2}} & γ^{'} \\ γ & V \end{matrix})),

where

\begin{matrix} V & = lim_{n \to \infty} {n σ_{0}^{2} {(X_{n}^{'} X_{n})}^{- 1} + \frac{ω}{b_{0}^{2}} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} β_{0}^{'} X_{n}^{'} G_{n}^{'} X_{n} {(X_{n}^{'} X_{n})}^{- 1} \\ - \frac{n {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0} [σ_{0}^{2} β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} + μ_{3} dg {(M_{n} G_{n}^{*})}^{'}] X_{n} {(X_{n}^{'} X_{n})}^{- 1}}{b_{0} [σ_{0}^{2} tr (G_{n}^{'} M_{n} G_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}]} \\ - \frac{n {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} [σ_{0}^{2} M_{n} G_{n} X_{n} β_{0} + μ_{3} dg (M_{n} G_{n}^{*})] β_{0}^{'} X_{n}^{'} G_{n}^{'} X_{n} {(X_{n}^{'} X_{n})}^{- 1}}{b_{0} [σ_{0}^{2} tr (G_{n}^{'} M_{n} G_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} G_{n} X_{n} β_{0}]}\} \end{matrix}

and

γ = lim_{n \to \infty} \{\frac{n}{b_{0} E (d_{n})} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} [σ_{0}^{2} M_{n} G_{n} X_{n} β_{0} + μ_{3} dg (M_{n} G_{n}^{*})] - \frac{ω}{b_{0}^{2}} {(X_{n}^{'} X_{n})}^{- 1} X_{n}^{'} G_{n} X_{n} β_{0}\} .

References

Badinger, Harald, and Peter Egger. 2011. Estimation of higher-order spatial autoregressive cross-section models with heteroscedastic disturbances. Papers in Regional Science 90: 213–35. [Google Scholar] [CrossRef]
Case, Anne C. 1991. Spatial patterns in household demand. Econometrica 59: 953–65. [Google Scholar] [CrossRef] [Green Version]
Cheng, Xu, Zhipeng Liao, and Ruoyao Shi. 2019. On uniform asymptotic risk of averaging GMM estimators. Quantitative Economics 3: 931–97. [Google Scholar] [CrossRef]
Cliff, Andrew David, and J. Keith Ord. 1981. Spatial Processes: Models and Applications. London: Pion Ltd. [Google Scholar]
Debarsy, Nicolas, Fei Jin, and Lung-Fei Lee. 2015. Large sample properties of the matrix exponential spatial specification with an applicationto FDI. Journal of Econometrics 188: 1–21. [Google Scholar] [CrossRef] [Green Version]
Gospodinov, Nikolay, Ivana Komunjer, and Serena Ng. 2017. Simulated minimum distance estimation of dynamic models with errors-in-variables. Journal of Econometrics 200: 181–93. [Google Scholar] [CrossRef]
Gouriéroux, Christian, Alain Monfort, and Eric Renault. 1993. Indirect inference. Journal of Applied Econometrics 8: S85–S118. [Google Scholar] [CrossRef]
Jin, Fei, and Lung-Fei Lee. 2019. GEL estimation and tests of spatial autoregressive models. Journal of Econometrics 208: 585–612. [Google Scholar] [CrossRef]
Kelejian, Harry H., and Ingmar R. Prucha. 1999. A generalized moments estimator for the autoregressive parameter in a spatial model. International Economic Review 40: 509–33. [Google Scholar] [CrossRef] [Green Version]
Kelejian, Harry H., and Ingmar R. Prucha. 2001. On the asymptotic distribution of the Moran I test statistic with applications. Journal of Econometrics 104: 219–57. [Google Scholar] [CrossRef] [Green Version]
Kelejian, Harry H., and Ingmar R. Prucha. 2010. Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Journal of Econometrics 157: 53–67. [Google Scholar] [CrossRef] [Green Version]
Kyriacou, Maria, Peter C. B. Phillips, and Francesca Rossi. 2017. Indirect inference in spatial autoregression. The Econometrics Journal 20: 168–89. [Google Scholar] [CrossRef]
Lam, Clifford, and Pedro C. L. Souza. 2019. Estimation and selection of spatial weight matrix in a spatial lag model. Journal of Business and Economic Statistics 3: 693–710. [Google Scholar] [CrossRef]
Lee, Lung-Fei. 2001. Asymptotic Distribution of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive Models I: Spatial Autoregressive Processes. Working Paper. Columbus: Department of Econommics, Ohio State University. [Google Scholar]
Lee, Lung-Fei. 2002. Consistency and efficiency of least squares estimation for mixed regressive, spatial autoregressive models. Econometric Theory 18: 252–77. [Google Scholar] [CrossRef]
Lee, Lung-Fei. 2004. Asymptotic distribution of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72: 1899–925. [Google Scholar] [CrossRef]
Lee, Lung-Fei. 2007. GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. Journal of Econometrics 137: 489–514. [Google Scholar] [CrossRef]
Lin, Xu, and Lung-Fei Lee. 2010. GMM estimation of spatial autoregressive models with unknown heteroskedasticity. Journal of Econometrics 157: 34–52. [Google Scholar] [CrossRef]
Liu, Shew Fan, and Zhenlin Yang. 2015. Modified QML estimation of spatial autoregressive models with unknown heteroskedasticity and nonnormality. Regional Science and Urban Economics 52: 50–70. [Google Scholar] [CrossRef]
Nagar, Anirudh L. 1959. The bias and moment matrix of the general k-class estimators of the parameters in simultaneous equations. Econometrica 27: 575–95. [Google Scholar] [CrossRef]
Phillips, Peter C. B. 2012. Folklore theorems, implicit maps, and indirect inference. Econometrica 80: 425–54. [Google Scholar]
Robinson, Peter M. 2008. Correlation testing in time series, spatial and cross-sectional data. Journal of Econometrics 147: 5–16. [Google Scholar] [CrossRef] [Green Version]
Smith, Anthony A., Jr. 1993. Estimating nonlinear time-series models using simulated vector autoregressions. Journal of Applied Econometrics 8: S63–S84. [Google Scholar] [CrossRef] [Green Version]
Zhang, Xinyu, and Jihai Yu. 2018. Spatial weights matrix election and model averaging for spatial autoregressive models. Journal of Econometrics 203: 1–18. [Google Scholar] [CrossRef]

1.	Recent literature on dealing with heteroscedasticity in the spatial framework includes Kelejian and Prucha (2010), Badinger and Egger (2011), Liu and Yang (2015), Jin and Lee (2019), among others. An essential idea in this strand of literature is to use some moment conditions that are robust to unknown heteroscedasticity.
2.	It should be pointed out when $lim_{n \to \infty} \sqrt{n} / h_{n} = 0$ (and $lim_{n \to \infty} h_{n} / n = 0$ , where $h_{n}^{- 1}$ is the order of magnitude of elements of $W_{n}$ ), ${\hat{θ}}_{n}$ is consistent, as shown in Lee (2002), and thus one may not need to seek a consistent estimator of $λ_{0}$ separately and then use it to construct a consistent estimator of $β_{0}$ . In practice, one may not know a priori the rate of $h_{n}$ , but the II estimator to be introduced is always consistent regardless of the rate of $h_{n}$ .
3.	Multicollinearities can happen, for example, when $X_{n} = 1_{n}$ and $W_{n}$ is row-normalized. Lee (2004) showed that under homoscedasticity, however, the QML estimator can still be consistent in spite of violation of this condition. Since the II estimator to be discussed in this paper is to correct the possible inconsistency of the OLS estimator, Assumption 5(ii) is maintained.
4.	The asymptotic variances are given by ${lim}_{n \to \infty} n^{- 1} Var (r_{n}) / {[plim (d_{n} / n)]}^{2} = {lim}_{n \to \infty} [Var (r_{n}) / n] / {[E (d_{n}) / n]}^{2}$ and ${lim}_{n \to \infty} n^{- 1} Var (u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}) / {[plim (d_{n} / n)]}^{2} = {lim}_{n \to \infty} [Var (u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}) / n] / {[E (d_{n}) / n]}^{2}$ , respectively, for the (properly recentered) OLS estimator and the resulting II estimator. Their explicit expressions are given respectively in Theorems 1 and 2 to be introduced. Assumption 5(ii) implies that ${plim}_{n \to \infty} (d_{n} / n) = {plim}_{n \to \infty} (y_{n}^{'} W_{n}^{'} M_{n} W_{n} y_{n} / n)$ exists and is nonzero. It can be shown (see Appendix A) that $Var (r_{n}) = Var (u_{n}^{'} M_{n} G_{n} u_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0} + 2 Cov (β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}, u_{n}^{'} M_{n} G_{n} u_{n})$ , where the covariance term disappears under normality, and $Var (u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n}) = Var (u_{n}^{'} E_{n} u_{n}) + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}$ . When $h_{n}$ diverges, $β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} Σ_{n} M_{n} G_{n} X_{n} β_{0}$ is the dominating term in $Var (r_{n})$ as well as $Var (u_{n}^{'} E_{n} u_{n} + β_{0}^{'} X_{n}^{'} G_{n}^{'} M_{n} u_{n})$ . Then the usual condition that ${plim}_{n \to \infty} n^{- 1} Z_{n}^{'} Σ_{n} Z_{n}$ exists and is nonsingular is sufficient for Assumption 6 to hold. When $h_{n}$ is bounded, a more precise characterization of a sufficient condition is not immediately obvious. Essentially, it requires, in addition to the existence and nonsingularity of ${plim}_{n \to \infty} n^{- 1} Z_{n}^{'} Σ_{n} Z_{n}$ , the existence of ${lim}_{n \to \infty} n^{- 1} Var (u_{n}^{'} E_{n} u_{n})$ and ${lim}_{n \to \infty} n^{- 1} Var (u_{n}^{'} M_{n} G_{n} u_{n})$ , where $Var (u_{n}^{'} E_{n} u_{n}) = O (n / h_{n})$ and $Var (u_{n}^{'} M_{n} G_{n} u_{n}) = O (n / h_{n})$ .
5.	The use of observed, endogenous but non-simulated, variables within the binding function does not appear to be common. An interesting example is Gospodinov et al. (2017), where the authors used observed data within the binding function to hedge against misspecification bias. In their set-up of the autoregressive distributed lag model with a latent scalar predictor under the presence of measurement error, a similar technical difficulty exists regarding the invertibility condition of their binding function and they resorted to simulations to approximate the binding function and then the invertibility condition is numerically verified based on the approximated binding function.
6.	This follows similarly from the proof of Proposition 2 in Lin and Lee (2010).
7.	Neither Lin and Lee (2010) nor Liu and Yang (2015) reported how the inference procedures based on their estimators would perform in finite samples.
8.	Each sub-figure contains 1000 lines, one for each of the simulated data set.
9.	The authors thank a referee for suggesting this comparison. Since one needs to concentrate out the scalar error variance instead of the nuisance matrix $Σ_{n}$ , the II procedure needs to be modified, see Appendix B.6.
10.	Very recently, Zhang and Yu (2018) and Lam and Souza (2019) proposed combining spatial weight matrices in recognition of possible misspecification of the weight matrix and Cheng et al. (2019) suggested combining a conservative GMM estimator based on valid moment conditions and an aggressive GMM estimator based on both valid and possibly misspecified moment conditions.

Figure 1.

b_{n} (λ)

under variance structure V1 and parameter configuration P1,

R = 100

.

Figure 1.

b_{n} (λ)

under variance structure V1 and parameter configuration P1,

R = 100

.

Table 1. Estimation of spatial autoregressions (SAR) with variance structure V1 and parameter configuration P1.

		MQML			GMM			GMM( $Σ_{n}$ )			II
R	$θ_{0}$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$
100	0.2	−0.009	0.067	3.6%	−0.015	0.073	5.8%	0.000	0.01	4.9%	−0.009	0.067	3.8%
	0.8	0.034	0.367	4.5%	0.050	0.377	5.4%	0.001	0.036	5.6%	0.033	0.367	4.7%
	0.2	−0.002	0.099	5.0%	−0.002	0.099	5.1%	0.000	0.008	5.5%	−0.002	0.099	5.0%
	1.5	−0.004	0.088	4.9%	−0.005	0.088	5.1%	0.000	0.007	4.1%	−0.004	0.088	4.9%
	0.6	−0.005	0.035	3.0%	−0.008	0.039	7.2%	0.000	0.005	5.3%	−0.005	0.035	4.1%
	0.8	0.032	0.366	3.8%	0.047	0.378	5.9%	0.001	0.037	5.6%	0.031	0.367	4.7%
	0.2	−0.001	0.102	5.5%	−0.001	0.102	5.5%	0.000	0.008	5.7%	−0.001	0.102	5.5%
	1.5	0.004	0.089	4.9%	0.004	0.089	4.9%	0.000	0.007	4.6%	0.004	0.089	4.9%
	0.9	−0.001	0.009	0.3%	−0.002	0.01	7.7%	0.000	0.001	6.1%	−0.001	0.009	4.3%
	0.8	0.022	0.367	4.0%	0.037	0.378	6.0%	0.003	0.04	6.5%	0.02	0.367	5.6%
	0.2	−0.002	0.102	4.4%	−0.002	0.102	4.4%	0.000	0.008	5.3%	−0.002	0.102	4.4%
	1.5	0.000	0.093	5.9%	0.000	0.094	6.0%	0.000	0.007	4.7%	0.00	0.094	6.1%
200	0.2	0.000	0.047	5.6%	−0.003	0.051	8.0%	0.000	0.007	5.3%	0.000	0.047	6.1%
	0.8	−0.005	0.255	5.1%	0.003	0.261	5.8%	0.001	0.026	6.9%	−0.005	0.255	5.2%
	0.2	0.002	0.071	5.8%	0.001	0.071	5.7%	0.000	0.006	5.4%	0.002	0.071	5.8%
	1.5	0.003	0.061	3.3%	0.002	0.061	3.4%	0.000	0.005	5.4%	0.003	0.061	3.3%
	0.6	−0.003	0.025	2.2%	−0.005	0.028	8.6%	0.000	0.004	5.1%	−0.003	0.025	5.0%
	0.8	0.021	0.256	4.1%	0.030	0.264	6.0%	0.000	0.026	5.2%	0.020	0.256	4.7%
	0.2	−0.001	0.069	4.5%	−0.001	0.069	4.6%	0.000	0.006	5.7%	−0.001	0.069	4.5%
	1.5	0.000	0.062	4.2%	−0.001	0.062	4.3%	0.000	0.005	4.9%	0.000	0.062	4.2%
	0.9	−0.001	0.006	0.7%	−0.001	0.007	8.3%	0.000	0.001	6.0%	−0.001	0.006	4.7%
	0.8	0.022	0.262	4.2%	0.030	0.269	6.7%	0.000	0.026	4.5%	0.021	0.262	5.7%
	0.2	−0.001	0.072	5.5%	−0.001	0.072	5.3%	0.000	0.005	4.8%	−0.001	0.072	5.5%
	1.5	0.000	0.063	3.8%	0.000	0.063	3.9%	0.000	0.005	4.4%	0.000	0.063	3.8%

The weight matrix is a social interaction matrix

W_{n} = I_{R} \otimes [(1_{m_{j}} 1_{m_{j}}^{'} - I_{m_{j}}) / (m_{j} - 1)]

,

m_{j} \sim IID U (3, 20)

,

j = 1, \dots, R

. Reported for each estimator are the bias, root mean squared error (RMSE), and the empirical size of the t test for each parameter equal to its true value at

5 %

from 1000 simulations. The exogenous regressors are

{(1, x_{i 1}, x_{i 2})}^{'}

,

x_{i 1} \sim IID N (3, 1)

,

x_{i 2} \sim IID U (- 1, 2)

, and

x_{i 1}

is independent of

x_{i 2}

. The group size

m_{j}

follows

IID U (3, 20)

. The error terms follow a zero-mean independent normal distribution. The variance structure is such that if

m_{j} > 10

, the error variance in the j-th group is

m_{j}

and otherwise is

1 / m_{j}^{2}

,

j = 1, \dots, R

.

Table 2. Estimation of SAR with variance structure V1 and parameter configuration P2.

		MQML			GMM			GMM( $Σ_{n}$ )			II
R	$θ_{0}$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$
100	0.2	−0.010	0.076	5.5%	−0.014	0.078	5.9%	−0.003	0.049	5.4%	−0.009	0.076	5.7%
	0.2	0.007	0.335	5.7%	0.013	0.336	6.0%	0.005	0.056	5.8%	0.007	0.335	5.8%
	0.2	0.001	0.102	5.7%	0.001	0.102	5.7%	−0.001	0.008	6.3%	0.001	0.102	5.7%
	0.1	0.000	0.095	5.9%	0.000	0.096	5.9%	0.000	0.007	6.0%	0.000	0.095	5.9%
	0.6	−0.005	0.039	2.8%	−0.007	0.040	5.2%	−0.002	0.026	4.8%	−0.004	0.039	4.6%
	0.2	−0.004	0.330	4.1%	0.002	0.331	4.3%	0.005	0.058	5.3%	−0.004	0.330	4.2%
	0.2	0.005	0.100	4.3%	0.005	0.100	4.4%	0.000	0.008	6.4%	0.005	0.100	4.3%
	0.1	0.000	0.091	5.6%	0.000	0.091	5.7%	0.000	0.007	5.3%	0.000	0.091	5.6%
	0.9	−0.001	0.010	0.9%	−0.002	0.010	4.6%	0.000	0.006	3.9%	−0.001	0.010	5.0%
	0.2	0.013	0.335	5.2%	0.018	0.336	6.1%	0.001	0.055	4.1%	0.012	0.335	5.7%
	0.2	−0.001	0.102	5.2%	−0.001	0.102	5.4%	0.000	0.008	7.3%	−0.001	0.102	5.2%
	0.1	0.002	0.089	5.1%	0.002	0.089	5.1%	0.000	0.007	5.9%	0.002	0.089	5.1%
200	0.2	−0.007	0.054	5.4%	−0.009	0.055	5.9%	−0.005	0.034	5.6%	−0.007	0.054	5.9%
	0.2	0.013	0.231	4.7%	0.016	0.232	4.7%	0.006	0.040	6.0%	0.013	0.231	4.7%
	0.2	−0.002	0.071	4.3%	−0.002	0.071	4.4%	0.000	0.005	5.1%	−0.002	0.071	4.3%
	0.1	0.000	0.064	4.7%	0.000	0.064	4.7%	0.000	0.005	5.1%	0.000	0.064	4.7%
	0.6	−0.003	0.027	2.8%	−0.004	0.027	5.3%	−0.001	0.017	4.1%	−0.003	0.027	4.9%
	0.2	−0.001	0.233	5.2%	0.001	0.233	5.4%	0.002	0.039	4.4%	−0.002	0.233	5.2%
	0.2	0.003	0.071	4.7%	0.003	0.071	4.7%	0.000	0.005	5.1%	0.003	0.071	4.7%
	0.1	0.000	0.063	5.5%	0.000	0.063	5.5%	0.000	0.005	5.8%	0.000	0.063	5.5%
	0.9	−0.001	0.007	1.3%	−0.001	0.007	5.7%	−0.001	0.005	4.9%	−0.001	0.007	5.3%
	0.2	0.012	0.232	3.4%	0.015	0.233	4.2%	0.004	0.039	5.0%	0.012	0.232	4.1%
	0.2	−0.002	0.072	5.0%	−0.002	0.072	5.1%	0.000	0.006	5.5%	−0.002	0.072	5.0%
	0.1	0.002	0.063	5.0%	0.002	0.063	5.0%	0.000	0.005	6.7%	0.002	0.063	5.0%

The weight matrix is a social interaction matrix

W_{n} = I_{R} \otimes [(1_{m_{j}} 1_{m_{j}}^{'} - I_{m_{j}}) / (m_{j} - 1)]

,

m_{j} \sim IID U (3, 20)

,

j = 1, \dots, R

. Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at

5 %

from 1000 simulations. The exogenous regressors are

{(1, x_{i 1}, x_{i 2})}^{'}

,

x_{i 1} \sim IID N (3, 1)

,

x_{i 2} \sim IID U (- 1, 2)

, and

x_{i 1}

is independent of

x_{i 2}

. The group size

m_{j}

follows

IID U (3, 20)

. The error terms follow a zero-mean independent normal distribution. The variance structure is such that if

m_{j} > 10

, the error variance in the j-th group is

m_{j}

and otherwise is

1 / m_{j}^{2}

,

j = 1, \dots, R

.

Table 3. Estimation of SAR with variance structure V2 and parameter configuration P1.

		MQML			GMM			GMM( $Σ_{n}$ )			II
R	$θ_{0}$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$
100	0.2	0.000	0.014	4.6%	0.000	0.014	5.2%	−0.001	0.013	4.9%	0.000	0.014	5.0%
	0.8	0.001	0.047	3.6%	0.000	0.047	3.7%	0.003	0.042	3.7%	0.001	0.046	3.7%
	0.2	0.000	0.008	4.1%	0.000	0.008	4.1%	0.000	0.008	5.1%	0.000	0.008	4.2%
	1.5	0.000	0.008	5.3%	0.000	0.008	5.4%	0.000	0.007	4.6%	0.000	0.008	5.3%
	0.6	0.000	0.008	4.3%	0.000	0.008	4.9%	−0.001	0.007	4.3%	0.000	0.008	4.6%
	0.8	0.001	0.049	4.5%	0.002	0.049	5.7%	0.002	0.043	4.3%	0.001	0.049	5.5%
	0.2	0.000	0.008	5.3%	0.000	0.008	5.1%	0.000	0.008	4.9%	0.000	0.008	5.2%
	1.5	0.000	0.008	5.3%	0.000	0.008	5.4%	0.000	0.007	4.7%	0.000	0.008	5.3%
	0.9	0.000	0.002	5.1%	0.000	0.002	7.8%	0.000	0.002	5.9%	0.000	0.002	6.8%
	0.8	0.002	0.050	4.1%	0.002	0.050	5.8%	0.002	0.045	4.9%	0.002	0.049	4.6%
	0.2	0.000	0.008	5.4%	0.000	0.008	5.4%	0.000	0.008	5.6%	0.000	0.008	5.4%
	1.5	0.000	0.008	5.2%	0.000	0.008	5.2%	0.000	0.007	4.7%	0.000	0.008	5.0%
200	0.2	0.000	0.010	3.8%	0.000	0.010	4.7%	0.000	0.009	3.8%	0.000	0.010	4.4%
	0.8	0.001	0.033	4.6%	0.001	0.033	4.1%	0.001	0.030	2.6%	0.001	0.033	4.6%
	0.2	0.000	0.006	4.6%	0.000	0.006	4.6%	0.000	0.005	4.7%	0.000	0.006	4.6%
	1.5	0.000	0.005	4.9%	0.000	0.005	4.6%	0.000	0.005	5.5%	0.000	0.005	4.9%
	0.6	0.000	0.005	3.9%	0.000	0.006	5.4%	0.000	0.005	4.5%	0.000	0.005	4.6%
	0.8	0.000	0.034	3.7%	−0.001	0.034	5.0%	0.000	0.030	3.8%	0.000	0.034	3.9%
	0.2	0.000	0.006	5.9%	0.000	0.006	6.1%	0.000	0.006	5.1%	0.000	0.006	5.9%
	1.5	0.000	0.006	5.5%	0.000	0.006	5.4%	0.000	0.005	4.8%	0.000	0.006	5.6%
	0.9	0.000	0.001	3.9%	0.000	0.001	5.9%	0.000	0.001	5.4%	0.000	0.001	5.0%
	0.8	−0.001	0.035	4.3%	−0.002	0.035	5.4%	0.000	0.031	5.8%	−0.001	0.034	5.0%
	0.2	0.000	0.006	3.8%	0.000	0.006	3.9%	0.000	0.005	4.4%	0.000	0.006	3.8%
	1.5	0.000	0.006	5.1%	0.000	0.006	5.2%	0.000	0.005	4.6%	0.000	0.006	5.5%

The weight matrix is a social interaction matrix

W_{n} = I_{R} \otimes [(1_{m_{j}} 1_{m_{j}}^{'} - I_{m_{j}}) / (m_{j} - 1)]

,

m_{j} \sim IID U (3, 20)

,

j = 1, \dots, R

. Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at

5 %

from 1000 simulations. The exogenous regressors are

{(1, x_{i 1}, x_{i 2})}^{'}

,

x_{i 1} \sim IID N (3, 1)

,

x_{i 2} \sim IID U (- 1, 2)

, and

x_{i 1}

is independent of

x_{i 2}

. The group size

m_{j}

follows

IID U (3, 20)

. The error terms follow a zero-mean independent normal distribution. The variance structure is such that the error variance in the j-th group is

1 / m_{j}

,

j = 1, \dots, R

.

Table 4. Estimation of SAR with variance structure V2 and parameter configuration P2.

		MQML			GMM			GMM( $Σ_{n}$ )			II
R	$θ_{0}$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$
100	0.2	−0.005	0.058	5.2%	−0.004	0.080	19.0%	−0.005	0.051	6.4%	−0.005	0.058	5.9%
	0.2	0.006	0.066	5.0%	0.006	0.086	15.4%	0.006	0.058	6.4%	0.006	0.066	6.7%
	0.2	0.000	0.008	4.8%	0.000	0.008	4.9%	0.000	0.008	5.2%	0.000	0.008	4.8%
	0.1	0.000	0.008	4.6%	0.000	0.008	4.6%	0.000	0.007	4.8%	0.000	0.008	4.6%
	0.6	−0.003	0.029	1.4%	−0.001	0.042	19.7%	−0.003	0.027	6.5%	−0.003	0.029	5.5%
	0.2	0.007	0.067	1.5%	0.003	0.091	17.9%	0.007	0.061	6.2%	0.006	0.067	5.2%
	0.2	0.000	0.008	3.7%	0.000	0.008	3.6%	0.000	0.008	4.2%	0.000	0.008	3.8%
	0.1	0.000	0.007	4.4%	0.000	0.007	4.0%	0.000	0.007	5.6%	0.000	0.007	4.4%
	0.9	−0.001	0.008	0.4%	−0.001	0.011	19.2%	−0.001	0.007	5.4%	−0.001	0.007	5.8%
	0.2	0.009	0.067	0.4%	0.008	0.093	17.8%	0.009	0.059	4.4%	0.007	0.065	5.0%
	0.2	0.000	0.009	4.8%	0.000	0.009	5.7%	0.000	0.008	6.1%	0.000	0.009	5.1%
	0.1	0.000	0.008	5.5%	0.000	0.008	5.7%	0.000	0.007	5.6%	0.000	0.008	5.6%
200	0.2	−0.003	0.039	5.4%	−0.004	0.058	20.1%	−0.004	0.035	5.9%	−0.003	0.039	6.0%
	0.2	0.005	0.045	4.9%	0.006	0.064	17.1%	0.005	0.041	6.4%	0.004	0.045	5.7%
	0.2	−0.001	0.006	5.3%	−0.001	0.006	5.6%	0.000	0.006	5.4%	−0.001	0.006	5.3%
	0.1	0.000	0.005	5.8%	0.000	0.005	5.7%	0.000	0.005	5.3%	0.000	0.005	5.8%
	0.6	−0.002	0.020	0.7%	−0.001	0.029	16.8%	−0.003	0.018	4.8%	−0.001	0.020	4.6%
	0.2	0.003	0.045	1.1%	0.002	0.062	15.6%	0.005	0.040	4.3%	0.002	0.044	4.5%
	0.2	0.000	0.006	5.0%	0.000	0.006	5.0%	0.000	0.005	4.9%	0.000	0.006	5.0%
	0.1	0.000	0.005	5.3%	0.000	0.005	5.2%	0.000	0.005	5.2%	0.000	0.005	5.1%
	0.9	0.000	0.005	0.1%	0.000	0.008	17.3%	0.000	0.004	5.4%	0.000	0.005	4.7%
	0.2	0.002	0.045	0.2%	0.002	0.064	15.9%	0.003	0.039	3.7%	0.002	0.044	3.9%
	0.2	0.000	0.006	4.4%	0.000	0.006	5.3%	0.000	0.006	4.6%	0.000	0.006	4.7%
	0.1	0.000	0.005	5.8%	0.000	0.005	5.8%	0.000	0.005	5.1%	0.000	0.005	5.9%

The weight matrix is a social interaction matrix

W_{n} = I_{R} \otimes [(1_{m_{j}} 1_{m_{j}}^{'} - I_{m_{j}}) / (m_{j} - 1)]

,

m_{j} \sim IID U (3, 20)

,

j = 1, \dots, R

. Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at

5 %

from 1000 simulations. The exogenous regressors are

{(1, x_{i 1}, x_{i 2})}^{'}

,

x_{i 1} \sim IID N (3, 1)

,

x_{i 2} \sim IID U (- 1, 2)

, and

x_{i 1}

is independent of

x_{i 2}

. The group size

m_{j}

follows

IID U (3, 20)

. The error terms follow a zero-mean independent normal distribution. The variance structure is such that the error variance in the j-th group is

1 / m_{j}

,

j = 1, \dots, R

.

Table 5. Estimation of SAR under homoscedasticity with parameter configuration P1.

		QML			GMM			II
R	$θ_{0}$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$
100	0.2	0.014	0.043	19.0%	−0.007	0.040	7.5%	−0.003	0.037	5.6%
	0.8	−0.032	0.146	18.5%	0.025	0.144	6.7%	0.013	0.138	6.3%
	0.2	−0.002	0.029	15.9%	−0.002	0.029	6.1%	−0.002	0.029	6.1%
	1.5	−0.001	0.026	17.7%	−0.002	0.026	6.1%	−0.001	0.026	5.9%
	0.6	0.026	0.032	34.1%	−0.004	0.020	5.4%	−0.002	0.019	5.2%
	0.8	−0.137	0.188	26.5%	0.022	0.137	5.4%	0.010	0.131	4.7%
	0.2	−0.001	0.029	17.3%	−0.001	0.029	5.1%	0.000	0.029	5.1%
	1.5	−0.005	0.026	15.3%	0.000	0.026	4.5%	0.001	0.026	4.4%
	0.9	0.011	0.011	51.3%	−0.001	0.005	5.2%	0.000	0.005	4.1%
	0.8	−0.213	0.248	39.9%	0.022	0.140	5.5%	0.009	0.134	5.0%
	0.2	−0.002	0.028	14.5%	−0.001	0.028	4.3%	−0.001	0.028	3.9%
	1.5	−0.011	0.029	19.8%	0.001	0.027	5.1%	0.001	0.027	4.7%
200	0.2	0.017	0.032	23.0%	−0.003	0.026	5.6%	−0.001	0.025	4.3%
	0.8	−0.044	0.103	18.6%	0.009	0.094	5.0%	0.003	0.090	3.6%
	0.2	−0.001	0.020	15.7%	−0.001	0.020	4.1%	−0.001	0.020	4.2%
	1.5	0.001	0.017	14.1%	0.001	0.017	4.6%	0.001	0.017	4.9%
	0.6	0.028	0.030	62.1%	−0.002	0.014	5.3%	−0.001	0.013	4.2%
	0.8	−0.145	0.172	47.1%	0.009	0.096	4.8%	0.003	0.093	3.8%
	0.2	0.000	0.021	18.4%	0.000	0.021	4.5%	0.000	0.021	4.5%
	1.5	−0.005	0.019	18.1%	0.000	0.018	5.3%	0.001	0.018	4.8%
	0.9	0.011	0.011	85.1%	0.000	0.004	6.3%	0.000	0.003	4.1%
	0.8	−0.220	0.237	68.7%	0.009	0.097	5.9%	0.004	0.093	5.0%
	0.2	−0.002	0.021	17.7%	0.000	0.021	4.5%	0.000	0.021	4.4%
	1.5	−0.014	0.023	25.0%	−0.002	0.019	4.6%	−0.001	0.019	4.5%

The weight matrix is a social interaction matrix

W_{n} = I_{R} \otimes [(1_{m_{j}} 1_{m_{j}}^{'} - I_{m_{j}}) / (m_{j} - 1)]

,

m_{j} \sim IID U (3, 20)

,

j = 1, \dots, R

. Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at

5 %

from 1000 simulations. The exogenous regressors are

{(1, x_{i 1}, x_{i 2})}^{'}

,

x_{i 1} \sim IID N (3, 1)

,

x_{i 2} \sim IID U (- 1, 2)

, and

x_{i 1}

is independent of

x_{i 2}

. The group size

m_{j}

follows

IID U (3, 20)

. The error terms follow a standard normal distribution.

Table 6. Estimation of SAR under homoscedasticity with parameter configuration P2.

		QML			GMM			II
R	$θ_{0}$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$	Bias	RMSE	$P (5 %)$
100	0.2	0.041	0.078	13.3%	−0.007	0.056	5.7%	−0.004	0.055	5.6%
	0.2	−0.037	0.121	14.7%	0.017	0.112	5.3%	0.011	0.109	5.1%
	0.2	−0.002	0.029	16.2%	−0.004	0.029	5.9%	−0.002	0.029	6.0%
	0.1	0.000	0.026	18.0%	−0.001	0.026	6.6%	0.000	0.026	6.4%
	0.6	0.072	0.076	29.7%	−0.004	0.029	4.6%	−0.002	0.028	4.3%
	0.2	−0.149	0.181	25.9%	0.007	0.108	5.3%	0.001	0.107	5.6%
	0.2	−0.001	0.029	16.4%	−0.001	0.029	5.2%	0.001	0.029	5.2%
	0.1	0.000	0.026	18.2%	0.000	0.027	6.6%	0.001	0.027	6.2%
	0.9	0.026	0.027	48.5%	−0.001	0.008	5.3%	−0.001	0.008	5.3%
	0.2	−0.214	0.235	38.4%	0.010	0.110	5.3%	0.004	0.109	4.9%
	0.2	−0.003	0.028	15.8%	0.000	0.028	4.0%	0.001	0.028	4.2%
	0.1	−0.002	0.025	15.2%	0.000	0.026	5.1%	0.001	0.026	4.9%
200	0.2	0.046	0.064	23.3%	−0.004	0.038	4.6%	−0.003	0.038	4.5%
	0.2	−0.046	0.093	18.7%	0.009	0.079	5.3%	0.005	0.078	4.7%
	0.2	−0.001	0.021	17.5%	−0.002	0.021	6.0%	−0.001	0.021	5.6%
	0.1	0.000	0.018	15.8%	0.000	0.018	4.3%	0.000	0.018	4.0%
	0.6	0.074	0.076	70.7%	−0.002	0.020	4.8%	−0.001	0.020	4.6%
	0.2	−0.152	0.170	49.3%	0.004	0.081	6.3%	0.000	0.079	5.9%
	0.2	−0.002	0.021	17.6%	−0.001	0.021	5.0%	0.001	0.021	5.0%
	0.1	0.000	0.019	17.8%	0.000	0.019	6.1%	0.001	0.019	6.4%
	0.9	0.027	0.027	95.2%	−0.001	0.005	4.6%	0.000	0.005	4.1%
	0.2	−0.218	0.229	72.6%	0.005	0.081	6.0%	0.001	0.079	5.2%
	0.2	−0.003	0.020	15.3%	0.000	0.020	4.8%	0.001	0.020	4.8%
	0.1	−0.001	0.018	15.0%	0.000	0.018	4.2%	0.001	0.018	3.9%

The weight matrix is a social interaction matrix

W_{n} = I_{R} \otimes [(1_{m_{j}} 1_{m_{j}}^{'} - I_{m_{j}}) / (m_{j} - 1)]

,

m_{j} \sim IID U (3, 20)

,

j = 1, \dots, R

. Reported for each estimator are the bias, RMSE, and the empirical size of the t test for each parameter equal to its true value at

5 %

from 1000 simulations. The exogenous regressors are

{(1, x_{i 1}, x_{i 2})}^{'}

,

x_{i 1} \sim IID N (3, 1)

,

x_{i 2} \sim IID U (- 1, 2)

, and

x_{i 1}

is independent of

x_{i 2}

. The group size

m_{j}

follows

IID U (3, 20)

. The error terms follow a standard normal distribution.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bao, Y.; Liu, X.; Yang, L. Indirect Inference Estimation of Spatial Autoregressions. Econometrics 2020, 8, 34. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics8030034

AMA Style

Bao Y, Liu X, Yang L. Indirect Inference Estimation of Spatial Autoregressions. Econometrics. 2020; 8(3):34. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics8030034

Chicago/Turabian Style

Bao, Yong, Xiaotian Liu, and Lihong Yang. 2020. "Indirect Inference Estimation of Spatial Autoregressions" Econometrics 8, no. 3: 34. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics8030034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Indirect Inference Estimation of Spatial Autoregressions

Abstract

1. Introduction

2. Main Results

2.1. The Asymptotic Behavior of the OLS estimator

2.2. The Indirect Inference Estimator

3. The Special Case of Pure SAR

4. Monte Carlo Evidence

5. Concluding Remarks

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Lemmas

Appendix B. Proofs

Appendix B.1. Proof of Theorem 1

Appendix B.2. Proof of Theorem 2

Appendix B.3. Proof of Theorem 3

Appendix B.4. Proof of Theorem 4

Appendix B.5. Proof of Corollary 2

Appendix B.6. The Case of Homoscedastic Error Term

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI