Next Article in Journal
Time-Decay Estimates for Linearized Two-Phase Navier–Stokes Equations with Surface Tension and Gravity
Next Article in Special Issue
Volatility Modeling: An Overview of Equity Markets in the Euro Area during COVID-19 Pandemic
Previous Article in Journal
Study of Reversible Platelet Aggregation Model by Nonlinear Dynamics
Previous Article in Special Issue
Functional Modeling of High-Dimensional Data: A Manifold Learning Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

New Robust Cross-Variogram Estimators and Approximations of Their Distributions Based on Saddlepoint Techniques

by
Alfonso García-Pérez
Departamento de Estadística, I.O. y C.N., Universidad Nacional de Educación a Distancia (UNED), Paseo Senda del Rey 9, 28040 Madrid, Spain
Submission received: 20 January 2021 / Revised: 24 March 2021 / Accepted: 28 March 2021 / Published: 1 April 2021

Abstract

:
Let Z ( s ) = ( Z 1 ( s ) , , Z p ( s ) ) t be an isotropic second-order stationary multivariate spatial process. We measure the statistical association between the p random components of Z with the correlation coefficients and measure the spatial dependence with variograms. If two of the Z components are correlated, the spatial information provided by one of them can improve the information of the other. To capture this association, both within components of Z ( s ) and across s , we use a cross-variogram. Only two robust cross-variogram estimators have been proposed in the literature, both by Lark, and their sample distributions were not obtained. In this paper, we propose new robust cross-variogram estimators, following the location estimation method instead of the scale estimation one considered by Lark, thus extending the results obtained by García-Pérez to the multivariate case. We also obtain accurate approximations for their sample distributions using saddlepoint techniques and assuming a multivariate-scale contaminated normal model. The question of the independence of the transformed variables to avoid the usual dependence of spatial observations is also considered in the paper, linking it with the acceptance of linear variograms and cross-variograms.

1. Introduction and Notation

Spatial dependence is described by a variogram in the univariate case. If there is another variable correlated with the variable of interest and we want to use its spatial information, we have to use a cross-variogram, thus extending the univariate analysis to the multivariate case.
Formally, let Z ( s ) = ( Z 1 ( s ) , , Z p ( s ) ) t , s D be an isotropic second-order stationary multivariate spatial process, with D being a fixed subset of R d , assuming that each component Z i , i = 1 , , p , has an expectation and variance constant, i.e., they do not depend on the location s . We also assume that the covariance between two observations depends only on the distance that separates them and not on the spatial locations.
In addition, we admit that each component possesses a variogram
2 γ i i ( h ) = v a r ( Z i ( s + h ) Z i ( s ) ) , s , s + h D
where v a r is the variance.
We measure the statistical association between the random components of Z with the correlation coefficients and the spatial dependence in each component with the variograms. To capture the association both within components of Z ( s ) and across s , the cross-variogram is defined as ([1], p. 67, or [2], p. 229)
2 γ i j ( h ) = c o v Z i ( s + h ) Z i ( s ) , Z j ( s + h ) Z j ( s ) = E ( Z i ( s + h ) Z i ( s ) ) · ( Z j ( s + h ) Z j ( s ) )
s , s + h D , where c o v means covariance and E means mathematical expectation.
This definition is for collocated data, i.e., assuming that each location (site) has all variables Z i measured, a situation that we assume all over the paper.
The results refer to the pair ( i , j ) , i.e., to a generic pair of components Z i , Z j of the vector Z ( s ) = ( Z 1 ( s ) , , Z p ( s ) ) t .
Let us also assume that we have a sample of Z ( s ) at m locations s 1 , , s m , obtaining m (p-dimensional) observations Z ( s 1 ) , , Z ( s m ) . Hence, the data matrix is a m × p matrix where the ( l , j ) th element is the observation of component Z j at location s l .
The definition of new robust estimators against outliers of the cross-variogram and their sample distributions are the aims of this paper.
Until now, there were only two robust estimators previously defined by [3]. This author considered the covariance estimation method to obtaining two somewhat weird and difficult to apply estimators. Here, we consider the location estimation method, extending the idea considered first in [4] and followed in [5] to the multivariate case.
To do this, we start with the classical (non-robust) method-of-moments estimator, defined as
2 γ i j ^ ( h ) = 1 N h l = 1 N h ( Z i ( s l + h ) Z i ( s l ) ) · ( Z j ( s l + h ) Z j ( s l ) )
with the sample size being n = N h and where the cardinality of N ( h ) = { ( s l 1 , s l 2 ) : s l 1 s l 2 = h } .
It is usually assumed that spatial data follow a normal distribution, but this is unrealistic because, in practice, they are contaminated by occasional outliers. For this reason, we assume in the paper a model close to the normal, i.e., a normal-like model in the central region but with heavier tails than the normal, namely, a multivariate-scale-contaminated normal distribution with joint probability density function (pdf):
f M ( z ) = f M ( z 1 , , z p ) = ( 1 ϵ ) f N ( z ; μ , Σ ) + ϵ f N ( z ; μ , g 2 Σ )
where ϵ ( 0 , 1 ) ; g > 1 ; f N ( z ; μ , Σ ) denotes the pdf of a p-variate normal random vector with mean vector μ = ( μ 1 , , μ p ) and covariance matrix Σ , a matrix with values σ i 2 in its diagonal, i = 1 , , p .
In this framework, ϵ represents the small proportion of outliers in the sample (e.g., the proportion of extreme weather events affecting Z at all locations) and g represents the extent of the contamination. If ϵ = 0 or g = 1 , this model reduces to the multivariate normal distribution and, if ϵ > 0 and g > 1 , it resembles the normal in the central part but with heavier tails.
This is the usual way in which robust statistics handles the nonnormality of the data: establishing a neighborhood of the standard model distribution, the contamination neighborhood, inside which the underlying model is located (e.g., [6,7,8], p. 12, or [9], p. 870).
From this joint distribution, the marginal distributions of the Z i are the univariate scale contaminated normal models:
( 1 ϵ ) N ( μ i , σ i 2 ) + ϵ N ( μ i , g 2 σ i 2 ) .
The paper is organized as follows. In Section 2, we consider a consecutive pair of transformations of the initial observations to avoid their dependence. With these, we can use standard techniques for independent and identically distributed (iid) random variables. We also obtain in that section the distribution of these new variables. Here, we have a remarkable difference with respect to the paper by [5]: there, the transformed variables were the square of standard normal variables, i.e., χ 2 distributed random variables, but here, we have the product of two different normal variables.
In Section 3, cross-variogram M-estimators based on the new variables are defined. The von Mises plus saddlepoint (VOM+SAD) approximations for their distributions are also obtained, approximations that are applied to the classical method-of-moment estimator in Section 4. This is the first time that a closed form approximation of its distribution is obtained. Simulations of approximation accuracy and lack of robustness of its distribution are included.
In Section 5, we define the α -trimmed cross-variogram estimator and we obtain the VOM+SAD approximation for its distribution. We do the same for the Huber’s cross-variogram estimator in Section 6. We include here a simulation study to compare the robustness of the three estimators as we increase the degree of contamination.
Section 7 is devoted to analyzing the dependence of the transformed variables on the linearized cross-variogram models. We conclude the paper with two examples of real data.
Finally, in Section 8, we give some conclusions, ending the paper with an Appendix, which contains the technical details obtained in the paper.

2. Preliminary Transformation

The usual dependence between spatial observations Z does not allow for the use of techniques for iid variables. Nevertheless, it is possible to skip this restriction by transforming the initial observations Z .
Namely, let us define the gap or lag variable W s i as
W s i = W s i ( h ) = Z i ( s + h ) Z i ( s ) .
The cross-variogram is now
2 γ i j ( h ) = E W s i · W s j
the mean of the product, and its classical estimator, the method-of-moments estimator,
2 γ i j ^ ( h ) = 1 N h l = 1 N h W s l i · W s l j
The sample mean of the variables X l = W s l i · W s l j , l = 1 , , n , is non-robust then.
This is the reason why we say that we use the location estimation way: the parameter is the mean, and the classical estimator is the sample mean. In this manner, instead of considering a weird estimator for a strange parameter of the initial distribution, we propose to transform the original (and usually dependent) observations Z l into new data X l (independent under some conditions) obtaining a natural parameter of the new variable (its mean) for which a manageable estimator (the sample mean) should be feasible. Then, standard techniques of robustification can be applied.
This idea has been successfully applied, first, in [4] and in [5].
An important problem is determining the distribution of this new variable X l from the original normal (or contaminated normal) distribution of Z l to later obtain the distribution of the robust estimators obtained, where X l is now the product of two different normal variables.

2.1. Correlation between W t i and W s j

First, let us define two new functions that are natural extensions of the similar ones associated with the variogram.
Let us call cross-covariogram between Z i and Z j to the function (provided it is well defined)
C C i j ( | a b | ) = c o v ( Z i ( a ) , Z j ( b ) )
that will be equal to E [ Z i ( a ) · Z j ( b ) ] μ i · μ j .
Here, a will be t or t + h and b will be s or s + h, and thus, μ i = E [ Z i ( t + h ) ] = E [ Z i ( t ) ] and μ j = E [ Z j ( s + h ) ] = E [ Z j ( s ) ] , where the equality between the expectations is obtained because of the intrinsic stationary property of the components of Z.
Analogously, we assume the equality of the variances in locations that are distanced by a lag h, σ i 2 = V [ Z i ( t + h ) ] = V [ Z i ( t ) ] , and σ j 2 = V [ Z j ( s + h ) ] = V [ Z j ( s ) ] .
Let us also define the cross-correlogram as
ρ i j ( | h | ) = C C i j ( | h | ) σ i · σ j
Now, the covariance between W t i and W s j will be (see the Appendix A for details)
c o v W t i , W s j = σ i σ j 2 ρ i j ( | t s | ) ρ i j ( | t s + h | ) ρ i j ( | t s h | ) .
Thus, the correlation between W t i and W s j will be zero if
2 ρ i j ( | t s | ) ρ i j ( | t s + h | ) ρ i j ( | t s h | ) = 0 .
Because locations are fixed in advance (for instance, they could be sample stations) we assume that they are equally spaced on a transect, for instance, in Figure 2.1 of [1], i.e., they are data on a regular grid. Hence, we can match two contiguous Z i (for which the dependence is supposed to be the strongest), so that it is t + h = s.
Now, the previous condition of correlation equal to zero is obtained if
2 ρ i j ( h ) ρ i j ( 0 ) ρ i j ( 2 h ) = 0
or, in terms of the cross-covariogram, when
2 C C i j ( h ) C C i j ( 0 ) C C i j ( 2 h ) = 0 .
On the other hand, with a little of algebra, the cross-variogram can be expressed as (see Appendix A for details)
2 γ i j ( h ) = 2 C C i j ( 0 ) C C i j ( h )
i.e.,
C C i j ( h ) = C C i j ( 0 ) γ i j ( h )
and then, it will be
C C i j ( 2 h ) = C C i j ( 0 ) γ i j ( 2 h ) .
Replacing these values of C C i j ( h ) and C C i j ( 2 h ) in (2), we obtain
2 C C i j ( 0 ) γ i j ( h ) C C i j ( 0 ) C C i j ( 0 ) γ i j ( 2 h ) = 0
i.e., the correlation between W t i and W s j will be 0 when
γ i j ( 2 h ) = 2 γ i j ( h )
i.e., if a linear cross-variogram can be accepted as model (because, theoretically, the nugget is 0).
Remark 1.
The increments W t i and W s j have as joint cumulative distribution function, if they are uncorrelated,
P ( 1 ϵ ) f N 1 + ϵ f N 2 W t i x , W s j y = ( 1 ϵ ) P f N 1 W t i x , W s j y + ϵ P f N 2 W t i x , W s j y = ( 1 ϵ ) P f N 1 W t i x P f N 1 W s j y + ϵ P f N 2 W t i x P f N 2 W s j y
Hence, if W t i and W s j are uncorrelated, with probability 1 ϵ , they are independent under model f N 1 and, with probability ϵ , they are independent under model f N 2 , being a mixture of independent variables. For this reason, these variables are considered in the paper as independent if they are uncorrelated, following the idea of [4].

2.2. Independence of the Observations X s

The method-of-moments estimator 2 γ i j ^ ( h ) was expressed as the sample mean of the variables X l = W s l i · W s l j , l = 1 , , n . Considering only two of them, X 1 = W s 1 i · W s 1 j and X 2 = W s 2 i · W s 2 j , if we can accept a linear variogram for the variable Z i and a linear variogram for the variable Z j , it was proved in [5] that W s 1 i will be independent of W s 2 i and that W s 1 j will be independent of W s 2 j , l = 1 , , n .
If, additionally, we can accept a linear cross-variogram for the couple ( Z i , Z j ) , the variables W s 1 i and W s 2 j , and W s 1 j and W s 2 i will be independent.
As a conclusion, if we could accept a linear variogram for the variable Z i , a linear variogram for the variable Z j , and a linear cross-variogram for this pair, the variables X l = W s l i · W s l j , l = 1 , , n , could be considered independent, a situation that we assume in the paper and to which we shall return later.

2.3. Distribution of the Transformed Variables

Therefore, the initial observations Z i , Z j , normal or contaminated normal distributed, are transformed into the lag variables W s i , W s j and, finally, into their product X s = W s i · W s j . The reason for this transformation is to express the classical estimator as a sample mean of independent variables (if linear variograms and cross-variogram can be accepted), obtaining a nice mathematical expression for the estimator, very useful in the definition of new robust estimators of of location and in the determination of its sample distribution, thanks to this location estimation way.
The problem is that, although, initially, the Z i are contaminated normal variables, after two transformations, we do not have normality in X s . In what follows, we obtain their distributions.
Proposition 1.
(a) If Z i N ( μ i , σ i 2 ) , then W s i N 0 , 2 γ i i ( h ) .
(b) If Z i ( 1 ϵ ) N ( μ i , σ i 2 ) + ϵ N ( μ i , g 2 σ i 2 ) , then W s i ( 1 ϵ ) N 0 , 2 γ i i ( h ) + ϵ N 0 , g 2 2 γ i i ( h ) .
(Proof in the Appendix A).
To obtain the distribution of 2 γ i j ^ ( h ) we use two results from Nadarajah and Pongány (2016).
Proposition 2.
([10], p. 202, Theorems 2.1 and 2.2)
(a) Let ( V 1 , V 2 ) denote a bivariate normal random vector with zero means, unit variances, and correlation coefficient ρ. Then, the pdf of X = V 1 · V 2 is
p X ( x ) = 1 π 1 ρ 2 exp ρ 1 ρ 2 x K 0 1 1 ρ 2 | x |
< x < , where K 0 is the modified Bessel function of the second-order zero.
(b) If X 1 , , X n ( n 2 ) is a random sample of X = V 1 · V 2 , the pdf of their sample mean X ¯ is
p X ¯ ( x ) = n ( n + 1 ) / 2 2 ( 1 n ) / 2 | x | ( n 1 ) / 2 π 1 ρ 2 Γ ( n / 2 ) exp a 1 a 2 2 x K 1 n 2 a 1 + a 2 2 | x |
< x < , where a 1 = n / ( 1 ρ ) , a 2 = n / ( 1 + ρ ) , and K b is the modified Bessel function of the second-order b.
Thus, if ( Z i , Z j ) is a bivariate scale contaminated normal variable with distribution
( 1 ϵ ) N ( μ , Σ ) + ϵ N ( μ , g 2 Σ ) = ( 1 ϵ ) N 1 + ϵ N 2
the variable ( W s i , W s j ) will be a bivariate scale contaminated normal variable with distribution
( 1 ϵ ) N ( 0 , Σ c ) + ϵ N ( 0 , g 2 Σ c )
where, in Σ c , the two elements of the diagonal are V W s i = 2 γ i i ( h ) and V W s j = 2 γ j j ( h ) and the correlation coefficient between W s i and W s j is
ρ i j ( h ) = 2 γ i j ( h ) 2 γ i i ( h ) 2 γ j j ( h )
equal to the correlation coefficient between W s i / 2 γ i i ( h ) and W s j / 2 γ j j ( h ) , usually shortened as ρ in the rest of the paper. Hence, it will be
2 γ i i ( h ) 2 γ j j ( h ) · ρ = 2 γ i j ( h ) .
The distribution of X s = W s i · W s j is
F ( x ) = P X s x = ( 1 ϵ ) P N 1 X s x + ϵ P N 2 X s x = ( 1 ϵ ) P N 1 X s 2 γ i i ( h ) 2 γ j j ( h ) x 2 γ i i ( h ) 2 γ j j ( h ) + ϵ P N 2 X s g 2 2 γ i i ( h ) 2 γ j j ( h ) x g 2 2 γ i i ( h ) 2 γ j j ( h ) = ( 1 ϵ ) P X x / ( 2 γ i i ( h ) 2 γ j j ( h ) ) + ϵ P X x / ( g 2 2 γ i i ( h ) 2 γ j j ( h ) ) = ( 1 ϵ ) G ( x ) + ϵ H ( x )
where P X is the cumulative distribution function for which the pdf is given by (3). The last equality, (5), is used as a notation.

3. Cross-Variogram M -Estimators

Because the method-of-moments estimator is the sample mean of the transformed variables X s , this estimator is robustified as it is the sample mean, but here, the model distribution of the observations is somewhat peculiar, with the computations being more elaborated.
Firstly, we define a large class of cross-variogram estimators for which their robustness can be controlled. We call cross-variogram M-estimators, with score function ψ : X × Θ I R , to the solution of the equation:
s = 1 n ψ ( X s , T n ) = 0
where X s are the variables previously considered and we assume that ψ ( x , θ ) is monotonically decreasing in θ for all x. In fact, T n is an estimator for a location problem, with ψ ( x , θ ) being of the form ψ ( x θ ) , with ψ ( u ) monotonically increasing in u, [11].
We can control the robustness of the cross-variogram M-estimators, choosing a bounded score function. Other robustness properties, such us the breakdown point, can also be applied to this class of estimators.

3.1. Von Mises Approximation for their Distributions

If T n ( X 1 , , X n ) is an estimator where F is the underlying model distribution of the observations, the tail probability P F { T n > t } can be expressed at another model G using the von Mises expansion as [12,13,14]:
P F { T n > t } = P G { T n > t } + TAIF x ; t ; T n , G d F ( x ) + O | | F G | | 2
where TAIF x ; t ; T n , G is Hampel’s influence function of the tail probability functional, called tail area influence function (15] and defined as
TAIF x ; t ; T n , G = ϵ P G ϵ , x { T n > t } ϵ = 0
for all x R where the right-hand side exists.
This influence function is calculated by changing the underlying model G using a contaminated model ( 1 ϵ ) G + ϵ δ x before computing the first derivative at ϵ = 0 , with δ x being the distribution that assigns mass 1 at x.
If distributions F and G are close enough, we can use the von Mises approximation (VOM)
P F { T n > t } P G { T n > t } + TAIF x ; t ; T n , G d F ( x )
to compute the distribution of T n under the underlying model F using model G.
In particular, if F is a mixture F = ( 1 ϵ ) G + ϵ H the von Mises expansion is
P F { T n > t } = P G { T n > t } + ϵ TAIF x ; t ; T n , G d H ( x ) + O ( ϵ 2 )
because TAIF x ; t ; T n , G d G ( x ) = 0 . The von Mises approximation (7) will be then
P F { T n > t } P G { T n > t } + ϵ TAIF x ; t ; T n , G d H ( x ) .
Distribution G plays an important role in the VOM approximation because we can choose it such that we know the tail probability of the leading term, P G { T n > t } . Distribution G is called the pivotal distribution, and let us observe that TAIF is also computed for this pivotal distribution.

3.2. Saddlepoint Approximation of the TAIF

In order to use von Mises approximation (8) for location M-estimators, we compute a saddlepoint approximation (SAD) of the TAIF x ; t ; T n , G , using Lugannani and Rice’s formula, [16] ([17], p. 77, or better, [8], p. 314). We use the approximation given in [11] for M-estimators and, following the same computations as that in [18], pp. 402–404, we have that
TAIF x ; t ; T n , G = ϕ ( s ) r 1 n 1 / 2 e z 0 ψ ( x , t ) e z 0 ψ ( y , t ) d G ( y ) 1 + O ( n 1 / 2 )
where ϕ is the density function of the standard normal distribution, and s and r 1 are the functionals
s = 2 n K ( z 0 , t )
r 1 = z 0 K ( z 0 , t )
with
K ( λ , t ) = log e λ ψ ( y , t ) d G ( y )
being the cumulant generating function of distribution G; K ( λ , t ) being the second partial derivative of K ( λ , t ) with respect to the first variable λ ; and z 0 being the saddlepoint, i.e., the solution of the saddlepoint equation
K ( z 0 , t ) = e z 0 ψ ( y , t ) ψ ( y , t ) d G ( y ) = 0 .
Replacing the SAD approximation (9) in the VOM approximation (8), we obtain the VOM+SAD approximation for the distribution of the M-estimator T n ( X 1 , , X n ) , assuming that X i F = ( 1 ϵ ) G + ϵ H ,
P F { T n > t } P G { T n > t } + ϵ ϕ ( s ) r 1 n e z 0 ψ ( x , t ) d H ( x ) e z 0 ψ ( y , t ) d G ( y ) 1
which is the approximation that we use in what follows and where G and H are the distributions that appear in (5).
The VOM+SAD approximation will be accurate if distributions F and G are close. Nevertheless, if this is not the case, we can use an iterative procedure, as in [19,20,21], considering intermediate distributions between F and G.

4. Sample Distribution of the Method-of-Moments Estimator

Not all the cross-variogram M-estimators are robust. For instance, the classical method-of-moment estimator 2 γ i j ^ ( h ) is not robust because its score function ψ ( u ) = u is not bounded. Nevertheless, we compute its VOM+SAD approximation to show its lack of robustness next and because its distribution will be useful in the determination of the distribution of some robust versions of it.
Due to 2 γ i j ^ ( h ) being an M-estimator with score function ψ ( x θ ) = x t , we can use approximation (10). Its leading term is computed with respect to distribution G ( x ) = P X x / ( 2 γ i i ( h ) 2 γ j j ( h ) ) , where P X is the cumulative distribution function for which the pdf is p X , given by (3) in Proposition 2.
Thus, the leading term in (10) is
P G { 2 γ i j ^ ( h ) > t } = P 1 N h s = 1 N h X s > t = P G 1 N h s = 1 N h W s i W s j > t = P G 1 N h s = 1 N h W s i 2 γ i i ( h ) W s j 2 γ j j ( h ) > t 2 γ i i ( h ) 2 γ j j ( h ) = d p X ¯ ( x ) d x
where d = t / ( 2 γ i i ( h ) 2 γ j j ( h ) ) and where p X ¯ is the pdf given by (4) because, now, the previous tail probability is the tail probability of the sample mean of the product of two standard normal distributions.
The rest of the elements in approximation (10) essentially depend on the cumulant generating function of distribution G and are described in the Appendix. All of them are very easy to program with R. They are computed in the Supplementary Materials available on the website. https://www2.uned.es/pea-metodos-estadisticos-aplicados/cross-variogram.htm (accessed on 22 February 2021).

4.1. Performance of the Theoretical Results with Simulations

We can see how accurate the VOM+SAD approximation is for the method-of-moments estimator with a simulation study, considering a sample size as small as n = 3 . We considered a bivariate normal distribution with mean vector ( 0 , 0 ) and covariance matrix such that 0.5 2 and 0.7 2 are the marginal variances and 0.3 the covariance for ( W s i , W s j ) . We consider four different situations: no contamination, contamination ϵ = 0.05 , contamination ϵ = 0.1 , and contamination ϵ = 0.2 .
Under these conditions, we obtain Figure 1 in which we appreciate that the approximations are very good, especially in the tails, which are the areas of interest for tests and confidence intervals.
We include in Table 1 some values of Figure 1 (see the Supplementary Materials, p. 6): values of the VOM+SAD approximation and exact ones obtained with the simulation.
If we compute from this table the relative errors of the approximation, in %, defined as usual (see, for instance, [22]) as
100 × | E x a c t A p p r o x i m a t i o n | 1 E x a c t
we obtain Table 2, showing extremely low relative errors in the approximations. This is one of the advantages of saddlepoint approximations, [14].

4.2. Robustness of the Method-of-Moments-Estimator

We can observe the lack of robustness of the distribution of the method-of-moments-estimator in Figure 2 as we increase ϵ or g.
The programs in R, used to obtain this figure, are in the Supplementary Materials.
Remark 2.
The sample size N h , considered in each estimation, depends on the value of the lagh, that is fixed in advance. Ifhis small, the number of lags will be large and N h will be small. The VOM+SAD approximations obtained in the paper are very accurate, even in this case.
Nevertheless, ifhis large, the number of lags will be small and the sample size N h will be large. In this case, it is easier to compute the leading term as
P G { 2 γ i j ^ ( h ) > t } = P 1 N h s = 1 N h X s > t = P G 1 N h s = 1 N h W s i 2 γ i i ( h ) W s j 2 γ j j ( h ) > t 2 γ i i ( h ) 2 γ j j ( h )
using the central limit theorem because
W s i 2 γ i i ( h ) W s j 2 γ j j ( h )
is the product of two standard normal variables with correlation coefficient ρ . The characteristic function of this product is (expression (4) in [10])
φ ( u ) = [ 1 i ( 1 + ρ ) u ] 1 / 2 [ 1 + i ( 1 ρ ) u ] 1 / 2
and then, the mean of this product variable X s is φ ( 0 ) / i = ρ and the second moment about the origin is φ ( 0 ) / i 2 = 1 + 2 ρ 2 . Hence, the variance will be 1 + ρ 2 and the leading term can be computed if N h is large, as
P G { 2 γ i j ^ ( h ) > t } 1 Φ t 2 γ i i ( h ) 2 γ j j ( h ) ρ N h 1 + ρ 2 .
Since, if ϵ = 0 or g = 1 , the scale contaminated normal distribution is just a normal distribution, this last expression is an approximation for the distribution of the classical method-of-moments estimator under the usual underlying normal distribution model.

5. α -Trimmed Cross-Variogram Estimator

Another robust estimator for the cross-variogram, which is not an M-estimator, can be obtained by trimming the X s observations as follows:
Considering the initial pair of variables Z i and Z j , and transforming them to the couple W s i = Z i ( s + h ) Z i ( s ) and W s j = Z j ( s + h ) Z j ( s ) and finally to the product X s = W s i · W s j , if we trim the 100 · α % of the smallest and the 100 · α % of the largest ordered data X ( i ) , the (symmetrically) sample α-trimmed cross-variogram estimator is defined as
2 γ i j ^ α ( h ) = 1 N h 2 r X ( r + 1 ) + + X ( N h r ) = X ¯ α
where r = [ N h α ] if [ . ] stands for the integer part.
To obtain an approximation for its sample distribution, we use an accurate VOM+SAD approximation obtained in [21]. From Corollary 1 therein, we can approximate the small sample distribution of the sample α -trimmed cross-variogram 2 γ i j ^ α ( h ) when the observations X i come from F = ( 1 ϵ ) G + ϵ H , with k iterations (k large), by the VOM+SAD approximation to the distribution of the method-of-moments-estimator 2 γ i j ^ ( h ) , obtained in the previous section, as
P F 2 γ i j ^ α ( h ) > t ( 1 + N h c 1 ) k + 1 ( 1 + N h c 2 ) k + 1 P F 2 γ i j ^ ( h ) > t
where c 1 = ( 1 2 α ) 1 / ( k + 1 ) 1 and c 2 = 1 / ( 1 2 α ) 1 / ( k + 1 ) 1 .
In the bottom row of Figure 3, we plot the tail probability of the 0.2 -trimmed cross-variogram estimator 2 γ i j ^ α ( h ) with no contamination ( ϵ = 0 ) and with two percentages of contamination: ϵ = 0.15 and ϵ = 0.3 , with the sample size being N h = 10 .
We observe in this figure that, as we increase the contamination percentage, i.e., as we increase ϵ , the tail probabilities obtained with the trimmed cross-variogram estimators are affected but by less than those obtained with the classical method-of-moments estimator. We see this by comparing the first row of figures (non-trimmed cross-variogram estimators) with the second row of figures (trimmed cross-variogram estimators).

6. Huber’s Cross-Variogram Estimator

If the ψ function, ψ ( x , t ) = ψ ( x t ) , used to obtain the M-estimator in Equation (6) is the Huber’s function ψ b ( u ) = min { b , max { u , b } } , the M-estimator obtained is called the Huber’s cross-variogram estimator, 2 γ i j ^ H ( h ) . Since its score function is bounded, this estimator will be robust.
An approximation for its distribution can be obtained from (10). Nevertheless, the leading term P G 2 γ i j ^ H ( h ) > t is not easy to compute. For this reason, in this case, we use the Lugannani and Rice formula to approximate this leading term, the VOM+SAD approximation for the distribution of the Huber’s cross-variogram estimator being the following:
P X i F 2 γ i j ^ H ( h ) > t 1 Φ ( s ) + ϕ ( s ) 1 r 1 s + ϵ ϕ ( s ) r 1 n e z 0 ψ b ( x t ) d H ( x ) e z 0 ψ b ( y t ) d G ( y ) 1
where the saddlepoint z 0 is such that
e z 0 ψ b ( y t ) ψ b ( y t ) d G ( y ) = 0
with G and H being the distributions that appear in (5), and where all the functionals r , r 1 , and s are computed with respect to model G.
This approximation may seem complicated but it is easy to compute using the huber function of the MASS library, [23].
Example 1.
In order to analyze the behaviour of the robust estimators defined in the paper, we compare them with the classical method-of-moments estimator, carrying out a simulation study in which we compare the 0.1 -trimmed and Huber’s variogram estimators with the classical one.
The study consists of a simulation of two spatial and statistical correlated variables Z 1 and Z 2 , both with a normal distribution, in different situations, with some of them considered, for instance, in [4]:
(A)
No contamination, Z 1 N ( 0 , 1 ) ;
(B)
Z 1 0.95 · N ( 0 , 1 ) + 0.05 · N ( 0 , 5 2 ) ;
(C)
Z 1 0.90 · N ( 0 , 1 ) + 0.10 · N ( 0 , 5 2 ) ;
(D)
Z 1 0.80 · N ( 0 , 1 ) + 0.20 · N ( 0 , 5 2 ) ;
(E)
Z 1 0.95 · N ( 0 , 1 ) + 0.05 · N ( 0 , 20 2 ) ;
(F)
Z 1 0.90 · N ( 0 , 1 ) + 0.10 · N ( 0 , 20 2 ) ;
(G)
Z 1 0.80 · N ( 0 , 1 ) + 0.20 · N ( 0 , 20 2 ) .
The details of the simulations are in the Supplementary Materials. In these simulations, we observe less sensitivity in the robust estimators than in the classical one, as we increase the contamination in the model. We appreciate this in Figure 4 and Figure 5. In the first one, we observe that the classical variogram model can be accepted for the three estimations in case (A), where there is no contamination. Nevertheless, as we increase the contamination (Figure 5), this variogram model does not represent the classic variogram estimations; only in some cases it represents the 0.1-trimmed variogram estimations, and it can be accepted when we consider Huber’s variogram estimations except, perhaps, in the last case, where it is doubtful.

7. Linearized Version of the Cross-Variogram Model

We saw at the end of Section 2.2 that, if linear models can be accepted as variograms and cross-variograms, the variables X s can be considered independent. These linearized versions of the model (classical and robust) were introduced in Section 9 of [5] and can be applied to model the cross-variogram. They essentially consists of replacing, before the range, the increasing part of the traditional variogram, or cross-variogram, using the regression line and, after the range, using the sill (or the robust sample mean in the robust linearized version).
Additionally, the test defined in Section 10.1 of [5] can be used to check if these models can be accepted, using saddlepoint approximations for the robust (and classical) estimators of the variograms and cross-variograms.
Namely, we test the null hypothesis of a particular variogram or cross-variogram model M 0 ( h ) from which we obtain the theoretical variogram values 2 γ i i ( h ) (or cross-variogram values 2 γ i j ( h ) )) using as test statistic
S n = sup h 2 γ ^ i i ( h ) 2 γ i i ( h ) = max 1 | | h | | K 2 γ ^ i i ( h ) 2 γ i i ( h )
or
S n = sup h 2 γ ^ i j ( h ) 2 γ i j ( h ) = max 1 | | h | | K 2 γ ^ i j ( h ) 2 γ i j ( h )
assuming that we consider K lags.
If we unify both as
S n = sup h 2 γ ^ ( h ) 2 γ ( h ) = max 1 | | h | | K 2 γ ^ ( h ) 2 γ ( h )
the cumulative distribution function of S n is (see [5])
F S n ( v ) = | | h | | = 1 K P 2 γ ( h ) { 2 γ ^ ( h ) > v + 2 γ ( h ) } P 2 γ ( h ) { 2 γ ^ ( h ) > v + 2 γ ( h ) }
probabilities that are computed with the VOM+SAD approximations.
We remark that the number K of lags (and hence the value of h ) can be modified to obtain the desired linearity.
Example 2.
Let us consider prediction data, included in the jura data set from Pierre Goovaerts’ book that contains geolocated information of several variables. This data set is calledprediction.datin the R library,gstat.
Two correlated variables, with a distribution similar to a scale contaminated normal model, are ln(Pb) (natural logarithm of Lead) and Ni (Nickel).
The values of the classical method-of-moments estimator, the 0.1-trimmed cross-variogram estimator, and the Huber’s cross-variogram estimator (with tuning constant b = 1.5 ) are easily obtained for these variables, as can be seen in the Supplementary Materials. The lag distant chosen was h = 0.2 . These values are shown in Figure 6.
To use their distributions, obtained in the paper, it is necessary to check if we can accept linear variograms for these two variables and a linear cross-variogram for the pair, as it was pointed out in Section 2.2. If this is the case, the variables X s , s = 1 , ,   n can be considered independent.
Assuming as underlying model, a scale contaminated normal with ϵ = 0.01 and g = 1.1 , the linearized versions of the variograms for the logarithm of Lead are shown in Figure 7. The linearized versions of the variograms for Nickel are shown in Figure 8.
Finally, the linearized versions for the cross-variograms models are shown in Figure 9.
From a visual point of view, all these linearized versions can be accepted using the test considered in Section 7. The values of the test statistics S n = sup h 2 γ ^ ( h ) 2 γ ( h ) and the p-values are given in Table 3 (see the Supplementary Materials). Thus, the independence of the X s can be accepted.
We conclude the paper with a real-data example in which we observe how robust cross-variogram estimations provide models less sensitive to outliers, which will lead us to a more robust cokriging.
Example 3.
Let us consider the geolocated pollution data, included in the Supplementary Materials, that are the 2017 average concentrations of four air pollutants in the Community of Madrid (Spain): nitrogen monoxide (NO), nitrogen dioxide (NO2), suspended particles with a size less than 10 microns (PM10), and ozone (O3). These data are obtained from 22 monitoring stations [24,25,26].
Two of these 4 variables are strongly correlated and have a distribution similar to a scale contaminated normal model; they are NO and NO2.
The variogram-crossvariogram matrix of the classical variogram and cross-variogram estimators along with classical least squares model (Mather’s model in this case) are shown in Figure 10.
The values of the classical method-of-moments estimator, the 0.1-trimmed cross-variogram estimator, and the Huber’s cross-variogram estimator (with tuning constant b = 1.5 ) for these variables are obtained in the Supplementary Materials. These values are shown in Figure 11, along with the linearized cross-variogram models.
We observe that, at first lag, the three estimations agree. In the others, we can see the soft effect of the 0.1-trimmed cross-variogram and Huber’s cross-variogram estimators.
The linearized versions of the variograms and cross-variogram can be accepted, and therefore, the independence of the transformed variables X s , s = 1 , , n .
Moreover, we appreciate the influence of the outliers in the estimation of the (linearized) cross-variogram in Figure 11 and, therefore, on the cokringing obtained with classical cross-variogram models. Thus, the use of robust estimators of the cross-varogram will be more reasonable in order to obtain a robust cokriging.

8. Conclusions

In this paper, we introduced new robust cross-variogram estimators and we obtained saddlepoint approximations for their distributions when the underlying model is a scale-contaminated normal distribution. We also obtained an approximation for the distribution of the method-of-moments estimator.
These approximations are especially useful when the sample size is small, a situation that we have when the size of the lag h is small.
We also proposed a suitable transformation of the initial observations to avoid the traditional dependence of the spatial observations. We see that is that linear variograms and a linear cross-variogram can be accepted as models to obtain this.

Supplementary Materials

Funding

This work was partially supported by grant PGC2018-095194-B-I00 from the Ministerio de Ciencia, Innovación y Universidades (Spain).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The author is very grateful to the referees and to the assistant editor for their kind and professional remarks.

Conflicts of Interest

The author declare no conflict of interest.

Appendix A

Expression for the covariance between W t i and W s j . Section 2.1
c o v W t i , W s j = c o v Z i ( t + h ) Z i ( t ) , Z j ( s + h ) Z j ( s )
= E ( Z i ( t + h ) Z i ( t ) ) · ( Z j ( s + h ) Z j ( s ) )
= E ( Z i ( t + h ) · Z j ( s + h ) E ( Z i ( t + h ) · Z j ( s )
E ( Z i ( t ) · Z j ( s + h ) + E ( Z i ( t ) · Z j ( s )
= C C i j ( | t + h s h | ) + μ i · μ j C C i j ( | t + h s | ) μ i · μ j
C C i j ( | t s h | ) μ i · μ j + C C i j ( | t s | ) + μ i · μ j
= ρ i j ( | t s | ) σ i σ j ρ i j ( | t s + h | ) σ i σ j ρ i j ( | t s h | ) σ i σ j + ρ i j ( | t s | ) σ i σ j
= σ i σ j 2 ρ i j ( | t s | ) ρ i j ( | t s + h | ) ρ i j ( | t s h | ) .
Expression for the cross-variogram. Section 2.1
2 γ i j ( h ) = E ( Z i ( s + h ) Z i ( s ) ) · ( Z j ( s + h ) Z j ( s ) )
= E ( Z i ( s + h ) · Z j ( s + h ) E ( Z i ( s + h ) · Z j ( s )
E ( Z i ( s ) · Z j ( s + h ) + E ( Z i ( s ) · Z j ( s )
= C C i j ( 0 ) + μ i · μ j C C i j ( h ) μ i · μ j
C C i j ( h ) μ i · μ j + C C i j ( 0 ) + μ i · μ j
= 2 C C i j ( 0 ) C C i j ( h )
Proof of Proposition 1.
If Z i is a variable with normal distribution, Z i N ( μ i , σ i 2 ) , where X H stands for “X is distributed as H”; then, it is W s i = ( Z i ( s + h ) Z i ( s ) ) N 0 , 2 γ i i ( h ) because of the intrinsic stationary property of Z i .
If Z i has a distribution ( 1 ϵ ) N ( μ i , σ i 2 ) + ϵ N ( μ i , g 2 σ i 2 ) = ( 1 ϵ ) N 1 + ϵ N 2 , the cumulative distribution function of W s i will be
P W s i y = ( 1 ϵ ) P N 1 W s i y + ϵ P N 2 W s i y = ( 1 ϵ ) P N 1 Z i ( s + h ) Z i ( s ) 2 γ i i ( h ) y 2 γ i i ( h ) + ϵ P N 2 Z i ( s + h ) Z i ( s ) g 2 γ i i ( h ) y g 2 γ i i ( h ) = ( 1 ϵ ) Φ y / 2 γ i i ( h ) + ϵ Φ y / ( g 2 γ i i ( h ) )
where Φ is the cumulative distribution function of the standard normal distribution. □
Elements of Approximation (10) for the Method-of-Moments estimator
K ( λ , t ) = log e λ ( y t ) d G ( y ) = log e λ ( y t ) p X y 2 γ i i ( h ) 2 γ j j ( h ) d y 2 γ i i ( h ) 2 γ j j ( h ) = λ t + log 1 λ 2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 1 / 2 · 1 + λ 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) 1 / 2
using expression (4) in Nadarajah and Pongány (2016) and with ρ being the correlation coefficient between W s i / 2 γ i i ( h ) and W s j / 2 γ j j ( h ) , mentioned above.
Hence, the saddlepoint equation K ( z 0 , t ) = 0 from which we obtain the saddlepoint z 0 is
2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 1 z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) 1 + z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) = 2 t .
The other elements in (10) are
K ( z 0 , t ) = z 0 t + log 1 z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 1 / 2 · 1 + z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) 1 / 2
K ( z 0 , t ) = 1 2 2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 2 1 z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 2 + 1 2 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) 2 1 + z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) 2
s = 2 n K ( z 0 , t )
r 1 = z 0 K ( z 0 , t )
and the integrals are
e z 0 ψ ( y , t ) d G ( y ) = e z 0 t 1 z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 1 / 2 · 1 + z 0 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) 1 / 2
e z 0 ψ ( x , t ) d H ( x ) = e z 0 t 1 z 0 g 2 2 γ i i ( h ) 2 γ j j ( h ) ( 1 + ρ ) 1 / 2 · 1 + z 0 g 2 2 γ i i ( h ) 2 γ j j ( h ) ( 1 ρ ) 1 / 2

References

  1. Cressie NAC. Statistics for Spatial Data; John Wiley & Sons: New York, NY, USA, 1993. [Google Scholar]
  2. Bivand, R.S.; Pebesma, E.J.; Gómez-Rubio, V. Applied Spatial Data Analysis with R, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  3. Lark, R.M. Two robust estimators of the cross-variogram for multivariate geostatistical analysis of soil properties. Eur. J. Soil Sci. 2003, 54, 187–201. [Google Scholar] [CrossRef]
  4. Cressie, N.; Hawkins, D.M. Robust estimation of the variogram: I. J. Int. Assoc. Math. Geol. 1980, 12, 115–125. [Google Scholar] [CrossRef]
  5. García-Pérez, A. Saddlepoint approximations for the distribution of some robust estimators of the variogram. Metrika 2020, 83, 69–91. [Google Scholar] [CrossRef]
  6. Huber, P.J. Robust estimation of a locaion parameter. Ann. Math. Stat. 1964, 35, 73–101. [Google Scholar] [CrossRef]
  7. Tukey, J.W. A survey of sampling from contaminated distributions. In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, Stanford Studies in Mathematics and Statistics; Oklin, I., Ed.; Stanford University Press: Palo Alto, CA, USA, 1960; Chaper 39; pp. 448–485. [Google Scholar]
  8. Huber, P.J.; Ronchetti, E.M. Robust Statistics, 2nd ed.; John Wiley & Sons: New York, NY, USA, 2009. [Google Scholar]
  9. Ebner, B.; Henze, N. Tests for multivariate normality–a critical review with emphasis on weighted L2-statistics. Test 2020, 29, 845–892. [Google Scholar] [CrossRef]
  10. Nadarajah, S.; Pongány, T.K. On the distribution of the product of correlated normal random variables. Comptes Rendus Math. 2016, 354, 201–204. [Google Scholar] [CrossRef]
  11. Daniels, H.E. Saddlepoint approximations for estimating equations. Biometrika 1983, 70, 89–96. [Google Scholar] [CrossRef]
  12. Withers, C.S. Expansions for the distribution and quantiles of a regular functional of the empirical distribution with applications to nonparametric confidence intervals. Ann. Stat. 1983, 11, 577–587. [Google Scholar] [CrossRef]
  13. Serfling, R.J. Approximation Theorems of Mathematical Statistics; John Wiley & Sons: New York, NY, USA, 1980. [Google Scholar]
  14. Ronchetti, E. Accurate and robust inference. Econom. Stat. 2020, 14, 74–88. [Google Scholar] [CrossRef]
  15. Field, C.A.; Ronchetti, E. A tail area influence function and its application to testing. Sequential Anal. 1985, 4, 19–41. [Google Scholar] [CrossRef]
  16. Lugannani, R.; Rice, S. Saddle point approximation for the distribution of the sum of independent random variables. Adv. Appl. Probab. 1980, 12, 475–490. [Google Scholar] [CrossRef]
  17. Jensen, J.L. Saddlepoint Approximations; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
  18. Von García-Pérez, A. Mises approximation of the critical value of a test. Test 2003, 12, 385–411. [Google Scholar] [CrossRef]
  19. García-Pérez, A. Another look at the Tail Area Influence Function. Metrika 2011, 73, 77–92. [Google Scholar] [CrossRef]
  20. García-Pérez, A. A linear approximation to the power function of a test. Metrika 2012, 75, 855–875. [Google Scholar] [CrossRef]
  21. García-Pérez, A. A Von Mises approximation to the small sample distribution of the trimmed mean. Metrika 2016, 79, 369–388. [Google Scholar] [CrossRef]
  22. Field, C.A.; Ronchetti, E. Small Sample Asymptotics; Lecture Notes-Monograph Series; Institute of Mathematical Statistics: Hayward, CA, USA, 1990; Volume 13. [Google Scholar]
  23. R Development Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Viena, Austria, 2020; Available online: http://www.R-project.org (accessed on 22 February 2021).
  24. Council-Enviroment. Department of the Madrid Environment: Air Quality Monitoring System. 2018. Available online: http://www.mambiente.munimadrid.es/opencms/opencms/calaire/consulta/descarga_\opendata.html?__locale=es (accessed on 22 February 2021).
  25. Council-Enviroment. Environmental Management, Council of the Environment, Local Administration and Territorial Planning, Atmospheric Quality Area-Air Quality Network. 2018. Available online: http://gestiona.madrid.org/azul_internet/html/web/2.htm?ESTADO_MENU=2 (accessed on 22 February 2021).
  26. Núñez-Alonso, D.; Pérez-Arribas, L.V.; Manzoor, S.; Cáceres, J.O. Statistical Tools for Air Pollution Assessment: Multivariate and Spatial Analysis Studies in the Madrid Region. J. Anal. Methods Chem. 2019, 2019, 9753927. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Approximate tail probabilities (in black) and simulated (in red) for the method-of-moments estimator 2 γ i j ^ ( h ) with sample size N h = 3 , with no contamination, and with three different degrees of contamination ϵ .
Figure 1. Approximate tail probabilities (in black) and simulated (in red) for the method-of-moments estimator 2 γ i j ^ ( h ) with sample size N h = 3 , with no contamination, and with three different degrees of contamination ϵ .
Mathematics 09 00762 g001
Figure 2. Tail distribution of the method-of-moments estimator 2 γ i j ^ ( h ) with sample size N h = 3 and two underlying models: ( 1 ϵ ) N ( 0 , 1 ) + ϵ N ( 0 , 1.1 2 ) and ( 1 ϵ ) N ( 0 , 1 ) + ϵ N ( 0 , 1.2 2 ) , for three different degrees of contamination ϵ .
Figure 2. Tail distribution of the method-of-moments estimator 2 γ i j ^ ( h ) with sample size N h = 3 and two underlying models: ( 1 ϵ ) N ( 0 , 1 ) + ϵ N ( 0 , 1.1 2 ) and ( 1 ϵ ) N ( 0 , 1 ) + ϵ N ( 0 , 1.2 2 ) , for three different degrees of contamination ϵ .
Mathematics 09 00762 g002
Figure 3. Tail probabilities of the classical method-of-moments cross-variogram estimator 2 γ i j ^ ( h ) (top row of figures) and 0.2 -trimmed cross-variogram estimator 2 γ i j ^ α ( h ) (bottom row of figures), with no contamination, ϵ = 0 , and contaminations ϵ = 0.15 and ϵ = 0.3 .
Figure 3. Tail probabilities of the classical method-of-moments cross-variogram estimator 2 γ i j ^ ( h ) (top row of figures) and 0.2 -trimmed cross-variogram estimator 2 γ i j ^ α ( h ) (bottom row of figures), with no contamination, ϵ = 0 , and contaminations ϵ = 0.15 and ϵ = 0.3 .
Mathematics 09 00762 g003
Figure 4. Variogram estimations of Example 1: classical (black), 0.1-trimmed (green) and Huber’s (red), and the variogram model with no contamination.
Figure 4. Variogram estimations of Example 1: classical (black), 0.1-trimmed (green) and Huber’s (red), and the variogram model with no contamination.
Mathematics 09 00762 g004
Figure 5. Variogram estimations of Example 1: classical (black), 0.1-trimmed (green) and Huber’s (red), and the variogram model with no contamination.
Figure 5. Variogram estimations of Example 1: classical (black), 0.1-trimmed (green) and Huber’s (red), and the variogram model with no contamination.
Mathematics 09 00762 g005
Figure 6. Classical (black) and robust (green and red) cross-variogram estimations of Example 2.
Figure 6. Classical (black) and robust (green and red) cross-variogram estimations of Example 2.
Mathematics 09 00762 g006
Figure 7. Classical (black) and robust (green and red) variogram estimations for the logarithm of lead and their linearized variograms of Example 2.
Figure 7. Classical (black) and robust (green and red) variogram estimations for the logarithm of lead and their linearized variograms of Example 2.
Mathematics 09 00762 g007
Figure 8. Classical (black) and robust (green and red) variogram estimations for nickel and their linearized variograms of Example 2.
Figure 8. Classical (black) and robust (green and red) variogram estimations for nickel and their linearized variograms of Example 2.
Mathematics 09 00762 g008
Figure 9. Classical (black) and robust (green and red) cross-variogram estimations, linearized versions, and the classical model (blue) of Example 2.
Figure 9. Classical (black) and robust (green and red) cross-variogram estimations, linearized versions, and the classical model (blue) of Example 2.
Mathematics 09 00762 g009
Figure 10. Variogram-crossvariogram matrix of the classical variogram and cross-variogram estimations with the classical model of Example 3.
Figure 10. Variogram-crossvariogram matrix of the classical variogram and cross-variogram estimations with the classical model of Example 3.
Mathematics 09 00762 g010
Figure 11. Classical (black) and robust (green and red) cross-variogram estimations of Example 3, with the linearized cross-variogram models.
Figure 11. Classical (black) and robust (green and red) cross-variogram estimations of Example 3, with the linearized cross-variogram models.
Mathematics 09 00762 g011
Table 1. Tail probabilities of the VOM+SAD approximation and the exact (simulated) values for the method-of-moments estimator of Figure 1.
Table 1. Tail probabilities of the VOM+SAD approximation and the exact (simulated) values for the method-of-moments estimator of Figure 1.
No Contamination ϵ = 0 ϵ = 0.05 ϵ = 0.2
ApproximationExactApproximationExactApproximationExact
t = 0.4 0.266270.279030.271010.281990.285240.29635
t = 0.6 0.119440.125120.123190.129040.134430.13795
t = 0.8 0.050520.056650.053020.058950.060530.06312
t = 0.9 0.031890.036710.033860.037270.039780.04304
t = 1.0 0.019500.024510.021020.024490.025600.02811
Table 2. Relative errors of the VOM+SAD approximation, in %.
Table 2. Relative errors of the VOM+SAD approximation, in %.
No Contamination ϵ = 0 ϵ = 0.05 ϵ = 0.2
t = 0.4 1.76981.52921.5789
t = 0.6 0.64920.67170.4083
t = 0.8 0.64980.63010.2764
t = 0.9 0.50040.35420.3407
t = 1.0 0.51360.35570.2583
Table 3. Values of S n = sup h 2 γ ^ ( h ) 2 γ ( h ) and its p-value considering a scale contaminated normal with ϵ = 0.01 and g = 1.1 of Example 2.
Table 3. Values of S n = sup h 2 γ ^ ( h ) 2 γ ( h ) and its p-value considering a scale contaminated normal with ϵ = 0.01 and g = 1.1 of Example 2.
Log LeadNickelCross-Variogram
S n p-Value S n p-Value S n p-Value
Classical0.07040760.05208727.82550.1120651.1608420.9775347
0.1-trimmed mean0.0312044158.390810.8783460.7257757
Huber0.0634437129.093010.8941931
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

García-Pérez, A. New Robust Cross-Variogram Estimators and Approximations of Their Distributions Based on Saddlepoint Techniques. Mathematics 2021, 9, 762. https://0-doi-org.brum.beds.ac.uk/10.3390/math9070762

AMA Style

García-Pérez A. New Robust Cross-Variogram Estimators and Approximations of Their Distributions Based on Saddlepoint Techniques. Mathematics. 2021; 9(7):762. https://0-doi-org.brum.beds.ac.uk/10.3390/math9070762

Chicago/Turabian Style

García-Pérez, Alfonso. 2021. "New Robust Cross-Variogram Estimators and Approximations of Their Distributions Based on Saddlepoint Techniques" Mathematics 9, no. 7: 762. https://0-doi-org.brum.beds.ac.uk/10.3390/math9070762

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop