Next Article in Journal
The Built Environment—A Missing “Cause of the Causes” of Non-Communicable Diseases
Previous Article in Journal
Children’s Exposure to Secondhand Smoke during Ramadan in Jakarta, Indonesia
Previous Article in Special Issue
Generalized Confidence Intervals and Fiducial Intervals for Some Epidemiological Measures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Empirical Likelihood-Based ANOVA for Trimmed Means

1
Department of Mathematics, Faculty of Physics and Mathematics, University of Latvia, Riga LV-1002, Latvia
2
Department of Law, Economics, Management and Quantitative Methods, University of Sannio, Benevento 82100, Italy
3
Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC 20057, USA
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2016, 13(10), 953; https://0-doi-org.brum.beds.ac.uk/10.3390/ijerph13100953
Submission received: 17 May 2016 / Revised: 15 September 2016 / Accepted: 20 September 2016 / Published: 27 September 2016
(This article belongs to the Special Issue Methodological Innovations and Reflections-1)

Abstract

:
In this paper, we introduce an alternative to Yuen’s test for the comparison of several population trimmed means. This nonparametric ANOVA type test is based on the empirical likelihood (EL) approach and extends the results for one population trimmed mean from Qin and Tsao (2002). The results of our simulation study indicate that for skewed distributions, with and without variance heterogeneity, Yuen’s test performs better than the new EL ANOVA test for trimmed means with respect to control over the probability of a type I error. This finding is in contrast with our simulation results for the comparison of means, where the EL ANOVA test for means performs better than Welch’s heteroscedastic F test. The analysis of a real data example illustrates the use of Yuen’s test and the new EL ANOVA test for trimmed means for different trimming levels. Based on the results of our study, we recommend the use of Yuen’s test for situations involving the comparison of population trimmed means between groups of interest.

1. Introduction

The comparison of the means of several populations is frequently encountered in the statistical analysis of data from environmental research and public health studies. Typically, ANOVA is used to compare these means of interest, for example, for the comparison of means of blood lead levels between groups of children receiving different interventions. Practical situations may involve complications such as unbalanced designs (i.e., unequal sample sizes for the groups), variance heterogeneity, and departures from normality. It may be the case, for instance, that the distributions underlying the data from each group are truly heavy tailed or skewed, but it is also possible that such departures from normality are due to few observations located away from the bulk of the data in the tails of the distribution. It is well-known that the classical ANOVA F test cannot handle such violations of its assumptions, and, as a consequence, it has problems controlling the probability of the type I error at the specified nominal level. Heteroscedasticity and/or outliers can completely break down the results of the ANOVA F test when not properly taken into account (see, for example, [1]). Given this limitation of the ANOVA F test, there is a need for ANOVA type tests that are robust to both heteroscedasticity and outliers.
A statistical test that satisfies these requirements is the test developed by Yuen [2], who proposed a modified version of Welch’s heteroscedastic F test [3]. The latter test is designed to deal with heteroscedasticity for normally distributed data, and it is using the sample means and sample variances to estimate their population counterparts. Since the sample mean and the sample variance are not robust to outliers, Yuen [2] proposed to replace them with a pair of robust estimators consisting of the trimmed mean and the Winsorized variance. Such an approach provides a better control of the probability of the type I error for one-way ANOVA situations involving unbalanced designs and skewed distributions (see [4]). There are two important comments to be made. The first comment is that the construction of Yuen’s test has a somewhat ad hoc nature, by replacing the least squares estimators with robust versions. The second comment is that Yuen’s test is no longer a test for the comparison of populations means, but, rather, it is a test to compare population trimmed means. It may be preferable to make inferences regarding the population trimmed means rather than the population means when the underlying distributions for the groups are skewed, since the trimmed means are more representative for the bulk of the data in those situations [5].
In this paper, we present an alternative to Yuen’s test, a new nonparametric test that can be used to compare several trimmed means based on the empirical likelihood (EL) approach to statistical inference [6,7,8]. The EL method (see [9] for a detailed overview) is a popular nonparametric approach that does not require normality (or other distributional assumptions) and can be regarded as a data adaptive method. We develop an EL-based ANOVA test for the comparison of trimmed means that takes advantage of the nonparametric nature of the EL approach, by extending the results of Qin and Tsao [8] who introduced the EL method for a trimmed mean (see also the results from [10]). All technical details regarding the tests considered in this paper (including the asymptotic results for the new EL-based ANOVA for trimmed means) are provided in the Appendix A, Appendix B, Appendix C and Appendix D.
The paper is organized as follows. In Section 2, we present and interpret the results of a simulation study that compares the performance of the EL-based ANOVA for trimmed means and means with alternative methods under several scenarios involving skewed distributions. In Section 3, we analyze a real data set using different types of tests for the comparison of population trimmed means and population means. We end the paper by presenting conclusions in Section 4.

2. Simulation Study

For simplicity, we present only situations where we are interested in the comparison of three population trimmed means or three population means ( k = 3 ), while having samples of equal sizes. We consider scenarios involving skewed distributions, with and without variance heterogeneity. For the EL ANOVA for trimmed means, we consider only symmetric trimming, where all samples are trimmed symmetrically. We note that, although we are primarily interested in the performance of the tests for the comparison of trimmed means, EL ANOVA for trimmed means (panel ELT) and Yuen’s test (panel Yuen); for completeness purposes, we are also including the results for the tests for the comparison of means, specifically the classical ANOVA F test (panel F test), Welch’s heteroscedastic F test for means (panel Welch), and the EL ANOVA for means (panel EL). For Welch’s test and Yuen’s test, we have used the R function t1way (see Wilcox [11]). The R functions that provide the implementation of the EL ANOVA methods for trimmed means and means are available from the corresponding author upon request.
For the simulation study, we investigate the potential effect of the shape of the distributions on the estimated probability of type I errors. We consider several skewed distributions with and without variance heterogeneity. We use a simulation design similar to that from [5], where (trimmed) means of only two independent skewed populations are compared. For the scenario with homogeneous variances (scenario 1), we simulate data from three independent skewed distributions. We consider the χ 3 2 distribution, the lognormal distribution with normal mean μ = 0 and normal scale σ = 1 , the gamma distribution with shape parameter α = 2 and scale parameter σ = 1 , and the skew-normal distribution with location parameter ξ = 0 , scale parameter ω = 1 , and slant parameter α = 1 (see [12]). For the scenario with heterogeneous variances (scenario 2), we further transform the data simulated from the three independent skewed distributions as to have the ratios between variances to be either 1:4:9 or 1:1:36. To ensure that the relevant H 0 T of equal trimmed means or H 0 of equal means are true, before altering the variances, we center the data using the theoretically determined trimmed means (when using tests for the comparison of trimmed means) or means (when using tests for the comparison of means). We use 10,000 Monte Carlo simulations to calculate the empirical probability of type I errors for the tests performed at the nominal 0.05 significance level.
Table 1 presents the empirical probability of type I errors for the different tests for the situation involving skewed distributions with homogeneous variances (scenario 1). Regarding the comparison of trimmed means, the results for Yuen’s test are closer to the nominal significance level than those for the EL ANOVA test for trimmed means. By contrast, among the tests that compare means, the results of the EL ANOVA test for means are closest to the nominal significance level. Table 2 and Table 3 present the corresponding results for the same tests for situations involving skewed distributions with heterogeneous variances (scenario 2). We note that it is more difficult to control the probability of a type I error when the ratios between variances are 1:1:36 than when they are 1:4:9. Similar to the homogeneous variances scenario, the results for the heterogeneous variances scenario suggest that Yuen’s test performs best among the tests for the comparison of trimmed means, while the EL ANOVA test performs best among the tests for the comparison of means.

3. Real Data Example

To illustrate the use of the EL ANOVA for trimmed means and means, we use the Oslo Transect data set [13]. This real data set includes 360 observations corresponding to different plants collected along a 120 km transect running through the city of Oslo, Norway. The concentrations of 25 chemical elements found in these plants were recorded together with factors that may influence the mineral concentration. Except for not including two chemical elements, Au and Na, this data is available within R package rrcov [14] as OsloTransect dataset. We analyze this dataset, and, thus, only 23 chemical elements are included in Table 4. To preserve the skewness of the data, we have also used the raw data, as opposed to the log transformed data (as done in [13]). After removing the observations with missing values, we are left with 332 observations. We consider the 23 concentrations of chemical elements as the response variables, and the lithology as a group variable with four levels.
As for the simulation study, even though our main interest is in tests that compare population trimmed means, for completeness purposes, we also provide the results from the tests that compare population means. We consider three symmetric trimming strategies similar to those used in the simulation study. The entries from Table 4 provide the p-values from the tests for the comparison of population means and population trimmed means. We note that, for each trimming strategy, the p-values from the EL ANOVA for trimmed means (panel ELT) and Yuen’s test (panel Yuen) are very similar. In addition, the p-values from the EL ANOVA for means (panel EL) and Welch’s heteroscedastic F test (panel Welch) are also very similar.

4. Conclusions

In this paper, we introduce a new nonparametric ANOVA type test for the comparison of population trimmed means. Although the new method is derived from the general principles of the empirical likelihood approach, versus the somewhat ad hoc nature of the derivation of Yuen’s test from Welch’s heteroscedastic F test, the results of our simulation study in situations involving skewed distributions indicate that, unless the sample sizes per group are very large, the new EL ANOVA method for trimmed means performs worse than Yuen’s test with respect to control over the probability of a type I error. This is in contrast with our simulation results for the comparison of means, where the EL ANOVA for means performs better than Welch’s heteroscedastic F test. The analysis of the real data example provides similar p-values for the new EL ANOVA method for trimmed means and the Yuen’s test for different trimming levels, and also similar p-values for the EL ANOVA and Welch’s heteroscedastic F test.
Based on these results, we recommend the use of Yuen’s test for situations, where the research question involves the comparison of population trimmed means between groups of interest. The choice of the specific trimming strategy is an important and complex issue, since different trimming strategies imply different null hypotheses being tested. As such, the selection of the trimming strategy should be based on subject matter reasons that take into account what is known by the experts about the data under investigation. Alternatively, in the absence of expert knowledge information, different trimming strategies could be used to evaluate the sensitivity of the results to the choice of the trimming strategy.

Acknowledgments

The authors would like to thank the academic editor and the reviewers for their thoughtful and constructive suggestions.

Author Contributions

Janis Valeinis and George Luta proposed the new EL ANOVA for trimmed means; Mara Velina proved the asymptotic results; Mara Velina and Luca Greco performed the simulation study; and Mara Velina performed the data analysis. All authors wrote and edited the paper and all authors read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Statistical Tests Not Based on EL

Let Y i = ( Y i 1 , Y i 2 , , Y i n i ) , i = 1 , 2 , , k , be independent random samples from k different distributions with population means μ i . We are interested in testing the null hypothesis of equal population means
H 0 : μ 1 = = μ k = μ .
Under the assumption of equal variances (homoscedasticity) and normally distributed data in each group, i.e., Y i j N ( μ i , σ ) , one can use the classical ANOVA F test
F = i = 1 k n i ( Y ¯ i · Y ¯ · · ) 2 / ( k 1 ) i = 1 k ( n i 1 ) s i 2 / ( N k ) ,
where
Y ¯ i · = 1 n i j = 1 n i Y i j and s i 2 = 1 n i 1 j = 1 n i ( Y i j Y ¯ i · ) 2
are the sample mean and sample variance of the i-th group, respectively, and
Y ¯ · · = i = 1 k j = 1 n i Y i j / N
is the pooled sample mean. The null hypothesis in (A1) is rejected at level c, if F > F c , k 1 , N k , where F c , k 1 , N k is the critical value based on the F distribution with k 1 and N k degrees of freedom.
Let us suppose now that Y i j N ( μ i , σ i ) for i = 1 , , k . Welch’s heteroscedastic F test [3] is designed to be robust to the violation of the assumption of equal group variances. The main difference with the classical ANOVA F test is that the following weights are used:
w i = n i s i 2 .
The Welch’s heteroscedastic F test statistics is defined by
F W = A i = 1 k w i ( Y ¯ i · Y ¯ ) 2 k 1 ,
where
Y ¯ = i = 1 k w i Y ¯ i · i = 1 k w i ,
A = 1 + 2 ( k 2 ) ( k 2 1 ) i = 1 k 1 n i 1 1 w i i = 1 k w i 2 1 .
The null hypothesis (A1) is rejected at level c if F W F c , k 1 , ν W , where
ν W = 3 k 2 1 i = 1 k 1 n i 1 1 w i i = 1 k w i 2 1 .
Yuen’s test, i.e., the robust modification of Welch’s heteroscedastic F test, is designed to be robust to departures from normality. The test is obtained by using the sample trimmed means and Winsorized variances instead of the sample means and variances. Let Y i ( 1 ) , Y i ( 2 ) , , Y i ( n i ) denote the order statistics for the ith sample. Let q i = [ n i α i ] + 1 and r i = n i [ n i β i ] , where 0 < α i < 1 / 2 and 0 < β i < 1 / 2 represent the proportion of observations trimmed from the left and from the right tail of the distribution, respectively, and [ x ] denotes the largest integer less than or equal to x. Then, m i = n i [ n i α i ] [ n i β i ] represents the effective sample size after trimming and the sample trimmed mean of the ith group is defined as
Y ¯ α β i = 1 m i j = q i r i Y i ( j ) .
Let W i j represent the new observations after replacing the trimmed observations in the lower and upper tails with the lowest and highest untrimmed values of the sample, i.e.,
W i j = Y i ( q i ) , Y i j Y i ( q i ) , Y i j , Y i ( q i ) < Y i j < Y i ( r i ) , Y i ( r i ) , Y i j Y i ( r i ) .
The sample Winsorized variance for the i-th group is computed as
s w i 2 = j = 1 n i ( W i j Y ¯ w i ) 2 n i 1 ,
where
Y ¯ w i = 1 n j = 1 n i W i j .
Yuen’s test statistics is given by
F Y T = A T i = 1 k w i T ( Y ¯ α β i Y ˜ ) 2 k 1 ,
where
w i T = m i ( m i 1 ) ( n i 1 ) s w i 2 ,
Y ˜ = i = 1 k w i T Y ¯ α β i i = 1 k w i T ,
and
A T = 1 + 2 ( k 2 ) ( k 2 1 ) i = 1 k 1 m i 1 1 w i T i = 1 k w i T 2 1 .
The null hypothesis of equal (trimmed) means is rejected at level c if F Y T F c , k 1 , ν Y T , where
ν Y T = 3 k 2 1 i = 1 k 1 m i 1 1 w i T i = 1 k w i T 2 1 .
Note that this test reduces to Welch’s heteroscedastic F test when there is no trimming.

Appendix B. EL-Based ANOVA for Means

Let F i denote a candidate for the true unknown distribution F i 0 and v i j = F i { Y i j } denote the jump of F i at { Y i j } . The EL for the i-th sample is L ( F i ) = j = 1 n i v i j and corresponds to a multinomial distribution defined on the i-th sample by attaching a weight v i j to each Y i j . The weights v i j = v i j ( μ i ) satisfy the conditions
  • v i j 0 ;
  • j = 1 n i v i j = 1 ;
  • j = 1 n i v i j Y i j = μ i .
The function L ( F i ) attains its maximum value when v i j = n i 1 . Similar to the classical approach based on the parametric likelihood, the profile EL ratio function is defined as
R ( μ i ) = sup v i j j = 1 n i n i v i j , j = 1 n i v i j = 1 , j = 1 n i v i j ( Y i j μ i ) = 0 .
For an ANOVA model, Owen ([15]) defined the k-sample EL as the product of k group specific empirical likelihoods. Therefore, given the k samples, the profile EL ratio function can be defined as follows:
R ( μ ) = i = 1 k R i ( μ ) = sup v i j i = 1 k j = 1 n i n i v i j : j = 1 n i v i j = 1 , j = 1 n i v i j ( Y i j μ ) = 0 ,
where v i j = v i j ( μ ) . Under the null hypothesis (A1), the k 1 contrasts between means are constrained to be zero, and if μ = μ 0 + O ( n 0 1 / 2 ) , where μ 0 is the true unknown common mean and n 0 = min 1 i k n i , then
2 log max μ R ( μ ) = i = 1 k w i ( Y ¯ i · Y ¯ ) 2 + O p ( n 0 1 / 2 ) d χ ( k 1 ) 2
as n 0 , where Y ¯ i · is the sample mean for the i-th sample, Y ¯ is the common mean estimator
Y ¯ = i = 1 k Y ¯ i · w i i = 1 k w i ,
and the weights w i are inverse proportional with the sample variances S i 2 , i.e.,
w i = n i S i 2 .
It is important to note that EL-based ANOVA is robust to heteroscedasticity (see [16]).

Appendix C. EL for the Trimmed Mean

In a one sample situation, let Y 1 , Y 2 , , Y n be independent identically distributed with Y 1 F 0 and let Y ( 1 ) , Y ( 2 ) , , Y ( n ) be the respective order statistics. Let q = [ n α ] + 1 and r = n [ n β ] , where 0 < α < 1 / 2 and 0 < β < 1 / 2 , represent the proportion of observations trimmed from the left and right tails, respectively, and [ x ] denotes the largest integer less than or equal to x. Then, m = n [ n α ] [ n β ] is the effective sample size after trimming. Let weights p j = 0 for j < q and j > r , p j 0 for q j r , and j = q r p j = 1 . Then, the profile EL ratio function for the trimmed sample is defined as
R ( μ α β ) = sup p j { j = q r m p j : p j 0 , j = q r p j = 1 , j = q r p j Y ( j ) = μ α β } .
Theorem C1.
(Qin and Tsao, [8]) Let μ α β 0 be the true value of μ α β . Assume F 0 is continuous, F 0 ( ξ α ) > 0 and F 0 ( ξ β ) > 0 . Then,
2 a log R ( μ α β ) d χ 1 2 ,
where
a = σ α β 2 / ( ( 1 α β ) τ α β 2 ) ,
σ α β 2 = 1 ( 1 α β ) ξ α ξ 1 β y 2 d F 0 ( y ) μ α β 2 ,
τ α β 2 = 1 ( 1 α β ) 2 ( ( 1 α β ) σ α β 2 + β ( 1 β ) ( ξ 1 β μ α β ) 2 2 α β ( ξ α μ α β ) ( ξ 1 β μ α β ) + α ( 1 α ) ( ξ α μ α β ) 2 ) .
The unknown scaling constant a can be estimated consistently via a ^ = σ ^ α β 2 / ( ( 1 α β ) τ ^ α β 2 ) , where
σ ^ α β 2 = 1 ( 1 α β ) ξ ^ α ξ ^ 1 β y 2 d F n ( y ) Y ¯ α β 2 ,
τ ^ α β 2 = 1 ( 1 α β ) 2 ( ( 1 α β ) σ ^ α β 2 + β ( 1 β ) ( ξ ^ 1 β Y ¯ α β ) 2             2 α β ( ξ ^ α Y ¯ α β ) ( ξ ^ 1 β Y ¯ α β ) + α ( 1 α ) ( ξ ^ α Y ¯ α β ) 2 ) ,
where ξ ^ p = inf { y : F n ( y ) p } for any 0 < p < 1 , and F n ( y ) is the empirical distribution function.

Appendix D. EL-Based ANOVA for Trimmed Means

The ELRT (B2) is still sensitive to outliers. Following the approach from [8], one could develop a version of (B2) with trimmed means. The main result stated in [8] is given in Appendix C.
Let Y i ( 1 ) , Y i ( 2 ) , , Y i ( n i ) denote the ordered i-th sample, set q i = [ n i α i ] + 1 and r i = n i [ n i β i ] , where 0 < α i < 1 / 2 and 0 < β i < 1 / 2 represent the proportion of observations trimmed from the left and the right tails, respectively, and [ x ] denotes the largest integer less than or equal to x. Then, m i = n i [ n i α i ] [ n i β i ] is the effective sample size after trimming in each group. The group specific trimmed means and variances are
Y ¯ α β i = 1 m i j = q i r i Y i ( j ) , S α β i 2 = m i 1 j = q i r i ( Y i ( j ) Y ¯ α β i ) 2 .
Similar to (B1), define the profile EL ratio function over the trimmed samples, that is, as if the m i observations in each sample are independent, i.e.,
R ( μ α β ) = i = 1 k R i ( μ α β i ) = sup v i j i = 1 k j = q i r i m i v i j , j = q i r i v i j = 1 , j = q 1 r 1 v i j ( Y i ( j ) μ α β ) = 0 .
It is important to note that the weights are no longer a function of the common population mean but of the common population trimmed mean, that is v i j = v i j ( μ α β ) . As a consequence, we will obtain an ELRT for a different null hypothesis claiming the equality of population trimmed means (see [8,10]), that is
H 0 T : μ α β 1 = μ α β 2 = = μ α β k = μ α β .
When the underlying distribution of the data in each group is symmetric, the two hypotheses (A1) and (D2) are equivalent if symmetric trimming is performed. This equivalence does not hold for skewed distributions, for which it may be preferable to compare trimmed means rather than means [5]. The following result holds.
Theorem D1.
Let μ α β 0 be the common population trimmed mean. Assume that F i 0 is continuous, F i 0 ( ξ α ) > 0 and F i 0 ( ξ β ) > 0 for each i = 1 , , k . If μ α β i = μ α β 0 + O p ( m 0 1 / 2 ) , i = 1 , 2 , k , where m 0 = min 1 i k m i , then
2 i = 1 k a i max μ log R i ( μ α β ) = i = 1 k a i w α β i ( Y ¯ α β i Y ¯ α β ) 2 + O p ( m 0 1 / 2 ) d χ ( k 1 ) 2 ,
where
Y ¯ α β = i = 1 k Y ¯ α β i w α β i i = 1 k w α β i + o p ( m 0 1 / 2 ) , w α β i = m i S α β i 2 ,
and the scaling factors are given by
a i = σ α β i 2 / ( ( 1 α i β i ) τ α β i 2 ) .
The quantities σ α β i 2 and τ α β i 2 for the ith trimmed sample are defined in Appendix C.
Proof. 
By a Lagrange multiplier argument, it can be shown (see, for example, [9]), that the v i j , i = 1 , 2 , , k that maximize R ( μ α β ) are given by
v i j = 1 m i ( 1 + λ i ( Y i ( j ) μ α β ) ) , j = q i , , r i ,
where the Lagrange multiplier λ i is the solution to
j = q i r i Y i ( j ) μ α β 1 + λ i ( Y i ( j ) μ α β ) = 0 ,
and
λ i = Y ¯ α β i μ α β 1 m i j = q i r i ( Y i ( j ) μ α β ) 2 + o p ( m i 1 / 2 ) .
Then, by substituting v i j from (D5) in the expression (D1), we obtain
2 log R ( μ α β ) = 2 i = 1 k j = q i r i log ( 1 + λ i ( Y i ( j ) μ α β ) ) .
The maximum empirical likelihood estimator Y ¯ α β is the solution to R ( μ α β ) / μ α β = 0 :
R ( μ α β ) μ α β = i = 1 k j = q i r i λ i μ α β ( Y i ( j ) μ α β ) λ i 1 + λ i ( Y i ( j ) μ α β ) 1 = i = 1 k m i λ i .
It follows that Y ¯ α β satisfies
i = 1 k m i ( Y ¯ α β i μ α β ) m i 1 j = q i r i ( Y i j μ α β ) 2 = o p ( m 0 ) ,
and expression (D3) follows. Since, according to Theorem C1, for each of the trimmed samples i = 1 , , k and true value μ α β 0 ,
2 a i log R i ( μ α β 0 ) d χ 1 2 ,
then, summing over the groups, we prove the result stated in (D3), by using the same arguments leading to the result stated in (B2). ☐

References

  1. Wilcox, R. Comparing the means of two independent groups. Biom. J. 1990, 32, 771–780. [Google Scholar] [CrossRef]
  2. Yuen, K. The two-sample trimmed t for unequal population variances. Biometrika 1974, 61, 165–170. [Google Scholar] [CrossRef]
  3. Welch, B. On the comparison of several mean values: An alternative approach. Biometrika 1951, 38, 330–336. [Google Scholar] [CrossRef]
  4. Lix, L.; Keselman, H. To trim or not to trim: Tests of location equality under heteroscedasticity and nonnormality. Educ. Psychol. Meas. 1998, 58, 409–429. [Google Scholar] [CrossRef]
  5. Keselman, H.; Wilcox, R.; Kowalchuk, R.; Olejnik, S. Comparing trimmed or least squares means of two independent skewed populations. Biom. J. 2002, 44, 478–489. [Google Scholar] [CrossRef]
  6. Owen, A. Empirical likelihood ratio confidence intervals for a single functional. Biometrika 1988, 75, 237–249. [Google Scholar] [CrossRef]
  7. Owen, A. Empirical likelihood ratio confidence regions. Ann. Stat. 1990, 18, 90–120. [Google Scholar] [CrossRef]
  8. Qin, G.; Tsao, M. Empirical likelihood ratio confidence interval for the trimmed mean. Commun. Stat. Theory Methods 2002, 31, 2197–2208. [Google Scholar] [CrossRef]
  9. Owen, A. Empirical Likelihood; Chapman & Hall/CRC: New York, NY, USA, 2001. [Google Scholar]
  10. Tsao, M.; Zhou, J. A nonparametric confidence interval for the trimmed mean. J. Nonparametr. Stat. 2002, 14, 665–673. [Google Scholar] [CrossRef]
  11. Wilcox, R. Introduction to Robust Estimation and Hypothesis Testing, 2nd ed.; Elsevier Academic Press: Burlington, MA, USA, 2005. [Google Scholar]
  12. Azzalini, A. A class of distributions which includes the normal ones. Scand. J. Stat. 1985, 12, 171–178. [Google Scholar]
  13. Reimann, C.; Arnoldussen, A.; Boyd, R.; Finne, T.; Koller, F.; Nordulgen, O. Element contents in leaves of four plant species (birch, mountain ash, fern and spruce) along anthropogenic and geogenic concentration gradients. Sci. Total Environ. 2007, 377, 416–433. [Google Scholar] [CrossRef] [PubMed]
  14. Todorov, V.; Filzmoser, P. An object-oriented framework for robust multivariate analysis. J. Stat. Softw. 2009, 32, 1–47. [Google Scholar] [CrossRef]
  15. Owen, A. Empirical likelihood for linear models. Ann. Stat. 1991, 19, 1725–1747. [Google Scholar] [CrossRef]
  16. Tsao, W.; Wu, C. Empirical likelihood inference for a common mean in presence of heteroscedasticity. Can. J. Stat. 2006, 34, 45–59. [Google Scholar] [CrossRef]
Table 1. Empirical probability of type I error for various tests for the equality of means and trimmed means of three independent skewed distributions with homogeneous variances. For methods involving trimmed means, symmetric trimming at level α i = β i = c , i = 1 , 2 , 3 is used.
Table 1. Empirical probability of type I error for various tests for the equality of means and trimmed means of three independent skewed distributions with homogeneous variances. For methods involving trimmed means, symmetric trimming at level α i = β i = c , i = 1 , 2 , 3 is used.
χ 3 2
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.0470.0760.0520.0500.0790.0500.0800.0490.090
300.0480.0700.0540.0540.0550.0530.0750.0540.079
400.0440.0620.0520.0490.0630.0480.0630.0500.069
500.0480.0580.0490.0460.0470.0470.0600.0490.067
1000.0490.0550.0510.0500.0560.0500.0560.0490.056
2000.0510.0550.0530.0500.0530.0510.0530.0510.056
5000.0510.0490.0490.0480.0500.0500.0510.0510.052
Lognormal ( μ = 0 , σ = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.0440.0730.0470.0400.0690.0400.0700.0400.081
300.0450.0690.0500.0480.0490.0460.0680.0460.072
400.0440.0630.0490.0460.0620.0460.0620.0450.066
500.0450.0650.0540.0490.0480.0470.0590.0460.062
1000.0490.0590.0550.0490.0550.0490.0570.0490.057
2000.0500.0530.0500.0450.0480.0460.0490.0480.052
5000.0510.0530.0520.0510.0520.0500.0500.0520.054
Gamma ( α = 2 , σ = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.0520.0780.0500.0500.0770.0520.0790.0520.096
300.0490.0690.0530.0520.0530.0500.0700.0520.080
400.0500.0620.0520.0520.0640.0520.0670.0530.074
500.0500.0600.0510.0480.0480.0500.0620.0520.069
1000.0520.0570.0520.0510.0570.0500.0560.0480.056
2000.0520.0560.0530.0520.0550.0530.0550.0520.055
5000.0490.0520.0510.0500.0510.0490.0510.0510.052
Skew-normal ( ξ = 0 , ω = 1 , α = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.0550.0770.0490.0490.0770.0510.0830.0540.099
300.0480.0650.0500.0510.0510.0500.0680.0510.078
400.0490.0610.0490.0490.0620.0510.0650.0510.073
500.0510.0580.0490.0490.0500.0500.0610.0520.071
1000.0550.0550.0520.0510.0560.0500.0560.0520.060
2000.0520.0510.0490.0500.0520.0510.0540.0520.056
5000.0460.0480.0470.0470.0480.0480.0490.0480.049
Table 2. Empirical probability of type I error for various tests for the equality of means and trimmed means of three independent skewed distributions with the ratios between variances being 1:4:9. For methods involving trimmed means, symmetric trimming at level α i = β i = c , i = 1 , 2 , 3 is used.
Table 2. Empirical probability of type I error for various tests for the equality of means and trimmed means of three independent skewed distributions with the ratios between variances being 1:4:9. For methods involving trimmed means, symmetric trimming at level α i = β i = c , i = 1 , 2 , 3 is used.
χ 3 2
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.0860.1010.0710.0640.0960.0610.0940.0620.109
300.0830.0880.0710.0600.0620.0640.0840.0640.091
400.0800.0750.0620.0600.0720.0550.0720.0540.078
500.0790.0670.0570.0500.0510.0510.0660.0520.071
1000.0790.0630.0580.0550.0600.0550.0600.0540.062
2000.0850.0600.0570.0570.0590.0540.0580.0540.058
5000.0730.0520.0510.0530.0540.0520.0540.0510.053
Lognormal ( μ = 0 , σ = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.1100.1460.1130.0780.1060.0700.1060.0660.114
300.1090.1310.1110.0630.0640.0650.0880.0630.090
400.1000.1150.1000.0660.0810.0620.0830.0590.084
500.0980.1100.0990.0600.0600.0600.0740.0590.077
1000.0970.0900.0830.0580.0630.0550.0620.0550.063
2000.0820.0710.0690.0510.0550.0520.0550.0540.058
5000.0770.0610.0600.0510.0530.0520.0530.0510.054
Gamma ( α = 2 , σ = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.0860.0990.0700.0610.0930.0620.0970.0650.111
300.0830.0800.0630.0580.0620.0560.0790.0600.090
400.0790.0740.0610.0580.0730.0560.0730.0590.083
500.0790.0680.0570.0490.0500.0530.0660.0580.075
1000.0780.0590.0550.0530.0600.0530.0600.0530.061
2000.0780.0570.0540.0530.0550.0530.0570.0530.057
5000.0790.0520.0500.0490.0510.0500.0500.0520.053
Skew-normal ( ξ = 0 , ω = 1 , α = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.0800.0790.0480.0480.0810.0520.0850.0540.106
300.0750.0690.0500.0510.0540.0530.0730.0570.085
400.0800.0600.0490.0500.0640.0500.0690.0530.075
500.0760.0600.0500.0500.0510.0510.0640.0530.073
1000.0790.0530.0490.0480.0540.0480.0550.0510.059
2000.0780.0540.0510.0510.0540.0510.0540.0520.056
5000.0750.0490.0470.0480.0490.0480.0490.0480.049
Table 3. Empirical probability of type I error for various tests for the equality of means and trimmed means of three independent skewed distributions with the ratios between variances being 1:1:36. For methods involving trimmed means, symmetric trimming at level α i = β i = c , i = 1 , 2 , 3 is used.
Table 3. Empirical probability of type I error for various tests for the equality of means and trimmed means of three independent skewed distributions with the ratios between variances being 1:1:36. For methods involving trimmed means, symmetric trimming at level α i = β i = c , i = 1 , 2 , 3 is used.
χ 3 2
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.1240.0900.0670.0590.0870.0560.0880.0560.102
300.1190.0800.0640.0560.0580.0580.0780.0580.086
400.1160.0700.0580.0550.0670.0530.0670.0520.072
500.1120.0630.0520.0480.0490.0510.0630.0520.069
1000.1110.0610.0550.0530.0590.0510.0580.0500.058
2000.1130.0590.0560.0540.0570.0530.0560.0520.056
5000.1020.0530.0530.0530.0540.0500.0520.0500.052
Lognormal ( μ = 0 , σ = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.1680.1260.1010.0710.0950.0620.0950.0580.103
300.1660.1180.0980.0550.0560.0600.0810.0570.084
400.1530.1040.0890.0600.0760.0610.0770.0570.080
500.1480.0950.0860.0520.0530.0560.0680.0550.072
1000.1360.0800.0750.0540.0610.0530.0620.0540.061
2000.1190.0640.0620.0500.0540.0490.0520.0510.055
5000.1120.0560.0550.0520.0530.0500.0510.0520.054
Gamma ( α = 2 , σ = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.1230.0890.0640.0580.0890.0590.0930.0600.107
300.1220.0790.0610.0570.0590.0550.0770.0580.086
400.1160.0710.0570.0560.0690.0540.0700.0550.077
500.1130.0660.0550.0510.0520.0510.0640.0530.070
1000.1100.0590.0530.0530.0580.0520.0570.0510.060
2000.1090.0540.0510.0510.0540.0500.0540.0510.055
5000.1080.0520.0510.0490.0500.0500.0510.0500.051
Skew-normal ( ξ = 0 , ω = 1 , α = 1 )
Trimming level
c = 5 % c = 10 % c = 20 %
nF testWelchELYuenELTYuenELTYuenELT
200.1130.0770.0500.0470.0800.0490.0830.0540.103
300.1070.0670.0510.0500.0520.0520.0740.0540.083
400.1120.0600.0490.0490.0630.0510.0670.0540.074
500.1070.0580.0480.0490.0500.0510.0640.0550.074
1000.1070.0540.0480.0480.0540.0490.0560.0520.060
2000.1060.0530.0510.0520.0540.0510.0540.0530.057
5000.1060.0480.0480.0500.0510.0500.0510.0490.051
Table 4. p-values from tests of equality of means and trimmed means of 23 chemical element concentrations in plants collected along the Oslo Transect ([13]). Symmetric trimming, α i = β i = c , i = 1 , , 4 .
Table 4. p-values from tests of equality of means and trimmed means of 23 chemical element concentrations in plants collected along the Oslo Transect ([13]). Symmetric trimming, α i = β i = c , i = 1 , , 4 .
Trimming Level
c = 5 % c = 10 % c = 20 %
Element F TestWelchELYuenELTYuenELTYuenELT
Ag0.260.100.090.220.230.420.410.740.73
B0.080.090.070.100.090.120.110.180.16
Ba0.010.010.010.030.030.020.02<0.01<0.01
Ca0.150.190.180.220.220.310.310.420.41
Cd0.080.050.040.090.090.050.050.030.02
Co<0.010.010.01<0.01<0.01<0.01<0.01<0.01<0.01
Cr0.17<0.01<0.01<0.01<0.01<0.01<0.01<0.01<0.01
Cu0.440.260.240.660.670.770.760.760.75
Fe0.030.020.010.040.040.020.020.040.03
Hg0.310.290.270.350.370.190.180.400.38
K0.470.280.260.500.520.530.520.580.57
La0.28<0.01<0.010.010.010.130.100.010.01
Mg0.240.230.210.280.280.380.370.570.56
Mn<0.01<0.01<0.01<0.01<0.01<0.01<0.01<0.01<0.01
Mo0.02<0.01<0.010.020.020.040.030.170.15
Ni<0.01<0.01<0.01<0.01<0.01<0.01<0.010.020.01
P0.280.250.240.390.400.430.430.580.57
Pb0.52<0.01<0.010.010.010.010.01<0.01<0.01
S0.580.550.540.700.720.780.780.810.81
Sb0.160.01<0.010.210.220.190.190.250.20
Sr0.140.070.060.180.190.220.220.100.09
Ti0.010.01<0.010.060.060.090.080.080.07
Zn0.880.800.790.970.970.970.970.970.96

Share and Cite

MDPI and ACS Style

Velina, M.; Valeinis, J.; Greco, L.; Luta, G. Empirical Likelihood-Based ANOVA for Trimmed Means. Int. J. Environ. Res. Public Health 2016, 13, 953. https://0-doi-org.brum.beds.ac.uk/10.3390/ijerph13100953

AMA Style

Velina M, Valeinis J, Greco L, Luta G. Empirical Likelihood-Based ANOVA for Trimmed Means. International Journal of Environmental Research and Public Health. 2016; 13(10):953. https://0-doi-org.brum.beds.ac.uk/10.3390/ijerph13100953

Chicago/Turabian Style

Velina, Mara, Janis Valeinis, Luca Greco, and George Luta. 2016. "Empirical Likelihood-Based ANOVA for Trimmed Means" International Journal of Environmental Research and Public Health 13, no. 10: 953. https://0-doi-org.brum.beds.ac.uk/10.3390/ijerph13100953

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop