Next Article in Journal
Improving Access to Justice with Legal Chatbots
Next Article in Special Issue
Improving the Efficiency of Robust Estimators for the Generalized Linear Model
Previous Article in Journal
Chi-Square and Student Bridge Distributions and the Behrens–Fisher Statistic
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Nonparametric Limits of Agreement for Small to Moderate Sample Sizes: A Simulation Study

1
Department of Toxicology, Charles River Laboratories Copenhagen A/S, 4623 Lille Skensved, Denmark
2
Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense M, Denmark
3
Department of Nuclear Medicine, Odense University Hospital, 5000 Odense C, Denmark
4
Department of Clinical Research, University of Southern Denmark, 5000 Odense C, Denmark
*
Author to whom correspondence should be addressed.
Submission received: 7 August 2020 / Revised: 24 August 2020 / Accepted: 25 August 2020 / Published: 28 August 2020
(This article belongs to the Special Issue Robust Statistics in Action)

Abstract

:
The assessment of agreement in method comparison and observer variability analysis of quantitative measurements is usually done by the Bland–Altman Limits of Agreement, where the paired differences are implicitly assumed to follow a normal distribution. Whenever this assumption does not hold, the 2.5% and 97.5% percentiles are obtained by quantile estimation. In the literature, empirical quantiles have been used for this purpose. In this simulation study, we applied both sample, subsampling, and kernel quantile estimators, as well as other methods for quantile estimation to sample sizes between 30 and 150 and different distributions of the paired differences. The performance of 15 estimators in generating prediction intervals was measured by their respective coverage probability for one newly generated observation. Our results indicated that sample quantile estimators based on one or two order statistics outperformed all of the other estimators and they can be used for deriving nonparametric Limits of Agreement. For sample sizes exceeding 80 observations, more advanced quantile estimators, such as the Harrell–Davis and estimators of Sfakianakis–Verginis type, which use all of the observed differences, performed likewise well, but may be considered intuitively more appealing than simple sample quantile estimators that are based on only two observations per quantile.

1. Introduction

The classical Bland–Altman Limits of Agreement (BA LoA) define a range within which approximately 95% of normally distributed differences between paired measurements are expected to lie [1,2,3]. In cases of non-normally distributed differences, the use of empirical quantiles has been proposed as a robust alternative [2,4,5]; however, extensive research endeavors in the past have suggested the application of nonparametric quantile estimation to the assessment of 2.5% and 97.5% percentiles as nonparametric LoA. We performed a simulation study on 15 nonparametric quantile estimators to derive nonparametric prediction intervals and assessed their performance by means of the coverage probability for one newly generated observation. The aim of this study was to suggest a nonparametric and robust alternative to the classical BA LoA when the normality assumption does not hold and/or the sample sizes are small to moderate. Our findings are illustrated by an application to data from a previously published clinical study on coronary artery calcification measured by the Agatston score [6].

2. Methods

Let the differences of paired observations be independent observations of a random variable X with a cumulative distribution function (CDF) F : R [ 0 , 1 ] . If F is continuous from the right, then the quantile function of X is continuous from the left: Q ( u ) : = inf { x : F ( x ) u } , u ( 0 , 1 ) ; hence, at least 100 u percent of the values of X are below Q ( u ) [7,8]. In the following, we seek to estimate the 2.5% and 97.5% percentiles of F, corresponding to u = 0.025 , 0.975 , by different methods of nonparametric quantile estimation.
Databases, such as JSTOR (Journal Storage), ScienceDirect, the online journal platform “Taylor & Francis Online”, and those maintained by PubMed/Medline in the NCBI (National Center for Biotechnology Information) were searched for quantile estimator, nonparametric quantile estimator, nonparametric kernel quantile estimator, subsampling quantile estimator, and new quantile estimator. Fifteen nonparametric quantile estimators were chosen, three of which are sample quantile estimators, four are subsampling quantile estimators, two are kernel quantile estimators, and six are other quantile estimators. Hence, we chose very different types of nonparametric quantile estimators, which use one, two, or all the observations in a sample.

2.1. Sample Quantile Estimators

The simplest way to estimate quantiles nonparametrically is by using sample quantile estimators. A random sample X 1 , X n of size n is sorted in increasing order X ( 1 ) X ( 2 ) X ( n ) ; the symbols here denote the order statistics of the random sample. The CDF of F can then be estimated by the step function
F ˜ ( x ) = 1 n i = 1 n I x i , ( x ) ,
where I A ( x ) is the indicator function that takes the value 1 if x A and 0 if x A ; the sample quantile function is defined as Q ˜ ( u ) : = inf { x : F ˜ ( x ) u } , u ( 0 , 1 ) by Cheng [7]. The quantile estimator
S Q p 1 = X ( n p ) if n p = n p X ( n p + 1 ) if n p < n p
is based on a single order statistic, where x is the greatest integer that is less than or equal to x [9,10]. S Q p 1 is the smallest observation for which at least p percent of the observed values in the sample are smaller than or equal to S Q p 1 .
The second sample quantile estimator S Q p 2 is a weighted average of the two order statistics that are closest to including p percent of all the observations in the sample:
S Q p 2 = ( 1 α ) X ( r ) + α X ( r + 1 )
with α = p ( n + 1 ) r and r = p ( n + 1 ) [9,11].
Finally, we considered a weighted average of X n p + 0.5 and X n p + 0.5 + 1 :
S Q I p = ( n p + 0.5 + i ) X ( i ) + ( n p + 0.5 i ) X ( i + 1 )
with i = [ n p + 0.5 ] and 0.5 n p ( n 0.5 ) [9,12].

2.2. Subsampling Quantile Estimators

The abovementioned sample quantile estimators are based on only one or two order statistics whereas subsampling, kernel, and other quantile estimators employ linear combinations of all the available order statistics, weighting them according to their relative closeness to the target percentile. Based on the sample quantile function Q ˜ , linear smooth nonparametric estimators of Q ( u ) can be written as
Q ( u ) = 0 1 Q ˜ ( t ) d t G ( u ; t ) ,
where G ( u ; · ) is a CDF with support on the unit interval. Many distributions G ( u ; · ) have been proposed, the choice of which depends on the sample size and, typically, a smoothing parameter. Depending on the choice of G ( u ; · ) , there are two major classes of quantile estimators according to Cheng [7]: subsampling quantile estimators and kernel quantile estimators. Both can be given as L-statistics, which is, j = 1 n W j · X ( j ) , where W j and X ( j ) is the weight for the j-th order statistic and the j-th order statistic itself, respectively [13]. For subsampling quantile estimators, a discrete distribution is chosen for G ( u ; · ) , interpreted as a resampling distribution from the set of the observed order statistics X ( j ) , j = 1 , , n [7].
The Harrell-Davis estimator is given by
H D p = i = 1 n W i X ( i )
with weight function
W i = 1 β ( n + 1 ) p , ( n + 1 ) ( 1 p ) ( i 1 ) / n i / n y ( n + 1 ) p 1 ( 1 y ) ( n + 1 ) ( 1 p ) 1 d y = I i / n p ( n + 1 ) , ( 1 p ) ( n + 1 ) I ( i 1 ) / n p ( n + 1 ) , ( 1 p ) ( n + 1 ) ,
where I i / n a , b is the incomplete beta function [10,14,15].
The quantile estimator of Kaigh and Lachenbruch
K L p = j = r r + n k j 1 r 1 n j k r / n k X ( j ) ,
where r = [ ( k + 1 ) p ] , is obtained by averaging a subsample quantile estimate over all k n subsamples of size k , 1 k n , which are sampled without replacement. The subsample size k is an arbitrary smoothing (or reduction) parameter, and Kaigh and Lachenbruch proposed choosing k, so as to minimize the mean squared error (MSE), which is, M S E = E ( K L p ε p ) 2 , where ε p is the true value of the p-th quantile [7,16,17].
Kaigh and Cheng [18] proposed the quantile estimator
K C p = j = 1 n r + j 2 r 1 n j + k r k r / n + k 1 k X ( j )
with r = k p , where x denotes the smallest integer greater than or equal to x [7]. The value of k is again determined by minimizing the MSE of the estimator.
Finally, the Bernstein polynomial quantile estimator is given by
B P p = j = 1 n n 1 j 1 p j 1 ( 1 p ) n j X ( j )
according to Cheng [7,19].

2.3. Kernel Quantile Estimators

Like subsampling quantile estimators, kernel quantile estimators can be written in the form of Equation (5). Here, G ( u ; . ) is a location-scale family CDF with density function K, location parameter u, and scale parameter h ( n ) :
G ( u ; t ) = 1 h ( n ) K t u h ( n ) .
Subsequently, Equation (5) becomes the kernel quantile estimator introduced by Parzen [8]:
Q ^ ( u ) = 0 1 Q ˜ ( t ) · 1 h ( n ) · K t u h ( n ) d t
with its L-statistic representation being
Q ^ ( u ) = j = 1 n ( j 1 ) / n j / n 1 h ( n ) · K t u h ( n ) d t X ( j ) .
The density K, called the kernel, is symmetric around zero and it has a chosen bandwidth h ( n ) , which satisfies h ( n ) when n [7,20]. Yang [21] proposed a discretized version, which we used as the first of the two kernel quantile estimators in our study due to its closed form:
K Q p 1 = 1 n j = 1 n 1 h ( n ) · K ( j / n ) p h ( n ) X ( j ) .
As K 1 / n p h ( n ) , , K n / n p h ( n ) do not generally provide a (discrete) probability distribution on [ 0 , 1 ] , monotonicity, translation and scale equivariance, and the symmetry relation do not hold. Translation and scale equivariance would imply that K Q p 1 applied to ( X 1 + c , , X n + c ) is equal to c plus K Q p 1 applied to ( X 1 , , X n ) ; the asymmetry means that K Q p 1 applied to ( X 1 , , X n ) is not equal to K Q p 1 applied to ( X 1 , , X n ) . The Nadaraya–Watson type estimator
K Q p 2 = j = 1 n K ( ( j 0.5 ) / n ) p h ( n ) i = 1 n K ( ( i 0.5 ) / n ) p h ( n ) X ( j )
Overcomes these drawbacks and was used as the second of the two kernel quantile estimators in this study [7,22,23]. Its name originates from the Nadaraya–Watson estimator in kernel regression [24,25].
In the following, we chose the standard Gaussian kernel for K and used the value of the bandwith h ( n ) that minimized the MSE to find k for both K L p in Equation (7) and K C p in Equation (8).

2.4. Other Quantile Estimators

The kernel quantile estimators that are presented in the previous section are all based on the usual empirical distribution (1) with equal weights 1 / n assigned to each observation. To improve the performance of quantile estimators, Huang and Brill [26] proposed using a weighted empirical distribution, for instance, the level crossing empirical distribution function
F l c ( x ) = i = 1 n w i , n I ( , x ] ( X ( i ) )
with weight function
w i , n = 1 2 1 n 2 n ( n 1 ) if i = 1 , n 1 n ( n 1 ) if i = 2 , 3 . . . , n 1 .
One such kernel quantile estimator using level crossing empirical distributions is
K Q p l c = i = 1 n n 1 1 h ( n ) K j = 1 i w j , n p h ( n ) X ( i ) .
Huang [27] modified the Harrell–Davis estimator (6) by applying a weighted empirical distribution function instead of the empirical distribution with equal weights 1 / n . The Harrell–Davis estimator using a level crossing empirical distribution function can be written as
H D p l c = 1 β ( ( n + 1 ) p , ( n + 1 ) q ) 0 1 F l c 1 ( y ) y ( n + 1 ) p 1 ( 1 y ) ( n + 1 ) q 1 d y = i = 1 n q i 1 , n q i , n 1 β ( ( n + 1 ) p , ( n + 1 ) q ) y ( n + 1 ) p 1 ( 1 y ) ( n + 1 ) q 1 d y X ( i ) ,
where β ( · , · ) is the beta function, q = 1 p , F l c ( · ) is given by (12), q i , n = j = 1 i w j , n , i = 1 , , n , with  w j , n , as defined in (13), and  q 0 , n 0 .
Sfakianakis and Verginis [28] proposed a group of estimators, motivated by the fact that nonparametric quantile estimation of extreme quantiles close to 0 and 1 requires large samples for sufficient accuracy. These three quantile estimators are supposed to better estimate quantiles in the tails of a distribution when using small samples and they employ the Binomial probability of observing exactly i out of n events with an event probability of p, B ( i ; n , p ) :
S V p 1 = 2 B ( 0 ; n , p ) + B ( 1 ; n , p ) 2 X ( 1 ) + B ( 0 ; n , p ) 2 X ( 2 ) B ( 0 ; n , p ) 2 X ( 3 ) + i = 2 n 1 B ( i ; n , p ) + B ( i 1 ; n , p ) 2 X ( i ) B ( n ; n , p ) 2 X ( n 2 ) + B ( n ; n , p ) 2 X ( n 1 ) + 2 B ( n ; n , p ) + B ( n 1 ; n , p ) 2 X ( n ) ,
S V p 2 = i = 0 n 1 B ( i ; n , p ) X ( i + 1 ) + ( 2 X ( n ) X ( n 1 ) ) B ( n ; n , p ) ,
S V p 3 = i = 1 n B ( i ; n , p ) X ( i ) + ( 2 X ( 1 ) X ( 2 ) ) B ( 0 ; n , p ) .
Finally, Navruz and Özdemir [29] introduced a new quantile estimator, which is a weighted average of all order statistics:
N O p = ( B ( 0 ; n , p ) 2 p + B ( 1 ; n , p ) p ) X ( 1 ) + B ( 0 ; n , p ) ( 2 3 p ) X ( 2 ) B ( 0 ; n , p ) ( 1 p ) X ( 3 ) + i = 1 n 2 ( B ( i ; n , p ) ( 1 p ) + B ( i + 1 ; n , p ) p ) X ( i + 1 ) B ( n ; n , p ) p X ( n 2 ) + B ( n ; n , p ) ( 3 p 1 ) X ( n 1 ) + ( B ( n 1 ; n , p ) ( 1 p ) + B ( n ; n , p ) ( 2 2 p ) ) X ( n ) .

2.5. Simulation Setup

We contrasted nonparametric LoA as constructed with the 15 abovementioned quantile estimators by comparing their coverage probabilities for the next paired difference under the given distributional assumption. Here, we employed the standard normal distribution (ND), a standard normal distribution with 1%, 2%, and 5% outliers (ND 1%, ND 2%, and ND 5%, respectively), an exponential distribution (ED) with a rate of 1, and a lognormal distribution (LND) with meanlog = 0 and sdlog = 1. For normal distributions comprising outliers, simulated data were replaced with a probability of 1%, 2%, and 5% by data sampled from a normal distribution with a mean of 0 and a standard deviation of 3. To examine small to moderate sample sizes, the sample size was set to 30, 50, 80, 100, and 150. For each combination of distribution, sample size, and nonparametric quantile estimator, 20,000 simulated trials of size ( n + 1 ) were generated with R (the code is available as Supplemental Material S1). Here, a seed was set in order to use the same simulated data for each combination of distribution and sample size across nonparametric quantile estimators. The first n observations in each simulated trial were used to derive nonparametric LoA, to which the last observation was compared. The coverage probability was then the proportion of cases out of the 20,000 trials where the nonparametric LoA included the last observation. All of the figures were generated with Stata/MP 16.1 (College Station, TX 77845, USA).

3. Results

For n = 30 (Table 1), none of the estimators reached the nominal coverage probability of 0.95. The coverage probability of S Q p 1 was closest to 0.95, ranging from 0.934 to 0.938. Note that, for sample sizes of up to n = 40 observations, the smallest and largest difference are used as nonparametric quantile estimates for the 2.5% and 97.5% percentiles, respectively. For  S Q I p , H D p , H D p l c , S V p 1 , S V p 2 , and  S V p 3 , the coverage probabilities were at least 0.921, 0.911, 0.910, 0.920, 0.914, and 0.923, respectively. Neither S Q p 2 nor K L p are defined for n < 40 .
S Q p 2 was the only estimator with coverage probabilities oscillating closely around 0.95 for all the investigated sample sizes n 50 (Table 2, Table 3, Table 4 and Table 5); for n = 50 , S Q p 2 was the only one to do so.
For n = 80 (Table 3), the coverage probabilities of H D p , S V p 1 , S V p 2 , and  S V p 3 fluctuated closely around the nominal level except for the simulations with an ED (0.94). For  n 100 (Table 4 and Table 5), these estimators performed close to the 0.95 nominal level for all of the investigated distributions.
The coverage probabilities of the recently proposed N O p estimator varied between 0.945 and 0.950 for n = 200 and they were very close to 0.95 for n = 250 (results not shown here).

4. Example

Diederichsen et al. [6] compared coronary artery calcification measurements using the Agatston score with the measurements using Framingham Heart Score in Danes of 50 and 60 years of age. Of 1825 randomly sampled citizens, 1257 consented to participation in the study, and 1156 of them were eligible. Agatston scores were independently reanalyzed for 129 randomly chosen study participants, and the agreement measures were the proportions of agreement and the kappa statistics for dichotomized calcification status (absence vs. presence) to assess intra- and inter-rater agreement. In the following, the intra-rater differences are used for exemplification purposes.
Approximately half of the 129 participants had an Agatston score of 0. The paired intra-rater differences ranged from −683 to 130, with a first, second, and third quartile being equal to 0; the 5th, 10th, 90th, and 95th percentiles were −23, −12, 1.1, and 5, respectively. The empirical distribution of the paired differences was, therefore, characterized by its denseness around 0 and a single, comparatively extreme outlier, clearly indicating the inappropriateness of the normality assumption in this setting (see also a histogram including an approximating normal distribution as Supplemental Material S2).
Using S Q p 2 , H D p , and  S V p 1 , the nonparametric, asymmetric, and robust LoA are −61.5, 12.8; −96.2, 26.7; and, −122.1, 30.6, respectively, whereas the symmetric BA LoA of −129.8, 116.1 are equidistant from the estimated mean difference of −6.9 (Figure 1). The upper LoA for H D p and S V p 1 are similar, but the respective lower LoA are differently affected by the single outlier (3942.5, −683). S Q p 2 appears to be most robust to few outliers due to its definition. The R source code for the derivation of these nonparametric LoA as well as the example data can be found as Supplemental Material S3 and S4, respectively.
The sensitivity of the classical BA LoA to outliers becomes crystal-clear when excluding the single outlier here. Subsequently, the symmetric BA LoA are −37.4 and 34.3, and the estimated mean difference reduces to −1.6 (results not shown here). In practice, outliers are, though, kept in the analysis dataset if there is no reasonable explanation for an exclusion. This underlines the importance of robust alternatives to the BA LoA.

5. Discussion

5.1. Statement of Principal Findings

The simple sample quantile estimators that are based on one and two order statistics performed closest to the nominal level in terms of the coverage probability for the next observation across six distributional scenarios for n = 30 and n = 50 , 80 , 100 , 150 . The Harrell–Davis subsampling estimator and estimators of the Sfakianakis–Verginis type followed closely for sample sizes of at least n = 80 and may be considered intuitively more appealing, as they use the entire sample, whereas more simple and outlier-robust sample quantile estimators are only based on a few observations from the sample.

5.2. Strengths and Limitations of The Study

The choice of distributions for the simulation study was motivated by our own experience with agreement assessments in clinical studies, especially roughly normal distributions with a few percent outliers. We investigated a wide range of quantile estimators, comprising sample, subsampling, and kernel quantile estimators as well as other methods for quantile estimation with sample sizes between 30 and 150. As a measure of performance, we considered the coverage probability of nonparametric LoA for the next observation, interpreting the nonparametric LoA as a prediction interval, as the lower and upper LoA need to be simultaneously assessed. Therefore, we did not pursue evaluations using, for instance, mean squared errors.

5.3. Strengths and Limitations in Relation to other Studies

A peculiarity of LoA is the sole focus on the 2.5th and 97.5th quantiles, two extreme quantiles. Dielman, Lowry and Pfaffenberger [9] investigated 0.02 and 0.98, but only for small samples ( n = 10 , 15 , 25 , 30 ). Others examined 0.05 and 0.95 [11,14,16,21,22,29], whereas Kaigh and Cheng [17,18] assessed 0.1 and 0.9. Only Huang and Brill [26,27] also targeted 0.025 and 0.975, but only for samples of maximum size n = 30 . Sfakianakis and Verginis [28] analyzed 0.01 and 0.99 as well as 0.05 and 0.95 in various sample sizes.
When compared to the usual number of 2000 iterations, the chosen number of 20,000 iteration runs translated for a given nonparametric estimator and sample size into a reduced range of the coverage probabilities across distributions by approximately 0.005 and is deemed appropriately accurate. However, the increased number of iterations did implicate considerably longer running times in creating the data for one Table (12 as opposed to 2 h). The abovementioned studies employed between 1000 and 10,000 iterations [16,22].
Harrell and Davis [14] did not recommend H D p for small n and extreme p, and Dielman, Lowry and Pfaffenberger [9] concluded that there was not one best estimator across scenarios, based on maximum sample sizes of 30 and 60; however, Dielman, Lowry and Pfaffenberger [9] suggested that H D p performs well in a wide range of cases, except when p = 0.02 , 0.98 . Our findings for H D p are in line with these former conclusions, but extend to larger sample sizes of n = 80 , 100 , 150 , in which H D p appears to be a preferable choice for estimating extreme quantiles.

5.4. Meaning of the Findings: Possible Mechanisms and Implications

Our findings suggest using S Q p 1 in small samples with approximately n = 30 but S Q I p , S V p 1 , or  S V p 3 may be preferential alternatives as S Q p 1 simply reduces to the smallest and largest observations as estimates for the 2.5% and 97.5% quantiles, respectively. The latter is, in turn, unfortunate in the case of outliers due to their unabated impact on the estimates. S Q p 2 performed closest to the nominal level for all samples with n 50 and appeared to be less prone to the single outlier in our clinical example than H D p and S V p 1 . However, the latter two estimators do involve all the observations. S Q p 2 can, therefore, be considered the first choice for samples of approximately n = 50 , but, for larger n, both  H D p and Sfakianakis–Verginis type estimators are equally applicable and actually preferable if the researcher seeks to include the entire dataset in quantile estimation and not only pairs of order statistics.
The normality assumption of the paired differences may often be considered to be reasonable in the planning stage; however, alternative quantile estimators should be equally specified in the planning stage as empirical distributions may deviate notably from ideal assumptions. Moreover, our investigation suggests several beneficial nonparametric alternatives to BA LoA instead of the simple percentile estimators that currently seem to prevail.

5.5. Unanswered Questions and Future Research

In the case of normally distributed paired differences, Bland and Altman [1,2] have already proposed approximate confidence intervals for the BA LoA. Recently, Vock [30] emphasized that only a tolerance interval or the outer confidence limits for BA LoA can provide a range that will contain a specified percentage of future differences with a known certainty. Carkeet and Goh [31,32] proposed exact confidence intervals for BA LoA, while using two-sided tolerance factors for a normal distribution.
In the case of any given distribution for the paired differences, several approaches for the construction of nonparametric confidence intervals for quantiles have been proposed over half a century [33,34,35,36,37,38,39]. For both H D p and K L p , confidence intervals for quantiles have been proposed [10,14,16]. In the context of nonparametric LoA, future research will naturally lie in the proposal and evaluation of confidence intervals for the 2.5% and 97.5% quantiles in small-to-moderate samples, especially with regard to S Q p 2 , S Q I p , H D p , and Sfakianakis–Verginis type estimators.
Regression procedures for method comparison analysis have not been considered here [40,41,42,43]. Robust methods designed for data configurations with outliers, such as S- or MM-estimation, Least Trimmed Squares, or the Forward Search, are of interest in this context [44,45,46,47,48].

Supplementary Materials

The following are available online at https://0-www-mdpi-com.brum.beds.ac.uk/2571-905X/3/3/22/s1, Code S1: R source code for generating Table 1, Table 2, Table 3, Table 4 and Table 5. Figure S2: Histogram for the data of the clinical example in Section 4, including an approximating normal distribution. Code S3: R source code for generating Limits of Agreement for the clinical example in Section 4. Data S4: Dataset of the clinical example in Section 4.

Author Contributions

Conceptualization, O.G.; methodology, H.C.P. and O.G.; software, M.E.F.; validation, M.E.F. and O.G.; formal analysis, M.E.F., H.C.P. and O.G.; writing—original draft preparation, M.E.F. and O.G.; writing—review and editing, M.E.F., H.C.P. and O.G.; visualization, O.G.; supervision, H.C.P. and O.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank Axel Diederichsen (Odense University Hospital, Denmark) for the permission to reanalyze fully anonymized data from the DanRisk study and three anonymous reviewers for very helpful comments on earlier versions of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bland, J.M.; Altman, D.G. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986, 1, 307–310. [Google Scholar] [CrossRef]
  2. Bland, J.M.; Altman, D.G. Measuring agreement in method comparison studies. Stat. Methods Med. Res. 1999, 8, 135–160. [Google Scholar] [CrossRef] [PubMed]
  3. Rosner, B. Fundamentals of Biostatistics, 8th ed.; Cengage Learning: Boston, MA, USA, 2015. [Google Scholar]
  4. Schmitz, S.; Krummenauer, F.; Henn, S.; Dick, H.B. Comparison of three different technologies for pupil diameter measurement. Graefe’s Arch. Clin. Exp. Ophthalmol. 2003, 241, 472–477. [Google Scholar] [CrossRef] [PubMed]
  5. Twomey, P.J. How to use difference plots in quantitative method comparison. Ann. Clin. Biochem. 2006, 43, 124–129. [Google Scholar] [CrossRef]
  6. Diederichsen, A.C.; Sand, N.P.; Nørgaard, B.; Lambrechtsen, J.; Jensen, J.M.; Munkholm, H.; Aziz, A.; Gerke, O.; Egstrup, K.; Larsen, M.L.; et al. Discrepancy between coronary artery calcium score and HeartScore in middle-aged Danes: The DanRisk study. Eur. J. Prev. Cardiol. 2012, 19, 558–564. [Google Scholar] [CrossRef]
  7. Cheng, C. On Estimation of Quantiles and Quantile Density Functions. Ph.D. Thesis, Texas A & M University, College Station, TX, USA, 1993. [Google Scholar]
  8. Parzen, E. Nonparametric statistical data modeling. J. Am. Stat. Assoc. 1979, 74, 105–121. [Google Scholar] [CrossRef]
  9. Dielman, T.; Lowry, C.; Pfaffenberger, R. A comparision of quantile estimators. Commun. Stat. Simul. Comput. 1994, 23, 355–371. [Google Scholar] [CrossRef]
  10. Steinberg, S.M. Confidence Intervals for Functions of Quantiles Using Linear Combinations of Order Statistics. Ph.D. Thesis, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA, 1983. [Google Scholar]
  11. Parrish, R.S. Comparision of quantile estimators in normal sampling. Biometrics 1990, 46, 247–257. [Google Scholar] [CrossRef]
  12. Hyndman, R.J.; Fan, Y. Sample quantiles in statistical packages. Am. Stat. 1996, 50, 361–365. [Google Scholar] [CrossRef]
  13. Serfling, R.J. Approximation Theorems of Mathematical Statistics; John Wiley & Sons: Hoboken, NJ, USA, 1980. [Google Scholar]
  14. Harrell, F.E.; Davis, C.E. A new distribution-free quantile estimator. Biometrika 1982, 69, 635–640. [Google Scholar] [CrossRef]
  15. Steinberg, S.M.; Davis, C.E. Comparison of nonparametric point estimators for interquantile differences in moderate sized samples. Commun. Stat. Theory Methods 1987, 16, 1607–1616. [Google Scholar] [CrossRef]
  16. Kaigh, W.D.; Lachenbruch, P.A. A generalized quantile estimator. Commun. Stat. Theory Methods 1982, 11, 2217–2238. [Google Scholar] [CrossRef]
  17. Kaigh, W.D. Quantile interval estimation. Commun. Stat. Theory Methods 1983, 12, 2427–2443. [Google Scholar] [CrossRef]
  18. Kaigh, W.D.; Cheng, C. Subsampling quantile estimators and uniformity criteria. Commun. Stat. Theory Methods 1991, 20, 539–560. [Google Scholar] [CrossRef]
  19. Cheng, C. The Bernstein polynomial estimator of a smooth quantile function. Stat. Probab. Lett. 1995, 24, 321–330. [Google Scholar] [CrossRef]
  20. Delampady, M.; Ghosh, J.K.; Samanta, T. An Introduction to Bayesian Analysis Theory and Methods; Springer: New York, NY, USA, 2006. [Google Scholar]
  21. Yang, S.S. A smooth nonparametric estimator of a quantile function. J. Am. Stat. Assoc. 1985, 80, 1004–1111. [Google Scholar] [CrossRef]
  22. Sheather, S.J.; Marron, J.S. Kernel quantile estimators. J. Am. Stat. Assoc. 1990, 85, 410–416. [Google Scholar] [CrossRef]
  23. Zelterman, D. Smooth nonparametric estimation of the quantile function. J. Stat. Plan. Inference 1990, 26, 339–352. [Google Scholar] [CrossRef]
  24. Nadaraya, E.A. Smooth regression analysis. Sankhyā Indian J. Stat. 1964, 26, 359–372. [Google Scholar]
  25. Watson, G.S. On estimating regression. Theory Probab. Appl. 1964, 9, 141–142. [Google Scholar]
  26. Huang, M.L.; Brill, P. A level crossing quantile estimation method. Stat. Probab. Lett. 1999, 45, 111–119. [Google Scholar] [CrossRef]
  27. Huang, M.L. On a distribution-free quantile estimator. Comput. Stat. Data Anal. 2001, 37, 477–486. [Google Scholar] [CrossRef]
  28. Sfakianakis, M.E.; Verginis, D.G. A new family of nonparametric quantile estimators. Commun. Stat. Simul. Comput. 2008, 37, 337–345. [Google Scholar] [CrossRef]
  29. Navruz, G.; Özdemir, A.F. A new quantile estimator with weights based on a subsampling approach. Br. J. Math. Stat. Psychol. 2020, 73. [Google Scholar] [CrossRef]
  30. Vock, M. Intervals for the assessment of measurement agreement: Similarities, differences, and consequences of incorrect interpretations. Biom. J. 2016, 58, 489–501. [Google Scholar] [CrossRef]
  31. Carkeet, A. Exact parametric confidence intervals for Bland-Altman limits of agreement. Optom. Vis. Sci. 2015, 92, e71–e80. [Google Scholar] [CrossRef] [Green Version]
  32. Carkeet, A.; Goh, Y.T. Confidence and coverage for Bland-Altman limits of agreement and their approximate confidence intervals. Stat. Methods Med. Res. 2018, 27, 1559–1574. [Google Scholar] [CrossRef]
  33. Chu, J.T. Some uses of quasi-ranges. Ann. Math. Stat. 1957, 28, 173–180. [Google Scholar] [CrossRef]
  34. Campbell, M.J.; Gardner, M.J. Calculating confidence intervals for some non-parametric analyses. Br. Med. J. 1988, 296, 1454–1456. [Google Scholar] [CrossRef] [Green Version]
  35. Beran, R.; Hall, P. Interpolated nonparametric prediction intervals and confidence intervals. J. R. Stat. Soc. Ser. B 1993, 55, 643–652. [Google Scholar] [CrossRef]
  36. Hutson, A.D. Calculating nonparametric confidence intervals for quantiles using fractional order statistics. J. Appl. Stat. 1999, 26, 343–353. [Google Scholar] [CrossRef]
  37. Hutson, A.D. ‘Exact’ bootstrap confidence bands for the quantile function via Steck’s determinant. J. Comput. Graph. Stat. 2002, 11, 471–482. [Google Scholar] [CrossRef]
  38. Zielinski, R.; Zielinski, W. Best exact nonparametric confidence intervals for quantiles. Statistics 2005, 39, 67–71. [Google Scholar] [CrossRef]
  39. Balakrishnan, N.; Li, T. Confidence intervals for quantiles and tolerance intervals based on ordered ranked set samples. Ann. Inst. Stat. Math. 2006, 58, 757–777. [Google Scholar] [CrossRef]
  40. Cornbleet, P.J.; Gochman, N. Incorrect least-squares regression coefficients in method-comparison analysis. Clin. Chem. 1979, 25, 432–438. [Google Scholar] [CrossRef] [PubMed]
  41. Passing, H.; Bablok, W. A new biometrical method for testing the equality of measurements from two different analytical methods. Clin. Chem. Lab. Med. 1983, 21, 709–720. [Google Scholar] [CrossRef]
  42. Passing, H.; Bablok, W. Comparison of several regression procedures for method comparison studies and determination of sample size. Clin. Chem. Lab. Med. 1984, 22, 431–445. [Google Scholar] [CrossRef] [Green Version]
  43. Payne, R.B. Method comparison: Evaluation of least squares, Deming and Passing/Bablok regression procedures using computer simulation. Ann. Clin. Biochem. 1997, 34, 319–320. [Google Scholar] [CrossRef]
  44. Rousseeuw, P.J. Least median of squares regression. J. Am. Stat. Assoc. 1984, 79, 871–880. [Google Scholar] [CrossRef]
  45. Yohai, V.J.; Zamar, R. High breakdown-point estimates of regression by means of the minimization of an efficient scale. J. Am. Stat. Assoc. 1988, 83, 406–413. [Google Scholar] [CrossRef]
  46. Riani, M.; Cerioli, A.; Atkinson, A.C.; Perrotta, D. Monitoring robust regression. Electron. J. Stat. 2014, 8, 646–677. [Google Scholar] [CrossRef]
  47. Rousseeuw, P.; Perrotta, D.; Riani, M.; Hubert, M. Robust monitoring of time series with application to fraud detection. Econom. Stat. 2019, 9, 108–121. [Google Scholar] [CrossRef]
  48. Riani, M.; Atkinson, A.C.; Corbellini, A.; Perrotta, D. Robust regression with density power divergence: Theory, comparisons, and data analysis. Entropy 2020, 22, 399. [Google Scholar] [CrossRef] [Green Version]
Figure 1. S Q p 2 (magenta, long dashes), H D p (red, short dashes) and S V p 1 (black, solid lines) contrasted with classical BA LoA (shaded area).
Figure 1. S Q p 2 (magenta, long dashes), H D p (red, short dashes) and S V p 1 (black, solid lines) contrasted with classical BA LoA (shaded area).
Stats 03 00022 g001
Table 1. Coverage probabilities for nonparametric Limits of Agreement ( n = 30 ). Neither S Q p 2 nor K L p are defined for n < 40 .
Table 1. Coverage probabilities for nonparametric Limits of Agreement ( n = 30 ). Neither S Q p 2 nor K L p are defined for n < 40 .
EstimatorNDND 1%ND 2%ND 5%EDLND
S Q p 1 0.9370.9380.9370.9370.9340.937
S Q p 2 ------
S Q I p 0.9260.9270.9270.9280.9210.926
H D p 0.9110.9230.9230.9240.9160.920
K L p ------
K C p 0.9000.8900.8770.8510.8850.880
B P p 0.9050.9080.9080.9090.8970.904
K Q p 1 0.9040.8820.8650.8320.9360.922
K Q p 2 0.9120.8930.8800.8570.9160.906
K Q p l c 0.9150.8930.8740.8380.9230.919
H D p l c 0.9160.9170.9170.9190.9100.915
S V p 1 0.9240.9250.9250.9260.9200.924
S V p 2 0.9250.9270.9260.9270.9140.919
S V p 3 0.9250.9260.9250.9260.9230.929
N O p 0.8130.8140.8170.8210.8080.834
Table 2. Coverage probabilities for nonparametric Limits of Agreement ( n = 50 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
Table 2. Coverage probabilities for nonparametric Limits of Agreement ( n = 50 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
EstimatorNDND 1%ND 2%ND 5%EDLND
S Q p 1 0.9190.9180.9200.9190.9190.919
S Q p 2 0.9510.9520.9530.9550.9510.951
S Q I p 0.9340.9350.9370.9380.9310.934
H D p 0.9390.9400.9420.9450.9350.938
K L p 0.9380.9320.9300.9170.9370.929
K C p 0.9240.9060.8980.8790.9150.912
B P p 0.9220.9250.9270.9310.9190.922
K Q p 1 0.9190.9020.8960.8610.9310.933
K Q p 2 0.9240.9090.9000.8810.9240.916
K Q p l c 0.9240.9120.9010.8670.9440.930
H D p l c 0.9330.9350.9360.9400.9280.931
S V p 1 0.9400.9410.9420.9450.9350.939
S V p 2 0.9370.9390.9390.9430.9330.934
S V p 3 0.9380.9410.9420.9440.9380.941
N O p 0.8390.8430.8450.8550.8450.870
Table 3. Coverage probabilities for nonparametric Limits of Agreement ( n = 80 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
Table 3. Coverage probabilities for nonparametric Limits of Agreement ( n = 80 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
EstimatorNDND 1%ND 2%ND 5%EDLND
S Q p 1 0.9390.9390.9400.9380.9340.939
S Q p 2 0.9500.9510.9510.9490.9450.950
S Q I p 0.9410.9410.9420.9410.9350.941
H D p 0.9500.9520.9540.9550.9400.949
K L p 0.9430.9390.9350.9330.9380.939
K C p 0.9360.9230.9160.8950.9250.929
B P p 0.9370.9390.9410.9420.9290.937
K Q p 1 0.9340.9260.9210.8870.9340.940
K Q p 2 0.9360.9250.9170.8970.9300.931
K Q p l c 0.9330.9280.9220.8900.9420.939
H D p l c 0.9430.9450.9470.9480.9340.943
S V p 1 0.9510.9520.9540.9550.9400.949
S V p 2 0.9500.9510.9530.9530.9400.948
S V p 3 0.9490.9510.9520.9540.9400.950
N O p 0.8880.8930.8960.9010.8910.910
Table 4. Coverage probabilities for nonparametric Limits of Agreement ( n = 100 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
Table 4. Coverage probabilities for nonparametric Limits of Agreement ( n = 100 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
EstimatorNDND 1%ND 2%ND 5%EDLND
S Q p 1 0.9410.9410.9410.9420.9410.941
S Q p 2 0.9520.9520.9520.9540.9530.952
S Q I p 0.9410.9410.9410.9420.9410.941
H D p 0.9500.9510.9530.9570.9480.950
K L p 0.9470.9410.9390.9380.9460.941
K C p 0.9400.9290.9200.9020.9350.935
B P p 0.9400.9420.9440.9480.9370.940
K Q p 1 0.9370.9350.9210.8960.9420.943
K Q p 2 0.9390.9300.9200.9030.9400.935
K Q p l c 0.9380.9350.9320.9050.9400.941
H D p l c 0.9440.9460.9480.9510.9420.944
S V p 1 0.9500.9520.9540.9590.9480.950
S V p 2 0.9500.9520.9530.9570.9480.949
S V p 3 0.9500.9510.9530.9590.9470.950
N O p 0.9110.9140.9170.9240.9130.924
Table 5. Coverage probabilities for nonparametric Limits of Agreement ( n = 150 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
Table 5. Coverage probabilities for nonparametric Limits of Agreement ( n = 150 ). Bold figures indicate coverage probabilities exceeding the nominal level of 0.95.
EstimatorNDND 1%ND 2%ND 5%EDLND
S Q p 1 0.9450.9450.9440.9460.9460.945
S Q p 2 0.9490.9490.9480.9510.9500.949
S Q I p 0.9420.9420.9410.9430.9440.942
H D p 0.9480.9490.9500.9540.9480.948
K L p 0.9450.9410.9390.9360.9470.945
K C p 0.9410.9340.9300.9060.9390.940
B P p 0.9420.9440.9430.9470.9410.942
K Q p 1 0.9400.9360.9280.9030.9430.943
K Q p 2 0.9400.9320.9230.9080.9410.937
K Q p l c 0.9390.9350.9280.9040.9410.943
H D p l c 0.9440.9460.9450.9490.9440.944
S V p 1 0.9490.9510.9520.9570.9490.949
S V p 2 0.9490.9520.9520.9560.9480.950
S V p 3 0.9490.9500.9510.9540.9470.949
N O p 0.9340.9370.9370.9430.9360.940

Share and Cite

MDPI and ACS Style

Frey, M.E.; Petersen, H.C.; Gerke, O. Nonparametric Limits of Agreement for Small to Moderate Sample Sizes: A Simulation Study. Stats 2020, 3, 343-355. https://0-doi-org.brum.beds.ac.uk/10.3390/stats3030022

AMA Style

Frey ME, Petersen HC, Gerke O. Nonparametric Limits of Agreement for Small to Moderate Sample Sizes: A Simulation Study. Stats. 2020; 3(3):343-355. https://0-doi-org.brum.beds.ac.uk/10.3390/stats3030022

Chicago/Turabian Style

Frey, Maria E., Hans C. Petersen, and Oke Gerke. 2020. "Nonparametric Limits of Agreement for Small to Moderate Sample Sizes: A Simulation Study" Stats 3, no. 3: 343-355. https://0-doi-org.brum.beds.ac.uk/10.3390/stats3030022

Article Metrics

Back to TopTop