A New One-Parameter Distribution for Right Censored Bayesian and Non-Bayesian Distributional Validation under Various Estimation Methods

Emam, Walid; Tashkandy, Yusra; Goual, Hafida; Hamida, Talhi; Hiba, Aiachi; Ali, M. Masoom; Yousof, Haitham M.; Ibrahim, Mohamed

doi:10.3390/math11040897

Open AccessArticle

A New One-Parameter Distribution for Right Censored Bayesian and Non-Bayesian Distributional Validation under Various Estimation Methods

¹

Department of Statistics and Operations Research, Faculty of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia

²

Department of Mathematics, Laboratory of Probability and Statistics LaPS, Badji Mokhtar Annaba University, Annaba 23000, Algeria

³

Department of Mathematical Sciences, Ball State University, Muncie, IN 47306, USA

⁴

Department of Statistics, Mathematics and Insurance, Faculty of Commerce, Benha University, Benha 13518, Egypt

⁵

Department of Applied, Mathematical and Actuarial Statistics, Faculty of Commerce, Damietta University, Damietta 34517, Egypt

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(4), 897; https://0-doi-org.brum.beds.ac.uk/10.3390/math11040897

Submission received: 2 January 2023 / Revised: 1 February 2023 / Accepted: 3 February 2023 / Published: 10 February 2023

(This article belongs to the Special Issue Distribution Theory and Application)

Download

Browse Figures

Versions Notes

Abstract

:

We propose a new extension of the exponential distribution for right censored Bayesian and non-Bayesian distributional validation. The parameter of the new distribution is estimated using several conventional methods, including the Bayesian method. The likelihood estimates and the Bayesian estimates are compared using Pitman’s closeness criteria. The Bayesian estimators are derived using three loss functions: the extended quadratic, the Linex, and the entropy functions. Through simulated experiments, all the estimating approaches offered have been assessed. The censored maximum likelihood method and the Bayesian approach are compared using the BB algorithm. The development of the Nikulin–Rao–Robson statistic for the new model in the uncensored situation is thoroughly discussed with the aid of two applications and a simulation exercise. For the novel model under the censored condition, two applications and the derivation of the Bagdonavičius and Nikulin statistic are also described.

Keywords:

Bagdonavičius and Nikulin statistic; Bayesian estimation; BB method; censored applications; lomax model; Nikulin–Rao–Robson; Pitman’s proximity

MSC:

62N01; 62N02; 62E10

1. Introduction

Exponential distribution is one of the often-encountered continuous probability distributions. It is frequently used to simulate the time between events. The mean and expected value of the exponential probability distribution, as well as its theoretical interpretation, will now be given. The exponential probability distribution, or Poisson point process, is the probability distribution of the interval between events in a process where events happen continuously and independently at a set average rate. In statistics and probability theory, this probability distribution is exploited. It is a unique instance of the gamma distribution. It is the continuous counterpart of the geometric distribution and possesses the key quality of being memoryless. In addition to the analysis of Poisson point processes, it is used in numerous other contexts. Exponential distribution is not the same as the class of exponential families of distributions, a major class of probability distributions that also includes the normal distribution, binomial distribution, gamma distribution, Poisson distribution, and many more probability distributions. One of the most often utilized continuous distributions is exponential probability distribution. It helps in figuring out how long there will be between events. It is used in many disciplines, such as physics, dependability theory, queuing theory, and others. Finding the height of different molecules in a gas at a stable temperature and pressure in a uniform gravitational field, as well as computing the monthly and annual highest values of consistent rainfall and river outflow volumes, are all tasks that can be accomplished using the exponential probability distribution. In statistical inference and dependability, selecting a suitable core model for new data analysis is becoming increasingly crucial. The results could be significantly affected by even a little departure from the core model. The difficulty of this work is increased by censorship. The chi-square test type is the one used most frequently to determine the goodness-of-fit. Numerous modifications to chi-square tests have been proposed by various researchers.

One of the aims of this study is to present a goodness-of-fit test for our parametric model, which is often used in survival analysis, social sciences, engineering, and dependability, in complete data scenarios, and in the presence of right censoring. For the one parameter Poisson-exponential (OPPE) model, we present the explicit forms of the quadratic test statistics (Nikulin–Rao–Robson test, Bagdonavičius and Nikulin test statistic, see Bagdonavičius et al. [1], Bagdonavičius and Nikulin [2], Bagdonavičius and Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7]). Then, we use actual data to apply both tests. This study demonstrates how the Bayesian technique may enhance both the mean time between failure of the OPPE distribution and the maximum likelihood estimate of the parameter. By using a suitable loss function, we will also illustrate the utility of this adjustment. The OPPE distribution is distinguished by its ease of mathematical and scientific handling in statistical and mathematical modeling. As is common for many probability distribution researchers, we do not approach this new distribution in the normal method in this work. For instance, we do not focus as much on the conventional study of the new distribution, not because it is unimportant, but rather because we are more concerned with the practical applications of the mathematical and statistical modeling, as well as a significant portion of the distribution’s verification using censored data. To demonstrate the significance and flexibility of the new distribution and its wide range of applications in statistical and mathematical modeling, as well as the handling of controlled data, we omit several theoretical mathematical features, many algebraic derivations, and related theories. To be helpful to scholars in the field and assist them in providing additional similar and possibly more flexible distributions, we must briefly discuss the emergence of the new distribution in this context as well as how it was created and formed. The OPPE distribution’s cumulative distribution function (CDF) can be written as

\Pr (X_{δ} \leq x) = F_{δ} (x) = \frac{1}{1 - e x p (- 1)} {1 - e x p [- {\bar{Δ}}_{δ} (x)]},

where x > 0 and δ > 0,

{\bar{Δ}}_{δ} (x) = 1 - Δ_{δ} (x)

and

Δ_{δ} (x) = {[e x p (- δ x)]}^{[1 - e x p (- δ x)]} .

The corresponding probability density function (PDF) can then be expressed as

f_{δ} (x) = \frac{δ}{1 - e x p (- 1)} \frac{Δ_{δ} (x)}{e x p {[x + {\bar{Δ}}_{δ} (x)]}} [δ x + \exp (δ x) - 1],

for all

x > 0

and

δ > 0

. Many scholars have been interested in the exponential distribution and in providing new flexible extensions of it, and they have also been interested in the applications of these new extensions in various fields of science, such as engineering, insurance, medicine, reliability, actuarial science, and others. In this work, we propose to develop a modified chi-square type fit test for the OPPE model in the scenario when the parameter is unknown, and the data are complete. This test is based on the Nikulin–Rao–Robson (NKRR) statistic, which Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7] separately proposed. This

Y_{n}^{2}

statistic is a logical refinement of the conventional chi-square test based on the maximum likelihood estimator on the original data. The original data were used in full for this statistic, which is a straightforward enough statistic to estimate the null hypothesis’s parameter. Second, this statistic entirely employed the Fisher data, which was to assess the information relative to a parameter contained in a model selection, based on the MLE. This provides a more unbiased evaluation of the chosen model of fit. Next, in the scenario when the parameters are unknown and the data are right-censored, we create a new goodness-of-fit test for this model. This modified chi-square

Y_{n}^{2}

test adjusts the NKRR statistic to account for both censoring and the unknown parameter. This study is a very significant numerical simulation study that we are conducting to demonstrate the invariance of the test statistic distribution on the one hand, and to test the null hypothesis

H_{0}

that a sample originates from an OPPE model on the other. We compute

Y_{n}^{2}

the NKRR statistic as well as the Bagdonavičius and Nikulin statistics of various simulated samples, respectively. As a result, we can state that the suggested tests can, respectively, fit complete and censored data from the OPPE distribution. We conclude our contribution in this work with applications using real data, with the aim of demonstrating the applicability of the OPPE model across many scientific disciplines.

It is worth noting that the NKRR basic test, which is supported by complete data, is the most popular test in the last ten years. This is because it fits the complete truth data, and this is the case in most practical and applied cases in various fields. What distinguishes this statistical test also is the availability of ready-made statistical packages on the R program, whether for simulation or applications on actual data. However, the practical and experimental reality in many fields (such as the medical, chemical, engineering, etc.) necessitates that researchers deal with practical experiments that produce controlled censored data. This type of data, of course, needs certain tests dedicated to statistical dealing with it in the problem of distributional validation. The NKRR basic test is not the optimal choice in these cases. Hence, and based on this dilemma, was the primary and most important motive that prompted many researchers to think about introducing a new statistical test that fits the censored data. This new NKRR test, of course, is a modified test from the NKRR original test. According to the nature of the procedures of the two tests, both tests are not suitable for working with a type of data. One weakness of the original test, for instance, is that it is only applicable to entire data and cannot be used to deal with censored data. Therefore, this test may be a strong contender for doing statistical hypothesis tests if we are working with complete data. The modified test also has the drawback of only being appropriate for use with censored data; it is not designed for complete data and can only be used for censored data. Because it is designed for this type of data, this test will surely be a strong contender for statistical hypothesis tests if we are working with censored data.

2. Construction of NKRR Statistic for the OPPE Model

The NKRR statistic is a well-known variant of the traditional chi-squared tests in the situation of complete data (for more information, see Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7]). The most common test to check whether a mathematical model is suitable for the data from observations is the chi-square statistic of Pearson. When the model’s parameters are unknown or data are censored, these tests, however, cannot be used. Natural adaptations of the Pearson statistic for the entire set of data were reported by Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7] and are known as NKRRs. The chi-square distribution is used in this statistical test, which is a logical extension of the Pearson statistic. When the censoring is included in addition to the unknown parameter, the classical test is insufficient to support the null hypothesis. Bagdonavičius et al. [1], Bagdonavičius and Nikulin [2], Bagdonavičius and Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7] suggested changing the NKRR statistic to take into account random right censoring. For the OPPE model in the current study, we recommend creating a modified chi-square test. The following NKRR statistic

Y^{2}

was developed by Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7] to test the hypothesis.

H_{0} : \overset{}{\Pr} {X_{δ} \leq x} = F_{δ} (x) |_{x \in R},

where in our case

F_{δ} (x) = F_{δ} (x) |_{x > 0},

x_{1}, x_{2}, \dots, x_{n}

, is a sample belong to a parametric family

F_{δ} (x)

, and where

Y^{2} ({\hat{\underline{δ}}}_{n}) = Z_{n}^{2} ({\hat{\underline{δ}}}_{n}) + n^{- 1} L^{T} ({\hat{\underline{δ}}}_{n}) {(Ι ({\hat{\underline{δ}}}_{n}) - J ({\hat{\underline{δ}}}_{n}))}^{- 1} L ({\hat{\underline{δ}}}_{n}),

where

X_{n}^{2} ({\hat{\underline{δ}}}_{n}) = {(\frac{ς_{1} - n p_{1} ({\hat{\underline{δ}}}_{n})}{\sqrt{n p_{1} ({\hat{\underline{δ}}}_{n})}}, \frac{ς_{2} - n p_{2} ({\hat{\underline{δ}}}_{n})}{\sqrt{n p_{2} ({\hat{\underline{δ}}}_{n})}}, \dots, \frac{ς_{n} - n p_{b} ({\hat{\underline{δ}}}_{n})}{\sqrt{n p_{b} ({\hat{\underline{δ}}}_{n})}})}^{T}

and

J ({\hat{\underline{δ}}}_{n})

is the information (Inf-Mx) for the grouped data

J ({\hat{\underline{δ}}}_{n}) = ℬ {({\hat{\underline{δ}}}_{n})}^{T} ℬ ({\hat{\underline{δ}}}_{n}),

with

ℬ ({\hat{\underline{δ}}}_{n}) = {[\frac{1}{{\sqrt{p}}_{i}} \frac{\partial p_{i} ({\hat{\underline{δ}}}_{n})}{\partial δ}]}_{r \times s} |_{(i = 1, 2, \dots, b and k = 1, 2, \dots, s)},

and then

L (δ) = {(L_{1} (δ), \dots, L_{s} (δ))}^{T} with L_{k} (δ) = \sum_{i = 1}^{r} \frac{ς_{i}}{p_{i}} \frac{\partial}{\partial (δ)} p_{i} (δ),

where

\hat{δ}

is the estimated Fisher Inf-Mx and

I (\hat{δ})

stands for the maximum likelihood estimator of the parameter vector. The

Y^{2}

statistic has

(b - 1)

degrees of freedom and follows the

χ_{b - 1}^{2}

distribution. Consider a set of observations

x_{1}, x_{2}, \dots, x_{n}

that are collected in

I_{1}, I_{2}, \dots, I_{b}

(these

b

subintervals are mutually disjoint:

I_{j} = (a_{j, b} (x) - 1; x_{j, b} (x))

The intervals

I_{j}

’s limits for

a_{j, b} (x)

are determined as follows

p_{j} (δ) = \int_{a_{j, b} (x) - 1}^{a_{j, b} (x)} f_{δ} (x) d x |_{(j = 1, 2, \dots, b)},

and

a_{j, b} (x) = F^{- 1} (\frac{j}{b}) |_{(j = 1, \dots, b - 1)} .

The vector of frequencies is generated by grouping the data into

I_{j}

intervals,

ς_{j} = {(ς_{1}, ς_{2}, \dots, ς_{b})}^{T}

, where

ς_{j} = \sum_{i = 1}^{n} 1_{{x_{x} \in I_{j}}} |_{(j = 1, \dots, b)} .

In order to determine if the utilized data are distributed in accordance with the OPPE model, in the situation of an unknown parameter

δ

, we design an NKRR test statistic in this study as a modified goodness-of-fit test. We use the estimated Fisher Inf-Mx to provide all the components of the

Y^{2}

statistic of our model after computing the maximum likelihood estimator

\hat{δ}

of the unknown OPPE distribution’s parameter on the data set. For more applications under other different data sets, see Ibrahim et al. [8] and Yadav et al. [9].

3. Estimation and Inference

This section, via two subsections, discusses Bayesian and non-Bayesian estimating methods. Six non-Bayesian estimation methods are considered, including the maximum likelihood estimation (MLE), the Cramér–von Mises estimation (CVME), the ordinary least square estimation (OLSQE), the weighted-least square estimation (WLSQE), moment method, and the Kolmogorov estimation (KE).

3.1. Classical Estimation Methods

3.1.1. Maximum Likelihood Method

Using some observable data, maximum likelihood estimation (MLE), a statistical approach, can estimate the unknown parameter of a probability distribution. Consider the n-sample

(x_{1}, x_{2}, \dots, x_{n})

and a fixed constant

m

, we assume that the m-sample

(x_{1}, x_{2}, \dots, x_{m})

is generated from the OPPE distribution. The likelihood function of this sample is

L_{δ} (x) = \prod_{i = 1}^{m} f_{δ} (x_{i}) {[1 - F_{δ} (x_{m})]}^{n - m}

where

N = \frac{n!}{(n - m)!}

in both (1) and (2) we have

L_{δ} (x) = N δ^{m} δ^{m} {(1 - e x p (- 1))}^{- n} C_{δ} {(x_{m})}^{n - m} \prod_{i = 1}^{m} A_{δ} (x_{i}) ℬ_{δ} (x_{i})

in which

A_{δ} (x_{i}) = Δ_{δ_{i}} (x_{i}) [δ x_{i} - 1 + e x p (δ x_{i})],

ℬ_{δ} (x_{i}) = e x p (- δ x_{i} - (1 - Δ_{δ} (x_{i}))),

and

C_{δ} (x_{m}) = 1 - e x p (- (1 - Δ_{δ} (x_{m}))) .

To obtain the maximum likelihood estimate (MLE) of

δ

, we have the log-likelihood (

ℓ (δ)

) function

ℓ (δ) = l o g [\prod_{i = 1}^{n} f_{δ} (x_{i, n})],

l_{δ} (x) = l n N + m l n δ - n l n (1 - e x p (- 1)) + (n - m) l n C_{δ} (x_{m}) + \sum_{i = 1}^{m} l n A_{δ} (x_{i}) + \sum_{i = 1}^{m} l n ℬ_{δ} (x_{i}) .

The maximum likelihood estimator

δ_{M L E}

of the parameter

δ

is the solution of the following non-linear eduation

\frac{\partial l_{δ} (x)}{\partial δ} = \frac{m}{δ} + (n - m) \frac{1}{C_{δ} (x_{m})} \frac{\partial C_{δ} (x_{m})}{\partial δ} + \sum_{i = 1}^{m} \frac{1}{A_{δ} (x_{i})} \frac{\partial A_{δ} (x_{i})}{\partial δ} + \sum_{i = 1}^{m} \frac{1}{ℬ_{δ} (x_{i})} \frac{\partial ℬ_{δ} (x_{x})}{\partial δ} = 0

where

\frac{\partial}{\partial δ} Δ_{δ} (x_{i}) = - [x_{i} - x_{i} e x p (- δ x_{i}) (1 - δ x_{i})] e x p (- δ x_{i} (1 - e x p (- δ x_{i})),

\frac{\partial}{\partial δ} A_{δ} (x_{i}) = (\frac{\partial}{\partial δ} Δ_{δ} (x_{i})) (δ x_{i} - 1 + e x p (δ x_{i})) + Δ_{δ} (x_{i}) (x_{i} + x_{i} e x p (δ x_{i})),

\frac{\partial}{\partial δ} ℬ_{δ} (x_{i}) = [(\frac{\partial}{\partial δ} Δ_{δ} (x_{i})) - x_{i}] e x p (- δ x_{i} - (1 - Δ_{δ} (x_{i})),

and

\frac{\partial}{\partial δ} C_{δ} (x_{m}) = - (\frac{\partial}{\partial δ} Δ_{δ} (x_{m})) e x p (- (1 - Δ_{δ} (x_{m})) .

3.1.2. The CVME Method

The CVME of the parameter

δ

is obtained via minimizing the following expression with respect to

δ

, where

{CVME}_{(δ)} = \frac{1}{12} n^{- 1} + \sum_{i = 1}^{n} {[F_{δ} (x_{i, n}) - l_{(i, n)}^{[1]}]}^{2} |_{(x_{i, n} \in N_{(0)})},

and where

l_{(i, n)}^{[1]} = \frac{1}{2 n} (2 i - 1),

and where

{CVME}_{(δ)} = \sum_{i = 1}^{n} {[\frac{1 - e x p [- {\bar{Δ}}_{δ} (x_{i, n})]}{1 - e x p (- 1)} - l_{(i, n)}^{[1]}]}^{2} .

Then, CVME of the parameter

δ

is obtained by solving the following non-linear equation

0 = \sum_{i = 1}^{n} (\frac{1 - e x p [- {\bar{Δ}}_{δ} (x_{i, n})]}{1 - e x p (- 1)} - l_{(i, n)}^{[1]}) ς_{(δ)} (x_{i, n}, δ),

where

ς_{(δ)} (x_{i, n}, δ) = \partial F_{δ} (x_{i, n}) / \partial δ

is the first derivatives of the CDF of OPPE distribution with respect to

δ

.

3.1.3. The OLSQ Method

Let

F_{δ} (x_{i, n})

denote the CDF of the OPPE model and let

x_{1} < x_{2} < \dots < x_{n}

be the

n

ordered RS. The OLSQE (

O_{(δ)}

) is obtained upon minimizing

O L S_{(δ)} = \sum_{i = 1}^{n} {[F_{δ} (x_{i, n}) - l_{(i, n)}^{[2]}]}^{2},

where

l_{(i, n)}^{[2]} = \frac{i}{n + 1}

. Then, we have

O L S_{(δ)} = \sum_{i = 1}^{n} {[\frac{1 - e x p [- {\bar{Δ}}_{δ} (x_{i, n})]}{1 - e x p (- 1)} - l_{(i, n)}^{[2]}]}^{2} .

The LSE is obtained via solving the following non-linear equation

0 = \sum_{i = 1}^{n} [\frac{1 - e x p [- {\bar{Δ}}_{δ} (x_{i, n})]}{1 - e x p (- 1)} - l_{(i, n)}^{[2]}] ς_{(δ)} (x_{i, n}, δ),

where

ς_{(δ)} (x_{i, n}, δ)

is defined above.

3.1.4. The WLSQE Method

The WLSQE is obtained by minimizing the function

W_{(δ)}

with respect to

δ

W L S_{(δ)} = \sum_{i = 1}^{n} d_{(i, n)}^{[3]} {[F_{δ} (x_{i, n}) - l_{(i, n)}^{[2]}]}^{2},

where

l_{(i, n)}^{[3]} = [{(1 + n)}^{2} (2 + n)] / [i (1 + n - i)] .

The WLSQEs are obtained by solving

0 = \sum_{i = 1}^{n} l_{(i, n)}^{[3]} [\frac{1 - e x p [- {\bar{Δ}}_{δ} (x_{i, n})]}{1 - e x p (- 1)} - l_{(i, n)}^{[2]}] ς_{(δ)} (x_{i, n}, δ),

where

ς_{(δ)} (x_{i, n}, δ)

is defined above.

3.1.5. Method of Moments

The method of moments is a technique used in statistics to estimate population parameters. To derive higher moments like skewness and kurtosis, the same method is applied. It begins by describing the population moments as functions of the important parameters (i.e., the anticipated powers of the random variable under discussion). The sample moments are then set to be equal to those. There are exactly as many of these equations as there are parameters that need to be estimated. Then, the equations are solved for the relevant parameters. Estimates of those parameters are used in the solutions. The moment estimation of the one-parameter of the OPPE distribution can be obtained by equating the first theoretical moment of (2) with the corresponding sample moments as follows

μ_{1}^{'} = E (X_{i, n}) = \frac{1}{n} \sum_{i = 1}^{n} x_{i, n}

where

μ_{1}^{'}

can be evaluated from (2).

3.1.6. KE Method

The Kolmogorov estimate (KE)

\hat{δ}

of

δ

is obtained by minimizing the function

K E = K E (δ) = \overset{\underset{}{1 \leq i \leq n}}{m a x} {\frac{1}{n} i - F_{δ} (x_{i, n}), F_{δ} (x_{i, n}) - \frac{1}{n} (i - 1)} .

The KE

\hat{δ}

of

δ

is obtained by comparing

[\frac{1}{n} i - F_{δ} (x_{i, n})] |_{1 \leq i \leq n}

and

[F_{δ} (x_{i, n}) - \frac{1}{n} (i - 1)] |_{1 \leq i \leq n}

and selecting the max one. However, for

1 \leq i \leq n

, we are minimizing the whole function

K (δ) .

For more detail about the KE method.

3.2. Simulations and Assessment

For comparing the classical methods, some MCMC simulation studies are performed. The results are presented in: Table 1 (δ = 0.5|n = 50, 100, 200, 300, and 500); Table 2 (δ = 0.9|n = 50, 100, 200, 300, and 500) and Table 3 (δ = 1.5|n = 50, 100, 200, 300, and 500). The numerical assessments are performed depending on the mean squared errors (MSEs). First, we generate N = 1000 samples of the OPPE model. Based on Table 1, Table 2 and Table 3, it is noted that the performance of all estimation methods improves when

n \to + \infty

. Despite the variety and abundance of the other classic methods, as demonstrated in Table 1, Table 2 and Table 3, the MLE approach is still the most efficient and reliable of the surviving classic methods. The MLE approach is generally mentioned as being advised for statistical modeling and applications. This evaluation is mostly based on a thorough simulation research, as displayed in Table 1, Table 2 and Table 3. Since the MSE for the MLE is the smallest for all n = 50, 100, 200, 300, and 500, this section uses simulation studies to evaluate rather than compare various estimating procedures. However, this does not exclude the use of simulation to compare various estimating approaches. Nevertheless, actual data are commonly used to assess different estimating approaches; for this reason, we will discuss a few examples specifically for this role. There are two more applications to compare the competing models.

3.3. Applications for Comparing Methods

Here, we will be very interested in comparing the different estimation methods, but this time through real data. The first data set called the failure time data or relief times (in minutes) of patients receiving an analgesic (see Gross and Clark [10]). The second data set is called the survival times (in days) of 72 guinea pigs infected with virulent tubercle bacilli, see Bjerkedal [11].

We discuss the skewness–kurtosis plot (or the Cullen and Frey plot) in these applications for examining initial fits of theoretical distributions such as normal, uniform, exponential, logistic, beta, lognormal, and Weibull. Plotting and bootstrapping are both applied for greater accuracy. The scattergram plots, the “nonparametric Kernel density estimation (NKDE)” method for examining the initial shape of the insurance claims density, the “Quantile-Quantile (Q-Q)” plot for examining the “normality” of the current data, the “total time in test (TTT)” plot for examining the initial shape of the empirical hazard rate function (HRF), and the “box plot” for identifying the extreme data were also presented.

Figure 1 gives the box plot for the failure time data (first row, the left panel), Q-Q plot for the failure time data (first row, the right panel), TTT plot for the failure time data (second row, the left panel), nonparametric Kernel density estimation plot for the failure time data (second row, the right panel), the Cullen and Frey plot for the failure time data (third row, the left panel), and scattergrams (third row, the right panel) for the failure time data.

Based on Figure 1 (first row), the relief data have only one extreme observation, based on Figure 1 (second row, the left panel), the HRF of the relief times is “monotonically increasing HRF”, based on Figure 1 (second row, the right panel), nonparametric Kernel density estimation is bimodal and right skewed with asymmetric shape, based on Figure 1 (third row, the left panel), the relief times data do not follow any of the theoretical distributions such as the normal, uniform, exponential, logistic, beta, lognormal, and Weibull. Figure 2 gives the box plot for the survival times (first row, the left panel), Q-Q plot for the survival times (first row, the right panel), TTT plot for the survival times (second row, the left panel), nonparametric Kernel density estimation plot (second row, the right panel), the Cullen and Frey plot (third row, the left panel), and scattergrams (third row, the right panel) for the survival times data. The survival data in Figure 2 (first row) has four extreme observations, the HRF of the survival times is “monotonically increasing HRF,” the nonparametric Kernel density estimation is bimodal and right skewed with an asymmetric shape, and the survival times data in Figure 2 (third row, left panel) do not follow any of the theoretical distributions, such as the normal or gamma distributions.

A statistical test called the Anderson–Darling test (ADT) is used to determine if a sample of data is taken from a particular probability distribution. The test is based on the assumption that the distribution under the test has no parameters that need to be estimated, in which case the test and its set of critical values are distribution-free. The test is most frequently applied when a family of distributions is being examined, in which case it is necessary to estimate the family’s parameters and take this into consideration when modifying the test statistic or its critical values. It is one of the most effective statistical strategies for detecting most deviations from normal distribution when used to determine whether a normal distribution adequately describes a collection of data.

The Cramér–von Mises criterion (CVMC) in statistics is a criterion used to compare two empirical distributions or to determine how well a cumulative distribution function fits an empirical distribution function. In other techniques, such as minimum distance, the CVMC and ADT tests are used in comparing methods. Table 4 gives the application results (CVMC and ADT) for comparing methods under the relief data. Table 5 lists the application CVMC and ADT for comparing methods under the survival data. Based on Table 4, it is seen that the moment method is the best with CVMC = 0.08439 and ADT = 0.499304, then the ML method with W CVMC = 0.085876 and ADT = 0.508074. Based on Table 4, it is seen that the moment method is the best with CVMC = 0.070712 and ADT = 0.463826, then the ML method with CVMC = 0.071247 and ADT = 0.465521.

3.4. Bayesian Analysis under Different Loss Functions

In this section, we will review some statistical aspects related to Bayesian estimations. Three types of Bayesian loss functions have been used. Additionally, I made a lot of useful comparisons using the new distribution. Many Bayesian statistical algorithms can be utilized, as per Muñoz-Gil et al. [12], and within their particular context, they can aid researchers in mathematics and statistical modeling tasks, particularly when applying Bayes’ theory. Muñoz-Gil et al. [12] offered the proposed representation of four full algorithms with their codes and implementations for this specific purpose, allowing the reader to follow the new algorithms and present more algorithms based on them.

3.4.1. Prior and Posterior Distributions

As prior distributions, we assume the parameter

δ

has an informative distribution as a prior, which is a Gamma distribution:

π (δ) = \frac{a b}{Γ (b)} δ^{b - 1} e x p (- a δ), a, b > 0 .

The posterior distribution of

δ

is

π (δ | x) = K δ^{m + b - 1} [1 - e x p (- 1)]^{- m} e x p (- a δ) \prod_{i = 1}^{m} P_{δ} (x_{i}) Δ_{δ} (x_{i}) Q_{δ} {(x_{m})}^{n - m},

where

K = \int_{0}^{+ \infty} δ^{m + b - 1} [1 - e x p (- 1)]^{- m} e x p (- a δ) \prod_{i = 1}^{m} P_{δ} (x_{i}) Δ_{δ} (x_{i}) Q_{δ} {(x_{m})}^{n - m} d δ,

is the normalizing constant. In the following, we use the three loss functions, namely the generalized quadratic (GQ), the entropy, and the Linex loss functions to obtain the Bayesian estimators. In the following formulas, we consider γ, P, and r as integers.

3.4.2. Bayesian Estimators and Their Posterior Risk

The Bayesian estimator under the generalized quadratic loss function

(G Q)

is

δ_{G Q} = \frac{I_{0}^{+ \infty} (m + b - 1 + γ, x_{m})}{I_{0}^{+ \infty} (m + b - 2 + γ, x_{m})},

where

I_{0}^{+ \infty} (m + b - 1 + γ, x_{m}) = \int_{0}^{+ \infty} δ^{m + b - 1 + γ} [1 - e x p (- 1)]^{- m} e x p (- a δ) \prod_{i = 1}^{m} Δ_{δ} (x_{i}) P_{δ} (x_{i}) Q_{δ} {(x_{m})}^{n - m} d δ

and

I_{0}^{+ \infty} (m + b - 2 + γ, x_{m}) = \int_{0}^{+ \infty} δ^{m + b - 2 + γ} [1 - e x p (- 1)]^{- m} e x p (- a δ) \prod_{i = 1}^{m} Δ_{δ} (x_{i}) P_{δ} (x_{i}) Q_{δ} {(x_{m})}^{n - m} d δ .

Under the entropy loss function, we obtain the following estimator

δ_{E} = {[K \int_{0}^{+ \infty} δ^{m + b - 1 - p} [1 - e x p (- 1)]^{- m} e x p (- a δ) \prod_{i = 1}^{m} Δ_{δ} (x_{i}) P_{δ} (x_{i}) Q_{δ} {(x_{m})}^{n - m} d δ]}^{- \frac{1}{p}} .

The corresponding posterior risk is

P R (δ_{E}) = P E_{π} (l n (δ) - l n (δ_{E})),

finally, under the Linex loss function, the Bayesian estimator

δ_{L} = \frac{- K}{r} l n [\int_{0}^{+ \infty} δ^{m + b - 1} [1 - e x p (- 1)]^{- m} e x p [- δ (a + r)] \prod_{i = 1}^{m} P_{δ} (x_{i}) Δ_{δ} (x_{i}) Q_{δ} {(x_{m})}^{n - m} d δ],

and the corresponding posterior risk is

P R (δ_{L}) = r (δ_{G Q} - δ_{L}) .

Since it is unlikely possible to obtain all these estimators analytically, we suggest the use of the MCMC procedures to evaluate them. In this work, several Bayesian algorithms were developed recently for the analysis of tracer-diffusion single-particle tracking data. Following Muñoz-Gil et al. [12], many Bayesians statistical algorithms can be used, which within their specific framework, can help researchers in mathematical and statistical modeling operations, especially when using Bayes’ theory. For this specific purpose, Muñoz-Gil et al. [12] presented the proposed visualization of four complete algorithms with their codes and applications, and the reader can track the new algorithms and other algorithms can be presented based on them.

3.4.3. Comparing the Likelihood Estimation and the Bayesian Estimation Using Pitman’s Closeness Criterion

We are going to compare the performance of the proposed Bayesian estimators with the MLEs, for that purpose, we perform a MCMC simulation method with δ = 1.5 and a = 3, b = 2. We generate N = 1000 type II censored samples following the OPPE model, we use different sample sizes n = 30, 100, 200 while m = 10, 40, 160, respectively, and we obtain the following results. Table 6 contains the values of the estimators produced by the function BB algorithm. Here, we observe that, especially as sample size n is raised, the estimated values are fairly close to the parameter’s genuine values. Table 7, Table 8 and Table 9 present the Bayesian estimators and PR (in brackets) under the generalized quadratic loss function, the entropy loss function, and the Linex loss function, respectively. Table 10 lists the Bayesian estimators and PR (in brackets) for each of the three loss functions.

In Table 7, the estimation under the generalized quadratic loss function, we remark that the value

γ = - 1

gives the best posterior risk. Additionally, we obtain the smallest suitable posterior risk when n is long. In the estimation under the entropy loss function, we obtain Table 8 where we can notice that the value

p = - 0.5

when

n = 200

provides the best posterior risk. We can notice clearly that the value

r = - 1.5

provides the best PR. In conclusion, it is evident from a brief comparison of the three loss functions that the quadratic loss function yields the best outcomes; the results are further illustrated in Table 10. The best Bayesian estimators and maximum likelihood estimators should be compared, as we suggest. We employ the Pitman closeness criterion for this purpose (more information can be found in Pitman [13], Fuller [14], and Jozani [15]).

Definition 1.

An estimator

ϑ_{1}

of a parameter

ϑ

dominates another estimator

ϑ_{2}

in the sense of Pitman’s closeness criterion if, for all

ϑ \in Θ

,

P_{ϑ} [| ϑ_{1} - ϑ | < | ϑ_{2} - ϑ |] > 0.5 .

We show the Pitman probability’ values in Table 11 so that we may contrast the Bayesian estimators with the MLE estimator when using the three loss functions when γ = −1, p = −0.5, and r = −1.5. Then, we notice that, according to this criterion, the Bayesian estimators of the parameter is better than the MLE. Additionally, the Linex loss function has the best values in comparison with the other two loss functions with the probability 0.734, δ_{(n=200,m=160)}.

4. Distributional Validation

We utilize the statistic type test based on a version of the NKRR statistic suggested by Bagdonavičius et al. [1], Bagdonavičius and Nikulin [2], Bagdonavičius and Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7] to confirm the sufficiency of the OPPE model when the parameter is unknown, and the data are censored. The failure rate

x_{i}

follows an OPPE distribution, hence we modify this test for an OPPE model. Let’= us think about the null hypothesis:

H_{0} : F (x) \in F_{0} = F_{0, δ} (x) |_{x \in R},

The OPPE distribution’s survival function (SrF) and cumulative hazard function are as follows:

S_{δ} (x) = 1 - F_{δ} (x) = 1 - [\frac{1}{1 - e x p (- 1)} (1 - e x p {- [1 - Δ_{δ} (x)]})],

where

Δ_{δ} (x) = {[e x p (- δ x)]}^{[1 - e x p (- δ x)]} .

and

V_{δ} (x) = - \ln [S_{δ} (x)],

For all

j

, we have a constant value of

e_{j, X} = E_{k} / k

under this choice of intervals. Since the inverse hazard function of the OPPE distribution lacks an explicit form, intervals can be estimated iteratively. Let us divide a finite time period

[0, τ]

into

k > s

smaller periods

I_{j} = (a_{j - 1} (x), a_{j, b} (x)]

and

0 = < a_{0, b} < a_{1, b} \dots < a_{k - 1, b} < a_{k, b} = + \infty .

If

x_{(x)}

is the ith element in the ordered statistics

(x_{(1)}, \dots, x_{(n)})

, and if

{[V_{δ} (x)]}^{- 1}

is the inverse of the cumulative hazard function

V_{δ} (x)

, then the estimated value of

{\hat{a}}_{j, b} (x)

where

E_{k} = E_{k} (x) = \sum_{l = 1}^{i - 1} V_{δ} (x),

and

a_{j, b} (x)

are random data functions, such as the

k

selected intervals have equal expected numbers of failures

e_{j, X}

. The test for hypothesis

H_{0}

can be based on the statistic

Y_{n, r - 1, ε}^{2} (\hat{δ}) = Z^{T} \hat{S^{-}} Z

where

Z = {(Z_{1}, Z_{2}, \dots, Z_{k})}^{T},

and

Z_{j} = \frac{1}{\sqrt{n}} (O_{j, X} - e_{j, X}) |_{(j = 1, 2, \dots, k)},

and

O_{j, X}

reflect the total number of failures that have been noticed throughout these times. Due to Bagdonavičius et al. [1], Bagdonavičius and Nikulin [2], Bagdonavičius and Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7], the test statistic is expressed as:

Y_{n, r - 1, ε}^{2} (\hat{δ}) = \sum_{j = 1}^{k} \frac{1}{O_{j, X}} {(O_{j, X} - e_{j, X})}^{2} + V_{W, G},

where

V_{W, G}

and many other details are given in Bagdonavičius et al. [1], Bagdonavičius and Nikulin [2], Bagdonavičius and Nikulin [3], Nikulin [4], Nikulin [5], Nikulin [6], and Rao and Robson [7]. We compute each component of the

Y_{n, r - 1, ε}^{2} (\hat{δ})

statistic for the OPPE model. The statistic

Y_{n, r - 1, ε}^{2} (\hat{δ})

has a chi-square limit distribution, and its degree of freedom is

d f = r a n k (S) = t r a c e (S^{- 1} S)

where

d f = k

. The estimated significance threshold is rejected if

Y_{n, r - 1, ε}^{2} (\hat{δ}) > χ_{ε}^{2} (d f)

(where

χ_{ε}^{2} (d f)

is the quantile of chi-square with

r = d f = r a n k (S)

), then the approximate significance level

ε

is rejected Hypothesis, where

\hat{P_{l j}}

is the Fisher information.

4.1. Uncensored Simulation Study under the NKRR Statistics $Y^{2} (ε; \hat{δ})$

We conducted an extensive study using numerical simulation to confirm the arguments of this work. Therefore, we generated the N statistics of 15,000 simulated samples with sizes of n = 25, n = 50, n = 150, n = 350, and n = 600, in order to test the null hypothesis

H_{0}

that the sample belongs to the OPPE model. We determine the average of the non-rejection numbers of the null hypothesis

Y^{2} (ε; \hat{δ}) \leq χ_{ε}^{2} (b - 1)

for various theoretical levels (ε = 0.01, 0.02, 0.05, 0.1). The corresponding empirical and theoretical levels are represented in Table 12. It can be seen that the calculated empirical level value is very close to its corresponding theoretical level value. Therefore, we conclude that the recommended test is very suitable for the OPPE distribution.

4.2. Uncensored Applications under the NKRR Statistics $Y^{2} (ε; \hat{δ})$

4.2.1. Strengths of Glass Fibers

In this example, 100 carbon fiber fracture stresses (in Gba) are included in this data set given by Nicholase and Padgett [16]. Using the BB algorithm, we can get the MLE value of the parameter δ, assuming that our OPPE model can fit the strength data of 1.5 cm glass fiber:

\hat{δ} = 2.14068

. We can compute and provide the Fisher Inf-Mx as follows using the value:

I (\hat{δ}) = 1.648771 .

The critical values for the NKRR statistical test were:

Y^{2} (ε; \hat{δ}) = 11.855642

and

χ_{0.05}^{2} (6) = 12.59159

, so the OPPE distribution can effectively simulate and model the 1.5 cm glass fiber data.

4.2.2. Heat Exchanger Tube Crack

The crack data, which includes examinations carried out at eight chosen intervals until fractures emerged in 167 similar turbine parts, was acquired from the book by Meeker and Escobar [17].

\begin{matrix} Time of inspection & 186 & 606 & 902 & 1077 & 1209 & 1377 & 1592 & 1932 \\ Number of fans found to have cracks & 5 & 16 & 12 & 18 & 18 & 2 & 6 & 17 \end{matrix}

We test the null hypothesis that these data are modified by our OPPE distribution using previously acquired NKRR statistics. We calculate the MLE

\hat{δ} = 2.475019

using R programming and the BB method (see Ravi (2009)). The estimated Fisher Inf-Mx at that time is:

I (\hat{δ}) = 1.592466

So, we arrive with

Y^{2} (ε; \hat{δ}) = 19.84927

as the answer. The crucial value for significance level

ε = 0.05

and

χ_{0.01}^{2} (12) = 21.02607

. This model’s NKRR statistic (

Y^{2} (ε; \hat{δ})

) is less than the essential value, allowing us to conclude that the data correctly fit the OPPE model.

4.3. Censored Simulation Study under the NKRR Statistics $Y_{n, r - 1, ε}^{2} (\hat{δ})$

In the censored simulation study under the NKRR statistics

Y^{2}

, it is expected that the produced sample

(N = 14000)

is censored at

25 %

with

d f = 5

. We determine the average value of the non-rejection numbers of the null hypothesis for various theoretical levels

(ε = 0.01, 0.02, 0.05, 0.1)

, where

Y_{n, r - 1, ε}^{2} (\hat{δ})

≤

χ_{ε}^{2} (r - 1)

. The matching theoretical and empirical levels are presented in Table 13, demonstrating how closely the determined empirical level value resembles the associated theoretical level value. Therefore, we draw the conclusion that the custom test is ideally suited to the OPPE model. These findings lead us to the conclusion that the theoretical level of the chi-square distribution on degrees of freedom corresponds to the empirical significance level of the

Y_{n, r - 1, ε}^{2} (\hat{δ})

statistics at which it is statistically significant. It may be inferred from this that the suggested test can accurately fit the censored data from the OPPE distribution. Returning to the defects of the NKRR basic test, the statistical literature contains many statistical hypothesis tests that fit the complete data, and for this reason, the original test has many competing tests in this field, and the fact is that the NKRR basic test actually has many alternatives. As for the modified NKRR test, it is considered an individual statistical test of its kind in the field of statistical tests due to its importance and the nature of the data that it can deal with, which are the data subject to censorship.

4.4. Censored Applications under the NKRR Statistics $Y_{n, x - 1, ε}^{2} (\hat{δ})$

4.4.1. Censored Lung Cancer Data Set

The lung cancer data provided by Loprinzi et al. [18] from the North Central Cancer Treatment Group investigated the survival in patients with advanced lung cancer (n = 228 and censored items = 63), and their performance scores rate how well the patient can do typical daily activities. If we assume that the data are distributed according to the OPPE distribution, we can estimate the vector parameter

\hat{δ}

by applying the maximum likelihood estimation approach as:

\hat{δ} = 3.004756

. As a number of classes, we employ

d f = 8

. The following is how the test statistic

Y_{n, r - 1, ε}^{2} (\hat{δ})

items are presented:

$\hat{a_{j, b} (X)}$	92.138	171.694	216.037	283.012	355.086	456.277	685.261	1022.3174
$\hat{O_{j, X}}$	29	30	35	31	32	25	28	18
$e_{j, X}$	6.55438	6.55438	6.55438	6.55438	6.55438	6.55438	6.55438	6.55438

The estimated matrix

\hat{P_{l j}} (X)

and the estimated information (E-Inf)

I (\hat{δ})

via Fisher are as follows:

\hat{P_{l j}} (X)

−0.6827

−0.4368

0.7015

0.7684

0.5094

0.3007

0.8769

−0.2884

and

I (\hat{δ}) = 3.0134022 .

The chi-squared test has a critical value of

χ_{0.05}^{2} (d f = 8) = 15.50731 .

We find that the estimated statistic for the suggested test is

Y_{n, r - 1, ε}^{2} (\hat{δ}) = 14.616535

using the earlier findings. We may state that our hypothesis

H_{0}

is accepted because the tabulated value of the

Y_{n, r - 1, ε}^{2} (\hat{δ})

statistic is higher than the computed value. This leads us to the conclusion that there is a 5% chance that the lung cancer data will deviate from the OPPE distribution.

4.4.2. Censored Capacitor Data Reliability Data Set

A set of data for basic reliability assessments data on glass capacitor longevity as a function of voltage and operating temperature from a factorial experiment (see Meeker and Escobar [17]). Each combination of temperature and voltage has eight capacitors. Testing was stopped after the fourth failure at each combination, n = 64 and censored items = 32. The parameter vector’s maximum likelihood estimator

\hat{δ}

, assuming that the data are distributed according to the OPPE distribution, is:

\hat{δ} = 1.680225

. We pick a few classes with

d f = 8

. The components of the statistical test

Y_{n, r - 1, ε}^{2} (\hat{δ})

are as follows:

$\hat{a_{j, b} (X)}$	346.1573	469.502	587.113	679.017	1078.834	1089.109	1102.167	1106.444
$\hat{O_{j, X}}$	11	15	6	10	6	5	6	5
$e_{j, X}$	6.91042	6.91042	6.91042	6.91042	6.91042	6.91042	6.91042	6.91042

The estimated matrix

\hat{P_{l j}} (X)

and Fisher’s E-Inf

I (\hat{δ})

are:

\hat{P_{l j}} (X)

0.48965

−0.73465

−0.40012

0.29784

−0.29467

−0.95113

0.60378

0.28564

and

I (\hat{δ}) = 4.1675894

The value of the statistical test

Y_{n, r - 1, ε}^{2} (\hat{δ}) = 13.84577

is then assessed.

χ_{0.05}^{2} (8) = 15.50731 > Y_{n, r - 1, ε}^{2} (\hat{δ})

is the crucial value. We may conclude that the OPPE model is used to update the life statistics for glass capacitors.

5. Conclusions

The one parameter Poisson-exponential (OPPE) model, a new adaptable variation of the exponentiated exponential model, is introduced and studied in this article. Six well-known estimation techniques are investigated, discussed, and used: Cramer–von Mises, ordinary least square, L-moments, maximum likelihood, Kolmogorov, and weighted-least square. The Bayesian estimation under the squared error loss function was also devised and investigated. The use of two real data and two simulated data allows for a complete evaluation of all available techniques. The utility and adaptability of the OPPE distribution are illustrated using two real data sets. The new model outperforms many other competing models in modeling relief times and survival times, according to the Akaike Information Criterion, Consistent Akaike Information Criterion, Hannan–Quinn Information Criterion, Bayesian Information Criterion, Cramér–von Mises, and Anderson–Darling statistics. However, there are more results the reader can read by examining the paper and the applications, the following results can be specifically highlighted:

The maximum likelihood method is still the most effective and trustworthy of the remaining classic approaches. Both the Bayesian technique and the Maximum Likelihood method are recommended for statistical modeling and applications.
Under different loss functions, the Bayesian estimation is provided. Three loss functions, the generalized quadratic, the Linex, and the entropy, are used to produce the Bayesian estimators, and many useful details are provided.
All of the provided estimation methods have been evaluated through simulation tests with particular parameter and controls (these simulation studies are all stated in the paper at the appropriate places).
The BB algorithm for process estimation under censored samples is used to compare the Bayesian technique and the censored maximum likelihood method.
It is shown in detail how the NKRR statistic is created for the OPPE model in the unfiltered instance. The results of a simulation research show that the OPPE model and the NKRR test are a good fit.
A simulation study for evaluating statistics is described, along with the construction of the Bagdonavičius and Nikulin statistic for the novel model under the censored situation. Two real data applications are examined in a censored scenario; the first data are reliability information on capacitors, and the second data are information about lung cancer (medical data). We deduce from these applications that the suggested technique can successfully fit censored data from the OPPE distribution.
For the uncensored strengths of glass fibers data, the critical values for the NKRR statistical test were: $Y^{2} (ε; \hat{δ}) = 11.855642$ and $χ_{0.05}^{2} (6) = 12.59159$ , so the OPPE distribution can effectively simulate and model the uncensored 1.5 cm glass fiber data.
For the uncensored heat exchanger tube crack data, the critical values for the NKRR statistical test were: $Y^{2} (ε; \hat{δ}) = 19.84927$ and $χ_{0.05}^{2} (6) = 12.59159$ , so the OPPE distribution can effectively simulate and model the uncensored heat exchanger tube crack data.
For the censored lung cancer data set: the value of the statistical test $Y_{n, r - 1, ε}^{2} (\hat{δ}) = 13.84577$ , where $χ_{0.05}^{2} (8) = 15.50731 > Y_{n, r - 1, ε}^{2} (\hat{δ}) = 14.616535$ . Then, we conclude that the OPPE model can be used for modeling the censored lung cancer data set.
For censored capacitor data reliability data set: the value of the statistical test $Y_{n, r - 1, ε}^{2} (\hat{δ}) = 13.84577$ , where $χ_{0.05}^{2} (8) = 15.50731 > Y_{n, r - 1, ε}^{2} (\hat{δ}) = 13.84577$ . Then, we conclude that the OPPE model can be used for modeling the censored capacitor data reliability data set.
It is worth noting that NKRR basic test, which is supported by complete data, is the most popular test in the last ten years. This is because it fits the complete truth data, and this is the case in most practical and applied cases in various fields. What distinguishes this statistical test also is the availability of ready-made statistical packages on the R program, whether for simulation or applications on actual data.
However, the practical and experimental reality in many fields (such as the medical, chemical, engineering, etc.) necessitates that researchers deal with practical experiments that produce controlled censored data. This type of data, of course, needs certain tests dedicated to statistical dealing with it in the problem of distributional validation. The NKRR basic test is not the optimal choice in these cases. Hence, and based on this dilemma, was the primary and most important motive that prompted many researchers to think about introducing a new statistical test that fits the censored data. This new NKRR test, of course, is a modified test from the NKRR original test.
According to the nature of the procedures of the two tests, both tests are not suitable for working with a type of data. For example, one of the drawbacks of the original test is that it is not suitable for dealing with censored data, it is only for complete data and cannot be applied to censored data. Therefore, if we are dealing with complete data, then this test may be a strong candidate for performing statistical hypothesis tests.
Moreover, a disadvantage of the modified test is that it is only suitable for dealing with censored data, it is not intended for complete data and can only be applied to censored data. Therefore, if we are dealing with censored data, this test will undoubtedly be a strong candidate for statistical hypothesis tests, because it is intended for this type of data.
Returning to the defects of the NKRR basic test, the statistical literature contains many statistical hypothesis tests that fit the complete data, and for this reason, the original test has many competing tests in this field, and the fact is that the NKRR basic test actually has many alternatives.
As for the modified NKRR test, it is considered an individual statistical test of its kind in the field of statistical tests due to its importance and the nature of the data that it can deal with, which are the data subject to censorship.

Author Contributions

W.E.: validation, writing the original draft preparation, conceptualization, data curation, formal analysis, software. Y.T.: funding acquisition methodology, formal analysis, conceptualization, software. H.G.: validation, formal analysis, data duration, conceptualization, software. T.H.: validation, formal analysis, data duration, conceptualization, formal analysis, software. A.H.: validation, data duration, conceptualization, software. M.M.A.: review and editing, conceptualization, formal analysis, supervision. H.M.Y.: review and editing, software, validation, writing the original draft preparation, conceptualization, supervision. M.I.: review and editing, formal analysis, software, validation, writing the original draft preparation, conceptualization. All authors have read and agreed to the published version of the manuscript.

Funding

The study was funded by Researchers Supporting Project number (RSP2023R488), King Saud University, Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset can be provided upon requested.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

CDF	Cumulative distribution function
OPPE	One parameter Poisson-exponential
SrF	Survival function
ADT	Anderson-Darling test
CVME	Cramér–von Mises estimation
MLEs	Maximum likelihood estimations
HRF	Hazard rate function
Inf-Mx	Information matrix
E- Inf	Estimated information
OLSQE	Ordinary least squares estimation
KE	Kolmogorov estimation
WLSQE	Weighted least squares estimation
MSE	Mean square error
$ℓ (δ)$	Log-likelihood
CVMC	Cramér–von Mises criterion
df	Degrees of freedom
P–P	Probability-probability
TTT	Total time in test
Q-Q	Quantile-quantile
NKRR	Nikulin–Rao–Robson

References

Bagdonavičius, V.B.; Levuliene, R.J.; Nikulin, M.S. Chi-Squared Goodness-of-Fit Tests for Parametric Accelerated Failure Time Models. Commun. Stat.-Theory Methods 2013, 42, 2768–2785. [Google Scholar]
Bagdonavičius, V.; Nikulin, M. Chi-squared goodness-of-fit test for right censored data. Int. J. Appl. Math. Stat. 2011, 24, 30–50. [Google Scholar]
Bagdonavičius, V.; Nikulin, M. Chi-squared tests for general composite hypotheses from censored samples. Comptes Rendus Math. 2011, 349, 219–223. [Google Scholar]
Nikulin., M.S. Chi-squared test for normality. In Proceedings of the International Vilnius Conference on Probability Theory and Mathematical Statistics; Vilnius University: Vilnius, Lithuania, 1973; Volume 2, pp. 119–122. [Google Scholar]
Nikulin, M.S. Chi-squared test for continuous distributions with shift and scale parameter. Theory Probab. Its Appl. 1973, 18, 559–568. [Google Scholar] [CrossRef]
Nikulin, M.S. On a Chi-squared test for continuous distributions. Theory Probab. Its Appl. 1973, 19, 638–639. [Google Scholar]
Rao, K.C.; Robson, D.S. A Chi-Square Statistic for Goodness-of-Fit Tests Within the Exponential Family. Commun. Stat.-Simul. Comput. 1974, 3, 1139–1153. [Google Scholar]
Ibrahim, M.; Hamedani, G.G.; Butt, N.S.; Yousof, H.M. Expanding the Nadarajah Haghighi Model: Copula, Censored and Uncensored Validation, Characterizations and Applications. Pak. J. Stat. Oper. Res. 2022, 18, 537–553. [Google Scholar] [CrossRef]
Yadav, A.S.; Shukla, S.; Goual, H.; Saha, M.; Yousof, H.M. Validation of xgamma exponential model via Nikulin-Rao-Robson goodness-of- fit test under complete and censored sample with different methods of estimation. Stat. Optim. Inf. Comput. 2022, 10, 457–483. [Google Scholar]
Gross, J.; Clark, V.A. Survival Distributions: Reliability Applications in the Biometrical Sciences; John Wiley: New York, NY, USA, 1975. [Google Scholar]
Bjerkedal, T. Acquisition of resistance in guinea pigs infected with different doses of virulent tubercle bacilli. Am. J. Epidemiol. 1960, 72, 130–148. [Google Scholar]
Muñoz-Gil, G.; Volpe, G.; Garcia-March, M.A.; Aghion, E.; Argun, A.; Hong, C.B.; Manzo, C. Objective comparison of methods to decode anomalous diffusion. Nat. Commun. 2021, 12, 6253. [Google Scholar] [CrossRef]
Pitman, E. The closest estimates of statistical parameter. In Mathematical Proceeding of the Cambridge Philosophical Society; Cambridge University Press: Cambridge, UK, 1937; Volume 33. [Google Scholar]
Fuller, W.A. Closeness of estimators. In Encyclopedia of Statistical Sciences; Wiley: New York, NY, USA, 1982; Volume 2. [Google Scholar]
Jozani, M.J.; Davies, K.F.; Balakrishnan, N. Pitman closeness results concerning ranked set sampling. Stat. Probab. Lett. 2012, 82, 2260–2269. [Google Scholar] [CrossRef]
Nicholase, M.D.; Padgett, W.J. A bootstrap control chart for Weibull percentiles. Qual. Reliab. Eng. Int. 2006, 22, 141–151. [Google Scholar] [CrossRef]
Meeker, W.Q.; Escobar, L.A.; Lu, C.J. Accelerated degradation tests: Modeling and analysis. Technometrics 1998, 40, 89–99. [Google Scholar] [CrossRef]
Loprinzi, C.L.; ALaurie, J.; Wieand, H.S.; EKrook, J.; Novotny, P.J.; Kugler, J.W.; Bartel, J.; Law, M.; Bateman, M.; EKlatt, N. Prospective evaluation of prognostic variables from patient-completed questionnaires. North Central Cancer Treatment Group. J. Clin. Oncol. 1994, 12, 601–607. [Google Scholar] [CrossRef]

Figure 1. Box plot, Q-Q plot, TTT plot, nonparametric Kernel density estimation, Cullen and Frey, and scattergrams for the relief times.

Figure 2. Box plot, Q-Q plot, TTT plot, nonparametric Kernel density estimation, Cullen and Frey, and scattergrams for the survival times.

Table 1. Simulation results for δ = 0.5.

n	MLE	OLSQ	WLSQ	CVM	Moment	KE
50	0.00305	0.00334	0.00314	0.00355	0.00329	0.00366
100	0.00148	0.00163	0.00163	0.00161	0.00163	0.00179
200	0.00073	0.00081	0.00081	0.00080	0.00082	0.00088
300	0.00047	0.00054	0.00051	0.00054	0.00049	0.00055
500	0.00027	0.00031	0.00031	0.00031	0.00030	0.00032

Table 2. Simulation results for δ = 0.9.

n	MLE	OLSQ	WLSQ	CVM	Moment	KE
50	0.00981	0.01073	0.01035	0.00939	0.01195	0.01187
100	0.00489	0.00533	0.00511	0.00556	0.00539	0.00616
200	0.00248	0.00286	0.00278	0.00282	0.00239	0.00285
300	0.00155	0.00177	0.00177	0.00178	0.00174	0.00180
500	0.00092	0.00098	0.000987	0.00099	0.00095	0.00108

Table 3. Simulation results for δ = 1.5.

n	MLE	OLSQ	WLSQ	CVM	Moment	KE
50	0.02580	0.03189	0.02847	0.02888	0.03039	0.03448
100	0.01296	0.01591	0.01414	0.01509	0.01408	0.01582
200	0.00633	0.00749	0.00732	0.00744	0.00740	0.00785
300	0.00460	0.00466	0.00498	0.00480	0.00494	0.00504
500	0.00271	0.00299	0.00291	0.00298	0.00302	0.00301

Table 4. p-values for comparing methods under the relief data.

Method	δ	CVMC	ADT
ML	0.503284	0.085876	0.508074
LS	0.470917	0.086923	0.514258
WLSQ	0.428757	0.088466	0.523383
CVM	0.470955	0.086921	0.514251
Moment	0.557676	0.08439	0.499304
KE	0.467910	0.087026	0.514869

Table 5. p-values for comparing methods under the survival data.

Method	δ	CVMC	ADT
ML	0.567783	0.071247	0.465521
LS	0.546370	0.071690	0.467139
WLSQ	0.551350	0.071581	0.466728
CVM	0.546576	0.071685	0.467122
Moment	0.599247	0.070712	0.463826
KE	0.527965	0.072124	0.468851

Table 6. The MLE of the parameter with quadratic error (in brackets).

N = 5000	n = 30	n = 100	n = 200
m →	10	40	160
	1.4905	1.46543	1.5104
δ→	(0.0125)	(0.0321)	(0.0023)

Table 7. Bayesian estimators and PR (in brackets) under GQ loss function.

γ ↓ N = 5000, m →	n = 30, 10	n = 100, 40	n = 200, 160
−2	1.6490 (0.0076)	1.6825 (0.0053)	1.6432 (0.0023)
−1.5	1.7990 (0.0087)	1.0825 (0.0061)	1.2127 (0.0016)
−1	1.6182 (0.0005)	0.9739 (0.0001)	2.0018 (0.0001)
−0.5	1.4994 (0.0071)	1.4888 (0.0060)	1.5138 (0.0011)
0.5	1.9760 (0.0085)	1.7926 (0.0077)	1.3439 (0.0013)
1	1.7516 (0.0087)	1.2972 (0.0058)	1.7208 (0.0035)
1.5	1.6743 (0.0045)	1.5632 (0.0065)	1.3275 (0.0032)
2	1.4768 (0.1241)	1.4191 (0.1181)	1.7158 (0.0033)

Table 8. Bayesian estimators and PR (in brackets) under the entropy loss function.

N = 5000	n = 30	n = 100	n = 200
p ↓ m →	10	40	160
−2	1.6037 (0.0009)	0.7757 (0.3190)	1.8493 (0.0308)
−1.5	1.3990 (0.1644)	1.2144 (0.0019)	1.0942 (0.0080)
−1	1.0886 (0.0031)	0.7654 (0.1173)	1.7697 (0.0099)
−0.5	1.6001 (0.0065)	1.542 (0.00020)	1.5638 (0.0071)
0.5	1.483 (0.00413)	1.4899 (0.0023)	1.5121 (0.0012)
1	1.4579 (0.0997)	1.4354 (0.0944)	1.4571 (0.0014)
1.5	1.6981 (0.0038)	0.4830 (0.0733)	1.2148 (0.0009)
2	1.356 (0.16440)	1.0942 (0.0080)	1.2144 (0.0019)

Table 9. Bayesian estimators and PR (in brackets) under Linex loss function.

	N = 5000	n = 30	n = 100	n = 200
r↓	m →	10	40	160
−2		1.3815 (0.0182)	1.5434 (0.0183)	1.3815 (0.0001)
−1.5		1.2080 (0.0014)	1.2609 (0.0199)	1.4011 (0.0004)
−1		1.6234 (0.0108)	0.9900 (0.0103)	1.2819 (0.0065)
−0.5		1.1915 (0.0111)	1.3187 (0.0195)	1.4091 (0.0032)
0.5		1.3909 (0.0231)	1.4560 (0.0183)	1.3324 (0.0032)
1		1.3815 (0.0183)	1.3815 (0.0183)	1.3815 (0.0183)
1.5		0.5155 (0.0519	0.5315 (0.0183)	1.5109 (0,0057)
2		1.4193 (0.0131)	0.9547 (0.1041)	1.4045 (0.0004)

Table 10. Bayesian estimators and PR (in brackets) under the three loss functions.

N = 5000	n₁ = 10	n₂ = 50	n₃ = 200
m →	10	40	160
$G Q \|_{γ = 1}$	1.4994 (0.00710)	1.4888 (0.0060)	1.5138 (0.0011)
$E n t r o p y \|_{p = - 0.5}$	1.4830 (0.00413)	1.4899 (0.0023)	1.5121 (0.0012)
$L i n e x \|_{r = - 1.5}$	0.5155 (0.05190)	0.5315 (0.0183)	1.5109 (0.0057)

Table 11. Pitman comparison of the estimators.

N = 5000	n = 10	n = 50	n = 200
m →	8	40	160
$G Q \|_{γ = 1}$	0.678	0.654	0.674
$E n t r o p y \|_{p = - 0.5}$	0.534	0.587	0.632
$L i n e x \|_{r = - 1.5}$	0.587	0.5789	0.734

Table 12. Empirical levels and corresponding theoretical levels (ε = 0.01, 0.02, 0.05, 0.1) and N = 15,000.

n↓ & ε→	ε₁ = 0.01	ε₂ = 0.02	ε₃ = 0.05	ε₄ = 0.1
n₁ = 25	0.9941	0.9829	0.9522	0.9031
n₂ = 50	0.9936	0.9820	0.9513	0.9024
n₃ = 150	0.9922	0.9812	0.9510	0.9012
n₅ = 300	0.9909	0.9807	0.9507	0.9008
n₆ = 600	0.9905	0.9804	0.9503	0.9004

Table 13. Empirical levels and corresponding theoretical levels (ε = 0.01, 0.02, 0.05, 0.1) and N = 14,000.

n ↓&ε→	ε₁ = 0.01	ε₂ = 0.02	ε₃ = 0.05	ε₄ = 0.1
$n_{1} = 25$	0.9930	0.9829	0.9533	0.9025
$n_{2} = 50$	0.9925	0.9818	0.9524	0.9014
$n_{3} = 150$	0.9912	0.9812	0.9516	0.9009
$n_{5} = 300$	0.9906	0.9807	0.9508	0.9004
$n_{6} = 600$	0.9904	0.9803	0.9502	0.9001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Emam, W.; Tashkandy, Y.; Goual, H.; Hamida, T.; Hiba, A.; Ali, M.M.; Yousof, H.M.; Ibrahim, M. A New One-Parameter Distribution for Right Censored Bayesian and Non-Bayesian Distributional Validation under Various Estimation Methods. Mathematics 2023, 11, 897. https://0-doi-org.brum.beds.ac.uk/10.3390/math11040897

AMA Style

Emam W, Tashkandy Y, Goual H, Hamida T, Hiba A, Ali MM, Yousof HM, Ibrahim M. A New One-Parameter Distribution for Right Censored Bayesian and Non-Bayesian Distributional Validation under Various Estimation Methods. Mathematics. 2023; 11(4):897. https://0-doi-org.brum.beds.ac.uk/10.3390/math11040897

Chicago/Turabian Style

Emam, Walid, Yusra Tashkandy, Hafida Goual, Talhi Hamida, Aiachi Hiba, M. Masoom Ali, Haitham M. Yousof, and Mohamed Ibrahim. 2023. "A New One-Parameter Distribution for Right Censored Bayesian and Non-Bayesian Distributional Validation under Various Estimation Methods" Mathematics 11, no. 4: 897. https://0-doi-org.brum.beds.ac.uk/10.3390/math11040897

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New One-Parameter Distribution for Right Censored Bayesian and Non-Bayesian Distributional Validation under Various Estimation Methods

Abstract

1. Introduction

2. Construction of NKRR Statistic for the OPPE Model

3. Estimation and Inference

3.1. Classical Estimation Methods

3.1.1. Maximum Likelihood Method

3.1.2. The CVME Method

3.1.3. The OLSQ Method

3.1.4. The WLSQE Method

3.1.5. Method of Moments

3.1.6. KE Method

3.2. Simulations and Assessment

3.3. Applications for Comparing Methods

3.4. Bayesian Analysis under Different Loss Functions

3.4.1. Prior and Posterior Distributions

3.4.2. Bayesian Estimators and Their Posterior Risk

3.4.3. Comparing the Likelihood Estimation and the Bayesian Estimation Using Pitman’s Closeness Criterion

4. Distributional Validation

4.1. Uncensored Simulation Study under the NKRR Statistics Y 2 ( ε ; δ ^ )

4.2. Uncensored Applications under the NKRR Statistics Y 2 ( ε ; δ ^ )

4.2.1. Strengths of Glass Fibers

4.2.2. Heat Exchanger Tube Crack

4.3. Censored Simulation Study under the NKRR Statistics Y n , r − 1 , ε 2 ( δ ^ )

4.4. Censored Applications under the NKRR Statistics Y n , x − 1 , ε 2 ( δ ^ )

4.4.1. Censored Lung Cancer Data Set

4.4.2. Censored Capacitor Data Reliability Data Set

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.1. Uncensored Simulation Study under the NKRR Statistics $Y^{2} (ε; \hat{δ})$

4.2. Uncensored Applications under the NKRR Statistics $Y^{2} (ε; \hat{δ})$

4.3. Censored Simulation Study under the NKRR Statistics $Y_{n, r - 1, ε}^{2} (\hat{δ})$

4.4. Censored Applications under the NKRR Statistics $Y_{n, x - 1, ε}^{2} (\hat{δ})$