Next Article in Journal
An Evolving Spacetime Metric Induced by a ‘Static’ Source
Previous Article in Journal
Relative Error Linear Combination Forecasting Model Based on Uncertainty Theory
Previous Article in Special Issue
Introduction to the Special Issue in Symmetry Titled “Symmetry in Statistics and Data Science”
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Properties and Maximum Likelihood Estimation of the Novel Mixture of Fréchet Distribution

by
Wikanda Phaphan
1,2,*,
Ibrahim Abdullahi
3 and
Wirawan Puttamat
4,*
1
Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand
2
Research Group in Statistical Learning and Inference, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand
3
Department of Mathematics and Statistics, Faculty of Science, Yobe State University, Damaturu 500501, Nigeria
4
Department of Mathematics, Faculty of Education, Chaiyaphum Rajabhat University, Chaiyaphum 36000, Thailand
*
Authors to whom correspondence should be addressed.
Submission received: 21 June 2023 / Revised: 5 July 2023 / Accepted: 5 July 2023 / Published: 7 July 2023
(This article belongs to the Special Issue Symmetry in Statistics and Data Science)

Abstract

:
In recent decades, there have been numerous endeavors to develop a novel category of survival distributions possessing enhanced flexibility through the extension of existing distributions. This article constructs and validates the statistical properties of a novel survival distribution in order to obtain an alternative distribution that is suitable for analyzing survival data by presenting the novel mixture of the Fréchet distribution along with statistical properties such as the probability density function (PDF), cumulative distribution function (CDF), r t h ordinary moment, skewness, kurtosis, moment-generating function, mean, variance, mode, survival function, hazard function, and asymptotic behavior, as well as constructing the estimators of the unknown parameter by employing the expectation-maximization (EM) algorithm, and simulated annealing. Additionally, the performance of the proposed estimators was compared with bias, mean squared errors (MSE), and simulated variances, and given an illustrative example of the proposed distribution to the survival data set in order to show that the proposed distribution is appropriate for the right-skewed data. This will be extremely advantageous in survival analysis.

1. Introduction

Survival analysis, a branch of statistics pertaining to death or failure, encompasses various types of statistical methods to draw conclusions. These methods include (1) nonparametric statistics, such as the Kaplan–Meier estimator and the log-rank test; (2) semi-parametric statistics, exemplified by the Cox proportional hazards model; and (3) parametric statistics, which focus on simulating survival time probabilities. Analysts may deduce that the survival function has a parametric distribution. For instance, if the survival time adheres to an exponential distribution, the hazard rate will be constant. Conversely, if the survival time conforms to a log-normal distribution, the hazard rate varies with time. Consequently, estimation of the survival function, calculation of the confidence interval, and assessment of the relative risk ensue. The utilization of a parametric survival function proves highly effective when appropriate distributions and parameter values are selected. The parametric survival distribution serves as a comprehensive representation of various types of survival data.
Hundreds of univariate continuous distributions exist. Mixture models play a crucial role in numerous applications, including survival analysis, such as Farewell [1], Hunsberger et al. [2], and Joudaki et al. [3]. These models involve the combination of two or more statistical distributions to create a new distribution, thereby addressing various challenges encountered in the field. Recognizing the evident necessity for mixture distributions, extensive efforts have been devoted to integrating multiple well-established distributions and utilizing them to tackle relevant issues. In the context of complete samples, Niyomdecha and Srisuradetchai [4] introduce a novel continuous three-parameter survival distribution referred to as the Complementary Gamma Zero-Truncated Poisson distribution. The traits of the maximum value in a series of independently identical gamma-distributed random variables are combined with those of zero-truncated Poisson random variables in this distribution. Abdullahi and Phaphan [5] present a mixture of Nakagami distribution, accompanied by statistical properties and a comparative analysis of the efficacy of estimators utilizing the quasi-Newton method and simulated annealing. Nanuwong et al. [6] proposed the mixture Pareto distribution by combining a Pareto distribution and a length-biased Pareto distribution. This distribution was formulated based on the concept of a weighted two-component distribution. Further investigation pertaining to the mixture models can be found in the references [7,8].
The Fréchet distribution, alternatively referred to as the inverse Weibull distribution, holds extensive application in the field of survival modeling. Fréchet [9] initially introduced the Fréchet distribution, which subsequently underwent further exploration by Fisher and Tippett [10] as well as Gumbel [11]. Furthermore, Abbas and Yincai [12] conducted a comparative analysis of the scale parameter estimation for the Fréchet distribution, employing maximum likelihood, probability-weighted moments, and Bayes estimations. Nasir and Aslam [13] utilized a Bayesian technique to estimate the parameter of the Fréchet distribution. Reyad et al. [14] established QE-Bayes and E-Bayes estimates for the scale parameters associated with the Fréchet distribution. Recent developments have introduced various extensions to the Fréchet distribution. Notably, Mead et al. [15] proposed the beta exponential Fréchet distribution.
Consequently, this article paid special attention to developing a new survival distribution by employing the notion of a mixture distribution, which is based on the Fréchet distribution, to obtain a new alternative distribution with the value of the time-varying hazard rate and investigating the statistical properties of the new distribution, such as the probability density function, cumulative distribution function, r t h ordinary moment, skewness, kurtosis, moment-generating function, mean, variance, mode, survival function, hazard function, asymptotic behavior, comparison of the estimators with several methods, and samples of applying to real data, which will be extremely useful in survival analysis.

2. The Fréchet Distribution

The Fréchet distribution, being a specific case of the generalized extreme value distribution [16], finds extensive application in the field of hydrology [17]. This distribution is commonly employed to model extreme events, including daily rainfall [18] and river discharges [19]. Moreover, the Fréchet distribution holds considerable significance in survival analysis utilizing experimental data from clinical research. Given its status as the inverse Weibull distribution, the Fréchet distribution exhibits properties akin to the Weibull distribution, such as time-varying hazard rates. As a result, the Fréchet distribution has been a subject of widespread discussion in the field of survival analysis.
Afify et al. [20] provides the probability density function (PDF), cumulative distribution function (CDF), and mean of the Fréchet distribution. The PDF of the Fréchet distribution described by
g ( x ) = δ λ δ x δ + 1 e λ x δ , x > 0 .
Given that λ > 0 represents a scale parameter and δ > 0 represents a shape parameter, the CDF associated with these parameters can be expressed as follows:
G ( x ) = e λ x δ , x > 0 .
Furthermore, the mean of the distribution can be determined as follows:
E ( X ) = λ Γ 1 1 δ , δ > 1 ; o t h e r w i s e ,
where Γ 1 1 δ represents a gamma function: Γ 1 1 δ = 0 y 1 1 δ 1 e y d y .

3. The Length-Biased Fréchet Distribution

Within the framework presented by Hesham et al. [21], a length-biased Fréchet distribution was introduced along with its associated CDF, PDF, and mean. The specific form of the CDF can be expressed using Equation (4).
G L ( x ) = 1 Γ 1 1 δ γ 1 1 δ , λ x δ , x > 0 ,
where λ > 0 , δ > 1 , Γ represents a gamma function, and γ represents an incomplete gamma function. The associated PDF can be expressed as follows:
g L ( x ) = δ λ δ 1 Γ 1 1 δ x δ e λ x δ .
Additionally, the distribution’s mean can be determined using the formula below:
E L ( X ) = λ Γ 1 2 δ Γ 1 1 δ , δ > 2 .

4. Theoretical Result

4.1. The Probability Density Function of the Novel Mixture Fréchet (NMF) Distribution

This subsection aims to construct a novel distribution by employing the notion of a mixture distribution. The proposed distribution will be a combination of two distinct distributions, namely the Fréchet distribution and the length-biased Fréchet distribution, with the probabilities weighted between the two distributions. Nevertheless, the inclusion of the weighted parameter, represented by “p” with 0 p 1 , would lead to a more complex PDF, thereby increasing the difficulty of implementing it. Consequently, this article opts to employ the function of parameter λ as a weighted parameter, thus rendering the PDF of the newly developed distribution with two parameters for enhanced flexibility in practical application. Hence, the PDF of the novel mixture Fréchet (NMF) distribution is defined as follows:
f N M F ( x ) = 1 λ + 1 g ( x ) + λ λ + 1 g L ( x ) , x > 0 ,
where λ > 0 and 1 λ + 1 + λ λ + 1 = 1 . By substituting Equations (1) and (5) into Equation (7), the resulting expression is denoted as
f N M F ( x ) = δ λ δ λ + 1 x δ e λ x δ 1 x + 1 Γ 1 1 δ , x > 0 , λ > 0 , δ > 1 .
Therefore, Equation (8) represents the PDF of the NMF distribution.

4.2. Validity Check of the NMF Distribution for a Proper Density Function

A PDF is considered valid if it satisfies the following conditions:
f ( x ) d x = 1 .
In order to demonstrate the validity of the proposed NMF distribution as a PDF, the following steps are undertaken:
0 δ λ δ λ + 1 x δ e λ x δ 1 x + 1 Γ 1 1 δ d x = 1 ,
let
u = λ x δ x = k 1 β λ ,
and
d x = d k λ β λ x β 1 .
By substituting Equations (11) and (12) into Equation (10), the resulting expression can be obtained.
0 f N M F ( x ) d x = 1 λ + 1 0 k α 1 + 1 β e k λ k 1 β Γ ( α ) + 1 Γ α + 1 β d k = 1 λ + 1 λ Γ ( α ) Γ ( α ) + 1 Γ α + 1 β Γ α + 1 β = λ ( λ + 1 ) + 1 ( λ + 1 ) = 1 .
This demonstrates that the PDF defined in Equation (8) conforms to the properties of a valid probability density distribution. Figure 1 depicts the PDF of the novel mixture of Fréchet distribution for different parameter values. The displayed variety of shapes demonstrates the right-skewed nature of the NMF distribution. Additionally, being a family of asymmetric distributions, the NMF distribution proves to be valuable for analyzing skewed data, particularly data with a right-skewed distribution, such as survival data.

4.3. The Cumulative Density Function of the NMF Distribution

Let G ( x ) and G L ( x ) represent the cumulative density function (CDF) of the Fréchet distribution and the length-biased Fréchet distribution, respectively. Consider a random variable X following the novel mixture Fréchet (NMF) distribution. The CDF for X in this instance can be written as follows:
F N M F ( x ) = 0 x f ( t ) d t = 0 x 1 λ + 1 g ( t ) + λ λ + 1 g L ( t ) d t
= 1 λ + 1 G ( x ) + λ λ + 1 G L ( x ) .
From Equation (14), the CDF of the novel mixture of Fréchet distribution can be expressed as
F ( x ) = 1 λ + 1 e λ x δ + λ λ + 1 1 Γ 1 1 δ γ 1 1 δ , λ x δ = 1 λ + 1 e λ x δ + λ Γ 1 1 δ γ 1 1 δ , λ x δ .

4.4. The r t h Ordinary Moment of the NMF Distribution

The NMF distribution’s r t h ordinary moment is expressed as follows:
μ r = E M X r = 0 x r f ( x ) d x .
Equation (19) gives the explicit expression for the r t h ordinary moment of the NMF distribution upon inserting Equation (8) into Equation (16) and performing integration with respect to x.
0 δ λ δ λ + 1 x δ + r e λ x δ 1 x + 1 Γ 1 1 δ d x .
μ r = 1 λ + 1 E X r + λ λ + 1 E L X r ,
where
E X r = λ r Γ 1 r δ ,
and
E L ( X ) = λ r Γ 1 r + 1 δ Γ 1 1 δ .
μ r = 1 λ + 1 λ r Γ 1 r δ + 1 λ + 1 λ r + 1 Γ 1 r + 1 δ Γ 1 1 δ , r = 1 , 2 , 3 ,
The following is the mathematical expression for the mean of the NMF distribution:
E N M F ( X ) = λ ( λ + 1 ) Γ 1 1 δ + λ Γ 1 2 δ Γ 1 1 δ , δ > 2 .
The second moment of the NMF distribution, denoted as E ( X 2 ) , can be derived from Equation (19) by setting the value of r = 2 .
E N M F ( X 2 ) = λ 2 λ + 1 Γ 1 2 δ + λ Γ 1 3 δ Γ 1 1 δ , δ > 3 .
The third moment of the NMF distribution, denoted as E ( X 3 ) , can be obtained from Equation (19) by substituting r = 3 .
E N M F ( X 3 ) = λ 3 λ + 1 Γ 1 3 δ + λ Γ 1 4 δ Γ 1 1 δ , δ > 4 .
The fourth moment of the NMF distribution, denoted as E ( X 4 ) , can be calculated by substituting r = 4 in Equation (19).
E N M F ( X 4 ) = λ 4 λ + 1 Γ 1 4 δ + λ Γ 1 5 δ Γ 1 1 δ , δ > 5 .
Equation (19) at r = 1 and r = 2 and substituting into Equation (24) yields the variance of the NMF distribution.
V a r N M F ( X ) = E M ( X 2 ) [ E M ( X ) ] 2 ,
V a r N M F ( X ) = λ 2 λ + 1 Γ 1 2 δ + λ Γ 1 3 δ Γ 1 1 δ λ λ + 1 Γ 1 1 δ + δ Γ 1 2 δ Γ 1 1 δ 2 , δ > 3 .
Hence, the standard deviation of the NMF distribution is:
S D N M F ( X ) = V a r N M F ( X ) .

4.5. The Skewness and Kurtosis of the NMF Distribution

The novel mixture Fréchet (NMF) distribution’s skewness and kurtosis coefficients are provided as follows, respectively:
Φ 1 = E N M F ( X 3 ) E N M F X 2 3 2 = λ 3 λ + 1 Γ 1 3 δ + λ Γ 1 4 δ Γ 1 1 δ λ 2 λ + 1 Γ 1 2 δ + δ Γ 1 3 δ Γ 1 1 δ 3 2 ,
and
Φ 2 = E N M F ( X 4 ) E N M F X 2 2 = λ 4 λ + 1 Γ 1 4 δ + λ Γ 1 5 δ Γ 1 1 δ λ 2 λ + 1 Γ 1 2 δ + δ Γ 1 3 δ Γ 1 1 δ 2 .

4.6. The Moment Generating Function of the NMF Distribution

The NMF distribution’s moment-generating function is provided by
E N M F e X t = M X ( t ) = r = 0 t r E N M F X r r ! .
By substituting Equation (19) into (29), the NMF distribution’s moment-generating function is derived as presented in Equation (30).
M X ( t ) = r = 0 t r r ! 1 λ + 1 λ r Γ 1 r δ + 1 λ + 1 λ r + 1 Γ 1 r + 1 δ Γ 1 1 δ .

4.7. The Mode of the NMF Distribution

By computing the derivative of the natural logarithm of Equation (8) with respect to x, setting it equal to zero, and solving for x, one is able to determine the mode of the NMF distribution. In this subsection, a nonlinear equation is obtained in Equation (32).
log f ( x ) = log δ λ δ λ + 1 δ log ( x ) λ x δ + log 1 x + 1 Γ 1 1 δ ,
λ x δ δ x δ x 1 x 2 1 x + 1 Γ 1 1 δ = 0 .

4.8. The Survival Function and the Hazard Rate Function of the NMF Distribution

Consider a continuous random variable, X, whose cumulative density function, F ( x ) , is specified on the range, [ 0 , ) . The following is an expression for the survival function of X:
S ( x ) = 1 F ( x ) .
The survival function of the NMF distribution is obtained by inserting Equation (15) into Equation (33):
S ( x ) = 1 1 λ + 1 e λ x δ + λ Γ 1 1 δ γ 1 1 δ , λ x δ .
Theoretically possible to define the hazard rate function of X as:
h r f ( x ) = f ( x ) S ( x ) .
Consequently, the NMF distribution’s hazard rate function is given by
h r f ( x ) = δ λ δ λ + 1 x δ e λ x δ 1 x + 1 Γ 1 1 δ 1 1 λ + 1 e λ x δ + λ Γ 1 1 δ Γ 1 1 δ λ x δ .

4.9. Asymptotic Behavior of the NMF Distribution

The NMF distribution exhibits zero asymptotic behavior as x approaches infinity.
lim x f ( x ) = lim x δ λ δ λ + 1 x δ e λ x δ 1 x + 1 Γ 1 1 δ = 0 .
As x approaches λ :
lim x λ f ( x ) = lim x δ λ δ λ + 1 x δ e λ x δ 1 x + 1 Γ 1 1 δ = δ e λ + 1 1 λ + 1 Γ 1 1 δ .

4.10. Maximum Likelihood Estimation of the NMF Distribution

Maximum likelihood estimators will be utilized in this subsection to estimate the NMF distribution’s parameters. The likelihood function of the NMF distribution is defined as follows if x 1 ,…, x n represent a random sample of size n taken from the NMF distribution:
L ( x ) = i = 1 n δ λ δ λ + 1 x i δ e λ x i δ 1 x i + 1 Γ 1 1 δ ,
( x ) = log i = 1 n δ λ δ λ + 1 x i δ e λ x i δ 1 x i + 1 Γ 1 1 δ .
Equation (38)’s natural logarithm has been employed to derive the log-likelihood function shown in Equation (40).
( x ) = n log ( δ ) + n δ log ( λ ) n log λ + 1 δ i = 1 log ( x i ) i = 1 λ x i δ + i = 1 log 1 x i + 1 Γ 1 + 1 δ .
By taking the derivative of Equation (40) with respect to λ and δ and then solving for each of those values, one can obtain the maximum likelihood estimators (MLEs).
( x ) λ = n δ λ n λ + 1 i = 1 λ x i δ δ λ ,
( x ) δ = n δ + n log ( λ ) i = 1 log ( x i ) i = 1 λ x i δ log λ x i Ψ 1 1 δ δ 2 Γ 1 1 δ 1 x i + 1 Γ 1 1 δ ,
where
Ψ ( δ ) = l n ( Γ ( δ ) ) δ .
Due to the nonlinearity of these equations, analytical solutions are not feasible, but iterative methods can be used to solve these numerically. This article proposes the utilization of the expectation-maximization (EM) algorithm and the simulated annealing to construct the MLEs for the NMF distribution.

4.10.1. Maximum Likelihood Estimation Employing the Simulated Annealing Algorithm

This article examines the MLEs for the unknown parameters of the NMF distribution. Analytical solutions for the MLEs are not attainable in Section 4.10. Therefore, in this part, the R optimization function, particularly the “optim” function, is employed for maximum likelihood estimation (MLE) using the simulated annealing. The steps of the Simulated Annealing Algorithm are as follows:
Step 1:
Give a initial value x ( k = 0 ) , temperature T, number of iterations n, and desired accuracy ε .
Step 2:
Pick a random value x ( k + 1 ) in the vicinity of x ( k ) .
Step 3:
If Δ E < 0 , where Δ E = f ( x ( k + 1 ) ) f ( x ( k ) ) , and f ( x ) represents the objective function, then accept x ( k + 1 ) . Otherwise, generate a random number α such that α ( 0 , 1 ) . If α exp ( Δ E / K T ) , where K is the Boltzmann constant, then accept x ( k + 1 ) . Otherwise, return to Step 2.
Step 4:
If | x ( k + 1 ) x ( k ) | < ε and T is sufficiently small, terminate the iterations. Otherwise, if the number of random number generations reaches n, decrease the value of T, let k = k + 1 , and go to Step 2. Otherwise, give k = k + 1 and go to Step 2.

4.10.2. Maximum Likelihood Estimation Employing the EM-Algorithm

An EM algorithm is an iterative method employed to estimate unknown parameters in incomplete statistical models. The application of the EM algorithm encompasses two primary scenarios. The first arises when the data is incomplete due to observational process issues or limitations. The second arises when optimizing the likelihood function becomes challenging. The procedure for implementing the EM algorithm for the NMF distribution is outlined as follows:
Steps involved in the Expectation (E)-Step
  • Calculate the log-likelihood function for an NMF distribution.
    ln L ( x ) = i = 1 n ln 1 λ + 1 g ( x i ) + λ λ + 1 g L ( x i ) .
  • Compute a complete log-likelihood function by assigning a missing value κ i in the function ln L ( x ) . The missing values κ i can take either 0 or 1. Thus, the complete random variable is denoted as Y = ( X ; K ) , where y 1 , y 2 , , y n represent the observations with y i = ( t i , κ i ) for i = 1 , 2 , . . . , n . Consequently, a complete log-likelihood function is written in:
    l complete Θ y 1 , y 2 , , y n = i = 1 n κ i ln 1 λ + 1 g ( t ) + i = 1 n 1 κ i ln λ λ + 1 g L ( t ) , = i = 1 n κ i ln 1 λ + 1 g ( t ) + i = 1 n 1 κ i ln x i ( λ + 1 ) Γ 1 1 δ g ( t ) ,
    where Θ = { λ , δ } . The Equation (44) can be simplified by substituting Equation (1), resulting in the complete log-likelihood function, denoted as l complete Θ y 1 , y 2 , , y n , which is expressed as follows:
    l complete Θ y 1 , y 2 , , y n = i = 1 n ln ( x i ) n ln ( λ + 1 ) n ln Γ 1 1 δ i = 1 n κ i ln ( x i ) + n κ ¯ Γ 1 1 δ + n ln ( δ ) + n δ ln ( λ ) ( δ + 1 ) i = 1 n ln ( x i ) i = 1 n λ x i δ ,
    where κ ¯ = 1 n i = 1 n κ i .
  • Formulate the new complete log-likelihood function by eliminating constant expressions, resulting in the following expression:
    l complete Θ y 1 , y 2 , , y n = n ln ( λ + 1 ) n ln Γ 1 1 δ + n κ ¯ Γ 1 1 δ + n ln ( δ ) + n δ ln ( λ ) ( δ + 1 ) i = 1 n ln ( x i ) i = 1 n λ x i δ .
A pseudo-log-likelihood function is derived at an E-step of an EM algorithm by replacing missing values with their respective expectations. Hence, the pseudo-log-likelihood function at the k t h stage can be expressed as follows:
l complete Θ y 1 , y 2 , , y n = n l n ( λ + 1 ) n l n Γ 1 1 δ + n a ( k ) Γ 1 1 δ + n l n ( δ ) + n δ l n ( λ ) ( δ + 1 ) i = 1 n l n ( x i ) i = 1 n λ x i δ ,
where a ( k ) = 1 n i = 1 n a i ( k ) , and a i ( k ) is given by
a i ( k ) = 1 λ + 1 g ( x ; λ ( k ) , δ ( k ) ) 1 λ + 1 g ( x ; λ ( k ) , δ ( k ) ) + λ λ + 1 g L ( x ; λ ( k ) , δ ( k ) ) .
Steps involved in the Maximization (M)-Step
The M-step process involves iteratively increasing the number of function expressions. With each iteration, the values of a ( k ) and the estimated parameters λ ( k + 1 ) , and δ ( k + 1 ) will adjust. The process continues until the estimated values remain unchanged. Consequently, the MLEs for λ , and δ obtained via an EM algorithm are λ ( k + 1 ) , and δ ( k + 1 ) , respectively, achieved by maximizing Equation (53). The initial values suggested in this article for the EM algorithm are λ ( 0 ) and δ ( 0 ) , which are as follows:
λ ( 0 ) = n t 1 δ , where t = i = 1 n 1 t i δ ,
δ ( 0 ) = 2 for sample size is small , and
δ ( 0 ) = 1.5 for sample size is large .
Steps of EM-Algorithm:
Step 1:
Generate a random sample t 1 , t 2 , , t n according to the NMF distribution.
Step 2:
Set k = 0 and compute the initial values λ ( 0 ) and δ ( 0 ) as specified in Equations (49)–(51).
Step 3:
Calculate a ( k ) = 1 n i = 1 n a i ( k ) for i = 0 , 1 , 2 , , n , when a i ( k ) was given by Equation (52). For example, when k = 0 , we obtain the following:
a i ( 0 ) = 1 λ + 1 g ( x ; λ ( 0 ) , δ ( 0 ) ) 1 λ + 1 g ( x ; λ ( 0 ) , δ ( 0 ) ) + λ λ + 1 g L ( x ; λ ( 0 ) , δ ( 0 ) ) .
Step 4:
Obtain the values of λ ( k + 1 ) and δ ( k + 1 ) by maximizing Equation (53). For instance, when k = 0 , we obtain the following values:
l complete Θ y 1 , y 2 , , y n = n l n ( λ + 1 ) n l n Γ 1 1 δ + n a ( 0 ) Γ 1 1 δ + n l n ( δ ) + n δ l n ( λ ) ( δ + 1 ) i = 1 n l n ( x i ) i = 1 n λ x i δ .
Step 5:
If λ ( k + 1 ) = λ ( k ) and δ ( k + 1 ) = δ ( k ) , then the algorithm stops. Otherwise, update k = k + 1 and proceed to Step 3 and Step 4.

4.10.3. Assessment of the Efficacy of the Parameter Estimation

In this subsection, a series of simulations were performed to compare the outcomes of maximum likelihood estimators obtained using EM algorithms and simulated annealing. The utilization of Equations (49) and (50) as the initial value for the simulated annealing via “optim” function is favored in this context. The random number generator employed for generating samples from the NMF distribution followed an acceptance-rejection algorithm, utilizing a Fréchet distribution from a VGAM package in R program version 4.3.0. Each model was subjected to 500 repetitions. Sample sizes of n = 5 , 10 , 30 , 50 were generated for the NMF distribution with parameters λ = 1.5 , 2.5 and δ = 2 , 3 , 4 . The resulting computations yielded six models for each method and sample size, as presented in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8.
Upon reviewing all the results from Figure 2. The performance of the EM algorithm was remarkable, with estimated values for most parameters closely resembling the actual values. Moreover, the proposed EM algorithm demonstrated higher precision compared to the maximum likelihood estimates obtained through simulated annealing, as evidenced by reduced bias, lower mean squared error (MSE), and decreased variance estimation simulation.

5. Illustrative Example

The proposed distribution is applied to an actual dataset in this part. The dataset used in this analysis was collected from a clinical trial conducted by Freireich et al. [22], where patients received a placebo to evaluate the efficacy of 6-mercaptopurine (6-MP) in maintaining remission. Following the completion of the trial after a year, the following remission times were recorded and are expressed in weeks: 1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23.
Based on the results shown in Figure 3, the remission times of patients who received a placebo had a right-skewed distribution. In order to compare the goodness of fit, four right-skewed distributions—the Fréchet distribution, the length-biased Fréchet distribution, the mixture of Nakagami distribution [5], and the proposed mixture Fréchet distribution—are chosen.
While the parameters of the other candidate distributions are determined using maximum likelihood estimation utilizing simulated annealing, the parameters of the novel mixture Fréchet (NMF) distribution are estimated using the EM algorithm. The best model is the one that provides the smallest Akaike information criterion (AIC) value, which is used as the evaluation criterion.
Based on the findings presented in Table 9, it is evident that the NMF distribution yields the lowest value of the AIC. This indicates that the NMF distribution outperforms the other potential distributions when using an AIC statistic as a measure of goodness-of-fit for this example data. Therefore, as indicated by Equations (20) and (26), the mean and standard deviation of the remission times observed in a group of 21 patients who received a placebo are 3.091147 weeks and 2.792774 weeks, respectively.

6. Conclusions and Discussion

This article presents the introduction of a novel survival distribution known as the novel mixture Fréchet (NMF) distribution. This distribution is characterized by its right-skewed distribution. The study explores various statistical properties of this newly proposed distribution and estimates its two parameters using both EM algorithms and simulated annealing. To assess the performance of both methods, a simulation study is conducted, involving twenty-four different combination scenarios. The illustrative examples of the proposed distribution are implemented using patient remission times data. The results reveal that the EM estimators exhibit greater efficiency compared to the simulated annealing estimators. Additionally, the NMF distribution demonstrates a better fit when compared to other candidate distributions, as indicated by the Akaike information criterion (AIC). Consequently, this article presents a novel right-skewed distribution that holds potential application in diverse areas, including extreme value analysis, survival analysis, and reliability analysis.
In future research, it is advisable to investigate interval estimation using different methods, such as [23,24], to further enhance the accuracy of the estimations.

Author Contributions

Conceptualization, W.P. (Wikanda Phaphan); methodology, W.P. (Wikanda Phaphan) and I.A.; validation, W.P. (Wikanda Phaphan), I.A. and W.P. (Wirawan Puttamat); formal analysis, W.P. (Wikanda Phaphan), I.A. and W.P. (Wirawan Puttamat); investigation, I.A.; writing—original draft preparation, W.P. (Wikanda Phaphan), I.A. and W.P. (Wirawan Puttamat); writing—review and editing, W.P. (Wikanda Phaphan), I.A. and W.P. (Wirawan Puttamat); visualization, W.P. (Wikanda Phaphan); funding acquisition, W.P. (Wikanda Phaphan). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by King Mongkut’s University of Technology North Bangkok, Thailand. Contract no.KMUTNB-66-BASIC-04.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The book by Lee and Wang [25], collected by Freireich et al. [22], contains the real-world data set used in this study.

Acknowledgments

The authors express their gratitude to the reviewers for their invaluable insights and constructive feedback. And this research has been financially supported by King Mongkut’s University of Technology North Bangkok, Thailand, under contract number KMUTNB-66-BASIC-04.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Farewell, T. The Use of Mixture Models for the Analysis of Survival Data with Long-Term Survivors. Biometrics 1982, 38, 1041–1046. [Google Scholar] [CrossRef] [PubMed]
  2. Hunsberger, S.; Albert, S.; London, B. A finite mixture survival model to characterize risk groups of neuroblastoma. Stat Med. 2009, 28, 1301–1314. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Joudaki, H.; Hashemi, R.; Khazaei, S. Survival analysis using Dirichlet process mixture model with three-parameter Burr XII distribution as kernel. Commun. Stat. Simul. Comput. 2022, 1–19. [Google Scholar] [CrossRef]
  4. Niyomdecha, A.; Srisuradetchai, P. Complementary Gamma Zero-Truncated Poisson Distribution and Its Application. Mathematics 2023, 11, 2584. [Google Scholar] [CrossRef]
  5. Abdullahi, I.; Phaphan, W. Some Properties of the New Mixture of Nakagami Distribution. Thail. Stat. 2022, 20, 731–743. [Google Scholar]
  6. Nanuwong, N.; Bodhisuwan, W.; Pudprommarat, C. A New Mixture Pareto Distribution and Its Application. Thail. Stat. 2015, 13, 191–207. [Google Scholar]
  7. Aryuyuen, S.; Bodhisuwan, W.; Volodin, A. Discrete Generalized Odd Lindley–Weibull Distribution with Applications. Lobachevskii J. Math. 2020, 41, 945–955. [Google Scholar] [CrossRef]
  8. Tonggumnead, U.; Klinjan, K.; Tanprayoon, E.; Aryuyuen, S. A four-parameter negative binomial-Lindley regression model to analyze factors influencing the number of cancer deaths using Bayesian inference. Commun. Math. Biol. Neurosci. 2023, 2023, 1–20. [Google Scholar] [CrossRef]
  9. Fréchet, M. Sur la loi de probabilité de l’écart maximum. Ann. Soc. Polon. Math. 1927, 6, 93. [Google Scholar]
  10. Fisher, R.A.; Tippett, L.H.C. Limiting forms of the frequency distribution of the largest or smallest member of a sample. Math. Proc. Camb. Philos. Soc. 1928, 24, 180–190. [Google Scholar] [CrossRef]
  11. Gumbel, E.J. Statistics of Extremes; Columbia University Press: New York, NY, USA, 1958. [Google Scholar]
  12. Abbas, K.; Yincai, T. Comparison of estimation methods for Fréchet distribution with known shape. Casp. J. Appl. Sci. Res. 2012, 1, 58–64. [Google Scholar]
  13. Nasir, W.; Aslam, M. Bayes approach to study shape parameter of Fréchet distribution. Int. J. Basic. Appl. Sci. 2015, 4, 246–254. [Google Scholar] [CrossRef] [Green Version]
  14. Reyad, H.M.; Younis, A.M.; Ahmed, S.O. QE-Bayesian and E-Bayesian estimation of the Fréchet model. BJMCS 2016, 19, 62–74. [Google Scholar] [CrossRef] [PubMed]
  15. Mead, M.E. On five-parameter Lomax distribution: Properties and applications. Pak. J. Stat. Oper. Res. 2016, 1, 185–199. [Google Scholar]
  16. Kotz, S.; Nadarajah, S. Extreme Value Distributions: Theory and Applications; Imperial College Press: London, UK, 2000. [Google Scholar]
  17. Adlouni, S.; Bobée, B.; Ouarda, T. On the tails of extreme event distributions in hydrology. J. Hydrol. 2008, 355, 16–33. [Google Scholar] [CrossRef]
  18. Moccia, B.; Mineo, C.; Ridolfi, E.; Russo, F.; Napolitano, F. Probability distributions of daily rainfall extremes in Lazio and Sicily, Italy, and design rainfall inferences. J. Hydrol. Reg. Stud. 2021, 23, 100771. [Google Scholar] [CrossRef]
  19. Ramos, L.; Louzada, F.; Ramos, E.; Dey, S. The Fréchet distribution: Estimation and application—An overview. J. Stat. Manag. Syst. 2020, 23, 549–578. [Google Scholar] [CrossRef] [Green Version]
  20. Afify, A.Z.; Yousof, H.M.; Cordeiro, G.M.; Ortega, E.M.M.; Nofal, Z.M. The Weibull Fréchet distribution and its applications. J. Appl. Stat. 2016, 43, 2608–2626. [Google Scholar] [CrossRef]
  21. Hesham, M.R.; Ahmed, M.H.; Soha, A.O.; Suzanne, A.A. The length-biased weighted Fréchet distribution: Properties and estimation. Int. J. Appl. Math. Stat. 2017, 3, 189–200. [Google Scholar]
  22. Acute Leukemia Group B; Freireich, E.J.; Gehan, E.A.; Frei, E.; Schroeder, L.R.; Wolman, I.J.; Anbari, R.; Burgert, E.O.; Mills, S.D.; Pinkel, D.; et al. The Effect of 6-Mercaptopurine on the Duration of Steroid-Induced Remissions in Acute Leukemia: A Model for Evaluation of Other Potential Useful Therapy. Blood 1963, 21, 699–716. [Google Scholar]
  23. Srisuradetchai, P.; Dangsupa, K. On Interval Estimation of the Geometric Parameter in a Zero–inflated Geometric Distribution. Thail. Stat. 2023, 21, 93–109. [Google Scholar]
  24. Srisuradetchai, P.; Tonprasongrat, K. On Interval Estimation of the Poisson Parameter in a Zero-inflated Poisson Distribution. Thail. Stat. 2022, 20, 357–371. [Google Scholar]
  25. Lee, E.T.; Wang, J.W. Statistical Methods for Survival Data Analysis, 3rd ed.; Wiley: Hoboken, NJ, USA, 2003; Volume 29. [Google Scholar]
Figure 1. Probability density functions for the novel mixture of Fréchet distribution at various values of λ and δ .
Figure 1. Probability density functions for the novel mixture of Fréchet distribution at various values of λ and δ .
Symmetry 15 01380 g001
Figure 2. Box plots display the biases, MES, and variance estimation simulation of the EM estimators and simulated annealing estimators.
Figure 2. Box plots display the biases, MES, and variance estimation simulation of the EM estimators and simulated annealing estimators.
Symmetry 15 01380 g002
Figure 3. The 21 patients who received a placebo’s times in remission.
Figure 3. The 21 patients who received a placebo’s times in remission.
Symmetry 15 01380 g003
Table 1. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 5 .
Table 1. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 5 .
λ δ λ ^ δ ^ Bias ( λ ^ ) Bias ( δ ^ ) MSE ( λ ^ ) MSE ( δ ^ ) VarSim ( λ ^ ) VarSim ( δ ^ )
1.521.74702.74260.24700.74260.40082.13820.33981.5867
31.56374.09280.06371.09280.08144.33110.07733.1369
41.56215.30090.06211.30090.04557.82070.04166.1283
2.522.98052.72950.48050.72951.23161.97321.00071.4410
32.65793.98940.15790.98940.28463.64590.25972.6669
42.61565.73290.11561.73290.134713.70420.121410.7012
Table 2. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 5 .
Table 2. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 5 .
λ δ λ ˜ δ ˜ Bias ( λ ˜ ) Bias ( δ ˜ ) MSE ( λ ˜ ) MSE ( δ ˜ ) VarSim ( λ ˜ ) VarSim ( δ ˜ )
1.523.155714.63941.655712.639416.4554231.103413.713971.3496
32.021315.52840.521312.52842.1113225.21541.839568.2533
41.965215.07940.465211.07941.4879188.20351.271565.4498
2.525.688214.00203.188212.002036.7693213.422026.604669.3742
33.938214.68741.438211.68748.9652220.44756.896983.8532
43.710013.95101.21009.95105.6971180.60554.233181.5837
Table 3. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 10 .
Table 3. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 10 .
λ δ λ ^ δ ^ Bias ( λ ^ ) Bias ( δ ^ ) MSE ( λ ^ ) MSE ( δ ^ ) VarSim ( λ ^ ) VarSim ( δ ^ )
1.521.56602.27440.06600.27440.09730.29630.09290.2210
31.54443.40370.04440.40370.03810.93690.03610.7740
41.52794.64930.02790.64930.01972.36060.01901.9390
2.522.65202.36690.15200.36690.30170.42420.27860.2896
32.58473.35500.08470.35500.12120.92110.11410.7950
42.54464.58050.04460.58050.05391.66880.05191.3318
Table 4. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 10 .
Table 4. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 10 .
λ δ λ ˜ δ ˜ Bias ( λ ˜ ) Bias ( δ ˜ ) MSE ( λ ˜ ) MSE ( δ ˜ ) VarSim ( λ ˜ ) VarSim ( δ ˜ )
1.523.296316.44281.796314.442818.4601273.367015.233464.7724
32.229916.46170.729913.46175.0049240.02904.472258.8125
41.961017.91100.461013.91102.0165248.40571.803954.8905
2.527.496815.43764.996813.437675.0960254.373450.128173.8031
35.322516.02672.822513.026728.4780252.515620.511282.8207
44.310115.77191.810111.771912.9169227.90409.640389.3260
Table 5. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 30 .
Table 5. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 30 .
λ δ λ ^ δ ^ Bias ( λ ^ ) Bias ( δ ^ ) MSE ( λ ^ ) MSE ( δ ^ ) VarSim ( λ ^ ) VarSim ( δ ^ )
1.521.53022.10750.03020.10750.03020.05950.02930.0479
31.51173.13120.01170.13120.01080.16370.01060.1465
41.50424.17780.00420.17780.00580.37820.00570.3466
2.522.55832.17450.05830.17450.08560.07430.08220.0439
32.52403.10780.02400.10780.03090.15000.03030.1384
42.51744.14380.01740.14380.01580.30760.01550.2869
Table 6. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 30 .
Table 6. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 30 .
λ δ λ ˜ δ ˜ Bias ( λ ˜ ) Bias ( δ ˜ ) MSE ( λ ˜ ) MSE ( δ ˜ ) VarSim ( λ ˜ ) VarSim ( δ ˜ )
1.525.086716.31403.586714.314045.6847263.954832.820659.0629
33.604217.47442.104214.474422.9393272.416118.511562.9077
42.645418.84611.145414.84618.7214286.80807.409566.4026
2.5213.319116.102610.819114.1026213.7684278.476896.715779.5939
39.232615.67016.732612.6701105.1613244.489559.833183.9577
47.699514.31825.199510.318264.7645201.854137.729395.3889
Table 7. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 50 .
Table 7. The average estimations, biases, MES, and variance estimation simulation of EM estimators λ ^ and δ ^ with a sample size of n = 50 .
λ δ λ ^ δ ^ Bias ( λ ^ ) Bias ( δ ^ ) MSE ( λ ^ ) MSE ( δ ^ ) VarSim ( λ ^ ) VarSim ( δ ^ )
1.521.51852.06940.01850.06940.01670.02930.01640.0245
31.50713.06340.00710.06340.00660.08850.00650.0845
41.50384.07460.00380.07460.00330.18180.00330.1763
2.522.64432.13850.14430.13852.60750.05052.58670.0313
32.53673.08810.03670.08810.05110.09680.04970.0890
42.50674.07580.00670.07580.00870.15130.00870.1456
Table 8. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 50 .
Table 8. The average estimations, biases, MES, and variance estimation simulation of simulated annealing estimators λ ˜ and δ ˜ with a sample size of n = 50 .
λ δ λ ˜ δ ˜ Bias ( λ ˜ ) Bias ( δ ˜ ) MSE ( λ ˜ ) MSE ( δ ˜ ) VarSim ( λ ˜ ) VarSim ( δ ˜ )
1.527.944116.32356.444114.3235109.6222273.147068.095367.9839
35.049817.60103.549814.601044.2769290.798831.676177.6082
44.230917.14732.730913.147328.2579254.701920.800381.8496
2.5219.166615.400016.666613.4000397.1666255.1433119.391275.5836
314.151413.607111.651410.6071224.4617202.720088.706990.2098
412.303912.56529.80398.5652168.4808178.429172.3640105.0656
Table 9. The MLE of the model’s parameters for patients who received a placebo’s times of remission.
Table 9. The MLE of the model’s parameters for patients who received a placebo’s times of remission.
Fitting DistributionEstimate ParametersAIC Statistics
λ δ p
Fréchet Distribution15.5050812.18451-5.58502
Length-biased Fréchet Distribution30.180821.5-11.4393
NMF Distribution2.0505873.5-2.951101
The mixture of Nakagami Distribution1.7737751.4523120.74.524097
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Phaphan, W.; Abdullahi, I.; Puttamat, W. Properties and Maximum Likelihood Estimation of the Novel Mixture of Fréchet Distribution. Symmetry 2023, 15, 1380. https://0-doi-org.brum.beds.ac.uk/10.3390/sym15071380

AMA Style

Phaphan W, Abdullahi I, Puttamat W. Properties and Maximum Likelihood Estimation of the Novel Mixture of Fréchet Distribution. Symmetry. 2023; 15(7):1380. https://0-doi-org.brum.beds.ac.uk/10.3390/sym15071380

Chicago/Turabian Style

Phaphan, Wikanda, Ibrahim Abdullahi, and Wirawan Puttamat. 2023. "Properties and Maximum Likelihood Estimation of the Novel Mixture of Fréchet Distribution" Symmetry 15, no. 7: 1380. https://0-doi-org.brum.beds.ac.uk/10.3390/sym15071380

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop