Next Article in Journal
Complexity Analysis of Carbon Market Using the Modified Multi-Scale Entropy
Previous Article in Journal
Hierarchical Scaling in Systems of Natural Cities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Truncated Power-Normal Distribution with Application to Non-Negative Measurements

by
Nabor O. Castillo
1,
Diego I. Gallardo
2,
Heleno Bolfarine
3 and
Héctor W. Gómez
4,*
1
Departamento de Matemáticas, Facultad de Ciencias, Universidad de La Serena, La Serena 1700000, Chile
2
Departamento de Matemática, Facultad de Ingeniería, Universidad de Atacama, Copiapó 1530000, Chile
3
Departamento de Estatística, Instituto de Matemática e Estatística (IME), Universidade de São Paulo, São Paulo 01000-000, Brazil
4
Departamento de Matemáticas, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
*
Author to whom correspondence should be addressed.
Submission received: 23 March 2018 / Revised: 18 May 2018 / Accepted: 28 May 2018 / Published: 5 June 2018

Abstract

:
This paper focuses on studying a truncated positive version of the power-normal (PN) model considered in Durrans (1992). The truncation point is considered to be zero so that the resulting model is an extension of the half normal distribution. Some probabilistic properties are studied for the proposed model along with maximum likelihood and moments estimation. The model is fitted to two real datasets and compared with alternative models for positive data. Results indicate good performance of the proposed model.

1. Introduction

Lehmann [1] proposed a class of asymmetric distributions. The cumulative distribution function (cdf) for such class is given by:
F F ( z ; α ) = { F ( z ) } α , z R ,
where F is in itself a cumulative distribution function and α Q , with Q the set of rational numbers. In the special case where α is an integer number, the above cdf corresponds to the distribution of the maximum in a sample of size α .
Durrans [2] gives an interpretation for (1) in the more general case α R + based on fractional order statistics. Assume F is an absolutely continuous function and f denotes its respective probability density function (pdf), i.e., f = d F . The pdf related to (1) is:
f F ( z ; α ) = α f ( z ) { F ( z ) } α 1 , z R , α R + .
Henceforth, we refer to a random variable with pdf as in (2) as the power distribution ( P F ), and we use the notation Z P F ( α ) . The particular case where F = Φ ( · ) , the cdf of the standard normal model, was approached in [2]. In such a case, the respective pdf of the model is reduced to:
f Φ ( z ; α ) = α ϕ ( z ) { Φ ( z ) } α 1 , z R , α R + ,
where ϕ ( · ) is the standard normal pdf. The authors used the term generalized Gaussian distribution to refer the model in Equation (3). This model also was studied with more detail by [3]. Pewsey et al. [4] call Model (3) the power-normal (PN) model, denoting Z P N ( α ) , and show that its Fisher information matrix (FIM) for the location-scale extension is nonsingular for α = 1 (i.e., the symmetric case).
The generalization of the normal distribution in (3) also is a particular case of the Beta-normal model discussed in [5].
On the other hand, the random variable X follows a half-normal distribution with scale parameter σ if its pdf is given by:
f H N ( x ; σ ) = 2 σ ϕ x σ I { x > 0 } ,
for σ > 0 . We denote X H N ( σ ) . Cooray and Ananda [6] extended the half-normal (HN) model by introducing the generalized half-normal (GHN) model, that is X is a random variable with the GHN distribution with scale parameter σ and shape parameter α , if its pdf is given by:
f G H N ( x ; σ , α ) = 2 π α x x σ α exp 1 2 x σ 2 α I { x > 0 } , σ > 0 , α > 0 .
We use the notation X G H N ( σ , α ) . Observe that G H N ( σ , α = 1 ) H N ( σ ) , that is one obtains the half-normal model with scale parameter σ > 0 .
Some properties of the GHN distribution are:
  • H ( x ; σ , α ) = 2 Φ x σ α 1
  • E ( X ) = 2 1 / α π Γ 1 + α 2 α σ
  • V a r ( X ) = 2 1 / α π π Γ ( 2 + α 2 α ) Γ 2 1 + α 2 α σ 2
  • E ( X r ) = 2 r / α π Γ r + α 2 α σ r , for r = 1 , 2 , ,
where H ( · ) is the cdf of X and Γ ( · ) is the gamma function. The proofs of those properties are presented in [6]. Recent extensions of the HN model are considered in [7,8], among others.
The recent literature has experienced a growth in the theory and applications of the continuous truncated models. Among others, we refer the reader to [9,10,11,12,13,14,15,16].
The main focus of this paper is to study the positive truncation for the model considered in (3), where the normalizing constant for the pdf (3) is to be determined, and the resulting model is an extension of the half-normal distribution. That is, we generate a more flexible extension of the half-normal distribution that we call the truncated positive power-normal (TPN) distribution, where the asymmetry parameter α is a shape parameter. Given its flexibility, the model is quite useful for fitting positive data related to survival analysis and reliability.
The paper is organized as follows. In Section 2, we present the TPN distribution. Some basic properties such as the quantile function, the risk function and some moments are considered, and Shannon entropy is studied. In Section 3, we discuss some inferential aspects such as the log-likelihood function and its maximization, the corresponding Fisher information matrix (FIM) and the method of moments estimation. Section 4 deals with an extension of the TPN model and presents results for a small-scale simulation study, indicating good parameter recovery. Results of using the proposed model in two real applications are reported in Section 5. The main conclusion is that the TPN model can be a viable alternative for adjusting positive data.

2. The Truncated Positive PN Distribution

In this section, we present the pdf of the TPN model, some of its basic properties, moments and asymmetry and kurtosis coefficients.

2.1. The Probability Density Function

Proposition 1.
A random variable Z has a TPN distribution and is denoted as Z T P N ( σ ; α ) with parameters σ and α, if its pdf is given by:
f Z ( z ; σ , α ) = 2 α α ( 2 α 1 ) σ ϕ z σ Φ z σ α 1 I { z > 0 } , σ , α R + .
Proof. 
Under the assumption that X P N ( σ , α ) , the pdf for the model TPN follows after computing the conditional distribution of Z = X | X > 0 , concluding the proof. ☐
Remark 1.
For α N = { 1 , 2 , } , the TPN model admits the following stochastic representation. If W 1 , , are independent and identically distributed (iid) random variables with common distribution N ( 0 , σ 2 ) , then:
Z = max ( 0 , X 1 , , X α ) T P N ( σ , α ) .
Its distribution function is given by:
F Z ( z ; σ , α ) = 2 α 2 α 1 Φ α z σ 1 2 α ,
For σ = 1 and varying α , Figure 1 depicts examples of the pdf for model TPN.

2.2. Properties

2.2.1. Quantile Function

Simple algebraic manipulations yield:
Q ( p ) = σ Φ 1 p ( 2 α 1 ) + 1 2 α 1 / α ,
for a probability 0 < p < 1 . The quartiles are, consequently:
  • First quartile = σ Φ 1 2 α + 3 2 α + 2 1 / α
  • Median(Z) = σ Φ 1 2 α + 1 2 α + 1 1 / α
  • Third quartile = σ Φ 1 3 ( 2 α 1 ) + 4 2 α + 2 1 / α

2.2.2. Hazard Rate Function

The hazard rate function for the random variable Z T P N ( σ , α ) is given by:
h ( z ) = f Z ( z ) 1 F Z ( z ) = α ϕ z σ Φ α 1 z σ σ 1 Φ α z σ ,
Remark 2.
(i) 
If α = 1 , then h ( z ) is the hazard function for the half-normal model z R + .
(ii) 
σ , α , z R + , h ( z ) is monotonically increasing with h ( 0 ) = 2 π α σ ( 2 α 1 ) .
(iii) 
σ , α , h ( z ) , as z .
For σ = 1 and varying α , Figure 2 depicts examples of the hazard rate function for model TPN.

2.3. Moments

Proposition 2.
If Z T P N ( σ , α ) , then the r-th moment of Z is:
μ r = E ( Z r ) = α 2 α σ r 2 α 1 d r ( α ) , r = 1 , 2 ,
where d r ( α ) = 1 / 2 1 Φ 1 ( u ) r u α 1 d u has to be computed numerically.
Proof. 
Making the variable change, u = Φ z σ , we obtain:
E ( Z r ) = 0 2 α α z r ( 2 α 1 ) σ ϕ z σ Φ z σ α 1 d z = 1 / 2 1 2 α α σ r 2 α 1 Φ 1 ( u ) r u α 1 d u .
 ☐
Corollary 1.
Therefore, the first four moments are given by:
(a) 
μ 1 = E ( Z ) = α 2 α σ 2 α 1 d 1 ( α ) .
(b) 
μ 2 = E ( Z 2 ) = α 2 α σ 2 2 α 1 d 2 ( α ) .
(c) 
μ 3 = E ( Z 3 ) = α 2 α σ 3 2 α 1 d 3 ( α ) .
(d) 
μ 4 = E ( Z 4 ) = α 2 α σ 4 2 α 1 d 4 ( α ) .
Corollary 2.
Asymmetry and kurtosis coefficients are given, respectively, by:
β 1 = ( 2 α 1 ) 2 d 3 ( α ) 3 α ( 2 α 1 ) 2 α d 1 ( α ) d 2 ( α ) + α 2 2 2 α + 1 d 1 ( α ) 3 α 2 α ( 2 α 1 ) d 2 ( α ) α 2 α d 1 ( α ) 2 3 / 2
and:
β 2 = ( 2 α 1 ) 3 d 4 ( α ) α ( 2 α 1 ) 2 2 α + 2 d 1 ( α ) d 3 ( α ) + 3 α 2 2 2 α + 1 ( 2 α 1 ) d 1 ( α ) 2 d 2 ( α ) 3 α 3 2 3 α d 1 ( α ) 4 α 2 α ( 2 α 1 ) d 2 ( α ) α 2 α d 1 ( α ) 2 2 .
Remark 3.
If α = 1 , the asymmetry and kurtosis coefficients take the values 0.99527 and 3.86918, respectively, which correspond to those for the classical HN distribution. Figure 3 depict plots for the asymmetry and kurtosis coefficients, respectively, of the HN and TPN distribution.

2.4. Shannon Entropy

Shannon entropy (see [17]) measures the amount of uncertainty for a random variable Z. It is defined as:
S ( Z ) = E ( log f Z ( z ) ) .
Therefore, it can be verified that the Shannon entropy for the TPN model is:
S ( Z ) = 1 1 α + log σ α + log 2 α 1 + 1 2 log 2 π + α 2 α 1 d 2 ( α ) 2 α 1 α + α 1 2 α 1 log ( 2 ) ,
Figure 4 shows the Shannon entropy for the TPN model fixing σ = 1 . Note that for a fixed σ , S ( Z ) is maximized at α 5.4962 .
Remark 4.
(i) 
From Figure 4 and for a fixed σ, we conclude that S ( Z ) S N ( Z ) , α > 0 , where S N ( Z ) denotes the Shannon entropy for the N ( 0 , σ 2 ) distribution.
(ii) 
For α = 1 , it follows that d 2 ( 1 ) = 1 / 2 , and S ( Z ) agrees with the entropy for the half-normal distribution (see [18]), which is given by:
S H N ( Z ) = 1 2 + 1 2 log π σ 2 2 .

2.5. Rényi Entropy

A generalization of the Shannon entropy is the Rényi entropy, which is defined as:
R p ( Z ) = 1 1 p log 0 [ f ( z ) ] p d z .
Routine calculations show that for the TPN model:
R p ( Z ) = log 2 π σ + p 1 p α log ( 2 ) + log α log ( 2 α 1 ) 1 2 ( 1 p ) log ( p ) + 1 ( 1 p ) 0 ϕ ( w ) Φ p w p ( α 1 ) d w .
Remark 5.
For m = p ( α 1 ) N = { 1 , 2 , , } , the R p ( Z ) is reduced to:
R p ( Z ) = log 2 π σ + p 1 p α log ( 2 ) + log α log ( 2 α 1 ) 1 2 ( 1 p ) log ( p ) + 1 ( 1 p ) c m p ,
where c m p is the normalization constant in the Balakrishnan skew-normal distribution ([19,20]). In the last two references, the following facts are shown:
(a) 
c 1 p 1 = 1 2 .
(b) 
c 2 p 1 = 1 4 + 1 2 π sin 1 p 1 + p .
(c) 
c 3 p 1 = 1 8 + 3 4 π sin 1 p 1 + p .
(d) 
m N , c m p 1 1 2 , for p .
(e) 
For m 4 , there is no closed form expression for c m p 1 . However, approximated values are provided in Table 1 from [21].

2.6. Kullback–Leibler Divergence for HN and TPN Models

The Kullback–Leibler divergence ( D K L ( f 1 , f 2 ) ) is a measure of how one pdf (say f 1 ) diverges from a second (say f 2 ) pdf. For this reason, it can be used as a measure to decide between two alternative models for a particular dataset. As the HN model is a particular case of the TPN model (for α = 1 ), we compute the Kullback–Leibler from the H N ( σ 1 ) and T P N ( σ 2 , α ) models, which can be shown to be given by:
D K L ( T P N , H N ) = 0 log f H N ( z ; σ 1 ) f T P N ( z ; σ 2 , α ) f T P N ( z ; σ 2 , α ) d z = 1 2 log 2 π + log σ 1 + 1 2 σ 1 2 E σ 2 , α ( Z 2 ) S ( Z ) = 1 α 1 + log σ 1 α σ 2 + α 2 α 1 d 2 ( α ) 2 α 1 σ 2 2 σ 1 2 1 log ( 2 α 1 ) + ( α 1 ) 1 + 1 2 α 1 log ( 2 ) .
Remark 6.
As expected, D K L ( T P N , H N ) = 0 , if σ 1 = σ 2 and α = 1 .

3. Inference

In this section, we discuss moments and maximum likelihood estimation (MLE) and FIM and present a simulation study to investigate parameter recovery.

3.1. Moments Estimation

Solving for σ in Equation (a) from Corollary 1 and replacing Z ¯ for E ( Z ) , it follows that:
σ = ( 2 α 1 ) Z ¯ α 2 α d 1 ( α ) ,
Thus, replacing σ , given in Equation (8), and the second sample moment in Equation (b) from Corollary 1, it follows that:
Z 2 ¯ α 2 α d 1 ( α ) 2 ( 2 α 1 ) Z ¯ 2 d 2 ( α ) = 0
Solving the equation given in (9) for α , we obtain α ^ M , and hence, replacing α by α ^ M in Equation (8), one obtains σ ^ M . This leads to the moments’ estimators σ ^ M , α ^ M for σ , α . The equation given in (9) is solved numerically using the function solve available in the software MAPLE.

3.2. The Log-Likelihood Function

For a random sample Z 1 , , Z n from the distribution T P N ( σ , α ) , the log likelihood function can be written as:
l ( σ , α ) = n log α 2 π σ log ( 1 2 α ) 1 2 σ 2 i = 1 n z i 2 + ( α 1 ) i = 1 n log Φ z i σ ,
so that the likelihood equations are given by:
1 σ 3 i = 1 n z i 2 ( α 1 ) σ 2 i = 1 n z i ϕ z i σ Φ z i σ = n σ
log ( 2 ) 2 α 1 + i = 1 n log Φ z i σ = n α
The solution for Equations (10)–(11) can be obtained by using the function optim available in [22], and the specific method is the L-BFGS-B developed by [23], which allows constrained optimization, which uses a limited-memory modification of the quasi-Newton method.

Fisher Information Matrix

Let random variable Z T P N ( σ , α ) . For a single observation z, the log-likelihood function for θ = ( σ , α ) is:
log f Z ( θ ; z ) = log ( α ) log ( σ ) + α log ( 2 ) log ( 2 α 1 ) log ( 2 π ) z 2 2 σ 2 + ( α 1 ) log [ Φ ( z / σ ) ]
The first derivatives of log f Z ( θ , z ) are:
log f Z ( θ ; z ) σ = 1 σ + z 2 σ 3 ( α 1 ) z σ 2 ϕ ( z / σ ) Φ ( z / σ ) log f Z ( θ ; z ) α = 1 α + log ( 2 ) 2 α log ( 2 ) 2 α 1 + log [ Φ ( z / σ ) ]
The second derivatives of log f Z ( θ , z ) are:
2 log f Z ( θ ; z ) σ 2 = 1 σ 2 3 z 2 σ 4 z ( α 1 ) σ 3 z 2 σ 2 ϕ ( z / σ ) Φ ( z / σ ) + z σ ϕ ( z / σ ) Φ ( z / σ ) 2 2 ϕ ( z / σ ) Φ ( z / σ ) 2 log f Z ( θ ; z ) α 2 = 1 α 2 + 2 α ( log ( 2 ) ) 2 ( 2 α 1 ) 2 2 log f Z ( θ ; z ) σ α = z σ 2 ϕ ( z / σ ) Φ ( z / σ )
It can be shown that the FIM for the TPN distribution is given by:
I F ( σ , α ) = I σ σ I σ α I σ α I α α
with the following elements:
I σ σ = 1 σ 2 + 3 σ 4 a 20 + ( α 1 ) σ 3 1 σ 2 a 31 + 1 σ a 22 2 a 11 I σ α = 1 σ 2 a 11 I α α = 1 α 2 2 α ( log ( 2 ) ) 2 ( 2 α 1 ) 2 ,
where a i j = E Z i ( ϕ ( z / σ ) Φ ( z / σ ) ) j , for i , j = 1 , 2 , 3 , must be computed numerically.

3.3. Truncation at c

As the following result indicates, the truncation point for the distribution TPN can be located at any c 0 . We denote this extension by Z T P N c .
Proposition 3.
A random variable Z T P N c , if its pdf is given by:
f Z ( z ; σ , α , c ) = α ( 1 Φ α ( c / σ ) ) σ ϕ z σ Φ α 1 z σ I { z > c } , σ , α R +
where ϕ ( · ) and Φ ( · ) denote the pdf and cdf of the standard normal distribution, respectively. We use the notation Z T P N c ( σ , α ) .
Proof. 
Under the assumption that X P N ( σ , α ) , the pdf for the model TPN arises after computing the conditional distribution of Z = X | X > c , concluding the proof. ☐

4. Simulation Study

In this section, we present a brief simulation study in order to assess the performance of the MLEs of the TPN model in finite samples. To simulate from the T P N c distribution, it is sufficient to simulate from the PN distribution, accepting only those values greater than c. The simulation algorithm is then:
  • Simulate U U ( 0 , 1 ) , and compute Y = σ Φ 1 ( U 1 / α ) .
  • If Y c , make Z = Y . Otherwise, go to the previous step.
The acceptance ratio is then 1 Φ α ( c / σ ) . Hereafter, c is considered known and taking values of 0, 0.5 and 1.0. Likewise, for α and σ were chosen three values, and the generated samples were of sizes n = 30 , n = 50 , n = 100 and n = 200 . For each combination of sample size and parameter values, 1000 samples were generated and MLEs were computed. Table 1 and Table 2 summarize the mean of the estimated parameters (mean), the mean of the estimated standard deviations (s.d.) and the root of the mean squared error ( M S E ). Note that a small sample size (say n = 30 and n = 50 ) presents a moderate bias for both parameters, which are decreasing for n increasing. Additionally, the s.d.’s are closer to M S E , especially when n is increased, suggesting that the s.d. are well estimated even in small sample sizes.

Examples

Figure 5 depicts the model fitting for some simulated samples of size n = 200 and truncations at c = 0.5 and at c = 1 .

5. Real Data Illustration

In this section, we present two applications to illustrate the performance of the TPN model compared with other usual distributions in the literature, such as the Weibull, gamma, GHN, Birnbaum–Saunders (BS, [24,25]), β -Birnbaum–Saunders ( β -BS, [26]), epsilon half-normal (EHN, [7]), power half-normal (PHN, [27]) and truncated positive normal (TN, [28]) models. Model comparison is implemented by using the AIC ([29]).

5.1. Australian Athletes

This dataset consists of several variables recorded on 202 Australian athletes and reported in [30]. Concretely, we analyze here measurements of the body mass index (BMI). Table 3 presents basic descriptive statistics for the dataset. We use the notation b 1 and b 2 to represent sample asymmetry and kurtosis coefficients, respectively.
Using results from Section 3.1, moment estimators were computed leading to the following values: σ ^ M = 7.644 and α ^ M = 447.867 , which were used as initial estimates for the maximum likelihood (ML) approach. In this case, we fixed two values for the TPN model, namely c = 0 and c = 16 (a value close to the sample minimum).
Table 4 depicts parameters’ estimates by maximum likelihood using the bbmle function in [22]. The standard errors of the MLE are calculated using the information matrix of each model. For each, we report the estimated log-likelihood function and the corresponding AIC. It can be noted that the AIC scores indicate better fit of the TPN model. On the other hand, results for c = 0 and c = 16 are similar. Therefore, we chose the standard model with c = 0 . In Figure 6, the estimated densities of the models using the ML estimates are shown with the data histogram. This also indicates good fit for the TPM model. Finally, Figure 7 shows the q-q plots for the TPN model and the other considered models. Note that TPN is a more appropriate model than Weibull, gamma, GHN and TN for this dataset because the sample quantiles are closer to the respective theoretical quantiles. Excepting the TPN distribution, all the other models present serious difficulties in accommodating the right tail of the data. Finally, the estimated skewness and kurtosis coefficients for the TPN model consider that the MLEs are 0.694 and 3.731. The 95% confidence intervals (CI) for those coefficients estimated via bootstrap (based on 10,000 bootstrap samples) are given by (0.167; 1.420) and (2.411; 6.841), respectively. Note that the sample versions of both coefficients are contained in the estimated CI.

5.2. Breaking Stress of Carbon Fibers

This dataset is considered in [31] and corresponds to breaking stress of carbon fiber (BSFC) measures in Gba. Cordeiro and Lemonte [26] already analyzed these data comparing the BS and β -BS models. Additionally, we also compared those models with the EHN and PHN distributions. Table 5 presents basic descriptive statistics for the dataset. Note that for this dataset, the sample minimum is close to zero. Therefore, in this case, it seems reasonable to consider c = 0 .
We also computed the moment estimators, resulting in σ ^ M = 1.604 and α ^ M = 14.505 , which were used as initial estimates for the maximum likelihood approach.
Table 6 shows the MLEs. It can be noted that AIC shows a better fit of the TPN model. In Figure 8, the ML setting of models is shown with the probability histogram. Finally, from the q-q plots in Figure 9, we have that the TPN model fits the data better than the other models considered.

6. Discussion

The main focus of this paper is studying a truncated positive version of the PN model, obtaining a new extension of the HN model. This model involves two parameters and is an alternative to other positive models. Maximum likelihood estimation is conducted for parameter estimation, and results of a simulation study indicate that it has good properties for small and moderate sample sizes, as well as applications to real data, indicating that it can outperform competing distributions. A simulation study was implemented using the acceptance rejection method for some truncation values, and the results were satisfactory.

Author Contributions

Nabor Castillo and Héctor W. Gómez conceived the model and study some properties; Diego I. Gallardo performed computational programs developed in simulation studies and applications; Heleno Bolfarine included some properties and translated the paper to english. All authors have read and approved the final manuscript.

Acknowledgments

We thank the four anonymous reviewers for their comments and suggestions helping us to improve the article. The research of N.O. Castillo was supported by DIULS REGULAR PR12151 (Chile). The research of H. Bolfarine was supported by CNPq-Brasil. The research of H.W. Gómez was supported by SEMILLERO UA-2015 (Chile).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lehmann, E.L. The power of rank tests. Ann. Math. Stat. 1953, 24, 23–43. [Google Scholar] [CrossRef]
  2. Durrans, S.R. Distributions of fractional order statistics in hydrology. Water Resour. Res. 1992, 28, 1649–1655. [Google Scholar] [CrossRef]
  3. Gupta, D.; Gupta, R.C. Analyzing skewed data by power normal model. Test 2008, 17, 197–210. [Google Scholar] [CrossRef]
  4. Pewsey, A.; Gómez, H.W.; Bolfarine, H. Likelihood-based inference for power distributions. Test 2012, 21, 775–789. [Google Scholar] [CrossRef]
  5. Eugene, N.; Lee, C.; Famoye, F. Beta-normal distribution and its applications. Commun. Stat. Theory Methods 2002, 31, 497–512. [Google Scholar] [CrossRef]
  6. Cooray, K.; Ananda, M.M.A. A Generalization of the Half-Normal Distribution with Applications to Lifetime Data. Commun. Stat. Theory Methods 2008, 37, 1323–1337. [Google Scholar] [CrossRef]
  7. Castro, L.M.; Gómez, H.W.; Valenzuela, M. Epsilon half-normal model: Properties and inference. Comput. Stat. Data Anal. 2012, 56, 4338–4347. [Google Scholar] [CrossRef]
  8. Olmos, N.M.; Varela, H.; Gómez, H.W.; Bolfarine, H. An extension of the half-normal distribution. Stat. Pap. 2012, 53, 875–886. [Google Scholar] [CrossRef]
  9. Arnold, B.C. Flexible univariate and multivariate models based on hidden truncation. J. Stat. Plan. Inference 2009, 139, 3741–3749. [Google Scholar] [CrossRef] [Green Version]
  10. Barranco-Chamorro, I.; Moreno-Rebollo, J.L.; Pascual-Acosta, A.; Enguix-Gonzalez, A. An overview of asymptotic properties of estimator in truncated distributions. Commun. Stat. Theory Methods 2007, 36, 2351–2366. [Google Scholar] [CrossRef]
  11. Barr, D.R.; Sherrill, E.T. Mean and variance of truncated normal distributions. Am. Stat. 1999, 53, 357–361. [Google Scholar]
  12. Bebu, I.; Mathew, T. Confidence intervals for limited moments and truncated moments in normal and lognormal models. Stat. Probab. Lett. 2009, 79, 375–380. [Google Scholar] [CrossRef]
  13. Chopin, N. Fast simulation of truncated Gaussian distributions. Stat. Comput. 2011, 21, 275–288. [Google Scholar] [CrossRef]
  14. Damien, P.; Walker, S.G. Sampling truncated normal, beta, and gamma densities. J. Comput. Graph. Stat. 2001, 10, 206–215. [Google Scholar] [CrossRef]
  15. Jonhson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; Wiley: New York, NY, USA, 1995; Volume 1. [Google Scholar]
  16. Robert, C.P. Simulation of truncated normal variables. Stat. Comput. 1995, 5, 121–125. [Google Scholar] [CrossRef] [Green Version]
  17. Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
  18. Ahsanullah, M.; Golam Kibria, B.M.; Shakil, M. Normal and Student’s t Distributions and Their Applications; Atlantis Press: Paris, France, 2014. [Google Scholar]
  19. Arnold, B.C.; Beaver, R.J. Skewed multivariate models related to hidden truncation and/or selective reporting (with discussion). Test 2002, 11, 7–54. [Google Scholar] [CrossRef]
  20. Sharafi, M.; Behboodian, J. The Balakrishnan skew-normal pdf. Stat. Pap. 2008, 49, 769–778. [Google Scholar] [CrossRef]
  21. Steck, G.P. Orthant probability for the equicorrelated multivariate normal distribution. Biometrika 1962, 49, 433–445. [Google Scholar] [CrossRef]
  22. R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2017; ISBN 3-900051-07-0. [Google Scholar]
  23. Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
  24. Birnbaum, Z.W.; Saunders, S.C. A new family of life distributions. J. Appl. Probab. 1969, 6, 319–327. [Google Scholar] [CrossRef] [Green Version]
  25. Birnbaum, Z.W.; Saunders, S.C. Estimation for a family of life distributions with applications to fatigue. J. Appl. Probab. 1969, 6, 328–377. [Google Scholar] [CrossRef]
  26. Cordeiro, G.M.; Lemonte, A.J. The β-Birnbaum-Saunders distribution: An improved distribution for fatigue life modeling. Comput. Stat. Data Anal. 2011, 55, 1445–1461. [Google Scholar] [CrossRef]
  27. Gómez, Y.M.; Bolfarine, H. Likelihood-based inference for the power half-normal distribution. J. Stat. Theory Appl. 2015, 14, 383–398. [Google Scholar] [CrossRef]
  28. Gómez, H.J.; Olmos, N.M.; Varela, H.; Bolfarine, H. Truncated positive normal distribution. Appl. Math. J. Chin. Univ. 2018, 33, 163–176. [Google Scholar]
  29. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  30. Cook, R.D.; Weisberg, S. An Introduction to Regression Graphics; JohnWiley & Sons Inc.: New York, NY, USA, 1994. [Google Scholar]
  31. Nichols, M.D.; Padgett, W.J. A bootstrap control chart for Weibull percentiles. Qual. Reliab. Eng. Int. 2006, 22, 141–151. [Google Scholar] [CrossRef]
Figure 1. Probability density function of T P N ( α , σ = 1 ) for different values of α .
Figure 1. Probability density function of T P N ( α , σ = 1 ) for different values of α .
Entropy 20 00433 g001
Figure 2. Hazard rate function of the T P N ( α , σ = 1 ) model and different values for α .
Figure 2. Hazard rate function of the T P N ( α , σ = 1 ) model and different values for α .
Entropy 20 00433 g002
Figure 3. Asymmetry (left) and kurtosis (right) coefficients for T P N ( σ , α ) (solid line) and half normal (HN) ( α = 1 , dotted line).
Figure 3. Asymmetry (left) and kurtosis (right) coefficients for T P N ( σ , α ) (solid line) and half normal (HN) ( α = 1 , dotted line).
Entropy 20 00433 g003
Figure 4. Shannon entropy of T P N ( α , σ = 1 ) for different values of α . The dashed line corresponds to the Shannon entropy for the standard normal model.
Figure 4. Shannon entropy of T P N ( α , σ = 1 ) for different values of α . The dashed line corresponds to the Shannon entropy for the standard normal model.
Entropy 20 00433 g004
Figure 5. Examples of the estimation of the TPN c ( σ = 1.0 , α = 1.5 ) with their corresponding estimates. Left panel: c = 0.5 , α ^ = 1.695 and σ ^ = 1.032 . Right panel: c = 1.0 , α ^ = 1.578 and σ ^ = 0.959 .
Figure 5. Examples of the estimation of the TPN c ( σ = 1.0 , α = 1.5 ) with their corresponding estimates. Left panel: c = 0.5 , α ^ = 1.695 and σ ^ = 1.032 . Right panel: c = 1.0 , α ^ = 1.578 and σ ^ = 0.959 .
Entropy 20 00433 g005
Figure 6. Histogram for the BMI dataset, with lines representing adjusted distributions using MLE for different models.
Figure 6. Histogram for the BMI dataset, with lines representing adjusted distributions using MLE for different models.
Entropy 20 00433 g006
Figure 7. q-q plots: TPN model (left), gamma model (center) and TN model (right).
Figure 7. q-q plots: TPN model (left), gamma model (center) and TN model (right).
Entropy 20 00433 g007
Figure 8. Histogram for the BSFC dataset, with lines representing adjusted distributions using MLE using different models.
Figure 8. Histogram for the BSFC dataset, with lines representing adjusted distributions using MLE using different models.
Entropy 20 00433 g008
Figure 9. q-q plots: TPN model (left), PHN model (center) and β -BS model (right).
Figure 9. q-q plots: TPN model (left), PHN model (center) and β -BS model (right).
Entropy 20 00433 g009
Table 1. Mean of the estimated parameters (mean), mean of the estimated standard deviations (s.d.) and root of the mean squared error ( M S E ) for MLEs of the T P N c ( σ , α ) model (cases n = 30 and n = 50 ).
Table 1. Mean of the estimated parameters (mean), mean of the estimated standard deviations (s.d.) and root of the mean squared error ( M S E ) for MLEs of the T P N c ( σ , α ) model (cases n = 30 and n = 50 ).
True Value n = 30 n = 50
α ^ σ ^ α ^ σ ^
c α σ means.d. MSE means.d. MSE means.d. MSE means.d. MSE
0.00.811.2351.0991.2210.9590.1640.1661.3041.0500.9870.9530.1360.133
21.2381.0971.2211.9230.3290.3281.2291.0180.9681.9230.2740.268
31.2441.0951.2342.8760.4910.4991.1370.9350.9642.9220.4060.41
1.011.4141.1451.2690.9650.1640.1691.3991.0520.9510.9640.1360.13
21.4201.1591.2621.9320.3290.3301.3521.0230.9711.9430.2730.262
31.4121.1561.2802.9010.4940.5021.2860.9600.9802.9340.4040.405
1.511.8571.2651.3530.9750.1630.1621.7701.0560.9960.9800.1320.126
21.8351.2611.3361.9550.3260.3281.7211.0430.9921.9720.2650.255
31.8381.2591.3482.9360.4900.4951.6951.0161.0302.9660.3960.39
0.50.812.2562.3833.0040.9450.1490.1572.4232.4212.4410.9320.1290.131
21.5951.5681.8661.9070.3140.3241.6871.5381.4971.8910.2650.264
31.4681.3941.6052.8630.4770.4901.3551.2191.2412.8970.3980.406
1.012.3292.4152.9960.9530.1510.1552.5372.4182.4510.9360.1280.131
21.7301.6071.8881.9210.3150.3251.8011.5441.4981.9010.2650.262
31.6151.4541.6342.8840.4820.4971.5071.2491.2612.9110.3980.402
1.512.6652.5532.9890.9650.1530.1542.7642.3882.3130.9530.1280.122
22.1631.7701.9651.9390.3180.3202.0881.5421.4681.9370.2620.251
32.0201.5701.7142.9220.4830.4801.8581.2901.2942.9490.3940.385
1.00.818.60210.10912.7680.9120.1460.1713.8674.6016.2880.9530.1120.117
22.2872.3893.1121.8910.2980.3172.3782.3812.4551.8680.2570.263
31.7331.7772.1392.8580.4650.4791.8461.7401.7652.8300.3920.4
1.018.63710.05012.6980.9140.1460.1694.0614.6586.4540.9560.1120.118
22.3642.4333.0081.9010.3010.3152.4492.3732.3691.8810.2570.257
31.9631.8662.2272.8660.4660.4811.9421.7431.7342.8550.3930.393
1.518.6119.78511.7400.9220.1470.1654.3764.8306.2600.9610.1140.116
22.6642.5652.9861.9320.3070.3132.7542.3742.3541.9070.2560.247
32.3001.9872.2522.9030.4710.4712.2451.7571.6892.8920.3900.373
Table 2. Mean of the estimated parameters (mean), mean of the estimated standard deviations (s.d.) and root of the mean squared error ( M S E ) for MLEs of the T P N c ( σ , α ) model (cases n = 100 and n = 200 ).
Table 2. Mean of the estimated parameters (mean), mean of the estimated standard deviations (s.d.) and root of the mean squared error ( M S E ) for MLEs of the T P N c ( σ , α ) model (cases n = 100 and n = 200 ).
True Value n = 30 n = 50
α ^ σ ^ α ^ σ ^
c α σ means.d. MSE means.d. MSE means.d. MSE means.d. MSE
0.00.810.9670.7070.6780.9860.1020.1000.8570.5080.5110.9960.0740.074
20.9230.6750.6901.9790.2020.1990.8560.5080.5111.990.1490.147
30.9300.6720.6872.9640.3010.2970.8500.5090.5052.9890.2230.221
1.011.1290.7190.7010.9910.1010.1001.0460.5190.5240.9980.0730.073
21.1070.7010.7121.9850.2000.1971.0400.5190.5221.9960.1460.145
31.1040.7050.7062.9760.3010.2971.0420.5180.5212.9940.220.218
1.511.5910.7390.7350.9950.0960.0961.5350.5250.5290.9990.0690.069
21.5840.7350.7461.9900.1920.1911.5450.5250.5251.9940.1370.138
31.5660.7350.7382.9900.2890.2841.5320.5250.532.9980.2060.209
0.50.811.2871.2521.4020.9790.0920.0921.0580.9310.9920.9890.0690.068
21.0570.9070.9611.9680.1930.1920.9150.6810.7051.9880.1440.143
31.0030.8160.8532.9570.2930.2900.8820.6220.6212.9840.220.215
1.011.4351.3041.4240.9830.0930.0911.2020.9671.0200.9930.0690.069
21.1950.9440.9741.9800.1950.1901.0950.7140.7241.9930.1450.142
31.1780.8600.8832.9690.2950.2931.0680.6440.6412.9920.2190.216
1.511.8501.4321.5130.9910.0940.0931.6561.0581.0950.9970.070.069
21.6391.0221.0411.9910.1940.1921.5600.7480.7551.9980.1410.141
31.6370.9210.9302.9840.2900.2871.5630.6660.6732.9940.2090.209
1.00.812.4692.8393.6700.9720.0840.0862.4452.6742.6890.9670.0680.066
21.2871.2651.4031.9600.1850.1831.0440.9380.9811.9800.1380.134
31.1021.0011.0672.9530.2850.2820.9630.7590.7782.9750.2130.207
1.012.5862.9543.5900.9740.0850.0862.5422.6832.6660.9710.0680.066
21.4391.3231.4341.9670.1860.1861.2240.9821.0151.9840.1390.138
31.2901.0641.1212.9590.2880.2831.1450.7960.8242.9790.2140.213
1.512.9623.1183.7230.9790.0860.0862.8102.7012.6100.9760.0690.064
21.8441.4291.5091.9820.1880.1861.6581.0631.0691.9930.140.136
31.7041.1471.1782.9800.2890.2841.5970.8410.8482.9940.2110.211
Table 3. Descriptive statistics.
Table 3. Descriptive statistics.
Datasetn X ¯ S b 1 b 2 min ( x ) max ( x )
BMI202 22.96 8.20 0.95 5.18 16.75 34.42
Table 4. Parameter estimates (with their respective standard deviations in parenthesis) and AIC values for Weibull, Gamma, generalized half-normal (GHN) and TPN models.
Table 4. Parameter estimates (with their respective standard deviations in parenthesis) and AIC values for Weibull, Gamma, generalized half-normal (GHN) and TPN models.
EstimatesWeibullGammaGHNTPNTPNTN
σ 24.259 ( 0.249 ) 0.339 ( 0.034 ) 24.954 ( 0.283 ) 7.667 ( 0.225 ) 7.676 (0.229)1.050 (0.142)
α 7.281 ( 0.340 ) 67.804 ( 6.730 ) 4.949 ( 0.070 ) 439.234 ( 109.204 ) 433.809 (109.878)-
λ -----8.035 (0.406)
c---0 (-)16 (-)-
log-likelihood 524.52 492.73 545.91 488.98 488.93 498.67
AIC 1053.04 989.45 1095.81 981.97 981.85 1001.33
Table 5. Descriptive statistics. BSFC, breaking stress of carbon fiber.
Table 5. Descriptive statistics. BSFC, breaking stress of carbon fiber.
Data Setn X ¯ S b 1 b 2 min ( x ) max ( x )
BSFC66 2.760 0.891 0.13 3.22 0.39 4.90
Table 6. Parameter estimates (with their respective standard deviations in parenthesis) and AIC values for the Birnbaum–Saunders (BS), β -BS, epsilon half-normal (EHN) and TPN models.
Table 6. Parameter estimates (with their respective standard deviations in parenthesis) and AIC values for the Birnbaum–Saunders (BS), β -BS, epsilon half-normal (EHN) and TPN models.
EstimatesBS β -BSEHNTPNPHN
σ -- 2.898 ( 0.252 ) 1.679 ( 0.101 ) 0.570 (0.118)
α 0.437 ( 0.038 ) 1.045 ( 0.004 ) 0.003 ( 0.068 ) 12.470 ( 2.252 ) 1.581 (0.913)
β 2.515 ( 0.132 ) 57.600 ( 0.331 ) ---
a- 0.193 ( 0.026 ) ---
b- 1876.732 ( 605.050 ) ---
log-likelihood 100.19 91.35 118.13 87.21 89.18
AIC 204.38 190.71 240.25 178.42 182.37

Share and Cite

MDPI and ACS Style

Castillo, N.O.; Gallardo, D.I.; Bolfarine, H.; Gómez, H.W. Truncated Power-Normal Distribution with Application to Non-Negative Measurements. Entropy 2018, 20, 433. https://0-doi-org.brum.beds.ac.uk/10.3390/e20060433

AMA Style

Castillo NO, Gallardo DI, Bolfarine H, Gómez HW. Truncated Power-Normal Distribution with Application to Non-Negative Measurements. Entropy. 2018; 20(6):433. https://0-doi-org.brum.beds.ac.uk/10.3390/e20060433

Chicago/Turabian Style

Castillo, Nabor O., Diego I. Gallardo, Heleno Bolfarine, and Héctor W. Gómez. 2018. "Truncated Power-Normal Distribution with Application to Non-Negative Measurements" Entropy 20, no. 6: 433. https://0-doi-org.brum.beds.ac.uk/10.3390/e20060433

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop