Next Article in Journal
Attention-TCN-BiGRU: An Air Target Combat Intention Recognition Model
Next Article in Special Issue
Estimation of COVID-19 Transmission and Advice on Public Health Interventions
Previous Article in Journal
The Influence of NeoTrie VR’s Immersive Virtual Reality on the Teaching and Learning of Geometry
Previous Article in Special Issue
A More Accurate Estimation of Semiparametric Logistic Regression
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Generalized t Distribution Based on a Distribution Construction Method

Faculty of Science, Beijing University of Technology, Beijing 100124, China
*
Author to whom correspondence should be addressed.
Submission received: 15 July 2021 / Revised: 19 September 2021 / Accepted: 20 September 2021 / Published: 28 September 2021
(This article belongs to the Special Issue Computational Statistics and Data Analysis)

Abstract

:
In this paper, a new generalized t (new Gt) distribution based on a distribution construction approach is proposed and proved to be suitable for fitting both the data with high kurtosis and heavy tail. The main innovation of this article consists of four parts. First of all, the main characteristics and properties of this new distribution are outined. Secondly, we derive the explicit expression for the moments of order statistics as well as its corresponding variance–covariance matrix. Thirdly, we focus on the parameter estimation of this new Gt distribution and introduce several estimation methods, such as a modified method of moments (MMOM), a maximum likelihood estimation (MLE) using the EM algorithm, a novel iterative algorithm to acquire MLE, and improved probability weighted moments (IPWM). Through simulation studies, it can be concluded that the IPWM estimation performs better than the MLE using the EM algorithm and the MMOM in general. The newly-proposed iterative algorithm has better performance than the EM algorithm when the sample kurtosis is greater than 2.7. For four parameters of the new Gt distribution, a profile maximum likelihood approach using the EM algorithm is developed to deal with the estimation problem and obtain acceptable.

1. Introduction

The generalized t (Gt) distribution was first proposed by McDonald and Newey (1988) [1] to implement a partially adaptive regression models with the following probability density function (pdf)
f ( x ; p , q ) = p 2 q 1 / p B ( 1 / p , q ) 1 + | x | p q ( q + 1 / p ) , p , q > 0 ,
where B ( a , b ) denotes the beta function. With the addition of two shape parameters, p , q , the shape of the distribution becomes more flexibe so that it can fit a wider range of data. Nadarajah (2008) [2] obtained the analytic expression of the cumulative distribution function (cdf) through the incomplete beta function presented below
F ( x ; p , q ) = 1 2 1 + sign ( x ) I 1 ( 1 + | x | p / q ) 1 1 p , q ,
where I x ( a , b ) = 1 B ( a , b ) 0 x t a 1 ( 1 t ) b 1 d t denotes the incomplete beta function.
With the passage of time, the generalized t distribution warrants special attention by scholars and has been widely applied in several research areas. Galbraith and Zhu (2010) [3] introduced a new class of asymmetric Student t distributions and illustrated their applications in financial econometrics. Harvey and Lange (2017) [4] applied the generalized t distribution in autoregressive conditional heteroscedasticity models. Furthermore, the generalized t distribution and its extension were involved in extensive research in the fields of robust estimation and robust statistical models, such as [5,6,7,8].
As the types of data become more and more widespread, many different generalized t distributions have been proposed and the related properties have been studied in depth. Arslan and Genç (2009) [9] obtained the skew-generalized t distribution through the scale mixture of the skew-exponential power and generalized gamma distributions. Venegas et al. (2012), [10] constructed a more flexible distribution by introducing skew coefficients ε to fit a wider range of data. Papastathopoulos and and Tawn (2013) [11] coined an extension of the Student’s t distribution that allows for a negative shape parameter. Acitas et al. (2015), [12] introduced an alpha-skew-generalized t distribution based on a new skewing procedure, and presented the corresponding properties as well as the estimation methods. Lak et al. (2019), [13] proposed a distribution with a variety of shape variations, namely, the alpha-beta-skew-generalized t distribution, which contains several well-known distributions and illustrated its usefulness by means of a real data analysis. For Student’s t Birnbaum-Saunders distribution, we recommend [14,15].
Although there is plenty of literature on the generalized t distribution, two aspects still need to be improved: (1) the type of data that previous distributions can be fitted to is not wide enough; (2) for the flexibility of the shape, the distribution function is often very complex, imposing limitations on the parameter estimation. To solve these two problems, the new Gt distribution we propose will fit both the data with a heavier tail and with higher kurtosis based on different shape parameters, with a more extensive application scope than the generalized t distributions introduced before. In addition, our new Gt distribution will make many classical estimation methods suitable for the parameter estimation problem of this distribution.
In this paper, we coined a new Gt distribution based on the distribution construction approach and denote it by N Gt ( μ , σ , α , β ) . The main innovation of this work consists of four parts. Firstly, we investigate the main properties of the new Gt distribution, including moments, skewness coefficients, kurtosis coefficients and random number generation. Secondly, we derive explicit expressions for moments of order statistics as well as its corresponding variance–covariance matrix through recurrence relations and distribution transformation techniques when the shape parameter 1 β is either a real non-integer or an integer. Thirdly, we established an EM-type algorithm and a novel iterative algorithm to acquire the maximum likelihood estimation (MLE) of parameters for N Gt ( 0 , σ , α , β ) . The explicit analytical expressions for the asymptotic variance–covariance matrix of MLE were also presented based on the EM algorithm. Finally, we introduce improved probability-weighted moments (IPWM) based on order statistics transformation and obtain the corresponding algorithm to implement this novel method. For four parameters, N Gt ( μ , σ , α , β ) , we propose a profile maximum likelihood approach (PLA) using the EM algorithm to deal with the estimation problem.
The rest of this article is organized as follows. In Section 2, we investigate the basic properties of this new distribution and make a comparison with the Gt distribution introduced by McDonald and Newey (1988) [1]. In Section 3, we derive explicit expressions for moments of order statistics from this distribution. In Section 4, we focus on the variance–covariance matrix of order statistics. In Section 5, several estimation methods are considered. In Section 6, we conduct several simulation studies to illustrate the performance of the proposed estimation method and compare these simulation results under different parameter values and different sample sizes. In Section 7, a real data analysis is performed to illustrate the feasibility of the new distribution we proposed. Discussions and conclusions are presented in Section 8 and proofs are given in the Appendix A.

2. Basic Theorems and Properties

In this section, some basic definitions, theorems and properties for the new Gt distribution are elucidated.
The new Gt distribution contains many common distributions, such as the normal distribution, the Laplace distribution, and the Pareto distribution. With the addition of two shape parameters, the shape of the distribution becomes more flexible, which can fit data having both heavy tails and high kurtosis.
Definition 1.
Let the random variable X with the following pdf
f X ( x ; μ , σ , β ) = β 2 1 + 1 / β Γ ( 1 / β ) σ exp | x μ | β 2 σ β , x R ,
where Γ (   ) denotes the gamma function, μ R is a location parameter, σ > 0 is a scale parameter, and β > 0 is a shape parameter. Then the random variable X is said to have a generalized normal distribution and denoted by X G N ( μ , σ , β ) .
The definition of the new Gt distribution (denoted by N Gt ( μ , σ , α , β ) ) is stated as follows:
Definition 2.
Let the random variable X μ G N ( 0 , σ , β ) and Y G a m m a ( α , β ) . Supposing X and Y are independent, we shall define T = X / Y 1 β N Gt ( μ , σ , α , β ) , where μ R is a location parameter, σ > 0 is a scale parameter, and α , β > 0 are two shape parameters.
Theorem 1.
The pdf of N Gt ( μ , σ , α , β ) can be expressed as follows
g T ( t ; μ , σ , α , β ) = 1 ( 2 β ) 1 + 1 β B ( α , 1 β ) σ 1 + β | t μ | β 2 σ β ( α + 1 β ) , t R .
Proof. 
See Appendix A.1 (Part I).   □
As is shown in the following two pictures (Figure 1 and Figure 2), two shape parameters ( α and β ) have a decisive effect on the shape of the density function. The density function has a heavier tail with a smaller value of α and β , but it is the opposite when the parameters α and β are larger. Furthermore, the shape parameter β plays a decisive role in the shape of the distribution and the change of parameter α can control the tail behavior.
Figure 3 and Figure 4 vividly describe the feature that the new Gt distribution is more suitable for fitting both the data with high kurtosis and with heavier tail than the Gt distribution coined by McDonald and Newey (1988) [1] under appropriate parameter range from two perspectives respectively.
It is well-known that the pdf with a steeper shape is suitable for fitting data with high kurtosis, while a pdf with a smoother shape can be used for fitting heavy-tailed data. Since these two distributions we compare are symmetric, we use the maximum value of the pdf to characterize the steepness of its shape. Let K 1 ( σ , α , β ) and K 2 ( σ , p , q ) denote the maximum value of the pdf of Gt distribution and new Gt distribution, respectively.
Since the value of σ has little effect on the value of K 1 ( σ , α , β ) and K 2 ( σ , p , q ) , we may assume that σ = 1 . On the one hand, it is not very difficult to prove that, for a fixed β , K 1 ( σ , α , β ) decreases monotonically with respect to α . Then we have: lim α K 1 ( σ , α , β ) + , lim α 0 + K 1 ( σ , α , β ) 0 . For K 2 ( σ , p , q ) , the interval of the shape parameters p , q are (1, 10) and (0.1, 50), respectively. The reason for choosing these two ranges is that they include possible values for parameters in practical applications. From Figure 3, we can obviously find that the range of function K 2 ( σ , p , q ) is never greater than 1. This means that we can only adjust the parameter σ to make the pdf of the Gt distribution have a steeper shape. On the other hand, for fixed β and p, when α , q 0 + , both K 1 ( σ , α , β ) and K 2 ( σ , p , q ) tend to 0 but K 1 ( σ , α , β ) has a faster convergence. This means that the pdf of the new Gt distribution proposed by us has a heavier tail when the parameters are within a reasonable range. Thus, the conclusion stated in the preceding paragraph is therefore self-evident.
The pdf generated under different shape parameters can be divided into four categories according to different shapes and the tail behavior, i.e., steep in shape and with a light tail, flat in shape and with a light tail, steep in shape and with a heavy tail, flat in shape and with a heavy tail. As is depicted in Figure 4, the solid line denotes the new Gt distribution, and the dashed line represents the previous Gt distribution defined by McDonald and Newey (1988) [1]. The subfigure Figure 4a–d depicts the pdf of the Gt distribution and the new Gt distribution under the above four categories with different shape parameters, respectively. It can be seen from the figure that, if the parameter value is in a reasonable range, the shape of the new Gt distribution is more flexible and more suitable for fitting a wider range of data.
Some important properties of the new Gt distribution are stated as follows:
Proposition 1.
If Y | U = u G N ( μ , σ u 1 β , β ) , and U G a m m a ( α , β ) , then Y N Gt ( μ , σ , α , β ) . U | Y = y G a m m a ( α + 1 β , 1 β + | y μ | β 2 σ β ) .
Proof. 
See Appendix A.1 (Part I).   □
Proposition 2.
If Y | U = u G N ( μ , σ ( β 2 u ) 1 β , β ) and U χ 2 ( 2 α ) , then Y N Gt ( μ , σ , α , β ) . U | Y = y G a m m a ( α + 1 β , 1 2 + β | y μ | β 4 σ β ) .
Proposition 3.
If X B e t a ( 1 β , α ) , let Y = μ + σ · W · ( 2 Z ) 1 β and Z = X β ( 1 X ) , then Y N G t ( μ , σ , α , β ) , where W is a random variable that only takes ± 1 and satisfies P ( W = 1 ) = P ( W = 1 ) = 1 2 .
From the above proposition, we can follow the following steps to generate random samples from N Gt ( μ , σ , α , β ) :
1.
Generate X B e t a ( 1 / β , α ) and set Z = X β ( 1 X ) .
2.
Generate U U ( 0 , 1 ) .
if U 1 2 , set W = 1 .
else set W = 1 .
3.
Set Y = μ + σ W ( 2 Z ) 1 β and return Y.
Proposition 4.
If T N Gt ( σ , α , β ) , we have
  • the 2 k -th order moment of T is given by
    E ( X 2 k ) = 2 β 2 k β Γ ( α 2 k / β ) Γ [ ( 2 k + 1 ) / β ] Γ ( α ) Γ ( 1 / β ) σ 2 k , k = 1 , 2 , ,
    and the above equation exists for α β > 2 k . Furthermore, from the symmetry of the distribution, we can conclude that E ( X 2 k 1 ) = 0 , k = 1 , 2 , ,
  • the kurtosis coefficient of T is given by
    G 2 ( α , β ) = Γ ( α ) Γ ( α 4 / β ) Γ ( 1 + 1 / β ) Γ ( 5 / β ) Γ 2 ( α 2 / β ) Γ 2 ( 3 / β ) 3 α β > 4 ,
  • the mean deviation of T is given by
    δ = 2 σ B ( 2 / β , α 1 / β ) ( 2 / β ) 1 1 / β B ( α , 1 / β ) β , α β > 1 ,
  • the shannon entropy is given by
    E [ log ( t ) ] = 1 + 1 β log 2 β + log B α , 1 β + log σ α + 1 β φ ( α ) φ ( α + 1 / β ) ,
  • the Rényi entropy is given by
    Υ R ( η ) = 2 B ( 1 / β , η α + ( η 1 ) / β ) ( 1 η ) ( 2 / β ) η + ( η 1 ) / β [ B ( α , 1 / β ) ] η σ η 1 β ,
    where φ (   ) denote the digamma function.

3. Moments of Order Statistics

In this section, we derive the explicit expression for moments of order statistics from this distribution under the independent identically-distributed (IID) case and independent not identically-distributed (INID) case when the shape parameters 1 β and α are both real non-integer or 1 β is an integer.
It is well-known that order statistics has always been a significant field in statistical research, especially in the study of distribution theory. In addition, it is necessary to study the related properties of order statistics, since many classical parameter estimation methods are based on the application of order statistics.

3.1. The IID Case

In this subsection, we obtain explicit expression for moments of order statistics from N Gt ( μ , σ , α , β ) under IID case based on distribution transformation approach and recurrence relations when the shape parameter 1 β is either an integer or a real non-integer.

3.1.1. Shape Parameter 1 β Integer Case

Govindarajulu (1963) [16] pointed out that the k-th order moment of order statistics from a symmetric distribution can be linearly expressed by the moments of the order statistics from the corresponding folded distribution
2 n δ r : n ( k ) = m = 0 r 1 n m γ r m : n m ( k ) + ( 1 ) k m = r n n m γ m r + 1 : m ( k ) ,
where δ r : n ( k ) denotes the k-th moments of order statistics from symmetric distribution, while γ r : n ( k ) denotes the k-th moments of order statistics from the corresponding folded distribution.
Without loss of generality, let μ = 0 , σ = 1 . Making transformation | T | β 2 = Z , we can derive the pdf of the random variable Z as follows
f Z ( z ; α , β ) = 1 ( 1 β ) 1 β B ( α , 1 β ) z 1 β 1 ( 1 + β z ) ( α + 1 β ) , z > 0
Under the above transformation, the recurrence relations introduced by Govindarajulu (1963) [16] becomes
Lemma 1.
2 n k β μ r : n ( k ) = m = 0 r 1 n m ν r m : n m ( k ) + ( 1 ) k m = r n n m ν m r + 1 : m ( k ) ,
where μ r : n ( k ) denotes the k-th moments of order statistics from N Gt ( 0 , 1 , α , β ) and ν r : n ( k ) denotes the k-th moments of order statistics from the corresponding folded distribution F Z ( z ) .
Proof. 
See Appendix A.2 (Part II).   □
Let α r : n ( k ) denote E ( Z r : n k ) , then ν r : n ( k ) = α r : n ( k / β ) . The cdf of the random variable Z can be written as
F Z ( z ; α , β ) = 1 ( 1 β ) 1 β B α , 1 β 0 z y 1 β 1 ( 1 + β y ) ( α + 1 β ) d y = 1 B α , 1 β 0 1 1 1 + β z y 1 β 1 ( 1 y ) α 1 d y ,
by binomial expansion we have
1 F Z ( z ) = 1 B α , 1 β i = 0 1 β 1 C i 1 ( 1 + β z ) α + i ,
where C i = ( 1 ) i 1 / β 1 i 1 α + i .
Let H v ( 1 β , p ) denote the coefficient in the expansion of i = 0 1 β 1 C i 1 ( 1 + β y ) i p . Obviously, the coefficients of this polynomial satisfy the following recursive relation
H v 1 β , p = i = 0 v C i H v i 1 β , p 1 ,
and with the following constraints
H v 1 β , 1 = C v , H v 1 β , 0 = 1 ,
we can cauculate every coefficient striaghtly. Then [ 1 F Z ( z ) ] p can be expressed as
[ 1 F Z ( z ) ] p = 1 B α , 1 β p v = 0 ( 1 β 1 ) p H v 1 β , p 1 ( 1 + β z ) α p + v .
It is noted that Equations (5) and (6) have similar forms, so the distribution F Z ( z ) is said to have a minimum domains of attraction. With this conclusion, E ( Z 1 : n k ) is derived first. Through the traditional way, we obtain
E ( Z 1 : n k ) = n 0 + z k [ 1 F Z ( z ) ] n 1 f ( z ) d z = n ( 1 β ) k B α , 1 β n v = 0 ( 1 β 1 ) ( n 1 ) H v 1 β , n 1 × B 1 β + k , α n + v k ,
and the above expression exists for k < α n .
After that, using the well-known recurrence relations from David and Nagaraja (2003) [17] presented below
( n r ) ν r : n ( k ) + r ν r + 1 : n ( k ) = n ν r : n 1 ( k ) ,
where r = 1 , , n 1 , and k = 1 , 2 , .
We acquire the k-th order moment of order statistics from the distribution F Z ( z ; α , β ) . When the sample size is n, the method of obtaining all the k-th moments of order statistics from N Gt ( 0 , 1 , α , β ) when 1 β is an integer can be briefly described as follows
1.
Derive ν i , j ( k ) ( 1 i n , 1 j n ) by using the above approach.
2.
For 1 r [ n 2 ] , derive μ r : n ( k ) by using the recurrence relations in Equation (4).
3.
By using the symmetry of Gt distribution, the k-th moments of the remaining order statistics can be calculated through μ n r + 1 : n ( k ) = μ r : n ( k ) .
Since this method makes full use of the characteristics of the distribution function and recursive relations, it is very efficient and can avoid a lot of unnecessary calculations.

3.1.2. Shape Parameter 1 β or α Non-Integer Case

The generalized Kampe de Feriet function from Exton (1978) [18] is defined by
F C : D A : B ( ( a ) : ( b 1 ) ; ; ( b n ) ; ( c ) : ( d 1 ) ; , ( d n ) ; x 1 , , x n ) = m 1 = 0 m n = 0 ( ( a ) ) m 1 + + m n ( ( b 1 ) ) m 1 ( ( b n ) ) m n ( ( c ) ) m 1 + + m n ( ( d 1 ) ) m 1 ( ( d n ) ) m n × x 1 m 1 x n m n m 1 ! m n ! ,
where a = ( a 1 , a 2 , , a A ) , b i = ( b i , 1 , b i , 2 , , b i , B ) for i = 1 , 2 , , n , c = ( c 1 , , c C ) , d i = ( d i , 1 , , d i , D ) , or i = 1 , 2 , , n and ( ( f ) ) k = ( ( f 1 , f 2 , , f p ) ) k = ( f 1 ) k ( f p ) k , ( f i ) k = f i ( f i + 1 ) ( f i + k 1 ) .
By using the above special function, we derive the following theorem:
Theorem 2.
If the shape parameter α is a real non-integer, the k-th order moment of order statistics X r : n from N Gt ( 0 , 1 , α , β ) can be calculated by the following convergent expression
E ( X r : n k ) = C r , n I 1 ( k , r 1 , n r ) + I 2 ( k , r 1 , n r ) ,
where
I 1 ( k , r 1 , n r ) = 2 k β n ( 1 β ) k β B 1 β , α i = 0 r 1 j = 0 n r ( 1 ) j r 1 i n r j A k + 1 β , α k β 1 , i + j ,
I 2 ( k , r 1 , n r ) = ( 1 ) k 2 k β n ( 1 β ) k β B 1 β , α i = 0 n r j = 0 r 1 ( 1 ) j n r i r 1 j A k + 1 β , α k β 1 , i + j ,
A k + 1 β , α k β 1 , i + j = β i + j B k + i + j + 1 β , α k β B α , 1 β i + j × F 1 : 1 1 : 2 ( ( k + i + j + 1 β ) : ( 1 α , 1 β ) ; ; ( 1 α , 1 β ) : ( α + i + j + 1 β ) : ( 1 β + 1 ) ; ; ( 1 β + 1 ) : 1 ; ; 1 ) ) ,
C r : n = n ! ( r 1 ) ! ( n r ) ! .
Proof. 
See Appendix A.2 (Part II).   □
Theorem 3.
If the shape parameter 1 β is a real non-integer, A k + 1 β , α k β 1 , i + j in Theorem 2 can be calculated by
A ( k + 1 β , α k β 1 , i + j ) = ( 1 β ) k + 1 β α ( i + j ) B α , 1 β i + j B k + 1 β , α ( i + j + 1 ) k β × F 1 : 1 1 : 2 ( ( α ( i + j + 1 ) k β ) : ( 1 1 β , α ) ; ; ( 1 1 β , α ) : ( α ( i + j + 1 ) + 1 β ) : ( α + 1 ) ; ; ( α + 1 ) : 1 ; 1 ) ,
Proof. 
See Appendix A.2 (Part II).   □

3.2. The INID Case

Suppose that X 1 , , X n are independent but not identical random samples from the following pdf with the same shape parameter β but the rest of the parameters are different
g i ( x ) = 1 ( 2 β ) 1 + 1 β B α i , 1 β σ i 1 + β | t μ i | β 2 σ i β ( α i + 1 β ) .
Theorem 4.
If the shape parameter 1 β is a real non-integer, the m-th order moment of order statistics X r : n from N Gt ( 0 , 1 , α i , β ) can be calculated by the following convergent expression
μ r : n m = j = n r + 1 n ( 1 ) j ( n r + 1 ) j 1 n r + j = r n ( 1 ) m + j r j 1 r 1 I j + ( m ) ,
where
I j + ( m ) = m 2 m β 1 j t = 1 j α i t B ( α i t , 1 β ) 1 β m β 1 B m β , t = 1 j α i t m β × F 1 : 1 1 : 2 ( ( t = 1 j α i t m β ) : ( 1 1 β , α i 1 ) ; ; ( 1 1 β , α i j ) : ( t = 1 j α i t ) : ( α i 1 + 1 ) ; ; ( α i j + 1 ) : 1 ; ; 1 ) , α i t > m β .
Proof. 
See Appendix A.2 (Part II).   □

4. Product Moments and Variance–Covariance Matrix of Order Statistics

In this section, we obtain explicit expressions for product moments and the corresponding variance–covariance matrix of order statistics from this distribution based on recurrence relations and distribution transformation when the shape parameter 1 β is an integer.
The covariance matrix of order statistics plays a big role in a number of estimation methods based on order statistics, such as the least square estimation and the BLUE for location-scale parameters. In addition, obtaining a covariance matrix of order statistics is usually not an easy task and requires some related skills.
Under transformation | T | β 2 = Z introduced in the previous section, the recursive relations for the product moment of order statistics from a symmetric distribution between its corresponding folded distribution becomes
2 n k 1 + k 2 β μ r , s : n ( k 1 , k 2 ) = m = 0 r 1 n m ν r m , s m : n m ( k 1 , k 2 ) + ( 1 ) k 1 m = r s 1 n m ν m r + 1 : m ( k 1 ) ν s m : n m ( k 2 ) + ( 1 ) k 1 + k 2 m = s n n m ν m + 1 s , m + 1 r : m ( k 1 , k 2 ) .
The cdf of the random variable Z is denoted by F Z ( z ) and suppose U , V F Z ( z ) , then let α r , s : n ( k 1 , k 2 ) denote E ( U r : n k 1 V s : n k 2 )   ( r s ) . Obsiviously, we can write that ν r , s : n ( k 1 , k 2 ) = α r , s : n ( k 1 / β , k 2 / β ) .
First of all, we decide to derive α r , s : n ( k 1 , k 2 ) . Taking the well-known expression of the product moment of order statistics and then using the binomial expansion, we obtain
α r , r + 1 : s ( k 1 , k 2 ) = C r , r + 1 : s 0 < x < y < + x k 1 y k 2 [ F ( x ) ] r 1 [ 1 F ( y ) ] s r 1 f ( x ) f ( y ) d x d y = C r , r + 1 : s i = 0 r 1 ( 1 ) i r 1 i 0 + x k 1 [ 1 F ( x ) ] i f ( x ) x + y k 2 [ 1 F ( y ) ] s r 1 f ( y ) d y d x = C r , r + 1 : s i = 0 r 1 ( 1 ) i r 1 i I ( k 1 , k 2 , i , s r 1 ) ,
where
C r , s : n = s ! ( r 1 ) ! ( s r 1 ) ! .
After that, we attempt to derive the double integral I ( k 1 , k 2 , i , s r 1 ) . By the multinomial theorem we have
[ 1 F ( y ) ] s r 1 = 1 B α , 1 β s r 1 v = 0 ( s r 1 ) ( 1 β 1 ) H v 1 β , s r 1 × ( 1 + β y ) [ α ( s r 1 ) + v ] .
We denote the inner integral x + y k 2 [ 1 F ( y ) ] s r 1 f ( y ) d y of the above double integral I ( k 1 , k 2 , i , s r 1 ) by T x ( k 2 , s r 1 ) and it can be calculated as follows
T x ( k 2 , s r 1 ) = ( 1 β ) 1 β B α , 1 β s r v = 0 ( s r 1 ) ( 1 β 1 ) H v 1 β 1 , s r 1 × x + y k 2 + 1 β 1 ( 1 + β y ) [ α ( s r ) + v + 1 β ] d y = ( 1 β ) k 2 B α , 1 β s r v = 0 ( s r 1 ) ( 1 β 1 ) H v 1 β , s r 1 × β x 1 + β x 1 w k 2 + 1 β 1 ( 1 w ) α ( s r ) + v k 2 1 d w .
Notice that ( k 2 + 1 β 1 ) is still a positive integer, then expanding the integrand on the right hand side of the above expression in terms of ( 1 w ) , we acquire
T x ( k 2 , s r 1 ) = ( 1 β ) k 2 B α , 1 β s r v = 0 ( s r 1 ) ( 1 β 1 ) H v 1 β , s r 1 m = 0 k 2 + 1 β 1 ( 1 ) m × k 2 + 1 β 1 m ( 1 + β x ) [ α ( s r ) + m + v k 2 ] α ( s r ) + m + v k 2 .
Using the fact that
[ 1 F ( x ) ] i = 1 B α , 1 β i u = 0 i ( 1 β 1 ) H u 1 β , i × ( 1 + β y ) ( α i + u ) ,
by putting Equations (11) and (12) into I ( k 1 , k 2 , i , s r 1 ) , we obtain
I ( k 1 , k 2 , i , s r 1 ) = ( 1 β ) k 1 + k 2 B α , 1 β s r + i + 1 u = 0 ( 1 β 1 ) i v = 0 ( s r 1 ) ( 1 β 1 ) H u 1 β , i H v 1 β , s r 1 × m = 0 k 2 + 1 β 1 k 2 + 1 β 1 m ( 1 ) m α ( s r ) + m + v k 2 × B k 1 + 1 β , α ( s r + i + 1 ) + m + u + v k 1 k 2 .
The above equation will exists for α > ( k 1 + k 2 ) / 2 . Finally, using recurrence relations from Arnold et al. (2008) [19]:
For any arbitray distribution and 1 p < q n
α p , q : n ( k 1 , k 2 ) = r = p q 1 s = n q + r + 1 n ( 1 ) s + n p q + 1 r 1 p 1 s r 1 n q n s α r , r + 1 : s ( k 1 , k 2 ) ,
we can derive a closed-form expression for every product moment and the corresponding variance–covariance matrix of order statistics from F Z ( z ) .
When the sample size is n, the method of obtaining all the product moments and variance–covariance matrix for order statistics from N Gt ( μ , σ , α , β ) can be briefly described as:
1.
Derive ν i , j : n ( k 1 , k 2 ) ( 1 i j n ) by using the above approach.
2.
For 1 r [ n 2 ] , r s n r + 1 , obtain μ r , s : n ( k 1 , k 2 ) by using recurrence relations in Equation (9) and obtain the covariance σ r , s : n by σ r , s : n = μ r , s : n μ r : n μ s : n .
3.
By using the symmetry of Gt distribution and the variance–covariance matrix, the remaining covariance of order statistics can be calculated by σ r , s : n = σ n s + 1 , n r + 1 : n and σ r , s : n = σ s , r : n .

5. Parameter Estimation

In this section, we propose several parameter estimation methods for the new Gt distribution.

5.1. Modfied Method of Moments

In this subsection, a modified method of moments (MMOM) for the parameters of N Gt ( 0 , σ , α , β ) is proposed.
Let X 1 , , X n be a random sample from N Gt ( σ , α , β ) . Making transformation Y = | X | 2 , the k-th theoretical central moments of Y are given by
E ( Y k ) = 2 β 2 k β Γ ( α 2 k / β ) Γ [ ( 2 k + 1 ) / β ] Γ ( α ) Γ ( 1 / β ) σ 2 k , k = 1 , 2 , ,
The MMEs of the new distribution after transformation can be obtained by equating the first three theoretical central moments of Y with the corresponding central sample moments. Eliminating the scale parameter σ , we acquire
[ Γ ( α ) ] Γ ( α 4 / β ) [ Γ ( α 2 / β ) ] 2 × [ Γ ( 1 / β ) ] Γ ( 5 / β ) [ Γ ( 3 / β ) ] 2 = n i = 1 n Y i 2 ( i = 1 n Y i ) 2 ,
[ Γ ( α ) ] 2 Γ ( α 6 / β ) [ Γ ( α 2 / β ) ] 3 × [ Γ ( 1 / β ) ] 2 Γ ( 7 / β ) [ Γ ( 3 / β ) ] 3 = n 2 i = 1 n Y i 3 ( i = 1 n Y i ) 3 .
In order to establish an effective algorithm to obtain the numerical solution of the above equations, we propose the following theorem:
Theorem 5.
Let
G k ( α , β ) = [ Γ ( α ) ] k 1 Γ ( α 2 k / β ) [ Γ ( α 2 / β ) ] k × [ Γ ( 1 / β ) ] k 1 Γ [ ( 2 k + 1 ) / β ] [ Γ ( 3 / β ) ] k k > 2 ,
then we assert that G k ( α , β ) is a monotonically-decreasing function of the parameters α and β. In addition, the function has the following limiting properties
F o r f i x e d α : lim β 2 k / α + G k ( α , β ) = , lim β + G k ( α , β ) = 3 k / 2 2 k + 1 × i = 1 k Γ [ i / ( 2 k + 1 ) ] [ Γ ( 1 / 3 ) ] k [ Γ ( 2 / 3 ) ] k .
F o r f i x e d β : lim α 2 k / β + G k ( α , β ) = , lim α + G k ( α , β ) = [ Γ ( 1 / β ) ] k 1 Γ [ ( 2 k + 1 ) / β ] [ Γ ( 3 / β ) ] k .
Proof. 
See Appendix A.3 (Part III).   □
With the above theorem, the steps of the algorithm are presented by the following forms
1.
Assign an initial values α ( 1 ) , and then substitute α ( 1 ) into Equation (15). Since G k ( α , β ) is monotonically-decreasing with respect to parameter β , u n i r o o t in the R function can be used to derive β ^ ( 1 ) .
2.
Substitute β ^ ( 1 ) into Equation (16), since G k ( α , β ) is monotonically-decreasing with respect to parameter α , using u n i r o o t in R function then we obtain α ^ ( 2 ) .
3.
If the norm of vector ( α ^ ( i + 1 ) α ^ ( i ) , β ^ ( i + 1 ) β ^ ( i ) ) exceed the prescribed value ε > 0 , then 2–3 are repeated. Otherwise, the iteration is broken up and the values generated in the last iteration are determined to be estimates of the shape parameters α and β .

5.2. MLE Using the EM Algorithm

In this subsection, an EM-type algorithm to determine the MLEs for the parameters of N Gt ( 0 , σ , α , β ) is established.
Through the method of distribution construction, the pdf of the new Gt distribution is treated as a marginal pdf of a bivariate distribution and this bivariate distribution function can be expressed as the product of two common distributions. After that, we treat one random variable in the bivariate distribution as an observed variable, while the other as an unobserved variable, so that the EM algorithm can be applied to obtain the MLE. At the same time, this idea is also widely used to derive the MLE of other distributions. See [15,20,21] for more details about recent works in this topic.
First proposed by Dempster (1977) [22], the EM algorithm is a powerful tool to deal with estimation problem under missing data situation. In this subsection, we are going to use the EM algorithm to acquire the MLEs of N Gt ( σ , α , β ) based on Proposition 2. Let Y = ( y 1 , , y n ) T denote the observed data, U = ( u 1 , , u n ) T on behalf of the unobserved data. Combining Y and U together we derive the complete data denoted by Z = ( Y , U ) . Besides that, let θ = ( σ , α , β ) T represent the vector of parameters. Invoking Proposition 2, the complete log-likelihood function can be expressed as follows
ln L ( θ | Z ) = n 1 + 1 β ln β 1 + 2 β + α ln 2 ln Γ ( 1 β ) ln Γ ( α ) ln σ + α + 1 β 1 i = 1 n ln u i β 4 σ β i = 1 n | y i | β u i 1 2 i = 1 n u i .
It is well-known that the EM framework is an iterative algorithm consisting of two steps; namely, the expectation step (E-step) and the maximization step (M-step).
E-step: Given the observed data set Y = ( y 1 , , y n ) and parameter estimation values θ ( h ) for the h-th iteration, the conditional expectation of the complete log-likelihood function can be calculated as follows
Q ( θ | θ ( h ) ) = E [ ln L ( θ | Z ) | Y , θ ( h ) ] = n 1 + 1 β ln β 1 + 2 β + α ln 2 ln Γ ( 1 β ) ln Γ ( α ) ln σ + α + 1 β 1 i = 1 n E [ ln u i | y i , θ ( h ) ] β 4 σ β i = 1 n | y i | β E [ u i | y i , θ ( h ) ] 1 2 i = 1 n E [ u i | y i , θ ( h ) ] ,
where
E [ u i | y i , θ ( h ) ] = 4 σ ( h ) β ( h ) ( α ( h ) + 1 / β ( h ) ) 2 σ ( h ) β ( h ) + β ( h ) | y i | β ( h ) , E [ ln u i | y i , θ ( h ) ] = φ α ( h ) + 1 β ( h ) ln 1 2 + β ( h ) | y i | β ( h ) 4 σ ( h ) β ( h ) .
In the EM algorithm, the M-step needs to maximize the conditional expectation obtained by E-step and the suggested framework can be briefly described as follows
1.
Update α ( h + 1 ) through the following nonlinear equation and the corresponding numerical procedures is recommended for u n i r o o t in R program.
Q ( θ | θ ( h ) ) α = n [ φ ( α ) + ln 2 ] i = 1 n E [ ln u i | y i , θ ( h ) ] .
2.
Update σ ( h + 1 ) through the following explicit expression
σ ( h + 1 ) = β ( h ) 2 4 n 1 β ( h ) i = 1 n | y i | β ( h ) E [ u i | y i , σ ( h ) , α ( h + 1 ) , β ( h ) ] 1 β ( h ) .
3.
Update β ( h + 1 ) by maximizing the observed data log-likelihood function
β ( h + 1 ) = arg max β i = 1 n Q ( θ | σ ( h + 1 ) , α ( h + 1 ) , β ) .
For the initial values θ 0 = ( σ 0 , α 0 , β 0 ) , we recommend conducting numerical procedures such as o p t i m in R program for maximizing the corresponding likelihood function or using the estimation values from the method of moments.

5.3. Explicit Expressions for Asymptotic Variances and Covariances of the Estimates under the EM Algorithm

In this subsection, we obtain closed-form expression for the asymptotic variances and covariances of the estimates under the EM algorithm. Louis (1982) [23] coined the missing information principle to acquire an information matrix in a situation containing latent variables. This principle is given by
Observed information = Complete information Missing information .
In the following paper, we will denote the complete, observed and missing information by I Z ( θ ) , I Y ( θ ) and I ( U | Y ) ( θ ) , respectively. Then the observed information I Y ( θ ) is calculated by:
I Y ( θ ) = I Z ( θ ) I ( U | Y ) ( θ ) .

5.3.1. Complete Information Matrix I Z ( θ )

Taking the second-order partial derivatives of the complete log-likelihood function in Equation (17) and applying the definition of the fisher information, then after some calculation, we obtain
I 11 Z = n β σ 2 , I 22 Z = n φ ( α ) , I 12 Z = I 21 Z = I 23 Z = I 32 Z = 0 ,
I 13 Z = I 31 Z = n σ β + β 2 4 σ n T 1 ( α , β ) + n β 1 β 2 1 β B ( α , 1 / β ) ,
I 33 Z = 2 n β 3 ln β + ( 1 + β ) + 2 ln 2 + φ ( 1 β ) + n β 2 1 β 1 + φ ( 1 β ) β 2 2 β 3 n φ ( α ) + ln 2 2 α + n 2 T 1 ( α , β ) + β n 4 T 2 ( α , β ) ,
where
T 1 ( α , β ) = 4 β 3 φ ( 1 + 1 β ) ln β 2 φ ( α ) ,
T 2 ( α , β ) = 2 β 3 φ 1 + 1 β 2 + 1 β 2 φ 1 + 1 β + 8 β 3 φ 1 + 1 β ln β 2 + φ 1 + 1 β φ ( α ) } + 4 β 3 ( β 2 ) 2 + [ φ ( α ) ] 2 + φ ( α ) + 2 φ ( α ) ln β 2 .
The second-order partial derivatives of the complete likelihood function and the calculation method of the complete information are shown in the Appendix A.3 (Part III).

5.3.2. Missing Information Matrix I ( U | Y ) ( θ )

From Proposition 2, the logarithm of the joint density function of latent variable U given Y can be written as
ln f U | Y ( u | y ) = i = 1 n α + 1 β ln 1 2 + β | y i | 4 σ β n Γ ( α + 1 / β ) + α + 1 β 1 i = 1 n ln U i i = 1 n 1 2 + β | y i | 4 σ β U i .
After taking the second-order partial derivatives with respect to σ , α and β , and utilizing the results in Section 5.2, the elements of the missing information matrix I ( U | Y ) ( θ ) are presented as follows
I 11 ( U | Y ) = β ( α + 1 ) i = 1 n | y i | [ 2 ( β + 1 ) σ β + β | y i | 2 ] 2 σ β + 1 + β σ | y i | 2 | y i | ( β + 1 ) σ 2 ( 2 σ β + β | y i | β ) , I 12 ( U | Y ) = I 21 ( U | Y ) = i = 1 n β 2 | y i | 2 σ β + 1 + β σ | y i | , I 13 ( U | Y ) = I 31 ( U | Y ) = i = 1 n | y i | 2 σ β + 1 + β σ | y i | + ( α + 1 β ) i = 1 n 2 β | y i | ( 2 σ β + 1 + β σ | y i | ) β 2 | y i | ( 2 σ β + 1 ln σ + σ | y i | ) ( 2 σ β + 1 + β σ | y i | ) 2 + 2 β | y i | β 2 σ | y i | l n σ 4 σ β + 1 , I 22 ( U | Y ) = n 2 Γ ( α + 1 / β ) α 2 , I 23 ( U | Y ) = I 32 ( U | Y ) = | y i | ( 1 β ln σ ) 2 σ β + β | y i | + n β 4 φ ( α + 1 / β ) , I 33 ( U | Y ) = 2 β 2 i = 1 n ln 1 2 + β | y i | 4 σ β + i = 1 n | y i | 2 σ β + 1 + β σ | y i | + 1 β 2 i = 1 n | y i | ( 1 β ln σ ) 2 σ β + β | y i | + 2 n β 3 + n β 4 Γ ( α + 1 / β ) β + α + 1 β i = 1 n | y i | [ ( 2 σ β + β | y i | ) ln σ + ( 2 σ β ln σ + | y i | ) | y i | ( 1 β ln σ ) ] ( 2 σ β + β | y i | ) 2 + | y i | ( 2 β ln σ ) 2 σ β + β | y i | β 2 n β 3 φ α + 1 β ln 1 2 + β | y i | 4 σ β .

5.4. MLE via a New Iterative Algorithm

In this subsection, we propose a new iterative algorithm to obtain the MLEs for the parameters of N Gt ( 0 , σ , α , β ) because of the difficulty of sloving the estimation equations directly.
Let X 1 , , X n be a random sample from N Gt ( 0 , σ , α , β ) , and the log-likelihood function is given by
ln L ( X ; σ , α , β ) = n ( 1 + 1 / β ) ln ( 2 / β ) n ln B ( α , 1 / β ) n ln σ α + 1 β i = 1 n ln 1 + β | x i | β 2 σ β .
With partial derivatives respect to ( σ , α , σ ) , we obtain
ln L ( X ; σ , α , β ) σ = n ( α β + 1 ) i = 1 n β | x i | β 2 σ β + β | x i | β , ln L ( X ; σ , α , β ) α = n φ ( α + 1 / β ) φ ( α ) i = 1 n ln 1 + β | x i | β 2 σ β , ln L ( X ; σ , α , β ) β = n ln ( 2 / β ) + n β ( 1 + 1 / β ) n φ ( α + 1 / β ) φ ( 1 / β ) + i = 1 n ln 1 + β | x i | β 2 σ β β ( α β + 1 ) i = 1 n | x i | β + β | x i | β ln ( | x i | / σ ) 2 σ β + β | x i | β . ( 18 )
Instead of sloving all three preceding equations simultaneously, we propose a new algorithm because of the complexity of the third equation of Equation (18). In order to establish the new approach, the following lemma is necessary
Lemma 2.
For any fixed β, the following equations
n ( α β + 1 ) i = 1 n β | x i | β 2 σ β + β | x i | β = 0 , n φ ( α + 1 / β ) φ ( α ) i = 1 n ln 1 + β | x i | β 2 σ β = 0 , ( 19 )
always possess a convergent solution of parameter ( σ , α ) and is independent of the initial values ( σ 0 , α 0 ) .
Proof. 
See Appendix A.3 (Part III).   □
Moreover, the algorithm for obtaining the solution of the equation in Lemma 2 is shown below
1.
Given arbitrary initial values ( σ ( 0 ) , α ( 0 ) ) and σ ( 0 ) > 0 , α ( 0 ) > 0 .
2.
Substitute α ( 0 ) into the first equation of Equation (19), and we obtain σ ( 1 ) .
3.
Substitute σ ( 1 ) into the second equation of Equation (19), and we obtain α ( 1 ) .
4.
If the norm of vector ( σ ( h + 1 ) σ ( h ) , α ( h + 1 ) α ( h ) ) exceed the prescribed value ε > 0 , then 2–3 are repeated. Otherwise, the iteration is broken up and the values generated in the last iteration are determined to be solution of this equation.
In what follows, the new algorithm to acquire the solution of Equation (18) can be briefly described as
1.
Give an appropriate interval for the shape parameter β , and then generate a set of values of this parameter at an appropriate distance within the interval and represent the j-th element of this vector by β ( j ) .
2.
Assigning β ( j ) into the first two equations of Equation (18)and applying Lemma 2, we can obtain their convergence solution denoted by ( σ ( j ) , α ( j ) ) .
3.
Repeat this procedure and observe the sign-changing in the left-hand side of the third equation of Equation (18). Then, by interpolating on β values, we derive the final estimate β ^ M L E , which makes the third equation equals to 0 or sufficiently close to 0. Finally, by substituting β ^ M L E into the first two equations of Equation (18), we derive the final estimates denoted by ( σ ^ M L E , α ^ M L E ) .
It is noteworthy that this algorithm is applicable when the sample kurtosis is greater than 2.7. Comparing with other numerical methods, our new algorithm requires less precision of initial values ( σ 0 , α 0 ) and can obtain convergent solutions.

5.5. An Improved PWM Estimation

In this subsection, we propose an improved PWM (IPWM) estimation based on order statistics transformation for N Gt ( 0 , σ , α , β ) and develop the corresponding algorithm to implement this novel method.
PWM estimation is a very effective and commonly used method for parameter estimation of distributions with heavy tails. Among many parameter estimation methods proposed so far, PWM and MLE are the two most commonly-used methods that have similar priority.
Let X 1 , , X n be a random sample from N Gt ( 0 , σ , α , β ) and X 1 : n X n : n behalf of the corresponding order statistics. The traditional theoretical PWM was introduced by Greenwood et al. [24] as an alternative to the method of moments given in the following equation
M p , r , s ( x ) = E { X p [ G ( x ) ] r [ 1 G ( x ) ] s } , r = 0 , 1 , , s = 0 , 1 , .
Taking p = 1 , s = 0 , the moments become M 1 , r , 0 ( x ) denoted by β r . Wang (1990) [25] obtained an unbiased estimator of β r through linear combination of order statistics as follows
b r = 1 n n 1 r 1 i = 1 n i 1 r x i : n , r = 0 , 1 , .
However, applying the traditional PWM directly can cause many problems due to the complexity of the distribution function. For example, intricacy expressions of the theoretical PWM will make the estimation equation tedious and difficult to be numerically optimized, thus impeding the use of this estimation method. As a result, we decide to make isotone transformation Y = σ X . After this monotone transformation, the estimation equation becomes simple and easy to solve by a numerical approach. The improved theoretical PWM (denoted by M 1 , r , 0 Y ) under this transformation is given by
β r Y = M 1 , r , 0 Y = E { Y [ G ( y ) ] r } , r = 0 , 1 , .
The corresponding improved sample PWM (denoted by b r Y ) is defined as follows
b r Y = 1 n n 1 r 1 i = 1 n i 1 r y i : n , r = 0 , 1 , ,
where y i : n = σ ˜ x i : n and σ ˜ is an initial guess of the scale parameter σ .
From Equation (20), we can derive the 2–5th improved theoretical PWM as follows
β 1 Y = ( 2 β ) 1 β 2 B ( α , 1 β ) 0 1 z 2 β 1 ( 1 z ) α 1 β 1 I z 1 β , α d z , β 2 Y = β 1 Y , β 3 Y = 2 1 / β 3 ( 1 β ) 1 β B ( α , 1 β ) 0 1 z 2 β 1 ( 1 z ) α 1 β 1 I z 1 β , α d z + 3 0 1 z 2 β 1 ( 1 z ) α 1 β 1 I z 1 β , α 3 d z ,
β 4 Y = 2 1 / β 2 ( 1 β ) 1 β B ( α , 1 β ) 0 1 z 2 β 1 ( 1 z ) α 1 β 1 I z 1 β , α d z + 0 1 z 2 β 1 ( 1 z ) α 1 β 1 I z 1 β , α 3 d z .
According to the traditional idea of the method of moments, let the improved theoretical PWM β 3 Y , β 4 Y equal to the corresponding improved sample PWM. Then we can obtain an equation set whose solution is the estimation of the unknown parameters. Due to the complexity of these equations, we transform the search for the solution of this equation set into an optimization problem and acquire the simultaneous estimation of the shape parameters denoted by α ^ P W M , β ^ P W M through minimizing the following expression to avoid failures during optimization
( α ^ P W M , β ^ P W M ) = arg min ( α , β ) β 3 Y + β 4 Y b 3 Y b 4 Y 2 .
Numerical techniques are recommended, such as the o p t i m function in R program and the method will be chosen for L B F G S B .
Finally, the estimation of the scale parameter σ ^ P W M can be obtained through the first original sample PWM of the random variable X with analytical expression given by the following equation
σ ^ P W M = 2 b 1 B ( α ^ P W M , 1 / β ^ P W M ) ( 2 β ^ P W M ) 1 / β ^ P W M 0 1 u 2 / β ^ P W M 1 ( 1 u ) α ^ P W M β ^ P W M 1 I u ( 1 / β ^ P W M , α ^ P W M ) d u .
In the study, we found that the estimation results are sensitive to the choice of the initial guess σ ˜ . Therefore, we introduce the following two methods to discuss the choice of this parameter.
Method 1 (A two step estimation method)
1.
Take a consistent estimate of parameter σ as the initial guess of it, such as MLE.
2.
Substitute this estimate value σ ^ into Equation (22) then we derive the improved PWM-estimation of the two shape parameters. The estimation for the scale paramter σ can be obtained through Equation (23).
Through this method, considering that it is relatively fixed to select the value of parameter σ ˜ , and the final estimations are largely dependent on this parameter, sometimes it is inevitable that the fixed parameter σ ˜ does not apply to all samples. As a result, we establish the following algorithm, which can flexibly select the parameter σ ˜ and the suggested framework can be briefly described as:
Method 2 (A selection algorithm)
1.
Give an appropriate interval for the parameter σ ˜ , generate a set of σ ˜ at appropriate distance within the interval and represent the h-th element of the vector by σ ˜ ( h ) .
2.
Substitute each σ ˜ ( h ) into Equation (22) to acquire simultaneously estimation for two shape parameters denoted by α ^ P W M ( h ) and β ^ P W M ( h ) . After that, cauculate the corresponding original 6th theroetical PWM-kuritosis through the following equation
T k ( α ^ P W M ( h ) , β ^ P W M ( h ) ) = 0 1 u 2 β ^ P W M ( h ) 1 ( 1 u ) α ^ P W M ( h ) 1 β ^ P W M ( h ) 1 I u 1 β ^ P W M ( h ) , α ^ P W M ( h ) k d u ,
τ 6 ( h ) P W M = T 1 ( α ^ P W M ( h ) , β ^ P W M ( h ) ) + 2 T 3 ( α ^ P W M ( h ) , β ^ P W M ( h ) ) + T 5 ( α ^ P W M ( h ) , β ^ P W M ( h ) ) T 1 ( α ^ P W M ( h ) , β ^ P W M ( h ) ) .
3.
Repeate 2 and cauculate q P W M ( σ ˜ ) ( h ) = | τ 6 ( h ) P W M M ^ 1 , 0 , 5 X / M ^ 1 , 0 , 1 X | ( M ^ 1 , 0 , 5 X and M ^ 1 , 0 , 1 X denote the original 6th and 2th sample PWM of the random variable X, respectively). Then, interpolate on σ ˜ values and derive σ ˘ that minimize q P W M ( σ ˜ ) .
4.
Substituting σ ˘ into Equation (22) we derive the final estimation for two shape parameters and final estimation for σ can be obtained form Equation (23).

5.6. Parameter Estimation of N Gt ( μ , σ , α , β )   ( μ 0 )

In this subsection, we propose a profile maximum likelihood approach (PLA) using the EM algorithm to cope with parameter estimation for the new Gt distribution with four parameters.
In fitting problems with the distribution of data with better symmetry, three factors are usually comprehensively scrutinized, namely location information, scale parameters, and shape parameters. Therefore, with the addition of the location parameter, we focus on parameter estimation problem of N Gt ( μ , σ , α , β ) .
Balakrishnan (2019) [20] applied this approach to determine the MLEs for the parameters of Student’s t Birnbaum-Saunders distribution. As Balakrishnan points out, this method is an example of an exhaustive search. Due to the addition of location parameter μ , the estimation equation becomes very complicated. To overcome this difficulty, μ can be assumed as a natural number through profile likelihood approach, reducing the dimension of the problem with four parameters to that with three. Then the EM algorithm for the three-parameter case described in the preceding chapter comes in handy here.
Let Y 1 , , Y n be a random sample from N Gt ( μ , σ , α , β ) . In this case, the log-likelihood function is given by
ln L ( y | μ , σ , α , β ) = n ( 1 + 1 / β ) ln ( 2 / β ) n ln B ( α , 1 / β ) n ln σ α + 1 β i = 1 n ln 1 + β | y i μ | β 2 σ β .
Since β | y i μ | β 2 σ β + β | y i μ | β < ln ( 1 + y i ) < β | y i μ | β 2 σ β , we can assert that the likelihood function L ( μ ) is bounded when the parameter ( σ , α , β ) are fixed.
A suggested framework for the PLA using the EM algorithm is briefly described as follows:
1.
Give an appropriate interval for the location parameter μ . Then generate a set of μ at appropriate distance within the interval and represent the j-th element of the vector by μ ( j ) .
2.
Substitute each μ ( j ) in the log-likelihood function and implement the EM algorithm to derive the MLEs of ( σ , α , β ) .
3.
Repeating this procedure, and then interpolating on μ values, we obtain a curve of values of the log-likelihood function and choose the value of μ that maximizes the curve as μ ^ M L E .
4.
Substitute μ ^ M L E in the log-likelihood function and implement the EM algorithm to determine the final MLEs denoted by ( σ ^ M L E , α ^ M L E , β ^ M L E ) .
The implementation for the EM algorithm is similar to Section 5.2, just replace all the | y i | in the partial derivative of σ , α , β and the integral with | y i μ ^ M L E | . The trace of the log-likelihood function is depicted in Figure 5. Through simulation research, we find that this algorithm performs well for α > β and is more accurate and stable than using the sample mean as the estimation of the location parameter μ directly under a small sample size.

6. Simulation Study

In this section, we conduct a simulation study to illustrate the performance of the proposed estimation methods and compare these simulation results under different parameter values and different sample sizes.
According to the actual demand, the sample size will be chosen for 25, 50, 100 and 200. We perform Monte Carlo simulations and evaluate the performance of the considered methods through relative bias (RB) and relative mean-square errors (RMSE). These two indicators will be computed with 5000 replications at the same time and they are defined as follows
R B ( θ ^ ) = 1 N i = 1 N θ ^ i θ i ,
R M S E ( θ ^ ) = 1 N i = 1 N θ ^ i θ i R B ( θ ^ ) .
This section will be detailed in three parts. First of all, we conduct a comparative study for three different estimation methods (MMOM, MLE using the EM algorithm and IPWM) of N Gt ( 0 , σ , α , β ) . We mainly study the influence of shape parameter β on the performance of these methods. As a result, we fix σ = 1 , α = 3 , and β will be chosen for 2 , 3 , 4 , respectively. To show the comparison more vividly, we draw the following pictures (Figure 6, Figure 7 and Figure 8) and the estimated values of these three parameters are shown in Appendix A.4 (Part IV).
It can be seen that the two indicators RB and RMSE of these three estimation methods gradually decrease with the increase of sample size. As the value of the shape parameter, β , increases, the estimation accuracy of the IPWM and the MLE using the EM algorithm for shape parameter α gradually becomes better. Meanwhile, we can conclude that the IPWM performs best for both accuracy and stability, especially for the distribution has a heavier tail, while the MLE using the EM algorithm is the second on the whole. The performance of these two methods is comparable under a large sample size or high theoretical kurtosis. Furthermore, the IPWM is suitable for distributions with heavy tails and the MLE using the EM algorithm can become an alternative estimation method for steep-shaped distributions or under large sample size.
Second, a comparative study of the MLE using the EM algorithm and the MLE via a new iterative algorithm with sample kurtosis more than 2.7 is considered. The numerical results are presented in Table 1 for fixed μ = 0 , σ = 1 , α = 3 , β = 3 .
Table 1 suggests that the new iterative algorithm performs better than the EM algorithm under sample kurtosis that exceeds 2.7 in general, especially for the estimation accuracy and stability of shape parameter β . For shape parameter α , the iterative algorithm provides higher estimation accuracy while the EM algorithm is more stable.
At last, a simulation study for the PLA using the EM algorithm for N Gt ( μ , σ , α , β ) ( μ 0 ) considered in Section 5.6 will be performed. We mainly focus on the influence of shape parameter α and sample size on the estimation performance of this method. Therefore, we fix μ = 1 , σ = 1 , β = 2 and α will be chosen for 2 , 2.5 , 3 , respectively. Line graphs of the relationship between RB, RMSE and sample sizes with different value of shape parameter α were shown below (Figure 9) and the estimated value for these parameters are given in Table A4.
Figure 9 reveals that as the value of the shape parameter α increases, the estimation results become better and better. It is concluded that with the increase of the theoretical kurtosis, the accuracy and stability of the MLEs are gradually improved. In addition, the estimation effect of location parameter, μ , and shape parameter, β , is less affected by the change of shape parameter α , while that two indicators for the scale parameter, σ , and shape parameter, α , are more sensitive to the change of it. This means that it is not very difficult to estimate the location parameter for a distribution with good symmetry, while the shape parameters of a heavy tail distribution are more difficult to estimate.

7. Real Data Analysis

In this section, a real data analysis is performed to illustrate the feasibility of the new Gt distribution and a comparison with the Gt distribution is also presented.
These data concerned U.S. stock market and were introduced by Shen et al. (2020), [26]. The basic description of the statistical characteristics of the data set are shown in Table 2, which containing the sample size n, the mean value Y ¯ , the standard deviation S, the asymmetry coefficient b 1 , the kurtosis coefficient b 2 , the minimum value min ( Y ) , and the maximum value max ( Y ) . It can be seen from the table that the data set has good symmetry and higher kurtosis.
To obtain the MLEs of the unknown parameters, the PLA depicted in Figure 10 is recommended to avoid possible numerical optimization problems. The fitting results of these two distributions and the estimated parameter values are presented in Table 3. Based on the AIC criterion, we can infer from it that the new Gt distribution is slightly more suitable for fitting this data set than the Gt distribution. The histogram of the considered data with the two density functions we fitted is drawn in Figure 11. It is clearly seen that the pdf of the new Gt distribution is steeper in shape with a lighter tail, while the Gt distribution is the opposite.

8. Conclusions

In this paper, we coin a new Gt distribution that is suitable for fitting both the data with high kurtosis and a heavy tail based on a construction approach. Firstly, we have investigated the main properties of the new distribution including moments, skewness coefficients, kurtosis coefficients and random number generation. Secondly, we have derived the explicit expression for the moments of order statistics as well as its corresponding variance–covariance matrix through recurrence relations and the distribution transformation technique. The so obtained method has high efficiency and can greatly reduce the computation time. After that, we have focused on the parameter estimation of this new Gt distribution. Several estimation methods including MMOM, MLE using the EM algorithm, MLE using a new iterative algorithm and IPWM have been introduced. Among all these estimation methods, the IPWM performs best on the whole and this novel method makes the parameter estimation method of this distribution not limited to MLE and MMOM. Furthermore, the new iterative algorithm to acquire MLE is more suitable than the EM algorithm when the sample kurtosis is more than 2.7. For four-parameter new Gt distribution, we have established an EM-type algorithm through the profile maximum likelihood approach and discovered that the variation of the shape parameter α has a significant effect on the estimation performance of scale parameter σ and shape parameter α . However, there are still some limitations and areas to be improved in our research. The parameter estimation method is limited to PLA using the EM algorithm when the distribution has a heavier tail. Therefore, proposing more efficient and accurate estimation methods will be the focus of our future research. Besides, the new Gt distribution is suitable for fitting data with good symmetry. In future work, it can be applied for the asymmetric situation by adding a skew parameter.

Author Contributions

R.G.: Conceptualization, Formal analysis, Methodology, Software, Writing—original draft, Data curation; W.C.: Conceptualization, Formal analysis, Methodology, Investigation, Supervision, Funding acquisition, Writing—review and editing; X.Z.: Methodology, Formal analysis, Supervision, Writing—review and editing, Data curation, Funding acquisition, Validation; Y.R.: Funding acquisition, Writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the National Natural Science Foundation of China (Grant No. 11701021), the National Natural Science Foundation of China (Grant No. 11801019), the Beijing Natural Science Foundation (Grant No. Z190021), the Science and Technology Program of Beijing Education Commission (Grant No. KM202110005013) and the Science and Technology Project of Beijing Municipal Education Commission(Grant No. KM202110005012).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available in a publicly accessible repository; the data presented in this study are openly available in Shen et al., (2020).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Part I

Proof. Proof of Theorem 1
 
According to the pdf of the quotient of two random variables, we have
g T ( t ; σ , α , β ) = β 2 1 + 1 / β Γ ( 1 / β ) σ × 1 Γ ( α ) β α 1 0 + e | t μ | β z β / ( 2 σ β ) z α β e z β / β d z = 1 2 1 + 1 β Γ ( α ) Γ ( 1 β ) σ 0 + z α β e [ | t μ | β z β / ( 2 σ β ) + 1 / β ] z β d z .
Let u = [ | t μ | β z β / ( 2 σ β ) + 1 / β ] z β , we obtain
0 + z α β e [ | t μ | β z β / ( 2 σ β ) + 1 / β ] z β d z = 1 β + | t μ | β 2 σ β ( α + 1 β ) 1 β 0 + u α + 1 β 1 e u d u = β α + 1 β 1 Γ α + 1 β 1 + | t μ | β 2 σ β ( α + 1 β ) .
by substituting the above expression into (A1) we can derive the pdf of the random variable T in Theorem 1.   □
Proof. Proof of Proposition 1
 
f Y ( y ) = 0 + g ( y | u ) g ( u ) d u = β 1 α 2 1 + 1 β Γ ( 1 β ) Γ ( α ) σ 0 + u α + 1 β 1 e x p u | y μ | β 2 σ β u β d u = 1 ( 2 β ) 1 + 1 β B ( α , 1 β ) σ 1 + β | y μ | β 2 σ β ( α + 1 β ) .

Appendix A.2. Part II

Proof. Proof of Lemma 1
 
Notice that
F Z ( z ) = 2 G ( 2 1 β z ) 1 and f Z ( z ) = 2 1 + 1 β g ( 2 1 β z ) ,
the follwing steps are similar to Govindarajulu (1963) [16].   □
Proof. Proof of Theorem 2
 
Through the traditional way for cauculating the k-th moments of order statistics, we have
E ( X r : n k ) = C r , n + x k [ G ( x ) ] r 1 [ 1 G ( x ) ] n r g ( x ) d x = C r , n 0 + x k [ G ( x ) ] r 1 [ 1 G ( x ) ] n r g ( x ) d x + 0 x k [ G ( x ) ] r 1 [ 1 G ( x ) ] n r g ( x ) d x = C r , n I 1 ( k , r 1 , n r ) + I 2 ( k , r 1 , n r ) .
From the symetric properties of this distribution, let u = x , we have
I 2 ( k , r 1 , n r ) = C r , n ( 1 ) k 0 + u k [ 1 G ( u ) ] r 1 [ G ( u ) ] n r g ( u ) d u .
We notice that
G [ ( 2 u ) 1 β ] = 1 2 1 + I β u / ( 1 + β u ) 1 β , α u > 0 ,
where I x ( a , b ) = 1 B ( a , b ) 0 x t a 1 ( 1 t ) b 1 d t denote the incomplete beta function. -4.6cm0cm
I 1 ( k , r 1 , n r ) = C r , n 2 ( 1 β ) 1 β B ( 1 β , α ) 0 + u k + 1 β 1 ( 1 + β u ) ( α + 1 β ) { G [ ( 2 u ) 1 β ] } r 1 { 1 G [ ( 2 u ) 1 β ] } n r d u let z = β u 1 + β u = 2 k β 1 ( 1 β ) k β B ( 1 β , α ) 0 1 z k + 1 β 1 ( 1 z ) α + k + 2 β 1 1 + I z ( 1 β , α ) r 1 1 I z 1 β , α n r d z = 2 k β n ( 1 β ) k β B ( 1 β , α ) i = 0 r 1 j = 0 n r ( 1 ) j r 1 i n r j 0 1 z k + 1 β 1 ( 1 z ) α k β 1 I z 1 β , α i + j d z = 2 k β n ( 1 β ) k β B ( 1 β , α ) i = 0 r 1 j = 0 n r ( 1 ) j r 1 i n r j A k + 1 β , α k β 1 , i + j .
When α is a real non-integer, by using generalized multinomial theorem, we obtain
I z 1 β , α = z 1 β B ( 1 β , α ) k = 0 ( 1 α ) k z k ( 1 β + k ) k ! ,
I z 1 β , α i + j = z i + j β [ B ( 1 β , α ) ] i + j m 1 + + m i + j = 0 ( 1 α ) m 1 ( 1 α ) m i + j ( 1 β + m 1 ) ( 1 β + m i + j ) × z m 1 + + m i + j m 1 ! m i + j !
With the above equations A ( k + 1 β , α k β 1 , i + j ) can be rewritten as
A k + 1 β , α k β 1 , i + j = B 1 β , α ( i + j ) m 1 + + m i + j = 0 ( 1 α ) m 1 ( 1 α ) m i + j ( 1 β + m 1 ) ( 1 β + m i + j ) = β i + j B ( k + i + j + 1 β , α k β ) [ B ( α , 1 β ) ] i + j × F 1 : 1 1 : 2 ( ( k + i + j + 1 β ) : ( 1 α , 1 β ) ; ; ( 1 α , 1 β ) : ( α + i + j + 1 β ) : ( 1 β + 1 ) ; ; ( 1 β + 1 ) : 1 ; ; 1 ) ) ,
and the above expression exists for α β > k .
For a suifficent large N, we can bound that
B ( k + i + j + 1 β , α k β ) [ B ( 1 β , α ) ] i + j m a x ( m 1 , , m n i + j ) > N | ( 1 α ) m 1 ( 1 α ) m i + j ( 1 β + m 1 ) ( b + m i + j ) × 1 ( m 1 ! m i + j ! ) | < Γ ( k + i + j + 1 β ) [ B ( 1 β , α ) ] i + j m a x ( m 1 , , m i + j ) > N | ( 1 α ) m 1 ( 1 α ) m n i + j ( 1 β + m 1 ) ( 1 β + m i + j ) × 1 m 1 ! m i + j ! | = Γ ( k + i + j + 1 β ) [ B ( 1 β , α ) ] i + j [ m 1 = 0 m i + j = 0 | ( 1 α ) m 1 ( 1 a ) m i + j | ( 1 β + m 1 ) ( 1 β + m i + j ) m 1 ! m i + j ! m 1 = 0 N m n i + j = 0 N | ( 1 α ) m 1 ( 1 α ) m i + j | ( 1 β + m 1 ) ( 1 β + m i + j ) m 1 ! m i + j ! ] = Γ ( k + i + j + 1 β ) [ B ( 1 β , α ) ] i + j m = 0 | ( 1 α ) m | ( b + m ) m ! i + j m = 0 N | ( 1 α ) m | ( b + m ) m ! i + j < .
Proof. Proof of Theorem 3
 
By using the genralized multinomial theorem, we acquire
I β u / ( 1 + β u ) 1 β , α = 1 B ( α , 1 β ) k = 0 ( 1 1 β ) k ( 1 + β u ) ( α + k ) ( α + k ) k ! ,
and
I β u / ( 1 + β u ) 1 β , α i + j = B α , 1 β ( i + j ) m 1 m i + j = 0 ( 1 1 β ) m 1 ( 1 1 β ) m i + j ( α + m 1 ) ( α + m i + j ) ( 1 + β u ) [ α ( i + j ) + l = 1 i + j m l ] .
By substituting the above two expressions into A ( k + 1 β 1 , α + 1 β , i + j ) , we obtain
A ( k + 1 β , α + 1 β , i + j ) = ( 1 β ) k + 1 β α ( i + j ) B ( α , 1 β ) i + j B k + 1 β , α ( i + j + 1 ) k β × F 1 : 1 1 : 2 ( ( α ( i + j + 1 ) k β ) : ( 1 1 β , α ) ; ; ( 1 1 β , α ) : ( α ( i + j + 1 ) + 1 β ) : ( α + 1 ) ; ; ( α + 1 ) : 1 ; 1 ) ,
this expression exists for α > k β .   □
Proof. Proof of Theorem 4
 
Let G i ( x ) denote the cdf with parameters ( μ i , σ i , β , α i ) . Barakat and Abdelkader (2004) [27] proposed the following expression to cauculate the moments of order statistics for the INID case
μ r : n ( m ) + = j = n r + 1 n ( 1 ) j ( n r + 1 ) j 1 n r I j + ( m ) ,
μ r : n ( m ) = j = r n ( 1 ) j r j 1 r 1 I j ( m ) ,
where
I j + ( m ) = 1 i 1 < i 2 < i j n m 0 + x m 1 t = 1 j [ 1 G i t ( x ) ] d x , I j ( m ) = 1 i 1 < i 2 < i j n m 0 x m 1 t = 1 j G i t ( x ) d x .
Let u = x , from the symetric properties of this distribution we can reexpress I j ( m ) as
I j ( m ) = 1 i 1 < i 2 < i j n m ( 1 ) m 0 + x m 1 t = 1 j [ 1 G i t ( x ) ] d x .
From Section 2 we know that G i t [ ( 2 u ) 1 β ] = 1 2 1 + I β u / ( 1 + β u ) ( 1 β , α i t ) . Using series expansion, we immediately acquire the expressions as follows
1 G i t ( ( 2 u ) 1 β ) = 1 2 B ( α i t , 1 β ) k = 0 ( 1 1 β ) k ( α i t + k ) k ! ( 1 + β u ) ( α i t + k ) ,
t = 1 j 1 G i t ( ( 2 u ) 1 β ) = 2 j t = 1 j B ( α i t , 1 β ) m 1 = 0 m j = 0 ( 1 1 β ) m 1 ( 1 1 β ) m j ( α i 1 + m 1 ) ( α i j + m j ) × ( m 1 ! m j ! ) 1 × ( 1 + β u ) ( t = 1 j α i t + m 1 + m j ) .
By subsituting the above equation into the expression of I j + ( m ) , we obtain
I j + ( m ) = 1 i 1 < i 2 < i j n m β 2 j t = 1 j B ( α i t , 1 β ) 1 0 + ( 2 u ) m β 1 m 1 = 0 m j = 0 ( 1 1 β ) m 1 ( 1 1 β ) m j ( α i 1 + m 1 ) ( α i j + m j ) × ( 1 + β u ) ( t = 1 j α i t + m 1 + m j ) m 1 ! m j ! d u = 1 i 1 < i 2 < i j n m 2 m β 1 j t = 1 j B ( α i t , 1 β ) ( 1 β ) m β 1 m 1 = 0 m j = 0 ( 1 1 β ) m 1 ( 1 1 β ) m j ( α i 1 + m 1 ) ( α i j + m j ) × 1 m 1 ! m j ! B m β , t = 1 j α i t + m 1 + m j m β .
By using the defination of the generalized Kampe de Feriet function, the above equations can be reexpressed as
I j + ( m ) = m 2 m β 1 j t = 1 j α i t B ( α i t , 1 β ) 1 β m β 1 B m β , t = 1 j α i t m β × F 1 : 1 1 : 2 ( ( t = 1 j α i t m β ) : ( 1 1 β , α i 1 ) ; ; ( 1 1 β , α i j ) : ( t = 1 j α i t ) : ( α i 1 + 1 ) ; ; ( α i j + 1 ) : 1 ; ; 1 ) ,
and the above expression exists for α i t > m β .   □

Appendix A.3. Part III

Proof. Proof of Theorem 5
 
Let
h k ( β ) = [ Γ ( 1 / β ) ] k 1 Γ [ ( 2 k + 1 ) / β ] [ Γ ( 3 / β ) ] k , I k ( α , β ) = [ Γ ( α ) ] k 1 Γ ( α 2 k / β ) [ Γ ( α 2 / β ) ] k ( k > 2 ) .
In order to prove the monotony, we take the logarithms of the above two functions, respectively. Differentiating with respect to parameter β , we obtain
ln h k ( β ) β = 1 β 2 ( k + 1 ) φ 1 β + ( 2 k + 1 ) φ 2 k + 1 β 3 k φ 3 β using the fact that φ ( z ) = k = 0 1 k + z = 4 k 2 4 k β 3 × n ( n + 1 / β ) ( n + ( 2 k + 1 ) / β ) ( n + 3 / β ) < 0 ,
as a result h k ( β ) is monotonically decreasing with respect to the parameter β .
For I k ( α , β ) , taking logarithm, we have
ln I k ( α , β ) = ( k 1 ) ln Γ ( α ) + ln Γ ( α 2 k / β ) k ln Γ ( α 2 / β ) .
Taking partial derivative with respect to parameter α , we have
ln I k ( β ) α = ( k 1 ) φ ( α ) + φ α 2 k β k φ α 2 β = 4 k ( k 1 ) β 2 1 ( n + α ) ( n + α 2 k / β ) ( n + α 2 / β ) < 0 .
Taking partial derivative with respect to parameter β , we obtain
ln I k ( β ) α = = 2 k [ φ ( α 2 / β ) φ ( α 2 k / β ) ] < 0 .
From
Γ ( n t ) = n n t ( 2 π ) ( n 1 ) / 2 n Γ ( t ) Γ ( t + 1 / n ) Γ [ t + ( n 1 ) / n ] ,
we have
lim β 0 + h k ( β ) = lim t + h k ( t ) let t = 1 β = lim t + [ Γ ( t ) ] k 1 Γ ( t ) Γ [ t + 1 / ( 2 k + 1 ) ] Γ [ t + 2 k / ( 2 k + 1 ) ] [ Γ ( t ) Γ ( t + 1 / 3 ) Γ ( t + 2 / 3 ) ] k × ( 2 k + 1 ) ( 2 k + 1 ) t / ( ( 2 π ) k 2 k + 1 ) 3 3 t / ( 2 π 3 ) since lim t + Γ ( t + a ) Γ ( t ) t a = 1 , the above equation can be rewritten as = lim t + ( 2 k + 1 ) ( 2 k + 1 ) t / ( ( 2 π ) k 2 k + 1 ) 3 3 t / ( 2 π 3 ) × [ Γ ( t ) ] k 1 [ Γ ( t ) ] 2 k + 1 t i = 1 2 k i / ( 2 k + 1 ) [ Γ ( t ) ] k [ Γ ( t ) t 1 / 3 ] k [ Γ ( t ) t 2 / 3 ] k = lim t + ( 2 k + 1 ) ( 2 k + 1 ) t 2 k + 1 × ( 3 3 t ) k = + .
On the same way,
lim β + h k ( β ) = lim t 0 + h k ( t ) = 3 k / 2 2 k + 1 × i = 1 2 k Γ [ t + i / ( 2 i + 1 ) ] [ Γ ( 1 / 3 ) Γ ( 2 / 3 ) ] k .
Besides that, it is easy to prove that
lim β + I k ( α , β ) = 1 , lim α + I k ( α , β ) = 1 , lim α ( 2 k / β ) + I k ( α , β ) = + .
With the above conclusions, the result of the theorem is obvious.   □
Cauculation for the asymptotic variances and covariances of the estimates under EM algorithm
The second partial derivatives of the complete likelihood function is given by
2 ln L ( θ | Z ) σ 2 = n σ 2 β 2 ( β + 1 ) 4 σ 2 i = 1 n | y i | σ U i , 2 ln L ( θ | Z ) σ α = 0 ,
2 ln L ( θ | Z ) β σ = β 4 σ i = 1 n | y i | σ β U i β 2 4 σ i = 1 n | y i | σ β ln | y i | σ U i β 4 i = 1 n | y i | σ β 1 ln | y i | σ U i ,
2 ln L ( θ | Z ) α 2 = n φ ( α ) , 2 ln L ( θ | Z ) α β = 0 ,
2 ln L ( θ | Z ) β 2 = 2 n β 3 ln β + ( 1 + β ) + 2 l n 2 + φ ( 1 β ) n β 2 1 β 1 + φ ( 1 β ) β 2 + 2 β 3 i = 1 n ln U i 1 2 i = 1 n | y i | σ β ln | y i | σ U i β 4 i = 1 n | y i | σ β ln | y i | σ 2 U i .
From Proposition 2, let V = Y σ , the joint density function of ( U , V ) is given by
h U , V ( u , v ) = β 1 + 1 β 2 α + 2 β Γ ( 1 β ) Γ ( α ) u α + 1 β 1 exp β u v β 4 u 2 , u , v > 0 .
then through the above density function, the expectation can be calculated as
E ( V β U ) = β 1 + 1 β 2 α + 2 β Γ ( 1 β ) Γ ( α ) 0 + u α + 1 β exp u 2 0 + v β exp β u v β 4 d v d u let t = β u v β 4 = 4 β 2 ,
E ( V β 1 U ) = β 1 + 1 β 2 α + 2 β Γ ( 1 β ) Γ ( α ) 0 + u α + 1 β exp u 2 0 + v β 1 exp β u v β 4 d v d u let t = β u v β 4 = n β 1 β 1 2 1 β 2 B ( α , 1 / β ) ,
E [ ( ln V ) V β U ] = β 1 + 1 β 2 α + 2 β Γ ( 1 β ) Γ ( α ) 0 + u α + 1 β exp u 2 0 + ( ln v ) v β exp β u v β 4 d v d u .
The inner integral can be calculated as
0 + ( ln v ) v β exp β u v β 4 d v = 1 β 2 4 β u 1 + 1 / β 0 + ln 4 t β u t 1 / β e t d t let t = β u v β 4 = 1 β 2 4 β u 1 + 1 / β [ ( ln 4 ln β ln u ) Γ ( 1 + 1 / β ) β 2 Γ ( 1 + 1 / β ) ] .
By substituting the above expression into the double integral, we have
E [ ( ln V ) V β U ] = 4 β 3 ln 2 β φ ( α ) β 2 φ 1 + 1 β
On the same way, we obtain
E [ ( ln V ) 2 V β U ] = β 1 + 1 β 2 α + 2 β Γ ( 1 β ) Γ ( α ) 0 + u α + 1 β exp u 2 0 + ( ln v ) 2 v β exp β u v β 4 d v d u = 4 β 4 ( ln 4 ) 2 + ( ln 2 β ) 2 + 2 φ ( α ) ln 2 β + φ ( α ) + [ φ ( α ) ] 2 + 4 φ 1 + 1 β + φ 1 + 1 β 2 8 l n 4 β 2 φ 1 + 1 β ln 2 β β 2 φ ( α ) β 2 + 8 β 2 φ 1 + 1 β [ φ ( α ) + ln 2 β ]
Proof. Proof of Lemma 2
 
First of all, we decide to prove that the first and the second equation always has a unique solution for fixed ( β , α ) and ( σ , β ) , respectively.
Let L 1 ( σ ) = n ( α β + 1 ) i = 1 n β | x i | β 2 σ β + β | x i | β .
Obviously, this function is monotonically increasing with respect to σ and lim σ 0 L 1 ( σ ) = n α β < 0 , lim σ + L 1 ( σ ) = n > 0 . Therefore, there must exist a root for L 1 ( σ ) = 0 in ( 0 , + ) .
On the same way, let L 2 ( α ) = n φ ( α + 1 / β ) φ ( α ) i = 1 n ln 1 + β | x i | β 2 σ β and it is easy to prove that this function is monotonically decreasing with respect to α .
From Taylor’s theorem we obtain
L 2 ( α ) n β φ ( α ) i = 1 n ln 1 + β | x i | β 2 σ β ,
obviously, for fixed β and σ , lim α 0 L 2 ( α ) > 0 and lim α + L 2 ( α ) < 0 . Therefore, there must exist a root for L 2 ( α ) = 0 in ( 0 , + ) .
After that, we decide to prove the convergence of the iterative algorithm. Assuming that ( σ * , α * ) denote the solution of the equations and ( σ ( p ) , α ( p ) ) represent the values of p-th iteration, we have
Case I.
α ( 0 ) < α ( 1 )
The proof is conducted by mathematical induction, for p = 1 , the conclusion clearly holds.
Assume that for p = i , we have: α ( i ) < α ( i + 1 ) , σ ( i ) < σ ( i + 1 ) , then σ ( i + 2 ) and σ ( i + 1 ) are obtained from the following two equations, respectively
n ( α ( i + 1 ) β + 1 ) i = 1 n β | x i | β 2 σ ( i + 2 ) β + β | x i | β = 0 ,
n ( α ( i ) β + 1 ) i = 1 n β | x i | β 2 σ ( i + 1 ) β + β | x i | β = 0 .
Obviously, we can conclude that
( α ( i ) β + 1 ) i = 1 n β | x i | β 2 σ ( i + 1 ) β + β | x i | β ( α ( i ) β + 1 ) i = 1 n β | x i | β 2 σ ( i + 2 ) β + β | x i | β > 0 ,
and this is equivalent to σ ( i + 1 ) < σ ( i + 2 ) .
α ( i + 1 ) and α ( i + 2 ) are obtained from the following two equations, respectively
n φ ( α ( i + 1 ) + 1 / β ) φ ( α ( i + 1 ) ) i = 1 n ln 1 + β | x i | β 2 σ ( i + 1 ) β n β φ ( α ( i + 1 ) ) i = 1 n ln 1 + β | x i | β 2 σ ( i + 1 ) β = 0 ,
n φ ( α ( i + 2 ) + 1 / β ) φ ( α ( i + 1 ) ) i = 1 n ln 1 + β | x i | β 2 σ ( i + 2 ) β n β φ ( α ( i + 2 ) ) i = 1 n ln 1 + β | x i | β 2 σ ( i + 2 ) β = 0 ,
and this is equivalent to α ( i + 1 ) < α ( i + 2 ) .
Since the two squences σ ( i ) α ( i ) statisfy: 0 < σ ( j ) < + , 0 < α ( j ) < + , the conclusion holds.
Case II.
α ( 0 ) < α ( 1 )
The proof in this case is similar to the above case.   □
In a word, this conclusion can be summarized as follows:
1.
if α 0 < α * , the sequence { α ( i ) } , { σ ( i ) } are both monotonically increasing and converge from the left side to the solution,
2.
if α 0 > α * , the sequence { α ( i ) } , { σ ( i ) } are both monotonically decreasing and converge from the right side to the solution.

Appendix A.4. Part IV

Table A1. Estimation value under σ = 1 , α = 3 , β = 2 .
Table A1. Estimation value under σ = 1 , α = 3 , β = 2 .
MethodSample Size σ ^ α ^ β ^
n = 251.5847.832.314
n = 501.5286.6152.147
MMOMn = 1001.4635.3312.105
n = 2001.2654.5842.071
n = 251.185.1743.228
n = 501.1324.4212.482
MLE via EMn = 1001.1043.8142.364
n = 2001.0363.3412.201
n = 251.1093.6593.766
n = 501.0623.492.882
IPWMn = 1001.0333.38852.528
n = 2001.0223.3082.35
Table A2. Estimation value under σ = 1 , α = 3 , β = 3 .
Table A2. Estimation value under σ = 1 , α = 3 , β = 3 .
MethodSample Size σ ^ α ^ β ^
n = 251.3737.8305.339
n = 501.3146.6153.659
MMOMn = 1001.2635.3313.176
n = 2001.2174.5842.98
n = 251.0465.0735.207
n = 501.0324.1194.524
MLE via EMn = 1001.0213.5313.884
n = 2000.9973.1813.481
n = 251.0594.0872.569
n = 501.0443.4082.691
IPWMn = 1001.0373.1822.737
n = 2001.0253.052.825
Table A3. Estimation value under σ = 1 , α = 3 , β = 4 .
Table A3. Estimation value under σ = 1 , α = 3 , β = 4 .
MethodSample Size σ ^ α ^ β ^
n = 251.2985.4036.281
n = 501.2674.9335.124
MMOMn = 1001.2314.5084.619
n = 2001.1973.7184.358
n = 251.0675.2147.179
n = 501.0224.1555.794
MLE via EMn = 1001.0023.5135.039
n = 2000.9993.1154.402
n = 250.8874.0563.672
n = 500.9193.2243.496
IPWMn = 1000.9323.0753.318
n = 2000.9542.9763.132
Table A4. Estimation value for PLA using the EM algorithm under different α (fixed μ = 1 , σ = 1 , β = 2 ).
Table A4. Estimation value for PLA using the EM algorithm under different α (fixed μ = 1 , σ = 1 , β = 2 ).
Value of α Sample Size μ ^ σ ^ α ^ β ^
n = 251.00821.44114.38784.8281
n = 500.99861.37553.78413.3574
α = 2 n = 1000.99971.23253.07122.6304
n = 2000.99991.12332.56762.2731
n = 251.00361.35984.94234.6686
n = 501.00291.27154.08513.4920
α = 2.5 n = 1000.99941.16923.47252.6501
n = 2001.00061.08762.9622.3084
n = 250.97411.24945.03734.6552
n = 500.98961.18274.31193.4574
α = 3 n = 1001.00051.10583.74192.6380
n = 2000.9541.00063.31822.2532

References

  1. McDonald, J.B.; Newey, W.K. Partially Adaptive Estimation of Regression Models via the Generalized t Distribution. Econom. Theory 1988, 4, 428–457. [Google Scholar] [CrossRef]
  2. Nadarajah, H.N. On the Generalized t (Gt) Distribution. Statistics 2008, 42, 467–473. [Google Scholar] [CrossRef]
  3. Galbraith, J.W.; Zhu, D. A Generalized Asymmetric Student-t Distribution with Application to Financial Econometrics. J. Econom. 2010, 157, 297–305. [Google Scholar] [CrossRef] [Green Version]
  4. Harvey, A.; Lange, R.J. Volatility Modeling with a Generalized t Distribution. J. Time Ser. Anal. 2017, 38, 175–190. [Google Scholar] [CrossRef] [Green Version]
  5. Peel, D.; McLachlan, G.J. Robust Mixture Modelling Using the t Distribution. Stat. Comput. 2000, 10, 339–348. [Google Scholar] [CrossRef]
  6. Arslan, O.; Genç, A.İ. Robust Location and Scale Estimation Based on the Univariate Generalized t (GT) Distribution. Commun.-Stat.-Theory Methods 2003, 32, 1505–1525. [Google Scholar] [CrossRef]
  7. Paula, G.A.; Leiva, V.; Barros, M.; Liu, S. Robust Statistical Modeling Using the Birnbaum Saunders t Distribution Applied to Insurance. Appl. Stoch. Model. Bus. Ind. 2012, 150, 169–185. [Google Scholar] [CrossRef]
  8. Çankaya, M.N.; Arslan, O. On the Robustness Properties for Maximum Likelihood Estimators of Parameters in Exponential Power and Generalized t Distributions. Commun.-Stat.-Theory Methods 2018, 49, 607–630. [Google Scholar] [CrossRef]
  9. Arslan, O.; Genç, A.İ. The Skew Generalized t Distribution as the Scale Mixture of a Skew Exponential Power Distribution and Its Applications in Robust Estimation. Statistics 2009, 43, 481–498. [Google Scholar] [CrossRef]
  10. Venegas, O.; Rodríguez, F.; Gómez, H.W.; Olivares-Pacheco, J.F.; Bolfarine, H. Robust Modeling Using the Generalized Epsilon Skew t Distribution. J. Appl. Stat. 2012, 39, 2685–2698. [Google Scholar] [CrossRef]
  11. Papastathopoulos, I.; Tawn, J.A. A Generalised Student’s t-Distribution Stat. Probab. Lett. 2013, 83, 70–77. [Google Scholar] [CrossRef]
  12. Acitas, S.; Senoglu, B.; Arslan, O. Alpha-Skew Generalized t Distribution. Rev. Colomb. Estad. 2015, 38, 371–384. [Google Scholar] [CrossRef]
  13. Lak, F.; Alizadeh, M.; Monfared, M.E.D.; Esmaeili, H. The Alpha-Beta Skew Generalized t Distribution: Properties and Applications. Pak. J. Stat. Oper. Res. 2019, 46, 605–616. [Google Scholar] [CrossRef]
  14. Genç, A.İ. The Generalized T Birnbaum-Saunders Family. Statistics 2013, 47, 613–625. [Google Scholar] [CrossRef]
  15. Balakrishnan, N.; Kundu, D. Birnbaum-Saunders Distribution: A Review of Models, Analysis, and Applications. Appl. Stoch. Model. Bus. Ind. 2019, 35, 4–49. [Google Scholar] [CrossRef] [Green Version]
  16. Govindarajulu, Z. Relationships Among Moments of Order Statistics in Samples from Two Related Populations. Technometrics 1963, 5, 514–518. [Google Scholar] [CrossRef]
  17. David, H.A.; Nagaraja, H.N. Order Statistics, 3rd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [Google Scholar]
  18. Exton, H. Handbook of Hypergeometric Integrals: Theory, Applications, Tables, Computer Programs; Ellis Horwood: Chichester, UK, 1978. [Google Scholar]
  19. Arnold, B.C.; Balakrishnan, N.; Nagaraja, H.N. A First Course in Order Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2008. [Google Scholar]
  20. Balakrishnan, N.; Alam, F.M.A. Maximum Likelihood Estimation of the Parameters of Student’s t Birnbaum-Saunders Distribution: A Comparative Study. Commun.-Stat.-Simul. Comput. 2019, 2, 1–30. [Google Scholar] [CrossRef]
  21. Poursadeghfard, T.; Jamalizadeh, A.; Nematollahi, A. On the Extended Birnbaum—Saunders Distribution Based on the Skew-t-Normal Distribution. Iran. J. Sci. Technol. Trans. Sci. 2019, 43, 1689–1703. [Google Scholar] [CrossRef]
  22. Dempster, A.P. Maximum Likelihood from Incomplete Data via the EM Algorithm. J. R. Stat. Soc. 1977, 39, 1–38. [Google Scholar]
  23. Louis, T.A. Finding the Observed Information Matrix When Using the EM Algorithm. J. R. Stat. Soc. Ser. Stat. Methodol. 1982, 44, 226–233. [Google Scholar]
  24. Greenwood, J.A.; Landwehr, J.M.; Matalas, N.C.; Wallis, J.R. Probability Weighted Moments: Definition and Relation to Parameters of Several Distributions Expressable in Inverse Form. Water Resour. Res. 1979, 15, 1049–1054. [Google Scholar] [CrossRef] [Green Version]
  25. Wang, Q.J. Estimation of the GEV Distribution from Censored Samples by Method of Partial Probability Weighted Moments. J. Hydrol. 1990, 120, 103–114. [Google Scholar] [CrossRef]
  26. Shen, Z.; Chen, Y.; Shi, R. Modeling Tail Index with Autoregressive Conditional Pareto Model. J. Bus. Econ. Stat. 2020, 1–9. [Google Scholar] [CrossRef]
  27. Barakat, H.M.; Abdelkader, Y.H. Computing the Moments of Order Statistics from Nonidentical Random Variables. Stat. Methods Appl. 2004, 13, 15–26. [Google Scholar] [CrossRef]
Figure 1. The pdf of the new Gt distribution for different values of β with fixed α = 3 .
Figure 1. The pdf of the new Gt distribution for different values of β with fixed α = 3 .
Mathematics 09 02413 g001
Figure 2. The pdf of the new Gt distribution for different values of α with fixed β = 3 .
Figure 2. The pdf of the new Gt distribution for different values of α with fixed β = 3 .
Mathematics 09 02413 g002
Figure 3. 3D plot for K 2 ( 1 , p , q ) . (a) 1 p 10 , 1 q 50 . (b) 1 p 10 , 0.1 q 1 .
Figure 3. 3D plot for K 2 ( 1 , p , q ) . (a) 1 p 10 , 1 q 50 . (b) 1 p 10 , 0.1 q 1 .
Mathematics 09 02413 g003
Figure 4. Comparison with the Gt distribution defined by McDonald and Newey.
Figure 4. Comparison with the Gt distribution defined by McDonald and Newey.
Mathematics 09 02413 g004
Figure 5. Curve of values of the log-likelihood function under sample size n = 50 .
Figure 5. Curve of values of the log-likelihood function under sample size n = 50 .
Mathematics 09 02413 g005
Figure 6. RB and RMSE of ( 0 , σ , α , β ) ( σ = 1 , α = 3 , β = 2 ) . (a) RB of scale parameter σ . (b) RMSE of scale parameter σ . (c) RB of shape parameter α . (d) RMSE of shape parameter α . (e) RB of shape parameter β . (f) RMSE of shape parameter β .
Figure 6. RB and RMSE of ( 0 , σ , α , β ) ( σ = 1 , α = 3 , β = 2 ) . (a) RB of scale parameter σ . (b) RMSE of scale parameter σ . (c) RB of shape parameter α . (d) RMSE of shape parameter α . (e) RB of shape parameter β . (f) RMSE of shape parameter β .
Mathematics 09 02413 g006
Figure 7. RB and RMSE of ( 0 , σ , α , β ) ( σ = 1 , α = 3 , β = 3 ) . (a) RB of scale parameter σ . (b) RMSE of scale parameter σ . (c) RB of shape parameter α . (d) RMSE of shape parameter α . (e) RB of shape parameter β . (f) RMSE of shape parameter β .
Figure 7. RB and RMSE of ( 0 , σ , α , β ) ( σ = 1 , α = 3 , β = 3 ) . (a) RB of scale parameter σ . (b) RMSE of scale parameter σ . (c) RB of shape parameter α . (d) RMSE of shape parameter α . (e) RB of shape parameter β . (f) RMSE of shape parameter β .
Mathematics 09 02413 g007
Figure 8. RB and RMSE of ( 0 , σ , α , β ) ( σ = 1 , α = 3 , β = 4 ) . (a) RB of scale parameter σ . (b) RMSE of scale parameter σ . (c) RB of shape parameter α . (d) RMSE of shape parameter α . (e) RB of shape parameter β . (f) RMSE of shape parameter β .
Figure 8. RB and RMSE of ( 0 , σ , α , β ) ( σ = 1 , α = 3 , β = 4 ) . (a) RB of scale parameter σ . (b) RMSE of scale parameter σ . (c) RB of shape parameter α . (d) RMSE of shape parameter α . (e) RB of shape parameter β . (f) RMSE of shape parameter β .
Mathematics 09 02413 g008aMathematics 09 02413 g008b
Figure 9. RB and RMSE of ( μ , σ , α , β ) under different shape parameter α based on the PLA using the EM algorithm. (a) RB of location parameter μ . (b) RMSE of location parameter μ .(c) RB of scale parameter σ . (d) RMSE of scale parameter σ . (e) RB of shape parameter α . (f) RMSE of shape parameter α . (g) RB of shape parameter β . (h) RMSE of shape parameter β .
Figure 9. RB and RMSE of ( μ , σ , α , β ) under different shape parameter α based on the PLA using the EM algorithm. (a) RB of location parameter μ . (b) RMSE of location parameter μ .(c) RB of scale parameter σ . (d) RMSE of scale parameter σ . (e) RB of shape parameter α . (f) RMSE of shape parameter α . (g) RB of shape parameter β . (h) RMSE of shape parameter β .
Mathematics 09 02413 g009aMathematics 09 02413 g009b
Figure 10. Trace of the PLA based on the new Gt distribution.
Figure 10. Trace of the PLA based on the new Gt distribution.
Mathematics 09 02413 g010
Figure 11. Histogram of the stock data and the estimated pdf.
Figure 11. Histogram of the stock data and the estimated pdf.
Mathematics 09 02413 g011
Table 1. RB, RMSE and estimated values (in parentheses) of ( σ , α , β ) .
Table 1. RB, RMSE and estimated values (in parentheses) of ( σ , α , β ) .
Sample SizeMLE Using the EM AlgorithmMLE via a New Iterative Algorithm
RBRB
σ α β σ α β
n = 25 0.8788 (0.879)1.3800 (4.14)1.2524 (3.757)0.7235 (0.724)0.7337 (2.201)1.1098 (3.323)
n = 50 0.9371 (0.937)1.1867 (3.56)1.2078 (3.624)0.8101 (0.81)0.8392 (2.518)1.0586 (3.176)
n = 100 0.9482 (0.948)1.1041 (3.312)1.1861 (3.558)0.9144 (0.914)0.9572 (2.872)0.9989 (2.997)
n = 200 0.9678 (0.968)0.9555 (2.867)1.1188 (3.356)0.9373 (0.937)1.0052 (3.016)0.9837(2.951)
RMSERMSE
σ α β σ α β
n = 25 0.23481.21190.68660.21990.97910.1478
n = 50 0.16320.91050.80340.17830.81410.09916
n = 100 0.08740.72220.73110.12690.62230.0509
n = 200 0.04010.27890.21750.07530.39710.0211
Table 2. Descriptive statistics for stock data.
Table 2. Descriptive statistics for stock data.
n Y ¯ S b 1 b 2 min (Y)max (Y)
4278−2.295 × 10 3 0.011430.094665.28822−0.10390.08113
Table 3. Estimation value and the fitting index.
Table 3. Estimation value and the fitting index.
Parameter EstimatesNew Gt DistributionGt Distribution
σ 0.021430.01025
α 5.46063
β 1.22367
p1.92009
q1.72641
log-likelihood value13,563.1313,543.77
AIC−27,118.26−27,079.54
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Guan, R.; Zhao, X.; Cheng, W.; Rong, Y. A New Generalized t Distribution Based on a Distribution Construction Method. Mathematics 2021, 9, 2413. https://0-doi-org.brum.beds.ac.uk/10.3390/math9192413

AMA Style

Guan R, Zhao X, Cheng W, Rong Y. A New Generalized t Distribution Based on a Distribution Construction Method. Mathematics. 2021; 9(19):2413. https://0-doi-org.brum.beds.ac.uk/10.3390/math9192413

Chicago/Turabian Style

Guan, Ruijie, Xu Zhao, Weihu Cheng, and Yaohua Rong. 2021. "A New Generalized t Distribution Based on a Distribution Construction Method" Mathematics 9, no. 19: 2413. https://0-doi-org.brum.beds.ac.uk/10.3390/math9192413

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop