Next Article in Journal
Biology, Buddhism, and AI: Care as the Driver of Intelligence
Previous Article in Journal
Error Performance of Amplitude Shift Keying-Type Asymmetric Quantum Communication Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Jaynes-Gibbs Entropic Convex Duals and Orthogonal Polynomials

Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke 1, 3001, 12 ème Avenue Nord, Sherbrooke, QC J1H 5N4, Canada
Submission received: 8 March 2022 / Revised: 8 May 2022 / Accepted: 10 May 2022 / Published: 16 May 2022

Abstract

:
The univariate noncentral distributions can be derived by multiplying their central distributions with translation factors. When constructed in terms of translated uniform distributions on unit radius hyperspheres, these translation factors become generating functions for classical families of orthogonal polynomials. The ultraspherical noncentral t, normal N, F, and χ 2 distributions are thus found to be associated with the Gegenbauer, Hermite, Jacobi, and Laguerre polynomial families, respectively, with the corresponding central distributions standing for the polynomial family-defining weights. Obtained through an unconstrained minimization of the Gibbs potential, Jaynes’ maximal entropy priors are formally expressed in terms of the empirical densities’ entropic convex duals. Expanding these duals on orthogonal polynomial bases allows for the expedient determination of the Jaynes–Gibbs priors. Invoking the moment problem and the duality principle, modelization can be reduced to the direct determination of the prior moments in parametric space in terms of the Bayes factor’s orthogonal polynomial expansion coefficients in random variable space. Genomics and geophysics examples are provided.

1. Introduction

We shall argue that the four noncentral univariates t, F, the normal N, and χ 2 distributions ρ ( r | r o ) , with r being the one-dimensional random space variable, and r o the one-dimensional noncentrality parameter of the respective distributions, can all be constructed in a modular fashion by multiplying their central counterparts ρ ( r | r o = 0 ) with a factor T ( r | r o ) effecting a central distribution translation, that is,
ρ ( r | r o ) = T ( r | r o ) ρ ( r | 0 ) .
In statistical parlance, the noncentral distributions are needed to estimate or modelize effect sizes [1]. With the exception of the normal distribution, these translations are non-shape-preserving. The derivation of the translation factors T ( r | r o ) can be carried out in two manners, depending on whether primacy is put upon translated uniform density distributions on unit radius hyperspheres, as is done in this manuscript, or on translated normal distributions, as done classically. We shall review both derivations herein, with emphasis on the hyperspherical distributions.
We choose to place primacy upon the simple uniform density distribution on the unit radius hypersphere S ν , where ν stands for either the dimension of the hypersphere surface S ν ν + 1 , or, equivalently, the degrees of freedom (dof) of the specified distribution. It is known that the projection of a unit radius uniform density hyperspherical distribution on S ν on any given polar axis readily provides us with the central t distribution [2], and that such a projection converges with the central normal distribution N ( 0 , 1 ) with null mean when ν tends to infinity [3] (p. 59). This observation provides us with the needed building principle used throughout this manuscript: use the uniform density distribution on the unit radius hypersphere S ν ν + 1 to derive modular expressions for the central and noncentral t, F, N, and χ 2 distributions. In order to proceed, one needs to master some very simple notions concerning the hypersphere geometry. The projection of a random unit vector x on the unit radius hypersphere S ν 2 on any given unit polar axis p naturally defines a polar angle θ through the scalar product cos θ = x · p . As such, the central t distribution with ν 2 degrees of freedom can be drawn on the compact support 1 < cos θ < 1 , on which it acquires a simple expression in trigonometric terms: it is simply proportional to sin ν 2 1 θ after integration of the azimuthal coordinates. See Saville and Wood [2] for a very extensive digression on the subject. On the familiar sphere S 2 3 , the latter provides us with the well-known spherical surface element sin θ d θ after the integration of the azimuthal coordinate. Similarly, the central F ( ν 1 , ν 2 ) distribution becomes proportional to cos ν 1 1 θ sin ν 2 1 θ on the compact domain 0 cos θ < 1 , where θ is the angle between a random vector x on the hypersphere S ν 1 + ν 2 and a secant hyperplane defining the subspace S ν 1 . In statistical parlance, the S ν 1 and S ν 2 subspaces refer to the between-class and within-class variance spaces in an analysis variance (ANOVA), as reflected by the F-statistic redefinition in trigonometric terms, found in Section 2. As it turns out, these two central uniform hyperspherical distributions are all that is needed to proceed with the derivation of all the univariate noncentral t, F, N, and χ 2 distributions where, for uniformity of designation, a normal distribution N ( δ , 1 ) with non-vanishing mean δ and unit variance will be simply referred to as the noncentral normal distribution. Finally, in the theory of orthogonal polynomials, the designation ultraspherical polynomials—also known as Gegenbauer polynomials [4]—has prevailed over that of hyperspherical polynomials, and we shall abide by this nomenclature. In order to distinguish the ultraspherical t and F distributions from the ones derived classically from the normal N and χ 2 distributions, we shall designate the former densities by the Greek letter υ (upsilon), as in υπερσφαίρα (ypersfaíra or hypersphere in English), and the latter densities by the Greek letter ρ .
In Bayesian inference, the determination of the Bayes prior is referred to as an inverse problem, and Jaynes’ data-constrained maximal entropy priors provide a principled solution to this inverse problem [5,6,7]. The maximal entropy is reached by minimizing the Gibbs potential, and the solution to this optimization problem requires the determination of the empirical density entropic convex dual. See Le Blanc [8] for an extensive review on the subject, with references. As derived from translated normal distributions, the classical noncentral distributions all have a submodular decomposition for their translation factor of the form
T ( r | r o ) = E ( r | r o ) e r 0 2 / 2 ,
with E ( r | r o ) being a generalized hypergeometric function, a decomposition which does not readily provide regrouping of terms of similar order in the noncentrality parameter r o . Conversely, the translation factors for the noncentral t, the normal N, F, and χ 2 distributions, as derived from translated uniform distributions on unit radius hyperspheres, are found to be generating functions for the Gegenbauer, Hermite, Laguerre, and Jacobi orthogonal polynomial families { P n } n = 0 , respectively, and intrinsic properties of the orthogonal polynomials [9] allow for regrouping all terms of similar order in the noncentrality parameter r o , that is,
T ( r | r o ) = n = 0 c n P n ( r ) r o n ,
with the constants c n provided hereafter. To the best of our knowledge, the derivation of these translation factors and their identification as orthogonal polynomial family-generating functions has not been carried out before. As a consequence, one can expand the entropic convex duals on a generally small number of low-order orthogonal polynomials, an approach which greatly curtails the computational cost of determining the duals and, thus, the Jaynes–Gibbs priors. To the best of our knowledge, the expansion and discretization of the convex duals over orthogonal polynomial bases has not been proposed before. We adopt in this manuscript the convention that the polynomial family-defining weight functions should be provided by the corresponding normalized central distributions, a convention which results in much simplified expressions for the norm of the orthogonal polynomials.
In parametric Bayesian modeling, the translation factor T ( r | r o ) can be weighted by a Bayesian prior π ( r o ) to obtain the Bayes factor
BF ( r ) = T ( r | r o ) π ( r o ) d r o
for the generic superposition density
ρ ( r ) = ρ ( r | r o ) π ( r o ) d r o = T ( r | r o ) ρ ( r | 0 ) π ( r o ) d r o = B F ( r ) ρ ( r | 0 ) .
We will, in this manuscript, identify the normalized central distribution ρ ( r | 0 ) with the orthogonal polynomial family-defining weight w ( r ) ρ ( r | 0 ) [9], with, as a result, the rewriting of any generic density ρ ( r ) as
ρ ( r ) = B F ( r ) w ( r ) ,
that is, the Bayes factor B F ( r ) can stand as a substitute to the generic density ρ ( r ) . With p ( r ) = r ρ ( r | 0 ) d r standing for the cumulative density function of the respective central distributions ρ ( r | 0 ) , or, equivalently, the p-value of the null hypothesis statistical testing (NHST) procedure, we have that ρ ( p ) = B F ( p ) . That is, the Bayes factor B F ( p ) stands for the generally nonuniform p-value distribution ρ ( p ) of the above generic superposition density (see Le Blanc [10] for details of a proof), to be contrasted with the NHST framework which only considers the central distribution with its uninformative uniform p-value distribution. Now, if one’s goal is only to model the density B F ( p ) or to compute the associated local false discovery rate fdr ( p ) = 1 / ( 1 + B F ( p ) ) [11], one then only needs prior moments to carry on with the modelization. Invoking the moment problem [12] and the duality principle [13], prior moments in parametric space will be shown to be readily provided by Bayes factor B F ( r ) orthogonal polynomial expansion coefficients in random variable space—that is,
R o π ( r o ) r o n d r o = 1 c n P n 2 R P n ( r ) B F ( r ) w ( r ) d r .
The paper is organized as follows. In Section 2, we review, for the sake of completeness, the derivation of the classical univariate noncentral distributions as derived from translated normal distributions. In Section 3, the univariate ultraspherical noncentral t, the normal N, F, and χ 2 distributions are derived from translated uniform density distributions on unit radius hyperspheres, and are shown to be expressible as products of their central distribution times specific generating functions for the Gegenbauer, Hermite, Jacobi, and Laguerre orthogonal polynomial families, respectively. We argue, in Section 4, that the determination of the Gibbs priors in terms of empirical densities’ entropic convex duals is much simplified when these duals are expanded on a small number of low-order orthogonal polynomials. We also discuss how prior moments in parametric space are directly provided by the Bayes factor orthogonal polynomial expansion coefficients in random variable space. Section 5 and Section 6 are devoted to applications in genomics and geophysics, respectively.

2. The Classical Noncentral Distributions

The four central F, χ 2 , t, and normal N distributions can be given by
(1a) υ F ( ν 1 , ν 2 ) ( θ | 0 ) = 2 Γ ( ν 1 + ν 2 2 ) Γ ( ν 1 2 ) Γ ( ν 2 2 ) cos ν 1 1 θ sin ν 2 1 θ , 0 cos θ < 1 , (1b) ρ χ 2 ( ν 1 ) ( r | 0 ) = 1 2 ν 1 / 2 Γ ( ν 1 2 ) r ( ν 1 2 ) / 2 e r / 2 , 0 r < , (1c) ρ t ( ν 2 ) ( θ | 0 ) = Γ ( ν 2 + 1 2 ) Γ ( 1 2 ) Γ ( ν 2 2 ) sin ν 2 1 θ , 1 < cos θ < 1 , (1d) ρ N ( r | 0 ) = 1 2 π e 1 2 r 2 , < r < ,
respectively, where we have redefined the conventional t and F statistics so that
t = ν 2 cos θ sin θ , 1 < cos θ < 1 ,
and
F = ν 2 ν 1 cos 2 θ sin 2 θ , 0 cos θ < 1 ,
so as to re-scale the corresponding t and F distributions on the specified finite compact domains. As discussed in the Introduction, the above central t and F distributions are projections of uniform distributions on the unit radius S ν 2 and S ν 1 + ν 2 hyperspheres, respectively. The reasons for such rescaling will become apparent when we link in Section 3 the ultraspherical noncentral t and F distributions with the Gegenbauer and Jacobi orthogonal polynomial families, respectively, families which both have similar finite compact support domains. It is straightforward to verify that the above central t and F distributions converge with the central normal N and χ 2 distributions, respectively, in the limit ν 2 : see the discussions leading to Equations (19) and (36) below. In geometric terms, the limit ν 2 allows one to restrict the study of the central t and F distributions on a restricted domain around the angle θ = π / 2 , around which these distributions will concentrate their respective distribution weights in concordance with the notion that a high-dimensional hypersphere concentrates its surface on a narrow equatorial band at its equator [3]. In this limit, with the respective variable substitutions r = ν 2 1 cos θ and r = ( ν 2 2 ) cos 2 θ , the normal N and χ 2 respective central distributions’ definition domains consequently extend to infinity as cos θ 1 , as reflected by the respective distribution domains provided above.
The noncentral t distribution has been classically derived by considering the ratio of a random variable which distributes according to a noncentral normal distribution N ( δ , 1 ) with non-vanishing mean δ , over that of a random variable distributed according to a central χ ν 2 distribution with ν 2 degrees of freedom. Similarly, the noncentral F distribution is classically derived by considering the ratio of a noncentral χ 2 random variable with noncentrality parameter Λ and ν 1 degrees of freedom over that of a central χ 2 random variable with ν 2 degrees of freedom. See, for example, Walck et al. [14] for explicit steps for such derivations. As a result, as discussed in the Introduction, the classical distributions have a submodular decomposition for their translation factor of the form T ( r | r o ) = E ( r | r o ) e r 0 2 / 2 , with E ( r | r o ) being a generalized hypergeometric function [15]. Upon recalling that the F and χ 2 noncentral parameter Λ stands for the square of the noncentral parameter δ for the noncentral t and normal distributions, we have that
(4a) ρ F ( ν 1 , ν 2 ) ( θ | Λ ) = E F ( ν 1 , ν 2 ) ( θ | Λ ) e Λ / 2 ρ F ( ν 1 , ν 2 ) ( θ | 0 ) , 0 cos θ < 1 , 0 Λ < , (4b) ρ χ 2 ( ν 1 ) ( r | Λ ) = E χ 2 ( ν 1 ) ( r | Λ ) e Λ / 2 ρ χ 2 ( ν 1 ) ( r | 0 ) , 0 r < , 0 Λ < , (4c) ρ t ( ν 2 ) ( θ | δ ) = E t ( ν 2 ) ( θ | δ ) e δ 2 / 2 ρ t ( ν 2 ) ( θ | 0 ) , 1 < cos θ < 1 , δ < δ < , (4d) ρ N ( r | δ ) = E N ( r | δ ) e δ 2 / 2 ρ N ( r | 0 ) , < r < , δ < δ < ,
with the respective factors’ E ( r | r o ) McLaurin expansions — computed following the steps elaborated by Walck et al. [14]—given by
(5a) E F ( ν 1 , ν 2 ) ( θ | Λ ) = Γ ( ν 1 / 2 ) Γ ( 1 / 2 ) j even 0 Γ ( ( j + 1 ) / 2 ) Γ ( ( j + ν 1 ) / 2 ) Γ ( ( j + ν 1 + ν 2 ) / 2 ) Γ ( ( ν 1 + ν 2 ) / 2 ) ( 2 Λ cos θ ) j j ! (5b) = F 1 1 ( ν 1 + ν 2 2 ; ν 1 2 ; Λ cos 2 θ 2 ) , (5c) E χ 2 ( ν 1 ) ( r | Λ ) = Γ ( ν 1 / 2 ) Γ ( 1 / 2 ) j even 0 Γ ( ( j + 1 ) / 2 ) Γ ( ( j + ν 1 ) / 2 ) ( Λ r ) j / 2 j ! = I ( ν 1 2 ) / 2 ( Λ r ) (5d) = F 1 0 ( ; ν 1 2 ; Λ r 4 ) , (5e) E t ( ν 2 ) ( θ | δ ) = j = 0 Γ ( ( j + 1 + ν 2 ) / 2 ) Γ ( ( 1 + ν 2 ) / 2 ) ( 2 δ cos θ ) j j ! = F 1 1 ( ν 2 + 1 2 ; 1 2 ; δ 2 cos 2 θ 2 ) + 2 δ cos θ Γ ( ν 2 + 2 2 ) Γ ( ν 2 + 1 2 ) F 1 1 ( ν 2 + 2 2 ; 3 2 ; δ 2 cos 2 θ 2 ) , (5f) E N ( r | δ ) = j = 0 ( δ r ) j j ! = F 0 0 ( ; ; r ) = e δ r (5g) = F 1 0 ( ; 1 2 ; δ 2 r 2 4 ) + δ r F 1 0 ( ; 3 2 ; δ 2 r 2 4 ) = cosh δ r + sinh δ r = e δ r ,
where
F q p ( a 1 , , a p ; b 1 , , b q ; x ) n = 0 ( a 1 ) n ( a p ) n ( b 1 ) n ( b q ) n x n n ! .
stands for the generalized hypergeometric function, and where I ( ν 1 2 ) / 2 ( Λ r ) stands for the normalized modified Bessel function [16]. The respective first expansions for the multiplicative factors E are successively simpler versions of the Maclaurin expansion for the noncentral F distribution multiplicative factor E F ( ν 1 , ν 2 ) , such that
E χ 2 ( ν 1 ) ν 2 E F ( ν 1 , ν 2 ) ν 1 = 1 E t ( ν 2 ) ν 2 E N ,
with the understanding that the respective Maclaurin summations involve either only the even integers for the F and χ 2 cases, or both odd and even integers for the t and normal cases. It is straightforward to verify that the classical noncentral t and F distributions converge with the noncentral normal and χ 2 distributions, respectively, in the limit ν 2 . The expressions for the classical noncentral distributions do not readily regroup terms of similar order in their noncentrality parameter, contrarily to the ultraspherical noncentral distributions, which do so, as is discussed next.

3. The Ultraspherical Noncentral Distributions

Using geometrical arguments, the ultraspherical noncentral t-distribution for the t-statistic (2) on the hypersphere S ν 2 was shown by Le Blanc [10] to be given by
υ t ( ν 2 ) ( θ | δ ) = T t ( ν 2 ) ( θ | δ ) υ t ( ν 2 ) ( θ | 0 ) ,
where υ t ( ν 2 ) ( θ | 0 ) is identical to the central t-distribution ρ t ( ν 2 ) ( θ | 0 ) given in Equation (1), where the multiplicative distribution translation factor T t ( ν 2 ) ( θ | δ ) is given by
T t ( ν 2 ) ( θ | δ ) = 1 cos θ cos θ δ [ 1 2 cos θ cos θ δ + cos 2 θ δ ] ( ν 2 + 1 ) / 2 ,
and where
cos θ δ = δ / ν 2 ( δ / ν 2 ) 2 + 1 = δ δ 2 + ν 2 , 0 < θ δ < π ,
in terms of the noncentrality parameter δ , < δ < . As it turns out, the translation term T t ( ν 2 ) ( θ | δ ) is a generating function for the ultraspherical or, equivalently, Gegenbauer polynomials: redefining the variables so that x = cos θ , z = cos θ δ , and b = ( ν 2 1 ) / 2 , we have that
T t ( b ) ( x | z ) = 1 x z [ 1 2 x z + z 2 ] b + 1 = n = 0 2 b + n 2 b C n ( b ) ( x ) z n = ( 1 x z ) ( 2 b + 1 ) 2 F 1 2 b + 1 2 , 2 b + 2 2 ; b + 1 2 ; ( 1 x 2 ) z 2 ( 1 x z ) 2 ,
where C n ( b ) ( x ) are Gegenbauer polynomials with the explicit representation [4]
C n ( b ) ( x ) = k = 0 n / 2 ( 1 ) k ( b ) n k k ! ( n 2 k ) ! ( 2 x ) n 2 k ,
where the Pochhammer symbol ( x ) n is defined by the equality
( x ) n = Γ ( x + n ) Γ ( x ) = x ( x + 1 ) ( x + 2 ) ( x + n 1 ) ,
where y , the floor of y, is given by the lowest integer such that y 1 < y y , and where the last equality given in terms of the hypergeometric function F 1 2 can be deduced from Rainville [17] (Equation (144.8)). The Gegenbauer polynomials are orthogonal:
x = 1 1 C m ( b ) ( x ) C n ( b ) ( x ) w t ( b ) ( x ) d x = δ m , n C n ( b ) 2
with respect to the weight function
w t ( b ) ( x ) = Γ ( b + 1 ) Γ ( 1 2 ) Γ ( b + 1 2 ) ( 1 x 2 ) b 1 / 2 ,
which is identical to the central t distribution (1), except for the change of the variable x = cos θ . With this weight function normalization—which, note, differs from the usual weight function for the Gegenbauer polynomials [4]—the norm of the Gegenbauer polynomials simplifies to
C n ( b ) 2 = b b + n ( 2 b ) n n ! ,
with, in particular, C 0 ( b ) 2 = 1 . Note also that Equation (11) could be used to define a generalization T n ( b ) ( x ) for the Chebyshev polynomials of the first kind, T n ( b = 0 ) ( x ) , with
1 x z [ 1 2 x z + z 2 ] b + 1 = n = 0 T n ( b ) ( z ) z n = n = 0 2 b + n 2 b C n ( b ) ( x ) z n ,
which encompasses the defining equation
1 x z [ 1 2 x z + z 2 ] = n = 0 T n ( 0 ) ( x ) z n = lim b 0 n = 0 n 2 b C n ( Δ ) ( x ) z n
for the Chebyshev polynomials of the first kind. Setting x = r / ( 2 b ) 1 / 2 and z = δ / ( 2 b ) 1 / 2 , the central t distribution converges with the normal central distribution
υ N ( r | 0 ) = 1 2 π e 1 2 r 2
in the limit b . Applying the same limit to the Gegenbauer polynomials and their generating function (11), we have that
lim b T t ( b ) ( x | z ) = lim b 1 x z [ 1 2 x z + z 2 ] b + 1 = lim b n = 0 2 b + n 2 b C n ( b ) r ( 2 b ) 1 / 2 δ ( 2 b ) 1 / 2 n = n = 0 He n ( r ) δ n n ! = e r δ δ 2 / 2 = T N ( r | δ ) ,
where we have used the limit result [4]
lim b 1 ( 2 b ) n / 2 C n ( b ) r ( 2 b ) 1 / 2 = He n ( r ) n ! .
The multiplicative factor T N ( r | δ ) imparts a non-vanishing mean δ to the zero mean normal distribution, since
T N ( r | δ ) υ N ( r | 0 ) = e δ r δ 2 / 2 1 2 π e 1 2 r 2 = 1 2 π e 1 2 ( r δ ) 2 = υ N ( r | δ ) ,
where one recognizes the generating function e δ r δ 2 / 2 for the Hermite polynomials He n ( r ) , with the explicit representation [4]
He n ( r ) = n ! = 0 n / 2 ( 1 ) r n 2 2 ! ( n 2 ) ! .
The latter are orthogonal with respect to their defining weight function
w N ( r ) = υ N ( r | 0 ) = 1 2 π e 1 2 r 2 ,
with norm
r = He m ( r ) He n ( r ) w N ( r ) d r = δ m , n n ! ,
and with He 0 2 = 1 in particular.
Using geometrical arguments, the ultraspherical noncentral F-distribution for the F-statistic (3) was shown by Le Blanc [10] to be given by the integral representation
υ F ( ν 1 , ν 2 ) ( θ | Λ ) = T F ( ν 1 , ν 2 ) ( θ | Λ ) υ F ( ν 1 , ν 2 ) ( θ | 0 ) ,
where υ F ( ν 1 , ν 2 ) ( θ | 0 ) is identical to the central F distribution ρ F ( ν 1 , ν 2 ) ( θ | 0 ) given in Equation (1), where the multiplicative distribution translation factor T F ( ν 1 , ν 2 ) ( θ | Λ ) is given by the integral
T F ( ν 1 , ν 2 ) ( θ | Λ ) = ψ = 0 π T F ( ν 1 , ν 2 ) ( θ , ψ | Λ ) d ψ
with integrand
T F ( ν 1 , ν 2 ) ( θ , ψ | Λ ) = ( 1 cos θ cos ψ cos θ Λ ) [ 1 2 cos θ cos ψ cos θ Λ + cos 2 θ Λ ] ( ν 1 + ν 2 ) / 2 υ t ( ν 1 1 ) ( ψ | 0 ) ,
and where
cos θ Λ = Λ / ( Λ + ν 2 ) , 0 Λ < , 0 cos θ Λ < 1 ,
in terms of the noncentral parameter Λ . The special case ν 1 = 1 is given by
υ F ( ν 1 = 1 , ν 2 ) ( θ | Λ ) = cos θ = [ cos θ , cos θ ] 1 cos θ cos θ Λ [ 1 2 cos θ cos θ Λ + cos 2 θ Λ ] ( ν 2 + 1 ) / 2 υ t ( ν 2 ) ( θ | 0 ) .
Setting a = ( ν 1 2 ) / 2 , b = ( ν 2 2 ) / 2 , x = cos θ , ξ = cos ψ , z = cos θ Λ , and making use of the Gegenbauer polynomial-generating function expansion (11) together with the corresponding polynomials’ explicit expression (12), the integration in (27) can be carried out: we find
T F ( a , b ) ( x | z ) = ξ = 1 1 1 x ξ z [ 1 2 x ξ z + z 2 ] a + b + 2 w t ( a ) ( ξ ) d ξ = n = 0 a + b + 1 + n a + b + 1 F n ( a , b ) ( x 2 ) z 2 n , = n = 0 ( 1 ) n ( a + b + 2 ) n ( a + 1 ) n P n ( a , b ) ( 1 2 x 2 ) z 2 n = ( 1 + z 2 ) ( a + b + 2 ) F 1 2 ( 1 2 ( a + b + 2 ) , 1 2 ( a + b + 3 ) ; a + 1 ; 4 x 2 z 2 ( 1 + z 2 ) 2 ) ,
where the polynomials F n ( a , b ) ( x 2 ) which one related to the Jacobi polynomials [4], through
F n ( a , b ) ( x 2 ) = ( 1 ) n ( a + b + 1 ) n ( a + 1 ) n P n ( a , b ) ( 1 2 x 2 )
are provided with the explicit representation
F n ( a , b ) ( x 2 ) = k = 0 n ( 1 ) k k ! ( a + b + 1 ) 2 n k ( a + 1 ) n k ( x 2 ) n k ( n k ) ! ,
and where the last equality—deduced from Rainville [17] (Equation (132.10))—provides us with a generating function for the polynomials in terms of the hypergeometric function F 1 2 . The F n ( a , b ) ( x 2 ) polynomials are orthogonal with respect to the weight function
w F ( a , b ) ( x ) = 2 Γ ( a + b + 1 ) Γ ( a + 1 ) Γ ( b ) ( x 2 ) a + 1 2 ( 1 x 2 ) b , 0 x 1 ,
which is identical to the central F distribution (1), except for the change of the variable x = cos θ , with norm
F n ( a , b ) 2 = a + b + 1 a + b + 1 + 2 n ( a + b + 1 ) n ( b + 1 ) n ( a + 1 ) n n ! ,
and with F 0 ( a , b ) 2 = 1 in particular. Equation (31) can be verified to be valid for the special case (30) with a = 1 / 2 ( ν 1 = 1 ). In the limit ν 2 , the central F distribution converges with the central χ 2 distribution
υ χ 2 ( ν 1 ) ( r | 0 ) = 1 2 ν 1 / 2 Γ ( ν 1 2 ) r ( ν 1 2 ) / 2 e r / 2 , 0 r < ,
which, with a = ( ν 1 2 ) / 2 , can be used as a normalized defining weight function,
w χ 2 ( a ) ( r ) = 1 2 a + 1 1 Γ ( a + 1 ) r a e r / 2 , a > 1 ,
for the Laguerre orthogonal polynomial family. The Laguerre polynomials can be given the explicit representation [4]
L n ( a ) ( y ) = = 0 n ( 1 ) ( a + + 1 ) n ( n ) ! y ! , 0 y < .
Since Equation (32) provides the noncentral F distribution translation factor T F ( a , b ) ( x | z ) with an expansion in terms of the Jacobi polynomials P n ( a , b ) ( 1 2 x 2 ) , and since the limit result
lim b P n ( a , b ) 1 2 y b = L n ( a ) y
holds [4], one verifies with x 2 = r / 2 b and z 2 = Λ / 2 b that
lim b T F ( a , b ) ( x | z ) = T χ 2 ( a ) ( r | Λ ) = I ( a ) ( Λ r ) e Λ / 2 F 1 0 ( ; a + 1 ; Λ r 4 ) e Λ / 2 = n = 0 ( 1 ) n ( a + 1 ) n L n ( a ) ( r / 2 ) ( Λ / 2 ) n ,
which is a lesser-known expression for the noncentral χ 2 distribution translation factor first derived by Tiku [18]. Under normalization (37) for their defining weight factor, the norm of the Laguerre polynomials L n ( a ) ( r / 2 ) is given by
r = 0 L m ( a ) ( r / 2 ) L n ( a ) ( r / 2 ) w χ 2 ( a ) ( r ) d r = δ m , n L n ( a ) 2 δ m , n = ( a + 1 ) n n !
with, in particular, L 0 ( a ) 2 = 1 .
To summarize, the four ultraspherical noncentral F, χ 2 , t, and the normal N distributions are given by the product of translation factors T, which take the form of generating functions for specific orthogonal polynomial families, times the corresponding normalized central distributions which also stand for the corresponding polynomial family-defining weight functions. Thus, we have that
(42a) υ F ( a , b ) ( x | z ) = T F ( a , b ) ( x | z ) w F ( a , b ) ( x ) , 0 x < 1 , 0 z < 1 , (42b) υ χ 2 ( a ) ( r | Λ ) = T χ 2 ( a ) ( r | Λ ) w χ 2 ( a ) ( r ) , 0 r < , 0 Λ < , (42c) υ t ( b ) ( x | z ) = T t ( b ) ( x | z ) w t ( b ) ( x ) , 1 < x < 1 , 1 < z < 1 , (42d) υ N ( r | δ ) = T N ( r | δ ) w N ( r ) , < r < , < δ < ,
with the normalized central distributions
(43a) w F ( a , b ) ( x ) = 2 Γ ( a + b + 1 ) Γ ( a + 1 ) Γ ( b ) ( x 2 ) a + 1 2 ( 1 x 2 ) b , (43b) w χ 2 ( a ) ( r ) = 1 2 a + 1 1 Γ ( a + 1 ) r a e r / 2 , (43c) w t ( b ) ( x ) = Γ ( b + 1 ) Γ ( 1 2 ) Γ ( b + 1 2 ) ( 1 x 2 ) b 1 / 2 (43d) w N ( r ) = υ N ( r | 0 ) = 1 2 π e 1 2 r 2 ,
providing the polynomial family-defining weights, and with the corresponding generating functions and orthogonal polynomial expansions given by
(44a) T F ( a , b ) ( x | z ) = ( 1 + z 2 ) ( a + b + 2 ) F 1 2 ( 1 2 ( a + b + 2 ) , 1 2 ( a + b + 3 ) ; a + 1 ; 4 x 2 z 2 ( 1 + z 2 ) 2 ) = n = 0 ( 1 ) n ( a + b + 2 ) n ( a + 1 ) n P n ( a , b ) ( 1 2 x 2 ) z 2 n = n = 0 a + b + 1 + n a + b + 1 F n ( a , b ) ( x 2 ) z 2 n , (44b) T χ 2 ( a ) ( r | Λ ) F 1 0 ( ; a + 1 ; Λ r 4 ) e Λ / 2 = I ( a ) ( Λ r ) e Λ / 2 = n = 0 ( 1 ) n ( a + 1 ) n L n ( a ) ( r / 2 ) ( Λ / 2 ) n , (44c) T t ( b ) ( x | z ) = ( 1 x z ) ( 2 b + 1 ) F 1 2 ( 2 b + 1 2 , 2 b + 2 2 ; b + 1 2 ; ( 1 x 2 ) z 2 ( 1 x z ) 2 ) = 1 x z [ 1 2 x z + z 2 ] b + 1 = n = 0 2 b + n 2 b C n ( b ) ( x ) z n , (44d) T N ( r | δ ) F 1 0 ( ; ; δ r ) e δ 2 / 2 = e r δ δ 2 / 2 = n = 0 He n ( r ) δ n n ! .
We conclude this section by stressing the fact that the random variable space for the ultraspherical noncentral distributions are translated hyperspheres rather than translated normal distributions, as is assumed for the classical noncentral distributions. The tide model developed in Section 6 provides a concrete example of such a distribution on a translated sphere. As more extensively argued in [15], the ultraspherical and classical noncentral F and t distributions correspond to projections of translated hyperspheres and translated normal distributions, respectively; are identical in their central distributions when their noncentrality parameters are zero; and converge in high-dimensional spaces, but diverge in low-dimension spaces and for large noncentrality parameters. See Figure 1. These properties ultimately stem from the counterintuitive properties of the solid hypersphere which concentrates its volume on a thin ultraspherical shell in high-dimensional spaces [3], allowing one to use the ultraspherical distributions as surrogates for the classical noncentral t and F distributions in high-dimensional spaces.

4. Entropic Convex Duals Expansion in Terms of Orthogonal Polynomials

Bayes–Jaynes–Gibbs data-constrained maximal entropy priors—simply designated as Gibbs priors in the following—can be objectively computed for dense datasets. See Le Blanc [8] for an extensive review on the subject, from which we recall that the Bayes–Laplace prior and posterior update rules are rooted in the convex geometry of Shannon’s entropy function, with the Kullback–Leibler relative entropy being a Bregman divergence defined in terms of the former. The Gibbs priors can be formally expressed as
π ( r o ) = 1 Z ( λ ) exp R λ ( r ) υ ( r | r o ) d r ,
where the partition function Z ( λ ) is given by
Z ( λ ) = R o exp R λ ( r ) υ ( r | r o ) d r d r o ,
and where λ ( r ) is the entropic convex dual of the empirical density υ ( r ) . The latter is obtained through the unconstrained minimization of the Gibbs potential
inf λ G ρ ( λ ) = inf λ log Z ( λ ) R λ ( r ) υ ( r ) d r ,
which, by convex duality, corresponds to Jaynes data-constrained maximal entropy. The corresponding Gibbs–Jaynes model for a generic empirical density υ ( r ) is given by
υ ( r ) = R o υ ( r | r o ) π ( r o ) d r o = υ ( r | 0 ) R o T ( r | r o ) π ( r o ) d r o = υ ( r | 0 ) B F ( r ) .
As posed, solving for the entropic convex dual function λ ( r ) in (47) requires one to compute its value across the entire support domain of the empirical distribution υ ( r ) , a task which can be expensive in computing terms. Now, since the ultraspherical noncentral distribution translation factors T ( r | r o ) are generating functions for orthogonal polynomial families, and since the corresponding central distributions υ ( r | 0 ) are the family-defining weight functions w ( r ) υ ( r | 0 ) , one can rewrite the unconstrained minimization problem in simpler terms. Indeed, the exponentiated term in the partition function (46) can be rewritten as
R λ ( r ) υ ( r | r o ) d r = R λ ( r ) T ( r | r o ) w ( r ) d r = R λ ( r ) n = 0 c n P n ( r ) r o n w ( r ) d r = n = 0 c n R λ ( r ) P n ( r ) w ( r ) d r r o n = n = 0 c n λ n P n 2 r o n = n = 0 λ ˜ n r o n with λ ˜ n = c n λ n P n 2 ,
where the polynomials P n , with their respective multiplicative coefficients c n , are listed in (44), and where the λ n are the entropic convex dual expansion coefficients on family-wise orthogonal polynomials that remain to be determined. Similarly, the additive constraint term in (47) can be rewritten as
R λ ( r ) υ ( r ) d r = n = 0 λ n R P n ( r ) υ ( r ) d r = n = 0 λ n R P n ( r ) B F ( r ) w ( r ) d r = n = 0 λ ˜ n β ˜ n ,
where υ ( r ) is the empirical density to be modeled, and where
β ˜ n = 1 c n P n 2 R P n ( r ) B F ( r ) w ( r ) d r ,
with, in particular, β ˜ 0 = 1 . Within the division by the factor c n , the latter equality is the Bayes factor B F ( r ) expansion on the orthogonal polynomials in random sample space. The unconstrained optimization problem can thus be reformulated as
inf { λ ˜ n } log R o exp ( n = 0 λ ˜ n r o n ) d r o n = 0 λ ˜ n β ˜ n
in terms of orthogonal polynomial expansion coefficient sets { λ ˜ n } n = 0 for the continuous entropic convex dual function λ ( r ) , sets which can be restricted to a small finite number of coefficients, as can be assessed by the Kullback–Leibler divergence between the empirical density and its model in terms of orthogonal polynomials. At the minimum of the Gibbs potential (52), one has the simple condition
R o exp ( n λ ˜ n r o n ) r o n d r o R o exp ( n λ ˜ n r o n ) d r o = R o π ( r o ) r o n d r o = β ˜ n ,
which states that the n th moment of the noncentrality parameter r o , as weighted by the Gibbs prior
π ( r o ) = exp ( n λ ˜ n r o n ) R o exp ( n λ ˜ n r o n ) d r o ,
is, within the factor c n , equal to the n th coefficient of the Bayes factor expansion (51) on the orthogonal polynomial basis in random sample space. In practical terms, the determination of the Gibbs prior in terms of a small number of polynomial expansion coefficients λ ˜ n for the entropic dual convex results in a substantial reduction in the computing time needed to find the Gibbs potential minimum.
Once obtained, the Gibbs prior can be used to modelize the empirical density υ ( r ) according to Equation (48), in which the central distribution υ ( r | 0 ) stands for the weight function w ( r ) for the ultraspherical distributions listed in (42). A generic density υ ( r ) can thus be expanded as
υ ( r ) = R o υ ( r | r o ) π ( r o ) d r o = w ( r ) R o T ( r | r o ) π ( r o ) d r o = w ( r ) R o n = 0 c n P n ( r ) r o n π ( r o ) d r o = w ( r ) n = 0 c n R o r o n π ( r o ) d r o P n ( r ) = w ( r ) n = 0 c n β ˜ n P n ( r ) = w ( r ) n = 0 1 P n 2 R P n ( r ) B F ( r ) w ( r ) d r P n ( r )
where we have invoked the optimality condition (53) in going from the second to the third line. This sequence of equalities provides us with two alternative ways,
(56a) B F ( r ) = υ ( r ) w ( r ) = R o T ( r | r o ) π ( r o ) d r o (56b) = n = 0 c n R o r o n π ( r o ) d r o P n ( r ) = n = 0 1 P n 2 R P n ( r ) B F ( r ) w ( r ) d r P n ( r ) ,
to model the Bayes factors and the empirical densities: either one determines the Gibbs prior π ( r o ) via the optimization problem (52) and uses it to weigh the analytical orthogonal polynomial-generating functions T ( r | r o ) listed in (44); or, one directly determines the prior moments in parametric space in terms of the Bayes factor B F ( r ) expansion coefficients in random variable space, as per the last equality above. The latter strategy amounts to the construction of a probability distribution in terms of its moments [12]. In that respect, recall the Hausdorff moment problem, which stipulates that the collection of all moments of a probability distribution on a bounded interval uniquely determine the distribution [19], and the Hamburger moment problem, which considers the uniqueness of solutions for the same problem on the unbounded real line [20].
Numerical exploration indicates that the convergence of the polynomial expansion for the ultraspherical noncentral t and F distributions’ generating functions is affected by the Gibbs phenomenon at the extremities of their finite noncentrality parameter ranges [21]. For this reason, when considering the noncentral t and F distributions, one can either solve for the Gibbs prior to weigh the corresponding analytic generating functions (44) in constructing the models, or use the large b limit for the distributions which converge with the noncentral normal N and χ 2 distributions, respectively, together with the associated Hermite and Laguerre orthogonal polynomial families, to directly compute the prior moments in parametric space in terms of the Bayes factor expansion coefficients in random variable space, as according to (56). Both strategies are found to be unaffected by the Gibbs effect, as exemplified in the next section.

5. Genomics Examples

We first revisit, in this section, the NCBI Gene Expression Omnibus head and neck squamous cell carcinoma microarray dataset produced by [22], pertaining to 22 paired samples ( N = 44 ,   ν 2 = 42 ,   b = 41 / 2 ) of normal versus cancerous tissues, and interrogating 11,302 genes via 2-sample t-tests. In Le Blanc [10], a maximal entropy prior equal to the central ultraspherical t distribution (1) was postulated. In Le Blanc [8], the entropic dual convex λ ( x ) was computed on the full random variable range 1 < x < 1 . Here, the Bayes factor is provided with two models. The first model υ ( x ) = w t ( x ) T ( x | z ) π ( z ) d z is given in terms of the Gibbs prior π ( z ) (54) and the analytical Gegenbauer polynomial-generating function T ( x | z ) (11). We assessed the computational gains in our determination of the entropic convex dual λ ( x ) via Equation (52): with the empirical distribution υ ( x ) binned in 200 bins, and using the MATLAB ® optimization subroutine fminunc in a 64-bit Windows environment on a PC with an Intel(R) Core(TM) i9-9900K CPU @ 3.60 GHz processor and 64 GB of RAM, the determination times for the entropic convex dual λ ( x ) in terms of a small number of Gegenbauer polynomial coefficients were curtailed by a factor ranging from 100 to 1000, compared to previous determination times of λ ( x ) on the full random variable range 1 < x < 1 , with elapsed times as short as a few tenths of a second. In the second model, according to (56) and with the Hermite polynomials standing for the Gegenbauer polynomials in the large b limit, the Bayes factor B F ( r ) expansion in random variable space on the Hermite polynomials provides us with the Gibbs prior moments π ( δ ) δ n d δ in parametric space. As can be seen in Figure 2, the convergence of the two models with the empirical distribution is rapidly achieved with ten coefficients or less.
Next, we study a genome-wide association study (GWAS) dataset. A GWAS is an observational study assessing a genome-wide set of genetic variants in different individuals, and seeking to identify statistically significant variant-trait associations. Such studies commonly focus on associations between single-nucleotide polymorphisms (SNPs) and traits. We retrieved the GenoMICC EUR vs. UK biobank controls dataset from the GenOMICC (Genetics Of Mortality In Critical Care) GWAS, comparing 2244 critically ill patients with COVID-19 from UK intensive care units with European ancestry-matched control individuals selected from the large population-based cohort of the UK Biobank [23]. A logistic regression model was used for each of the 4,380,209 SNPs individually tested for statistical significance. We computed the empirical density of all the statistical test-associated χ ν 1 = 1 2 r-statistics, and used it to compute the first eight terms of the direct polynomial expansion (56) on Laguerre polynomials for the Bayes Factor B F ( p ( r ) ) . As illustrated in Figure 3, the accrual of the successive polynomial expansion terms allows for an incrementally better fit of the p-value empirical density, which strongly deviates from the NHST null hypothesis U ( 0 , 1 ) in the low p-value range where associations are detected. In the Bayesian framework, NHST statistical significance is replaced by the strength of Bayesian evidence, as assessed by magnitude of B F ( p ) , which, in turn, allows for the computation of a local false discovery rate [11]
fdr ( p ) = 1 / ( 1 + B F ( p ) )
as is also illustrated in Figure 3.

6. Geophysics Examples

We begin this section by modeling, in Figure 4, Earth’s above-sea-level emerging land/ice latitudinal density [24]. Because an ideal Earth surface can be described by the sphere S ν 2 = 2 , we have that the parameter b = ( ν 2 1 ) / 2 = 1 / 2 , with the weight factor w t ( 1 / 2 ) ( x ) in (15) reducing to the constant 1 / 2 . As a consequence, Equation (55) is equivalent to a standard expansion on Legendre polynomials—equivalently, Gegenbauer C n ( 1 / 2 ) polynomials—except for the fact that our convention regarding the central distribution normalization w t ( 1 / 2 ) ( x ) = 1 / 2 (rather than the usual weight w ( x ) = 1 ) on the span [ 1 , 1 ] ensures that expansion (55) defines a normalized density.
We conclude with an example in which the translating factor (9) for the ultraspherical noncentral t distribution is kept as-is, instead of using its expansion (11) in terms of Gegenbauer polynomials, to argue that the noncentral spherical t distribution (8) on S 2 can be used to describe the geometry of the gravity forces of a simple tide model. Consider Figure 5, which describes the gravitational pull of an ideal Moon of mass m on the thin water layer covering an ideal Earth of mass M and radius R at distance D from the Moon. The gravitational tidal force T per unit mass on point s on the Earth’s surface is given by
T = G m ( D R cos θ ) 2 + R 2 sin 2 θ = G m D 2 2 D R cos θ + R 2 .
The horizontal and vertical components of that force are given by
T h = + G m D 2 2 D R cos θ + R 2 × cos φ = + G m D R cos θ [ D 2 2 D R cos θ + R 2 ] 3 / 2 ,
T v = G m D 2 2 D R cos θ + R 2 × sin φ = G m R sin θ [ D 2 2 D R cos θ + R 2 ] 3 / 2 ,
respectively. Setting our physical units so that G m / D 2 = 1 , and defining cos θ δ = R / D , we have that
T h = 1 cos θ cos θ δ [ 1 2 cos θ cos θ δ + cos 2 θ δ ] 3 / 2 = T t ( ν 2 = 2 ) ( θ | δ ) ,
T v = sin θ [ 1 2 cos θ cos θ δ + cos 2 θ δ ] 3 / 2 .
The horizontal tidal force component T h is thus simply given by the noncentral spherical t distribution translation factor T t ( ν 2 = 2 ) ( θ | δ ) , as provided by Equation (4). One can integrate this horizontal tidal force over the entire spherical shell by weighing it with the central spherical distribution υ t ( ν 2 = 2 ) ( θ | 0 ) to account for the spherical geometry of the Earth. Since the integrand to this integral corresponds to the noncentral spherical t distribution υ t ( ν 2 = 2 ) ( θ | δ ) = T t ( ν 2 = 2 ) ( θ | δ ) υ t ( ν 2 = 2 ) ( θ | 0 ) on S 2 , the tidal force integrates to one in our unit system. In order for the system to be stationary, an opposing force must be opposed to this integrated tidal force of one. The inertial centrifugal force originating from the joint rotation of the Moon and Earth around their center of mass plays such a role. It is evaluated to be given by G m / D 2 = 1 in our unit system [25], which we add to the normalized horizontal tidal force T h above. We have plotted the resulting force field in Figure 5, with the noncentrality parameter cos θ δ set to an unrealistic value of 0.2 to enhance the visualization of the geometrical distribution of the tidal forces. Unlike most illustrative tide diagrams with symmetrical bulges found in the literature, our tide diagram, together with its unsymmetrical equatorial water bulges, provides a more accurate visual depiction of the expected asymmetry of the tidal forces resulting from purely geometrical considerations.

7. Discussion

We have seen that the univariate noncentral distributions can be constructed in a modular fashion by multiplying their central distributions with specific translation factors. Using geometrical arguments, we have found that the translation factors for the ultraspherical noncentral t, normal, F, and χ 2 distributions stand for generating functions for the Gegenbauer, Hermite, Jacobi, and Laguerre polynomial families, respectively, with their central distributions standing for the corresponding polynomial family-defining weights. These developments clearly link four of the most important classical continuous probability distributions with the powerful orthogonal polynomial formalism. To the best of our knowledge, the derivation of these translation factors and their identification as orthogonal polynomial family-generating functions has not been carried out before.
Jaynes’ maximal entropy prior is obtained through the unconstrained minimization of the Gibbs potential. In parametric Bayesian inference, the formal expression for the Gibbs potential comprises an integral of the product of the empirical distribution’s entropic convex dual λ ( r ) times the parametric kernel, or, equivalently, the likelihood function ρ ( r | r o ) . In the case of the ultraspherical noncentral distributions, the latter integral yields discretized expansion coefficients of the entropic convex duals on orthogonal polynomial bases. The determination of the entropic convex duals is thus reduced to the much simpler and computationally economical determination of a few low-order orthogonal polynomial coefficients. By invoking the moment problem and the duality principle, prior moments in parametric space are equated with Bayes factors expansion coefficients over orthogonal polynomial bases in random variable space. To the best of our knowledge, the expansion and discretization of convex duals over orthogonal polynomial bases has not been proposed before. In an approach which bears some similarities to our work, Alibrandi and Ricciardi [26] proposed a moment-based approach, the use of a discretized kernel set { ρ ( r | r i ) } i = 1 N , and the use of Jaynes’ maximal entropy principle to determine the set’s weighting factors { π i | i = 1 N π i = 1 } . While their kernel discretization procedure is an ad hoc procedure, our kernel discretization is principled since it is based on the powerful orthogonal polynomial formalism.
The machine learning community has begun to exploit the classical orthogonal polynomial formalism. A non-exhaustive review identifies the use of orthogonal polynomials in optical character recognition [27], in support vector machine kernel construction [28], and in polynomial-based iteration methods for symmetric linear systems [29,30]. In a similar vein, we hope to contribute the present formalism to both the statistical and the machine learning communities.

Funding

This research received no external funding.

Data Availability Statement

For the head and neck cancer microarray dataset, please refer to [31]. The COVID-19 GWAS dataset was accessed at https://genomicc.org/data, accessed on 3 January 2022. For Earth’s emerging land/ice latitudinal density, please refer to [32].

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ANOVAanalysis of variance
dofdegrees of freedom
FDRfalse discovery rate
GWASgenome-wide association study
NHSTnull-hypothesis statistical testing
SNPsingle-nucleotide polymorphism

References

  1. Wikipedia Contributors. Effect Size—Wikipedia, The Free Encyclopedia. 2022. Available online: https://en.wikipedia.org/wiki/Effect_size (accessed on 7 March 2022).
  2. Saville, D.J.; Wood, G.R. Statistical Methods: A Geometric Primer; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
  3. Vershynin, R. High-Dimensional Probability: An Introduction with Applications in Data Science; Cambridge University Press: Cambridge, UK, 2018; Volume 47. [Google Scholar]
  4. Olver, F.W.J.; Olde Daalhuis, A.B.; Lozier, D.W.; Schneider, B.I.; Boisvert, R.F.; Clark, C.W.; Miller, B.R.; Saunders, B.V.; Cohl, H.S.; McClain, M.A. NIST Digital Library of Mathematical Functions. 2021. Available online: http://dlmf.nist.gov/ (accessed on 15 September 2021).
  5. Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev. 1957, 106, 620. [Google Scholar] [CrossRef]
  6. Jaynes, E.T. Information theory and statistical mechanics. II. Phys. Rev. 1957, 108, 171. [Google Scholar] [CrossRef]
  7. Pressé, S.; Ghosh, K.; Lee, J.; Dill, K.A. Principles of maximum entropy and maximum caliber in statistical physics. Rev. Mod. Phys. 2013, 85, 1115. [Google Scholar] [CrossRef] [Green Version]
  8. Le Blanc, R. Entropic convex duality in the determination of data-constrained kernel-based Bayes-Jaynes priors. J. Convex Anal. 2022, 29, 623–647. [Google Scholar]
  9. Ismail, M.; Ismail, M.E.; van Assche, W. Classical and Quantum Orthogonal Polynomials in One Variable; Cambridge University Press: Cambridge, UK, 2005; Volume 13. [Google Scholar]
  10. Le Blanc, R. Bayesian Analysis on a Noncentral Fisher–Student’s Hypersphere. Am. Stat. 2019, 73, 126–140. [Google Scholar] [CrossRef] [Green Version]
  11. Stephens, M. False discovery rates: A new deal. Biostatistics 2017, 18, 275–294. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Schmüdgen, K. The Moment Problem; Springer: Berlin/Heidelberg, Germany, 2017; Volume 9. [Google Scholar]
  13. Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
  14. Walck, C. Hand-book on statistical distributions for experimentalists. Univ. Stockh. 2007, 10, 96-01. [Google Scholar]
  15. Le Blanc, R. Noncentral univariate distributions. 2022; to be submitted. [Google Scholar]
  16. András, S.; Baricz, Á. Properties of the probability density function of the non-central chi-squared distribution. J. Math. Anal. Appl. 2008, 346, 395–402. [Google Scholar] [CrossRef] [Green Version]
  17. Rainville, E.D. Special Functions; The Macmillan Company: New York, NY, USA, 1960; Volume 5. [Google Scholar]
  18. Tiku, M. Laguerre series forms of non-central χ2 and F distributions. Biometrika 1965, 52, 415–427. [Google Scholar] [PubMed]
  19. Wikipedia Contributors. Hausdorff Moment Problem—Wikipedia, The Free Encyclopedia. 2021. Available online: https://en.wikipedia.org/w/index.php?title=Hausdorff_moment_problem&oldid=1016911791 (accessed on 27 February 2022).
  20. Wikipedia Contributors. Hamburger Moment Problem—Wikipedia, The Free Encyclopedia. 2021. Available online: https://en.wikipedia.org/w/index.php?title=Hamburger_moment_problem&oldid=1060022764 (accessed on 27 February 2022).
  21. Shizgal, B.D.; Jung, J.H. Towards the resolution of the Gibbs phenomena. J. Comput. Appl. Math. 2003, 161, 41–65. [Google Scholar] [CrossRef] [Green Version]
  22. Kuriakose, M.; Chen, W.; He, Z.; Sikora, A.; Zhang, P.; Zhang, Z.; Qiu, W.; Hsu, D.; McMunn-Coffran, C.; Brown, S.; et al. Selection and validation of differentially expressed genes in head and neck cancer. Cell. Mol. Life Sci. 2004, 61, 1372–1383. [Google Scholar] [CrossRef]
  23. Pairo-Castineira, E.; Clohisey, S.; Klaric, L.; Bretherick, A.D.; Rawlik, K.; Pasko, D.; Walker, S.; Parkinson, N.; Fourman, M.H.; Russell, C.D.; et al. Genetic mechanisms of critical illness in COVID-19. Nature 2021, 591, 92–98. [Google Scholar] [CrossRef]
  24. Amante, C.; Eakins, B.W. ETOPO1 Global Relief Model Converted to PanMap Layer Format; NOAA-National Geophysical Data Center: Boulder CO, USA, 2009. [Google Scholar] [CrossRef]
  25. Matsuda, T.; Isaka, H.; Boffin, H.M. Confusion around the tidal force and the centrifugal force. arXiv 2015, arXiv:1506.04085. [Google Scholar]
  26. Alibrandi, U.; Ricciardi, G. Efficient evaluation of the pdf of a random variable through the kernel density maximum entropy approach. Int. J. Numer. Methods Eng. 2008, 75, 1511–1548. [Google Scholar] [CrossRef]
  27. Abdulhussain, S.H.; Mahmmod, B.M.; Naser, M.A.; Alsabah, M.Q.; Ali, R.; Al-Haddad, S. A robust handwritten numeral recognition using hybrid orthogonal polynomials and moments. Sensors 2021, 21, 1999. [Google Scholar] [CrossRef] [PubMed]
  28. Padierna, L.C.; Carpio, M.; Rojas-Domínguez, A.; Puga, H.; Fraire, H. A novel formulation of orthogonal polynomial kernel functions for SVM classifiers: The Gegenbauer family. Pattern Recognit. 2018, 84, 211–225. [Google Scholar] [CrossRef]
  29. Fischer, B. Polynomial Based Iteration Methods for Symmetric Linear Systems; SIAM: Philadelphia, PA, USA, 2011. [Google Scholar]
  30. Pedregosa, F.; Scieur, D. Acceleration through spectral density estimation. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 7553–7562. [Google Scholar]
  31. Kuriakose, M.; Chen, W.; He, Z.; Sikora, A.; Zhang, P.; Zhang, Z.; Qiu, W.; Hsu, D.; McMunn-Coffran, C.; Brown, S.; et al. Expression Data from Head and Neck Squamous Cell Carcinoma. 2007. Available online: https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/geo/query/acc.cgi?acc=GSE6631 (accessed on 7 March 2022).
  32. National Geophysical Data Center, NESDIS, NOAA, U.S. Department of Commerce. ETOPO1, Global 1 Arc-minute Ocean Depth and Land Elevation from the US National Geophysical Data Center (NGDC). 2011. Available online: https://rda.ucar.edu/datasets/ds759.4/ (accessed on 7 March 2022).
Figure 1. Symmetrized Kullback–Leibler divergence (a) between the hyperspherical υ t ( ν 2 ) ( θ | δ ) and the classical ρ t ( ν 2 ) ( θ | δ ) noncentral t—distributions in the upper-left-corner plot, and (bf) between the hyperspherical υ F ( ν 1 , ν 2 ) ( θ | Λ ) and the classical ρ F ( ν 1 , ν 2 ) ( θ | Λ ) noncentral F distributions for ν 1 = ( 2 , , 6 ) , respectively, for all the other plots. The ultraspherical and classical noncentral t and F distributions correspond to projections of translated uniform distributions on unit radius hyperspheres and translated normal distributions, respectively; are identical in their central distributions when their noncentrality parameters δ or Λ vanish; converge in high-dimensional (large degree-of-freedom ν 2 ) spaces, but diverge in low-dimension spaces and for large noncentrality δ or Λ parameters. The ultraspherical noncentral distributions can be used as surrogates for the classical noncentral t and F distributions in high-dimensional spaces. See the text for details.
Figure 1. Symmetrized Kullback–Leibler divergence (a) between the hyperspherical υ t ( ν 2 ) ( θ | δ ) and the classical ρ t ( ν 2 ) ( θ | δ ) noncentral t—distributions in the upper-left-corner plot, and (bf) between the hyperspherical υ F ( ν 1 , ν 2 ) ( θ | Λ ) and the classical ρ F ( ν 1 , ν 2 ) ( θ | Λ ) noncentral F distributions for ν 1 = ( 2 , , 6 ) , respectively, for all the other plots. The ultraspherical and classical noncentral t and F distributions correspond to projections of translated uniform distributions on unit radius hyperspheres and translated normal distributions, respectively; are identical in their central distributions when their noncentrality parameters δ or Λ vanish; converge in high-dimensional (large degree-of-freedom ν 2 ) spaces, but diverge in low-dimension spaces and for large noncentrality δ or Λ parameters. The ultraspherical noncentral distributions can be used as surrogates for the classical noncentral t and F distributions in high-dimensional spaces. See the text for details.
Entropy 24 00709 g001
Figure 2. Empirical random space densities and NHST p-value densities modelization for the head and neck cancer dataset. The upper panels illustrate the Jaynes–Gibbs model υ ( x ) = w t ( x ) T ( x | z ) π ( z ) d z , as provided by the Gibbs prior (54) and the analytical generating function (11) for the Gegenbauer polynomials. (a) Upper left-hand panel: convergence of the Jaynes–Gibbs model with the empirical density υ ( x ) = w t ( x ) B F ( x ) . (b) Upper right-hand panel: corresponding NHST t-test p-value model densities, as modeled by the Bayes factor B F ( p ) . As can be observed, convergence on the empirical densities is rapidly achieved with the expansion of the entropic convex dual λ ( x ) on a small number n of Gegenbauer polynomials. In the lower panels, with the Hermite polynomials standing for the Gegenbauer polynomials in the large b limit (21), the Bayes factor B F ( r ) expansion coefficients in random variable space directly provides the Gibbs prior moments π ( δ ) δ n d δ in parametric space, as according to (56). (c) Lower left-hand panel: cumulative orthogonal polynomial expansion of the Bayes factor B F ( r ) in random variable space. (d) Lower right-hand panel: corresponding NHST central normal distribution p-value densities as modeled by the Bayes factor B F ( p ) . As can be observed, convergence on the empirical densities is rapidly achieved with a small number n of low-order Hermite polynomials.
Figure 2. Empirical random space densities and NHST p-value densities modelization for the head and neck cancer dataset. The upper panels illustrate the Jaynes–Gibbs model υ ( x ) = w t ( x ) T ( x | z ) π ( z ) d z , as provided by the Gibbs prior (54) and the analytical generating function (11) for the Gegenbauer polynomials. (a) Upper left-hand panel: convergence of the Jaynes–Gibbs model with the empirical density υ ( x ) = w t ( x ) B F ( x ) . (b) Upper right-hand panel: corresponding NHST t-test p-value model densities, as modeled by the Bayes factor B F ( p ) . As can be observed, convergence on the empirical densities is rapidly achieved with the expansion of the entropic convex dual λ ( x ) on a small number n of Gegenbauer polynomials. In the lower panels, with the Hermite polynomials standing for the Gegenbauer polynomials in the large b limit (21), the Bayes factor B F ( r ) expansion coefficients in random variable space directly provides the Gibbs prior moments π ( δ ) δ n d δ in parametric space, as according to (56). (c) Lower left-hand panel: cumulative orthogonal polynomial expansion of the Bayes factor B F ( r ) in random variable space. (d) Lower right-hand panel: corresponding NHST central normal distribution p-value densities as modeled by the Bayes factor B F ( p ) . As can be observed, convergence on the empirical densities is rapidly achieved with a small number n of low-order Hermite polynomials.
Entropy 24 00709 g002
Figure 3. Bayes Factor B F ( p ) modeling a NHST p-value distribution from a genome-wide association study (GWAS) dataset. The GWAS compared 2244 critically ill patients with COVID-19 with 3 times as many ancestry-matched control individuals. The dataset comprises 4,380,209 χ ν 1 = 1 2 r-statistics, accounting for all the SNPs in the set, which have been modelized by a logistic regression model and tested for statistical significance. (a) Left panel: Accrual of the successive Laguerre polynomial expansion terms in Equation (56) for the Bayes factor demonstrating an incrementally better fit of the p-value empirical density which strongly deviates from the NHST null hypothesis U ( 0 , 1 ) in the low p-value range. (b) Right panel: Local false discovery rate fdr ( p ) = 1 / ( 1 + B F ( p ) ) . The Bayesian-based fdr crosses the 0.01 threshold (i.e., a fdr of 1%) when the NHST p-value reaches about 10 7 ( log 10 ( p ) = 7 ), in close concordance with the threshold of significance of 5 × 10 8 ( log 10 ( p ) = 7.3 ) chosen by the authors.
Figure 3. Bayes Factor B F ( p ) modeling a NHST p-value distribution from a genome-wide association study (GWAS) dataset. The GWAS compared 2244 critically ill patients with COVID-19 with 3 times as many ancestry-matched control individuals. The dataset comprises 4,380,209 χ ν 1 = 1 2 r-statistics, accounting for all the SNPs in the set, which have been modelized by a logistic regression model and tested for statistical significance. (a) Left panel: Accrual of the successive Laguerre polynomial expansion terms in Equation (56) for the Bayes factor demonstrating an incrementally better fit of the p-value empirical density which strongly deviates from the NHST null hypothesis U ( 0 , 1 ) in the low p-value range. (b) Right panel: Local false discovery rate fdr ( p ) = 1 / ( 1 + B F ( p ) ) . The Bayesian-based fdr crosses the 0.01 threshold (i.e., a fdr of 1%) when the NHST p-value reaches about 10 7 ( log 10 ( p ) = 7 ), in close concordance with the threshold of significance of 5 × 10 8 ( log 10 ( p ) = 7.3 ) chosen by the authors.
Entropy 24 00709 g003
Figure 4. Earth’s emerging land/ice latitudinal density. Orthogonal Legendre P n polynomial (Gegenbaeur C n ( 1 / 2 ) polynomial) modeling of Earth’s emerging land/ice masses’ latitudinal density. (a) Left panel: orthogonal Legendre P n polynomial modeling (55) of Earth’s emerging land/ice latitudinal density as expanded on the first 30 Legendre polynomials. (b) Right panel: Kullback–Leibler divergence between the empirical density and the model as a function of the number of orthogonal Legendre polynomials accrued.
Figure 4. Earth’s emerging land/ice latitudinal density. Orthogonal Legendre P n polynomial (Gegenbaeur C n ( 1 / 2 ) polynomial) modeling of Earth’s emerging land/ice masses’ latitudinal density. (a) Left panel: orthogonal Legendre P n polynomial modeling (55) of Earth’s emerging land/ice latitudinal density as expanded on the first 30 Legendre polynomials. (b) Right panel: Kullback–Leibler divergence between the empirical density and the model as a function of the number of orthogonal Legendre polynomials accrued.
Entropy 24 00709 g004
Figure 5. Tide geometry. Idealized model describing the gravitational tidal pull of an ideal Moon of mass m on a thin water layer covering an ideal Earth of mass M and radius R at distance D from the Moon. The horizontal tidal force is given by the modular translation factor T t ( ν 2 = 2 ) ( θ | δ ) (9), defining on S 2 the noncentral spherical distribution υ t ( ν 2 = 2 ) ( θ | δ ) (8) with cos θ δ = R / D , minus a factor of one, accounting for the centrifugal force induced by the Moon–Earth system revolving around its center of gravity. The equatorial bulges are not symmetric in this purely geometrical model. The noncentrality parameter cos θ δ is set to an unrealistic value of 0.2 to enhance the visualization of the geometrical distribution of the tidal forces.
Figure 5. Tide geometry. Idealized model describing the gravitational tidal pull of an ideal Moon of mass m on a thin water layer covering an ideal Earth of mass M and radius R at distance D from the Moon. The horizontal tidal force is given by the modular translation factor T t ( ν 2 = 2 ) ( θ | δ ) (9), defining on S 2 the noncentral spherical distribution υ t ( ν 2 = 2 ) ( θ | δ ) (8) with cos θ δ = R / D , minus a factor of one, accounting for the centrifugal force induced by the Moon–Earth system revolving around its center of gravity. The equatorial bulges are not symmetric in this purely geometrical model. The noncentrality parameter cos θ δ is set to an unrealistic value of 0.2 to enhance the visualization of the geometrical distribution of the tidal forces.
Entropy 24 00709 g005
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Le Blanc, R. Jaynes-Gibbs Entropic Convex Duals and Orthogonal Polynomials. Entropy 2022, 24, 709. https://0-doi-org.brum.beds.ac.uk/10.3390/e24050709

AMA Style

Le Blanc R. Jaynes-Gibbs Entropic Convex Duals and Orthogonal Polynomials. Entropy. 2022; 24(5):709. https://0-doi-org.brum.beds.ac.uk/10.3390/e24050709

Chicago/Turabian Style

Le Blanc, Richard. 2022. "Jaynes-Gibbs Entropic Convex Duals and Orthogonal Polynomials" Entropy 24, no. 5: 709. https://0-doi-org.brum.beds.ac.uk/10.3390/e24050709

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop