Next Article in Journal
Epistemology and Ethics in Zhuangzi
Previous Article in Journal
Aesthetic Theory and the Philosophy of Nature
Previous Article in Special Issue
A Fuzzy Take on the Logical Issues of Statistical Hypothesis Testing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

To Be or to Have Been Lucky, That Is the Question

1
Physico-Chimie des Électrolytes et Nano-Systèmes Interfaciaux, PHENIX, Sorbonne Université, CNRS, F-75005 Paris, France
2
Physique Théorique de la Matière Condensée, LPTMC, Sorbonne Université, CNRS, F-75005 Paris, France
*
Author to whom correspondence should be addressed.
Submission received: 21 April 2021 / Revised: 1 July 2021 / Accepted: 6 July 2021 / Published: 9 July 2021
(This article belongs to the Special Issue Logic and Science)

Abstract

:
Is it possible to measure the dispersion of ex ante chances (i.e., chances “before the event”) among people, be it gambling, health, or social opportunities? We explore this question and provide some tools, including a statistical test, to evidence the actual dispersion of ex ante chances in various areas, with a focus on chronic diseases. Using the principle of maximum entropy, we derive the distribution of the risk of becoming ill in the global population as well as in the population of affected people. We find that affected people are either at very low risk, like the overwhelming majority of the population, but still were unlucky to become ill, or are at extremely high risk and were bound to become ill.

1. Introduction

“That evening he was lucky”: what do we mean by this? It is even weirder when we say: “the luck turned”. Does this mean that we could be visited by fortune? Or that some people are luckier than others on certain days? Of course, we cannot rule out the fact that some people may bias the chances of success simply by cheating. Yet, is there any way to assess the dispersion of chances among gamblers (or just the fraction of cheaters)?
This kind of question is part of the field of probability calculus, which aims at determining the relative likelihoods of events (for a nice historical introduction to probability theory, see [1]). Probability calculus started during the summer of 1654 with the correspondence between Pascal and Fermat precisely on the elementary problems of gambling [2]. Symmetry arguments are at the heart of this calculus: for example, for an unbiased coin, the two results—heads or tails—are a priori equivalent and therefore, have the same probability of occurrence of 1 / 2 . This is why it is not anecdotal that Pascal wanted to give his treatise the “astonishing” title “Geometry of Chance”. Another illustration of the power of symmetry arguments is the tour de force of Maxwell who managed to calculate the velocity distribution of particles in idealized gases [3]. At the time when he derived what is since called the Maxwell–Boltzmann distribution, there was no possibility to measure this distribution. It was almost 60 years before Otto Stern could achieve the first experimental verification of this distribution [4], around the same time when he confirmed with Walther Gerlach the existence of the electron spin [5], for which he won the Nobel Prize in 1944. The agreement between theoretical and experimental distributions was surprisingly good. Since its invention in the middle of the 17th century, probability calculus has accompanied most if not all new fields of science, especially since the beginning of the 20th century with the burst of genetics and quantum physics up to the most recent developments of quantum cognition [6], not to mention its countless applications in finance and economy.
In probability theory, events are usually associated with random variables that are measurable. For example, in the heads or tails game, heads may be associated with 1 and tails with 0 . Then, for a given number N of draws, one can count the number of times the heads are flipped. This number k is between 0 and N and the ratio k / N is the frequency of the heads. If the coin is unbiased, this frequency fluctuates around 1 / 2 when the game ( N draws for each game) is played many times. Importantly, the frequency is observed ex post, i.e., after the game is played; the mean frequency is used as a measure of probability of getting a head. This is the usual way of assessing probabilities in the frequentist perspective of statistics. Remember that assessing probabilities for anticipating the outcome of future events is the very purpose of statistics. However, it is not always possible to deduce probabilities from frequency measurements. For example, suppose that each coin is tossed only once. Can we still assess the dispersion of chances among gamblers?
Dispersion of chances is far from being limited to gamblers. Disease risk is another area where people may be and actually are unequal for genetic or environmental reasons. In this case, the result of a “draw” is whether or not you have a disease D . The “game” is then limited to one “draw” per person. Of course, the mean probability to become ill can still be observed. Yet, can we assess the dispersion of disease risks? Then, if so, how can we? As a last emblematic example, we mention social opportunities. Measuring inequality of opportunity is a crucial issue with considerable political stakes, though it is extremely difficult to assess. On this last point, we postpone the in-depth study of the measure of unequal opportunities to a further work.
In all these examples, be it gambling, disease, or social opportunity, the ex ante chances are themselves random variables that cannot be deduced from frequency measurements nor be induced by symmetry arguments. They are hidden variables. Nevertheless, we argue here that the probability distribution function (pdf) of the ex ante chances can be assessed and we propose some tools to (i) first test the existence of some dispersion of chances in the population; (ii) then, infer the pdf of the ex ante chances; and (iii) explore more specifically the relevance of those tools to and their consequences in the field of chronic diseases, i.e., diseases that occur at various ages and persist throughout life [7]. Importantly, we do not assume any hypothetical functional form for the pdf of chances and then infer its parameters by Bayesian inference as is usually carried out. Here, we first test the inequality of chances in the population, then infer the functional form of the pdf by means of the principle of maximum entropy.

2. A Simple Draw Is Not Enough

Let us first assume that there is a sample of n people tossing a coin and that each of them has a probability p i to win (hence, 1 p i to lose). In an unbiased game, all the p i are identical and equal to 1 / 2 . Imagine that some gamblers are luckier, others less fortunate—hence, some p i are greater than 1 / 2 , others less than 1 / 2 . This means that the p i are random variables that are drawn from a probability distribution f p that is different from δ p 1 / 2 , where δ is the Dirac delta function. Let Φ and Σ 2 be the mean and variance of f p . Let us assume now that each individual plays N times. The result of each draw j of the individual i is a random variable X i j , either 1 in cases of success or 0 in cases of failure. This is a Bernoulli process: for each i , the random variables X i j are i.i.d. (independent, identically distributed, i.e., the probability of success p i is the same for the N draws of i ). Let us define S i = j = 1 N X i j as the score over N draws. It is the number of times the individual i has won. Given the risk p i , S i is a random variable that follows a binomial distribution B N , p i . The mean and the variance of S i for a given risk p i are
E N S i | p i = N p i V a r N S i | p i = N p i 1 p i
Once every individual has played N times, we obtain an estimation of the distribution of the n random variables S i as a histogram over the N + 1 values k = 0 ,   1 ,   2 , , N . These random variables S i are independent but non-identically distributed, as the p i are different from one individual to another.
Just as the p i are drawn from the distribution f p , the S i are the realizations of a random variable S (which takes the N + 1 discrete values k = 0 ,   1 ,   2 , , N ). The underlying distribution is no longer only on the random variable S , but on the joint probability of S and p . Thus, the marginal probability distribution function of S is given as follows:
k = 0 ,   1 ,   , N   P N S = k = E P N S = k | p = E C N k p k 1 p N k = 0 1 d p   f p C N k p k 1 p N k
where E is expected value of with the probability distribution of p , f p ; and C N k = N ! k ! N k ! is the binomial coefficient “ N choose k ”, i.e., the number of k-combinations of N . The mean of S is
E N S = E E N S | p = E k = 0 N k C N k p k 1 p N k = E N p
where E N is the expected value of with the probability distribution of S , P N (S); and E N S | p is the conditional expected value of S for a given underlying probability p , i.e., the Bernoulli distribution. Since Φ is the mean of the distribution f p ,
E N S = N E p = N Φ
and the variance of S is
V a r N S = E N S 2 E N S 2
where
E N S 2 = E E N [ S 2 | p ] = E k = 0 N k 2 C N k p k 1 p N k = E N p 1 p + N p 2
hence
E N S 2 = N E p E p 2 + N 2 E p 2
and
V a r N S = N E p E p 2 + N 2 E p 2 E p 2
Now, we recall the first two moments of f p , given its mean Φ and its variance Σ 2
E p = Φ E p 2 = Σ 2 + Φ 2
so that
V a r N S = N Φ 1 Φ Σ 2 + N 2 Σ 2 = N Φ 1 Φ + N N 1 Σ 2
Equation (3) gives the variance of the score S as a function of the variance Σ 2 of f p . In the following for the sake of clarity, we will refer to Σ 2 as the dispersion of chances.
Note that within the limit N , the probability distribution of the random variable S / N converges to the distribution f p .
We simulated two populations of n gamblers each drawing N times. Both populations have the same mean chance of gain Φ = 1 / 2 . However, in the first population, the chance distribution is
f 0 p = δ p 1 2
where there is no dispersion of chances, i.e., Σ 2 = 0 . In contrast, in the second population, the chance distribution is
f 1 p = 1 2 δ p + δ 1 p
where the dispersion is maximal, i.e., Σ 2 = 1 / 4 . The histograms are plotted in Figure 1 for a number n of gamblers ranging from 10 to 100 and a number N of draws ranging from 1 to 4 . Equation (3) shows that if N = 1 , the variance V a r 1 S = Φ 1 Φ does not depend on the dispersion of chances Σ 2 . As a matter of fact, when N = 1 , the gains are either 0 or 1 so that the histogram of gains has only two bins, one at 0 , the other at 1 . The mean of gains is Φ and the variance is Φ 1 Φ . Neither the mean nor the variance depends on the dispersion of chances Σ 2 . Moreover, according to Equation (1), the histogram of gains itself depends only on the mean of the distribution f p :
P 1 S = 0 = E 1 p = 1 Φ P 1 S = 1 = E p = Φ
The histogram of gains, therefore, cannot provide information on the dispersion of chances, as shown in Figure 1 for the case N = 1 where the histograms for f 0 and f 1 are indistinguishable. This means that a simple draw is not enough to extract the variance of f p from the histogram of gains; multiple draws are necessary, though are they sufficient?

3. A Statistical Test of the Dispersion of Chances

We then note in Figure 1 that the histogram of gains for two draws ( N = 2 ) has three bins, one at 0 , the second at 1 and the third at 2 , with the following values:
P 2 S = 0 = E 1 p 2 = 1 Φ 2 + Σ 2 P 2 S = 1 = E 2 p 1 p = 2 Φ 1 Φ 2 Σ 2
P 2 S = 2 = E p 2 = Φ 2 + Σ 2
Hence, the histogram of gains now depends on (and only on) both the mean and the variance of f p . Note that Equation (5) shows that Σ 2 Φ 1 Φ , since P 2 S = 1 0 ; moreover, Φ 1 Φ is maximal when Φ = 1 / 2 . For three or more draws, we could also have access to higher order moments of f p . Nevertheless, the minimum condition for the presence of a probability dispersion is that the variance of f p is non-zero. We therefore propose to design a statistical test that will be able to discriminate between both following hypotheses:
Null hypothesis H 0 : everybody has the same probability Φ of gain. This means that f 0 p = δ p Φ whose mean is E p = Φ and dispersion Σ 2 = 0 .
Alternative hypothesis H 1 : f has the same mean Φ but there is some dispersion of chances among the population, so that some people are luckier than others; hence, f has a non-zero dispersion Σ 2 .
According to H 0 , the mean of N draws is Φ and the variance is N Φ 1 Φ , whereas according to H 1 , the mean of N draws is also Φ but the variance is N Φ 1 Φ Σ 2 + N 2 Σ 2 . Hence, if the variance V a r N S grows linearly with N , then all individuals have the same probability p of success. If, on the contrary, V a r N S grows quadratically with N , then not all individuals have the same chance of success. We can, therefore, rephrase our hypothesis test as the following alternative based on the dependence of the variance V a r N S on the number N of draws:
Null hypothesis H 0 : the variance V a r N S grows linearly with N .
Alternative hypothesis H 1 : the variance V a r N S grows quadratically with N .
Figure 2 shows how the variance of S varies as a function of the number of draws N for two typical distributions of mean 1 / 2 : f 1 with zero dispersion and f 2 with maximum dispersion 1 / 4 . The distribution f 1 (resp. f 2 ) illustrates the case of a variance growing linearly (resp. quadratically) with N .
A relevant statistical test is needed to discriminate between the two hypotheses H 0 and H 1 , or at least to reject the null hypothesis H 0 . Moreover, in the remainder of this paper, we are particularly interested in the case N = 2 . It is then necessary to reformulate our hypotheses because it becomes difficult to discriminate the quadratic behavior from the linear behavior with only three points. Therefore, we rephrase our hypothesis test, based on the fact that the number of draws is limited to N = 2 :
Null hypothesis H 0 : the variance of S reads V a r 2 S = 2 Φ 1 Φ , i.e., Σ 2 = 0 .
Alternative hypothesis H 1 : the variance of S reads V a r 2 S = 2 Φ 1 Φ + 2 Σ 2 with Σ 2 > 0 .
To estimate the variance of S from a sample of n individuals, the unbiased variance estimator is used:
V n = 1 n 1 i = 1 n S i S ¯ 2
where S ¯ is the mean estimator
S ¯ = 1 n i = 1 n S i
The estimation of the variance of S , V n , from a sample of finite size n is subject to statistical fluctuations. Thus, our hypotheses become:
Null hypothesis H 0 : V n 2 Φ 1 Φ is compatible with 0 considering the error bars, i.e., the standard deviation of V n .
Alternative hypothesis H 1 : V n 2 Φ 1 Φ = 2 Σ 2 > 0 .
The variance of V n reads (see Appendix A)
V a r Σ 2 V n = 2 n n 1 2 Ψ 1 2 Ψ + 7 1 4 Ψ Σ 2 2 Σ 4 + 8 n 1 2 Ψ + Σ 2 2
where Ψ = Φ 1 Φ . Its asymptotic expression for n 1 reads
V a r Σ 2 V n ~ 2 n Ψ 1 2 Ψ + 7 1 4 Ψ Σ 2 2 Σ 4
Figure 3 compares the expression of the variance V a r Σ 2 V n (black dashed line) obtained in Equation (7) and its asymptotic expression (grey dashed line) in Equation (8) with simulations (blue dots) and shows good agreement.
It can be noted that the distribution of V n tends towards a normal distribution N E Σ 2 V n , V a r Σ 2 V n of mean E Σ 2 V n = V a r 2 S and variance V a r Σ 2 V n . Now, we wish to estimate the probability of having obtained a value as high as V n under the null hypothesis H 0 , i.e., the p-value. Since V n follows a normal distribution, the p-value can be expressed as follows
p - value = 1 2 1 erf z 2 = 1 2 erfc z 2
where erf and erfc are, respectively, the error function and the complementary error function. By posing E 0 V n and V a r 0 V n as the mean and the variance of V n under the null hypothesis H 0 , i.e., Σ 2 = 0 , we have
z = V n E 0 V n V a r 0 V n = V n 2 Φ 1 Φ V a r 0 V n
Within the limit of large sample sizes n 1 , one can write using, again, Ψ = Φ 1 Φ :
z n V n 2 Ψ 2 Ψ 1 2 Ψ
In the context of Figure 2 restricted to the case N = 2 and n = 100 , the estimated variance V n for the distribution f 1 (of mean 1 / 2 and dispersion 1 / 4 ) leads to z = 20 , i.e., a p-value of 10 87 . This allows us to reject the null hypothesis in this case.

4. Dispersion of Disease Risks for Twins

Inequality in disease risk is a major public health issue [8,9]. Of course, part of this inequality is known to depend on genetic and environmental factors. At the turn of the 2000s, a new approach called genome wide association studies (GWAS) was designed to characterize the genetic predisposition to a chronic disease [10]. GWAS are supposed to find in particular the genes involved in a given disease, and among these genes, the variants most at risk, i.e., the DNA sequences of a given gene that are more represented in the people affected by the disease. Such variants characterize the genetic predisposition to the disease. The mean frequency that an individual will become ill in a given population, specified by genetic and environmental factors, can then be measured. As usual, this frequency can be used as a measure of the probability of becoming ill. However, can we assess the dispersion of disease risk, if only it exists, in this specific population? More generally, is there any way to assess the dispersion of risk in a more objective manner, without any a priori assumption on presumed risk factors? Here comes into play a providential help from the existence of twins. Identical twins, also called monozygotic (MZ) twins, have the same genome, shared the same fetal environment and, generally, share the same living conditions. Therefore, they are most likely to also share the same probability of becoming ill, whatever the disease. Identical twins are, therefore, like a player betting twice. This is much related to the gambling question addressed above for N = 2 (two draws). Indeed, as both twins have the same probability p of having disease D , the status—healthy or ill—of each of the two twins is equivalent, respectively, to the outcome—loss or gain—of each of the two draws by one and the same gambler. In this situation, probability p is called a risk. Let f p be the probability distribution function of the risk of having disease D in the population. We define the random variable S as above, i.e., S = 0 if both twins are healthy, S = 1 if only one of the two twins is ill and S = 2 if both twins are ill. The mean Φ and variance of S are given by Equations (2) and (3), respectively, hence for N = 2
E 2 S = 2 Φ
V a r 2 S = 2 Φ 1 Φ + 2 Σ 2
Then, if V n is significantly greater than S ¯ 1 S ¯ / 2 , which amounts to carrying out the hypothesis test presented in the above section, we can conclude that there is some dispersion of the disease risk. As we will see below, the dispersion is in fact unusually large. However, before that, let us calculate the twin concordance rate of the disease D . In genetics, the twin concordance rate is the probability τ that a twin is affected given that his/her co-twin is affected:
τ = P X 2 = 1 | X 1 = 1 = P X 1 = 1 , X 2 = 1 P X 1 = 1 = P 2 S = 2 P X 1 = 1 , X 2 = 1 + P X 1 = 1 , X 2 = 0
hence
τ = P 2 S = 2 P 2 S = 2 + 1 2 P 2 S = 1 = 2 P 2 S = 2 2 P 2 S = 2 + P 2 S = 1
Note that τ is equal to the probandwise concordance rate, which is known to best assess the twin concordance rate [11].
Using Equations (4) and (6), we can also reformulate the concordance rate of twins in terms of the moments of the distribution f p :
τ = P X 1 = 1 , X 2 = 1 P X 1 = 1 = P 2 S = 2 P 1 S = 1 = E p 2 E p
Note we can generalize the concordance rate for a N -tuple:
τ N = P N S = N P 1 S = 1 = E p N E p
Using Equations (6) and (10), we obtain
τ = Φ 2 + Σ 2 Φ
Thus, the relative risk R R = τ / Φ is equal to
R R = E p 2 E p 2 = Φ 2 + Σ 2 Φ 2 = 1 + Σ 2 Φ 2
The twin concordance rate can also be computed using the probability density function f a p restricted to the population of affected people. Let f X ,   p be the joint probability of an individual to have a risk p 0 , 1 and to be in the state X 0 , 1 . According to Bayes’ theorem, also known as the theorem of the probability of causes since it was independently rediscovered by Laplace [12], we write
f X ,   p = f p | X P X = f X | p f p
hence
f p | X = f X | p f p P X
Then, f p | X = 1 is the distribution of the risk p in the population of affected people
f p | X = 1 = f a p
Now, by definition, we have
f X = 1 | p = p
and by noting that P X = 1 = P 1 S = 1 , we also have
P X = 1 = E f X = 1 | p = E p
This leads to the following expression of the risk distribution function among affected people
f a p = p f p E p
Note that f a p is the so-called “size-biased law” of the risk p of becoming ill. Size-biased laws are found in many contexts, notably rare events [13], Poisson point processes [14] or familial risk of disease [15].
The mean risk in the affected population is then
E a p = 0 1 p f a p d p = 0 1 p 2 f p d p E p = E p 2 E p
where E a is the expected value of among affected people, with the probability distribution f a p . Using Equation (11), we obtain
E a p = τ
which proves that the mean risk in the affected population is equal to the twin concordance rate.
We proceed now to evaluate the functional form of the distribution f p . Using the prevalence and the twin concordance rate of the disease D , we have access to, and only to, the mean Φ and dispersion Σ 2 of f p . The principle of maximum entropy then provides us with the least arbitrary distribution [16,17]. Dowson and Wragg proved [18] that in the class P of absolutely continuous probability distributions on 0 , 1 with given first and second moments (i.e., given mean and variance), there exists a distribution in P which maximizes the entropy
H f = 0 1 f p ln f p d p
and the corresponding density function f p on 0 , 1 is a truncated normal distribution f p ; m , s , 0 , 1 , which may be either bell-shaped or U-type. Dowson and Wragg show that when Φ 1 and Σ > Φ , which is usual for most if not all chronic diseases (unpublished results), the distribution f p ; m , s , 0 , 1 is U-type (see Appendix B). This distribution, which will be simply denoted f p ; m , s in the following, can then be written
f p ; m , s = 1 s Z 2 π exp p m 2 2 s 2
with
Z = erfi m s 2 + erfi 1 m s 2
The imaginary error function erfi x can be expressed using the Dawson function D x
erfi x = 2 π e x 2 D x
Therefore, f p ; m , s can finally be written
f p ; m , s = 1 2 s exp p 2 2 m p 2 s 2 D m s 2 + e 1 2 m 2 s 2 D 1 m s 2
It is straightforward to express Φ and Σ 2 in terms of the parameters m and s :
Φ = m 2 2 s 1 e 1 2 m 2 s 2 D m s 2 + e 1 2 m 2 s 2 D 1 m s 2
Σ 2 = s 2 1 1 s 2 2 m + 1 m e 1 2 m 2 s 2 D m s 2 + e 1 2 m 2 s 2 D 1 m s 2 + 1 2 1 e 1 2 m 2 s 2 2 D m s 2 + e 1 2 m 2 s 2 D 1 m s 2 2
Inverting this system of equations to obtain the risk distribution function of the disease D in terms of Φ and Σ 2 is a bit trickier and requires a numerical solver. In the next section, we show the outcome of this general formalism for one specific chronic disease, namely Crohn’s disease.

5. Application to Crohn’s Disease (CD)

Crohn’s disease (CD) is one of the most well-documented chronic diseases, particularly in the field of genetics [19]. Its prevalence Φ and twin concordance rate τ are [20]:
Φ 0.0025 τ 0.385
Then, the twin relative risk is
R R 154
hence
Σ 2 = Φ 2 R R 1 0.00096 Σ 0.031
which means that
Σ Φ 12
The dispersion of the risk of being affected is, therefore, huge for CD.
It is now necessary to calculate the p-value according to Equation (9) in order to be able to reject (or not) our null hypothesis H 0 . To do this, we first need to estimate the number of twin pairs n that remains unknown in the Swedish study [20]. Nevertheless, the number of twin pairs with at least one affected twin is known and equal to n 1 + n 2 = 31.5 , where n 1 = 24 and n 2 = 7.5 are the number of discordant and concordant twin pairs, respectively [20]. We can reconstruct the sample size n that would have been needed to obtain n 1 and n 2 , with probabilities P 2 S = 1 and P 2 S = 2 :
P 2 S = 1 + P 2 S = 2 = n 1 + n 2 n
By using Equations (5) and (6), we obtain the following sample size
n = n 1 + n 2 1 1 Φ 2 Σ 2 7809
Equation (9) is used by calculating z within the limit of large sample sizes n 1 . This results in z 2.4 , which allows us to reject the null hypothesis H 0 with the p - value 8 10 3 .
It is then legitimate to calculate the parameters m and s of the truncated normal distribution f p ; m , s , 0 , 1 , which maximizes the entropy H f given the mean Φ and the dispersion Σ 2 . Solving the system of Equations (14) and (15) for Φ = 0.0025 and Σ = 0.031 gives
m 0.505 s 0.0278
Both probability distribution functions f p ; m , s and f a p ; m , s = p f p ; m , s / Φ for CD are plotted in Figure 4a and zoomed in in Figure 4b. Quite remarkably, the probability density function f a p ; m , s in the population of affected people has two narrow peaks, one close to p = 0 and the other one close to p = 1 . This means that there are two quite separate categories of people who become ill: in the left peak (close to p = 0 ), people are at very low risk, but still have been unlucky to become ill, whereas in the right peak (close to p = 1 ), people are at extremely high risk, hence are unlucky a priori, and indeed, were bound to become ill. Not having any luck (to become ill because of high risk) or to have been unlucky (to become ill despite low risk), that is the question!
Finally, we note that concordant twins are very likely to be in the right peak, whereas discordant twins are in the left one. Indeed, when two MZ twins have their common risk p in the left peak, their probability of being concordant is extremely low, of the order of the mean of p 2 restricted to the left peak of f a p , which is of the order of 10 5 . On the contrary, when two MZ twins have their common risk p in the right peak, their probability to be concordant is extremely high, of the order of 0.997 . Interestingly enough, the fraction of people in the right peak (area under the curve) is 38.52 % , quite similar to the (probandwise) twin concordance rate of 38.65 % [20]. This strongly suggests that concordant twins for a given disease both have a strong predisposition for this disease, whereas discordant twins both have no particular predisposition.

6. Conclusions

Assessing the inequality of chances in a given population is a critical problem that has several issues, notably health and social opportunity. Starting with the simple heads or tails game, we have shown that, although hidden variables such as ex ante chances of gamblers (possibly cheating) cannot be assessed, their distribution can actually be assessed whenever multiple draws are available. For this purpose, we have proposed a hypothesis test to evidence the inequality of chances in a given population, then infer the functional form of the probability distribution function of the ex ante chances by means of the principle of maximum entropy, which gives the least arbitrary distribution given the mean and variance of the probability distribution function.
We applied this methodology to chronic diseases and found that the distribution of the risk of becoming ill is usually a U-type truncated normal distribution. We have computed the parameters of this U-type distribution in the case of Crohn’s disease using the prevalence and the twin concordance rate of this pathology. Moreover, we have found that the risk distribution function among affected people is bimodal with two narrow peaks, one corresponding to people with no liable risk factor and the other one to people genetically or environmentally destined to become ill. An interesting consequence is that concordant twins for a given disease both have a strong predisposition for that disease, while discordant twins both have no particular predisposition.
One should still not over-interpret the results, as they still rely only on estimates of the prevalence and the twin concordance rate of the disease. It can be thought of as the best possible interpretation in terms of distribution, based on the available information. Nevertheless, maximizing the entropy of the risk distribution function leads to significantly different conclusions than more arbitrary distributions such as, for example, beta-distributions [21].
Twins provide a unique means to play twice at the lottery of diseases. Of course, twins are all the more relevant to assess ex ante chances as they share the same environmental factors. In the same vein, “social twins” or more generally “social clones” would be of great help in assessing inequality of opportunities. However, controlling the environment of such social clones would be rather challenging as the issue of choice comes into play, which may change people’s lives with the same opportunities. Assessing the inequality of opportunities is, therefore, one of the most delicate, almost completely open, issues.
Pascal could never complete his treatise “Geometry of Chance”. This never-ending treatise is still being written, as evidenced in this Special Issue.

Author Contributions

Conceptualization, J.-M.V.; methodology, A.L., J.-M.V.; writing—original draft preparation, A.L., J.-M.V.; writing—review and editing, A.L., J.-M.V. Both authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

This work has benefited from fruitful discussions with Anne Dumay, Jean-Pierre Hugot and Alberto Romagnoni. We thank Bastien Mallein and Laurent Tournier for their careful reading of the manuscript and helpful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Computing the Variance of V n

To estimate the variance of S from a sample of n individuals, the unbiased variance estimator is used:
V n = 1 n 1 i = 1 n S i S ¯ 2
where S ¯ is the mean estimator
S ¯ = 1 n i = 1 n S i
We first recall the following properties of S ¯ :
E S ¯ = E S V a r S ¯ = V a r S n
By posing E S ¯ = E S = m , we can write
V n = 1 n 1 i = 1 n S i m 2 n n 1 S ¯ m 2
hence
V a r V n = n n 1 2 V a r S m 2 + n 2 n 1 2 V a r S ¯ m 2 2 n n 1 2 C o v i = 1 n S i m 2 , S ¯ m 2
To simplify the calculation, we consider sufficiently large samples (typically n > 30 ) so that the distribution of S ¯ tends towards the normal distribution N E S ¯ , V a r S ¯ of mean E S ¯ = m and variance V a r S ¯ = V a r S / n , according to the central limit theorem. The variable S ¯ is then independent of S i , which has, as an immediate effect, a null covariance C o v i = 1 n S i m 2 , S ¯ m 2 = 0 . Then, V a r S m 2 and V a r S ¯ m 2 remain to be determined. Let us start with the latter, which is simpler.
V a r S ¯ m 2 = E S ¯ m 4 E S ¯ m 2 2
with
E S ¯ m 2 = V a r S ¯ = V a r S n
and
E S ¯ m 4 = E S ¯ 4 4 E S ¯ 3 m + 6 E S ¯ 2 m 2 3 m 4
Since S ¯ is a Gaussian variable, its moments read
E S ¯ 2 = m 2 + V a r S ¯
E S ¯ 3 = m m 2 + 3   V a r S ¯
E S ¯ 4 = m 4 + 6 m 2   V a r S ¯ + 3   V a r S ¯ 2
All the terms in m cancel each other out, hence
V a r S ¯ m 2 = 2   V a r S ¯ 2 = 2 V a r S n 2
Now all that remains is to determine V a r S m 2 . This term requires expressing the moments of S as a function of the moments (up to the 4th order) of the distribution f . First, let us start by explicating the variance.
V a r S m 2 = E S m 4 E S m 2 2
with
E S m 2 = E S 2 m 2
and
E S m 4 = E S 4 4 E S 3 m + 6 E S 2 m 2 3 m 4
Then, we calculate the -th moments of S (for = 2 ,   3 ,   4 )
E S = E p E S S | p = E p k = 0 N k C N k p k 1 p N k
E S 2 = N E p + N N 1 E p 2
E S 3 = N E p + 3 N N 1 E p 2 + N N 1 N 2 E p 3
E S 4 = N E p + 7 N N 1 E p 2 + 6 N N 1 N 2 E p 3 + N N 1 N 2 N 3 E p 4
We also have the variance of S expressed with the moments of p :
V a r S = N E p 1 N E p + N N 1 E p 2
In general, we need to know the higher order moments of the distribution f if we want to go further. However, we are only interested here in the case N = 2 , where some welcome simplifications arise. It turns out the higher order moments of the distribution f do not contribute to the moments of S .
E S 2 = 2 E p + 2 E p 2 E S 3 = 2 E p + 6 E p 2 E S 4 = 2 E p + 14 E p 2
hence,
V a r S ¯ m 2 = 8 n 2 E p 1 2 E p + E p 2 2
and
V a r S m 2 = 2 E p 2 E p 1 4 E p 1 2 + 4 E p 2 2 + 2 7 28 E p + 32 E p 2 E p 2
It is further simplified by using E p = Φ and E p 2 = Φ 2 + Σ 2 .
V a r S ¯ m 2 = 8 n 2 Φ 1 Φ + Σ 2 2 V a r S m 2 = 2 Φ 1 Φ 1 2 Φ + 2 Φ 2 + 14 2 Φ 1 2 Σ 2 4 Σ 4
Then, we obtain the following expression
V a r V n = 2 n n 1 2 Φ 1 Φ 1 2 Φ + 2 Φ 2 + 7 2 Φ 1 2 Σ 2 2 Σ 4 + 8 n 1 2 Φ 1 Φ + Σ 2 2
Finally, we can simplify further by posing Ψ = Φ 1 Φ :
V a r V n = 2 n n 1 2 Ψ 1 2 Ψ + 7 1 4 Ψ Σ 2 2 Σ 4 + 8 n 1 2 Ψ + Σ 2 2

Appendix B. The Truncated Normal Distribution f p ; m , s , 0 , 1 Is U-Type When Φ 1 and Σ > Φ

The prevalence Φ of chronic diseases is most generally of the order of 10 3 and the relative risk R R of MZ twins is then of the order of 100. Thus, according to Equation (12), Σ / Φ is of the order of 10. As an example, R R 12 for Crohn’s disease (see Equation (16)). Therefore, Φ 1 and Σ > Φ is the rule for chronic diseases.
Dowson and Wragg [18] show that the truncated normal distribution f p that maximizes the entropy H f (see Equation (13)) with given mean μ 1 = Φ and second moment μ 2 = Φ 2 + Σ 2 is U-type when μ 1 and μ 2 are above the arc O M A ^ (see Figure 1 and text below in [18]). This dividing curve separates U-type from bell-shaped distributions. On this curve, the distribution f p that maximizes the entropy H f is no longer a truncated normal distribution but becomes a truncated exponential distribution (the arc O M A ^ is the set of points μ 1 , μ 2 whose coordinates are the first two moments of truncated exponential distributions on 0 , 1 ). A truncated exponential distribution on 0 , 1 can be written
f e x p p = λ 1 e λ e λ p
with λ , + . On the dividing curve O M A ^ , the first and second moments of f e x p p are given by
m 1 = 1 λ 1 e λ 1
m 2 = 2 λ 2 1 + 2 λ 1 e λ 1
It is easily seen that 0 < m 1 < 1 / 2 when λ 0 , + and 1 / 2 < m 1 < 1 when λ , 0 . The limiting case λ 0 corresponds to m 1 = 1 / 2 .
The truncated normal distribution f p that maximizes the entropy H f with given mean μ 1 = Φ and second moment μ 2 = Φ 2 + Σ 2 is U-type when μ 1 and μ 2 are above the arc O M A ^ , i.e., μ 2 > m 2 for μ 1 = m 1 . Now, when m 1 = Φ 1 , Equation (A1) gives λ 1 so that λ   ~ 1 / m 1 . Then, Equation (A2) gives m 2   ~ 2 / λ 2 , hence m 2   ~   2 m 1 2 , i.e., m 2   ~   2 Φ 2 . Therefore, f p is U-type if Φ 2 + Σ 2 > 2 Φ 2 , i.e., Σ > Φ .

References

  1. Debnath, L.; Basu, K. A short history of probability theory and its applications. Int. J. Math. Educ. Sci. Technol. 2014, 46, 13–39. [Google Scholar] [CrossRef]
  2. Godfroy-Génin, A.-S. Pascal: The geometry of chance. Math. Sci. Hum. 2000, 150, 7–39. [Google Scholar]
  3. Maxwell, J.C. Illustrations of the Dynamical Theory of Gases. Part I. On the motions and collisions of perfectly elastic spheres. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1860, 19, 19–32. [Google Scholar] [CrossRef]
  4. Stern, O. Eine direkte Messung der thermischen Molekulargeschwindigkeit. Z. Phys. 1920, 2, 49. [Google Scholar] [CrossRef] [Green Version]
  5. Gerlach, W.; Stern, O. Moment Das magnetische des Silberatoms. Z. Phys. 1922, 9, 353–355. [Google Scholar] [CrossRef]
  6. Bruza, P.D.; Wang, Z.; Busemeyer, J.R. Quantum cognition: A new theoretical approach to psychology. Trends Cogn. Sci. 2015, 19, 383–393. [Google Scholar] [CrossRef] [Green Version]
  7. Corbett, S.; Courtiol, A.; Lummaa, V.; Moorad, J.; Stearns, S. The transition to modernity and chronic disease: Mismatch and natural selection. Nat. Rev. Genet. 2018, 19, 419–430. [Google Scholar] [CrossRef]
  8. Arcaya, M.C.; Arcaya, A.L.; Subramanian, S.V. Inequalities in health: Definitions, concepts, and theories. Glob. Health Action 2015, 8, 27106. [Google Scholar] [CrossRef]
  9. Gomes, M.G.M.; Oliveira, J.F.; Bertolde, A.; Ayabina, D.; Nguyen, T.A.; Maciel, E.L.; Duarte, R.; Nguyen, B.H.; Shete, P.B.; Lienhardt, C. Introducing risk inequality metrics in tuberculosis policy development. Nat. Commun. 2019, 10, 2480. [Google Scholar] [CrossRef] [Green Version]
  10. Tam, V.; Patel, N.; Turcotte, M.; Bossé, Y.; Paré, G.; Meyre, D. Benefits and limitations of genome-wide association studies. Nat. Rev. Genet. 2019, 20, 467–484. [Google Scholar] [CrossRef]
  11. McGue, M. When assessing twin concordance, use the probandwise not the pairwise rate. Schizophr. Bull. 1992, 18, 171–176. [Google Scholar] [CrossRef]
  12. Stigler, S.M. Memoir on the Probability of the Causes of Events, Pierre Simon Laplace. Statist. Sci. 1986, 1, 364–378. [Google Scholar] [CrossRef]
  13. Patil, G.P.; Rao, C.R. Weighted distributions and size-based sampling with applications to wildlife populations and human families. Biometrics 1978, 34, 179–189. [Google Scholar] [CrossRef] [Green Version]
  14. Mihael, P.; Jim, P.; Marc, Y. Size-biased sampling of Poisson point processes and excursions. Probab. Theory Relat. Fields 1992, 92, 21–39. [Google Scholar]
  15. Davidov, O.; Zelen, M. Referent sampling, family history and relative risk: The role of length-biased sampling. Biostatistics 2001, 2, 173–181. [Google Scholar] [CrossRef]
  16. Pressé, S.; Ghosh, K.; Lee, J.; Dill, K.A. Principles of maximum entropy and maximum caliber in statistical physics. Rev. Mod. Phys. 2013, 85, 1115–1141. [Google Scholar] [CrossRef] [Green Version]
  17. Cofré, R.; Herzog, R.; Corcoran, D.; Rosas, F.E. A Comparison of the Maximum Entropy Principle Across Biological Spatial Scales. Entropy 2019, 21, 1009. [Google Scholar] [CrossRef] [Green Version]
  18. Dowson, D.C.; Wragg, A. Maximum-Entropy Distributions Having Prescribed First and Second Moments. IEEE Trans. Inf. Theory 1973, 19, 689–693. [Google Scholar] [CrossRef]
  19. Verstockt, B.; Smith, K.G.C.; Lee, J.C. Genome-wide association studies in Crohn’s disease: Past, present and future. Clin. Transl. Immunol. 2018, 7, e1001. [Google Scholar] [CrossRef]
  20. Halfvarson, J. Genetics in twins with Crohnʼs disease: Less pronounced than previously believed? Inflamm. Bowel Dis. 2011, 17, 6–12. [Google Scholar] [CrossRef]
  21. Stensrud, M.J.; Valberg, M. Inequality in genetic cancer risk suggests bad genes rather than bad luck. Nat. Commun. 2017, 8, 1165. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. Simulated histograms of gains S for two distributions f 0 and f 1 : on the left-hand side f 0 p = δ p 1 / 2 and on the right-hand side f 1 p = 1 2 δ p + δ 1 p . In each case, the histogram of success is plotted for increasing values of the number N of draws ( N = 1 , 2 , 3 , 4 ) and for two numbers n of gamblers: n = 10 (in blue) and 100 (in orange). For N = 1 , note that the histogram for f 0 is similar to the histogram for f 1 and both histograms converge to the same limit as n goes to infinity. On the contrary, for each N 2 , the histograms for f 0 and f 1 diverge as n increases.
Figure 1. Simulated histograms of gains S for two distributions f 0 and f 1 : on the left-hand side f 0 p = δ p 1 / 2 and on the right-hand side f 1 p = 1 2 δ p + δ 1 p . In each case, the histogram of success is plotted for increasing values of the number N of draws ( N = 1 , 2 , 3 , 4 ) and for two numbers n of gamblers: n = 10 (in blue) and 100 (in orange). For N = 1 , note that the histogram for f 0 is similar to the histogram for f 1 and both histograms converge to the same limit as n goes to infinity. On the contrary, for each N 2 , the histograms for f 0 and f 1 diverge as n increases.
Philosophies 06 00057 g001
Figure 2. Linear regression fits V a r N S for f 0 (blue dotted line), with a 0 = 0.251 ± 0.005 in agreement with Equation (3) when Σ 2 = 0 . Moreover, a 0 agrees with the expected value Φ 1 Φ = 1 / 4 . The quadratic fit (blue dashed line) yields an equivalent result. At odds with f 0 , the linear regression does not fit V a r N S for f 1 (orange dotted line), whereas the quadratic fit (orange dashed line) is excellent, with: a 1 = 0.01 ± 0.01 and b 1 = 0.244 ± 0.006 . Here, b 1 agrees with the expected value Σ 2 = 1 / 4 and a 1 with the expected value Φ 1 Φ Σ 2 = 0 .
Figure 2. Linear regression fits V a r N S for f 0 (blue dotted line), with a 0 = 0.251 ± 0.005 in agreement with Equation (3) when Σ 2 = 0 . Moreover, a 0 agrees with the expected value Φ 1 Φ = 1 / 4 . The quadratic fit (blue dashed line) yields an equivalent result. At odds with f 0 , the linear regression does not fit V a r N S for f 1 (orange dotted line), whereas the quadratic fit (orange dashed line) is excellent, with: a 1 = 0.01 ± 0.01 and b 1 = 0.244 ± 0.006 . Here, b 1 agrees with the expected value Σ 2 = 1 / 4 and a 1 with the expected value Φ 1 Φ Σ 2 = 0 .
Philosophies 06 00057 g002
Figure 3. Evolution of the variance of V n for N = 2 as a function of the number n of players. The blue dots are simulated with Φ = 0.5 and Σ 2 = 0.15 . The black dashed line corresponds to the variance of V n according to Equation (7). The grey dashed line corresponds to the leading-order term in 1 / n of the expected variance in Equation (8).
Figure 3. Evolution of the variance of V n for N = 2 as a function of the number n of players. The blue dots are simulated with Φ = 0.5 and Σ 2 = 0.15 . The black dashed line corresponds to the variance of V n according to Equation (7). The grey dashed line corresponds to the leading-order term in 1 / n of the expected variance in Equation (8).
Philosophies 06 00057 g003
Figure 4. (a) CD risk distribution function f p ; m , s among the population (in blue) is narrow peaked at p = 0 . The risk distribution function f a = p f p ; m , s Φ among affected people (in orange) has two narrow peaks. (b) Zoom in the vicinity of both peaks p = 0 and p = 1 . Concordant twins (almost) all belong to the right peak (at p = 1 ) whereas discordant twins (almost) all belong to the left peak (at p = 0 ).
Figure 4. (a) CD risk distribution function f p ; m , s among the population (in blue) is narrow peaked at p = 0 . The risk distribution function f a = p f p ; m , s Φ among affected people (in orange) has two narrow peaks. (b) Zoom in the vicinity of both peaks p = 0 and p = 1 . Concordant twins (almost) all belong to the right peak (at p = 1 ) whereas discordant twins (almost) all belong to the left peak (at p = 0 ).
Philosophies 06 00057 g004
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lesage, A.; Victor, J.-M. To Be or to Have Been Lucky, That Is the Question. Philosophies 2021, 6, 57. https://0-doi-org.brum.beds.ac.uk/10.3390/philosophies6030057

AMA Style

Lesage A, Victor J-M. To Be or to Have Been Lucky, That Is the Question. Philosophies. 2021; 6(3):57. https://0-doi-org.brum.beds.ac.uk/10.3390/philosophies6030057

Chicago/Turabian Style

Lesage, Antony, and Jean-Marc Victor. 2021. "To Be or to Have Been Lucky, That Is the Question" Philosophies 6, no. 3: 57. https://0-doi-org.brum.beds.ac.uk/10.3390/philosophies6030057

Article Metrics

Back to TopTop