Next Article in Journal
Four Types of Fixed-Point Theorems for Multifunctions in Probabilistic Metric Spaces
Next Article in Special Issue
Trapping the Ultimate Success
Previous Article in Journal
Evaluating Pallet Investment Strategy Using Fuzzy Analytic Network Process: A Case in Chinese Chain Supermarkets
Previous Article in Special Issue
Mixture of Species Sampling Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Central Limit Theorem for Predictive Distributions

1
Dipartimento di Scienze Fisiche, Informatiche e Matematiche, Università di Modena e Reggio-Emilia, Via Campi 213/B, 41100 Modena, Italy
2
Accademia Navale di Livorno, 57127 Livorno, Italy
3
Dipartimento di Scienze Statistiche “P. Fortunati”, Università di Bologna, Via delle Belle Arti 41, 40126 Bologna, Italy
*
Author to whom correspondence should be addressed.
Submission received: 30 October 2021 / Revised: 29 November 2021 / Accepted: 8 December 2021 / Published: 12 December 2021

Abstract

:
Let S be a Borel subset of a Polish space and F the set of bounded Borel functions f : S R . Let a n ( · ) = P ( X n + 1 · X 1 , , X n ) be the n-th predictive distribution corresponding to a sequence ( X n ) of S-valued random variables. If ( X n ) is conditionally identically distributed, there is a random probability measure μ on S such that f d a n a . s . f d μ for all f F . Define D n ( f ) = d n f d a n f d μ for all f F , where d n > 0 is a constant. In this note, it is shown that, under some conditions on ( X n ) and with a suitable choice of d n , the finite dimensional distributions of the process D n = D n ( f ) : f F stably converge to a Gaussian kernel with a known covariance structure. In addition, E φ ( D n ( f ) ) X 1 , , X n converges in probability for all f F and φ C b ( R ) .

1. Introduction

All random elements appearing in the sequel are defined on a common probability space, say ( Ω , A , P ) . We denote by S a Borel subset of a Polish space and by B the Borel σ -field on S. We let
P = probability measures on B and F = real bounded Borel functions on S .
Moreover, if λ P and f F , we write λ ( f ) to denote
λ ( f ) = f d λ .
In other terms, depending on the context, λ is regarded as a function on B or a function on F. This slight abuse of notation is quite usual (see, e.g., [1,2]) and very useful for the purposes of this note.
Let
X = ( X 1 , X 2 , )
be a sequence of S-valued random variables and
F 0 = { , Ω } and F n = σ ( X 1 , , X n ) .
The predictive distributions of X are the random probability measures on ( S , B ) given by
a n ( · ) = P ( X n + 1 · F n ) for all n 0 .
Under some conditions, there is a further random probability measure μ on ( S , B ) such that
μ ( f ) = a . s . lim n a n ( f ) for each f F .
For instance, condition (1) holds if X is exchangeable. More generally, it holds if X is conditionally identically distributed (c.i.d.), as defined in Section 2. Note also that, since S is separable, condition (1) implies a n μ weakly. Regarding a n and μ as measurable functions from Ω into P , one obtains
P { ω Ω : a n , ω μ ω weakly } = 1 .
Assume condition (1), fix a sequence d n of positive constants, and define
D n ( f ) = d n a n ( f ) μ ( f ) for each f F .
This note deals with the process
D n = D n ( f ) : f F .
Our goal is to show that, under some conditions on X and with a suitable choice of the constants d n , the finite-dimensional distributions of D n stably converge, as n , to a certain Gaussian limit.
To be more precise, we recall that a kernel on ( S , B ) is a measurable map α : S P . This means that α ( x ) P , for each x S , and the function x α ( x ) ( A ) is B -measurable for each A B . In what follows, we write
α ( x ) ( f ) = f ( y ) α ( x ) ( d y ) for all x S and f F .
Next, as in [3], suppose the predictive distributions of X satisfy the recursive equation
a n + 1 = q n a n + ( 1 q n ) α ( X n + 1 ) a . s . for all n 0 ,
where q 0 , q 1 , ( 0 , 1 ) are constants and α is a kernel on ( S , B ) . Moreover, let
ν ( · ) = P ( X 1 · )
be the marginal distribution of X 1 . Under condition (2), X is c.i.d. whenever α is a regular conditional distribution for ν given a sub- σ -field G B ; see ([3] Section 5). Hence, we assume
α ( · ) ( A ) = E ν ( 1 A G ) , ν - a . s . ,
for all A B and some sub- σ -field G B . For instance, condition (3) holds if
α ( x ) = δ x for all x S
where δ x denotes the unit mass at the point x (just let G = B ). In addition, we assume
n ( 1 q n ) 2 < and lim n d n sup k n ( 1 q k 1 ) = 0
where
d n = k n ( 1 q k ) 2 1 / 2 .
In this framework, it is shown that
D n ( f 1 ) , , D n ( f p ) N p ( 0 , Σ ) stably
for all p 1 and all f 1 , , f p F , where Σ is the random covariance matrix with entries
σ j k = α ( x ) ( f j ) α ( x ) ( f k ) μ ( d x ) μ ( f j ) μ ( f k ) .
We actually prove something more than (4). Let C b ( R ) denote the set of real bounded continuous functions on R . Then, it is shown that
E φ D n ( f ) F n   P N ( 0 , σ 2 ) ( φ )
for all f F and φ C b ( R ) , where
σ 2 = α ( x ) ( f ) 2 μ ( d x ) μ ( f ) 2 .
Based on (5), it is not hard to deduce condition (4).
Before concluding the Introduction, several remarks are in order.
(i)
A remarkable special case is α ( x ) = δ x for all x S . Indeed, Equation (2) holds with α = δ in some meaningful situations, including Dirichlet sequences; see ([3] Section 4) for other examples. Thus, suppose α = δ . Then, the above formulae reduce to σ j k = μ ( f j f k ) μ ( f j ) μ ( f k ) and σ 2 = μ ( f 2 ) μ ( f ) 2 . Moreover, if ν is non-atomic and
j = 0 n q j 0 and n j = 0 n q j = ,
then μ takes the form
μ = a . s . n V n δ Y n
where ( V n ) and ( Y n ) are independent sequences and ( Y n ) is i.i.d. with Y 1 ν ; see ([3] Theorem 20) and [4] for details.
(ii)
Let l ( G ) be the set of real bounded functions on G, where G is any subset of F. For instance, if S = R , one could take G = 1 ( , x ] : x R . In view of (4), a natural question is whether D n has a limit in distribution when l ( G ) is equipped with a suitable distance. As an example, l ( G ) could be equipped with the uniform distance (as in [1,2]) or with some weaker distance (as in [5]). Even if natural, this question is neglected in this note. We hope and plan to investigate it in a forthcoming paper.
(iii)
For fixed f F , condition (4) provides some information on the convergence rate of a n ( f ) to μ ( f ) . Define L n = u n | a n ( f ) μ ( f ) | where u n > 0 is any sequence of constants. Then, condition (4) yields L n P 0 whenever u n / d n 0 . Furthermore, L n P provided u n / d n and σ 2 > 0 a.s.
(iv)
The condition lim n d n sup k n ( 1 q k 1 ) = 0 is just a technical assumption which guarantees that, asymptotically, there are no dominating terms. In a sense, this condition is analogous to the weak Lindeberg’s condition in the classical CLT for independent summands.
(v)
From a Bayesian point of view, μ can be seen as a random parameter of the data sequence X. This is quite clear if X is exchangeable, for, in this case, X is conditionally i.i.d. given μ . If X is only c.i.d., the role of μ is not as crucial, but μ still contributes to specify the probability distribution of X; see ([3] Section 2.1). Thus, in a Bayesian framework, conditions (4)–(5) may be useful to make (asymptotic) inference about μ . To this end, an alternative could be proving a limit theorem for W n = w n ( μ n μ ) , where w n is a suitable constant and μ n = ( 1 / n ) j = 1 n δ X j the empirical measure. However, D n has two advantages with respect to W n . It usually converges at a better rate and the variance of the limit distribution is smaller; see, e.g., Example 3.
(vi)
Conditions (4)–(5) are our main results. They can be motivated in at least two ways. Firstly, from the theoretical perspective, conditions (4)–(5) fit into the results concerning the asymptotic behavior of conditional expectations (see, e.g., [6,7,8] and references therein). Secondly, from the practical perspective, conditions (4)–(5) play a role in all those fields where predictive distributions are basic objects. The main example is Bayesian predictive inference. Indeed, the predictive distributions investigated in this note have been introduced in connection with Bayesian prediction problems; see [3]. Another example is the asymptotic behavior of certain urn schemes. Related subjects, where (4)–(5) are potentially useful, are empirical processes for dependent data, Glivenko-Cantelli-type theorems and merging of opinions. Without any claim of being exhaustive, a list of references is: [3,5,9,10,11,12,13,14,15,16,17,18,19,20,21].

2. Preliminaries

In this note, N p ( 0 , C ) denotes the Gaussian law on the Borel sets of R p with mean 0 and covariance matrix C, where C is symmetric and semidefinite positive. If p = 1 and c 0 is a scalar, we write N ( 0 , c ) instead of N 1 ( 0 , c ) and
N ( 0 , c ) ( φ ) = φ ( x ) N ( 0 , c ) ( d x )
for all bounded measurable φ : R R . Note that, if Σ is a random covariance matrix, N p ( 0 , Σ ) is a random probability measure on the Borel sets of R p .
Let us briefly recall stable convergence. Let A + = { H A : P ( H ) > 0 } . Fix a random probability measure K on ( S , B ) and define
λ H ( A ) = E K ( A ) H for all A B and H A + .
Each λ H is a probability measure on B . Then, X n converges stably to K, written X n K stably, if
P ( X n · H ) λ H weakly for all H A + .
In particular, X n converges in distribution to λ Ω . However, stable convergence is stronger than convergence in distribution. To see this, take a further random variable X : Ω S . Then, X n P X if, and only if, X n δ X stably. Thus, stable convergence is strictly connected to convergence in probability. Moreover, ( X n , X ) K × δ X stably whenever X n K stably. Therefore, if X n converges stably, ( X n , X ) still converges stably for any S-valued random variable X.
We next turn to conditional identity in distribution. Say that X is conditionally identically distributed (c.i.d.) if
P X k · F n = P X n + 1 · F n a . s . for all k > n 0 .
Thus, at each time n, the future observations ( X k : k > n ) are identically distributed given the past. This is actually weaker than exchangeability. Indeed, X is exchangeable if, and only if, it is stationary and c.i.d.
C.i.d. sequences were introduced in [9,22] and then investigated in various papers; see, e.g., [3,4,5,11,23,24,25,26,27,28,29].
The asymptotics of c.i.d. sequences is similar to that of exchangeable ones. To see this, suppose X is c.i.d. and define the empirical measures
μ n = 1 n j = 1 n δ X j .
Then, there is a random probability measure μ on ( S , B ) such that
μ ( A ) = a . s . lim m μ m ( A ) for each fixed A B .
It follows that
E μ ( A ) F n = lim m E μ m ( A ) F n = lim m 1 m j = n + 1 m P X j A F n = P X n + 1 A F n a . s .
for all n 0 and A B . Therefore, as in the exchangeable case, the predictive distributions can be written as
a n ( · ) = P X n + 1 · F n = E μ ( · ) F n a . s .
Using the martingale convergence theorem, this implies
μ ( f ) = a . s . lim n E μ ( f ) F n = lim n a n ( f ) for all f F .
Furthermore, X is asymptotically exchangeable, in the sense that the probability distribution of the shifted sequence ( X n , X n + 1 , ) converges weakly to an exchangeable probability measure on ( S , B ) .
Finally, we state a technical result to be used later on.
Lemma 1.
Let ( Y n ) be a sequence of real integrable random variables, adapted to the filtration ( F n ) , and
Z n = E ( Y n + 1 F n ) .
Let V be a real non-negative random variable and 0 < b 1 < b 2 < an increasing sequence of constants, such that b n and b n / b n + 1 1 . Suppose ( Y n 2 ) is uniformly integrable, Z n a . s . Z for some random variable Z, and define
T n = b n ( Z n Z ) .
Then,
E φ ( T n ) F n   P N ( 0 , V ) ( φ ) for all φ C b ( R )
provided
b n 2 k n ( Z k Z k 1 ) 2 P V ;
lim n b n E sup k n | Z k Z k 1 | = 0 ;
k n E E ( Z k + 1 F k ) Z k = o ( 1 / b n ) .
Proof. 
Just repeat the proof of ([10] Theorem 1) with b n in the place of n . □

3. Main Result

Let us go back to the notation of Section 1. Recall that q n ( 0 , 1 ) is a constant for each n 0 and d n = k n ( 1 q k ) 2 1 / 2 . We aim to prove the following CLT.
Theorem 1.
Assume conditions (2)–(3) and
n ( 1 q n ) 2 < a n d lim n d n sup k n ( 1 q k 1 ) = 0 .
Then, there is a random probability measure μ on ( S , B ) such that
μ ( f ) = a . s . lim n a n ( f ) a n d E φ D n ( f ) F n P N ( 0 , σ 2 ) ( φ )
for all f F and φ C b ( R ) , where
σ 2 = α ( x ) ( f ) 2 μ ( d x ) μ ( f ) 2 .
As a consequence,
D n ( f 1 ) , , D n ( f p ) N p ( 0 , Σ ) s t a b l y
for all p 1 and all f 1 , , f p F where the covariance matrix Σ has entries
σ j k = α ( x ) ( f j ) α ( x ) ( f k ) μ ( d x ) μ ( f j ) μ ( f k ) .
Proof. 
Due to conditions (2)–(3), X is c.i.d.; see ([3] Section 5). Hence, as noted in Section 2, there is a random probability measure μ on ( S , B ) such that
a n ( f ) = a . s . E μ ( f ) F n for all f F .
By martingale convergence, it follows that a n ( f ) a . s . μ ( f ) for all f F .
We next prove condition (5). Fix f F and define
b n = d n , Y n = a n ( f ) , Z = μ ( f ) and V = σ 2 .
Then, ( Y n 2 ) is uniformly integrable (for f is bounded) and b n satisfies the conditions of Lemma 1. Moreover,
Z n = E ( Y n + 1 F n ) = E E μ ( f ) F n + 1 F n = E μ ( f ) F n = a n ( f ) a . s .
so that Z n a . s . Z . Therefore, Lemma 1 applies. Hence, to prove (5), it suffices to check conditions (6)–(8).
Let c = sup | f | . Since E ( Z k + 1 F k ) = Z k a.s., condition (8) is trivially true. Moreover, condition (2) implies
Z k Z k 1 = a k ( f ) a k 1 ( f ) = q k 1 a k 1 ( f ) + ( 1 q k 1 ) α ( X k ) ( f ) a k 1 ( f ) = ( 1 q k 1 ) α ( X k ) ( f ) a k 1 ( f ) a . s . for all k 1 .
Hence, condition (7) holds, since
d n E sup k n | Z k Z k 1 | 2 c d n sup k n ( 1 q k 1 ) 0 .
It remains to prove condition (6), namely
d n 2 k n ( 1 q k 1 ) 2 α ( X k ) ( f ) a k 1 ( f ) 2 P σ 2 .
First note that, since a k 1 ( f ) 2 a . s . μ ( f ) 2 as k , one obtains
d n 2 k n ( 1 q k 1 ) 2 a k 1 ( f ) 2 = k n ( 1 q k 1 ) 2 a k 1 ( f ) 2 k n ( 1 q k ) 2 a . s . μ ( f ) 2 .
Next, define
R k = α ( X k ) ( f ) 2 and M n = d n 2 k n ( 1 q k 1 ) 2 R k E ( R k F k 1 ) .
Then,
E ( M n 2 ) = d n 4 k n ( 1 q k 1 ) 4 E R k E ( R k F k 1 ) 2 4 c 4 d n 4 k n ( 1 q k 1 ) 4 4 c 4 d n 2 sup k n ( 1 q k 1 ) 2 · d n 2 k n ( 1 q k 1 ) 2 0 .
Moreover,
E ( R k F k 1 ) = E α ( x ) ( f ) 2 μ ( d x ) F k 1 a . s . α ( x ) ( f ) 2 μ ( d x ) .
Therefore,
d n 2 k n ( 1 q k 1 ) 2 R k = M n + d n 2 k n ( 1 q k 1 ) 2 E ( R k F k 1 ) P α ( x ) ( f ) 2 μ ( d x ) .
By the same argument, it follows that
d n 2 k n ( 1 q k 1 ) 2 α ( X k ) ( f ) a k 1 ( f ) P μ ( f ) α ( x ) ( f ) μ ( d x ) .
In addition, as proved in the Claim below,
α ( x ) ( f ) μ ( d x ) = a . s . μ ( f ) .
Collecting all pieces together, one finally obtains
d n 2 k n ( 1 q k 1 ) 2 α ( X k ) ( f ) a k 1 ( f ) 2 P μ ( f ) 2 + α ( x ) ( f ) 2 μ ( d x ) 2 μ ( f ) 2 = σ 2 .
Hence, condition (6) holds.
This concludes the proof of (5). We next prove that (5) ⇒ (4). Let p 1 and f 1 , , f p F . Fix u 1 , , u p R and define
U n = j = 1 p u j D n ( f j ) and σ u 2 = j , k u j u k σ j k .
Moreover, for each H A + , define the probability measure
λ H ( A ) = E N ( 0 , σ u 2 ) ( A ) H for each Borel set A R .
We have to show that
P ( U n · H ) λ H weakly for each H A + .
To this end, call ϕ H the characteristic function of λ H , namely
ϕ H ( t ) = E e i t x N ( 0 , σ u 2 ) ( d x ) H = E e t 2 σ u 2 / 2 H for all t R .
Letting f = j = 1 p u j f j , one obtains
U n = D n ( f ) and σ u 2 = α ( x ) ( f ) 2 μ ( d x ) μ ( f ) 2 .
Therefore, condition (5) yields
E e i t U n = E E e i t D n ( f ) F n E e t 2 σ u 2 / 2 = ϕ Ω ( t )
for each t R . Hence, condition (9) holds for H = Ω . Next, suppose H n F n and P ( H ) > 0 . Then, for large n, one obtains
E 1 H e i t U n = E 1 H E e i t D n ( f ) F n .
Hence, for each t R , condition (5) still implies
P ( H ) ϕ H ( t ) = E 1 H e t 2 σ u 2 / 2 = lim n E 1 H E e i t D n ( f ) F n = lim n E 1 H e i t U n .
Therefore, condition (9) holds whenever H n F n and P ( H ) > 0 . Based on this fact, by standard arguments, condition (9) easily follows for each H A + .
To conclude the proof of the Theorem, it remains only to show that:
Claim: 
α ( x ) ( f ) μ ( d x ) = a . s . μ ( f ) for all f F .
Proof of the Claim:
By (3), α is a regular conditional distribution for ν given a sub- σ -field of B , where ν is the marginal distribution of X 1 . Therefore, as proved in ([3] Lemma 6), there is a set A B such that ν ( A ) = 1 and
α ( z ) ( f ) α ( x ) ( d z ) = α ( x ) ( f ) for all x A and f F .
Since X is c.i.d. (and, thus, identically distributed) one also obtains P ( X n A ) = ν ( A ) = 1 for all n 1 .
Having noted these facts, fix f F . Since a 0 = ν and α is a regular conditional distribution for ν ,
α ( x ) ( f ) a 0 ( d x ) = a 0 ( f ) .
Moreover, if α ( x ) ( f ) a n ( d x ) = a n ( f ) a.s. for some n 0 , then
α ( x ) ( f ) a n + 1 ( d x ) = q n α ( x ) ( f ) a n ( d x ) + ( 1 q n ) α ( x ) ( f ) α ( X n + 1 ) ( d x ) = q n a n ( f ) + ( 1 q n ) α ( X n + 1 ) ( f ) = a n + 1 ( f ) a . s .
By induction, one obtains α ( x ) ( f ) a n ( d x ) = a n ( f ) a.s. for each n 0 . Hence,
α ( x ) ( f ) μ ( d x ) = lim n α ( x ) ( f ) a n ( d x ) = lim n a n ( f ) = μ ( f ) a . s .
. □
We do not know whether E φ D n ( f ) F n converges a.s. (and not only in probability) under the conditions of Theorem 1. However, it can be shown that E φ D n ( f ) F n converges a.s. under slightly stronger conditions on q n .
Under conditions (2)–(3), for Theorem 1 to work, it suffices that
lim n n b ( 1 q n ) = c for some b > 1 / 2 and c > 0 .
In addition, if (10) holds, then
n b 1 / 2 d n c 2 b 1 .
Hence, letting D n * = n b 1 / 2 ( a n μ ) , one obtains
D n * ( f 1 ) , , D n * ( f p ) N p 0 , c 2 2 b 1 Σ stably ,
for all p 1 and all f 1 , , f p F , provided conditions (2), (3) and (10) hold.
We close this note with some examples.
Example 1.
Let
q n = n + θ n n + 1 + θ n + 1
where ( θ n ) is a bounded increasing sequence with θ 0 > 0 . Then, X is c.i.d. (because of (2)–(3)) but is exchangeable if and only if θ n = θ 0 for all n. In any case, since condition (10) holds with b = c = 1 , Theorem 1 applies and d n can be replaced by n . Letting D n * = n ( a n μ ) , it follows that
D n * ( f 1 ) , , D n * ( f p ) N p 0 , Σ stably .
It is worth noting that, in the special case θ n = θ 0 for all n, the predictive distributions of X reduce to
a n = θ 0 ν + i = 1 n α ( X i ) n + θ 0 .
Therefore, X is a Dirichlet sequence if α = δ . The general case, where α is any kernel satisfying condition (3), is investigated in [30]. It turns out that X satisfies most properties of Dirichlet sequences. In particular, μ has the same distribution as
μ * = n V n α ( Y n ) ,
where ( V n ) and ( Y n ) are independent sequences, ( Y n ) is i.i.d. with Y 1 ν , and ( V n ) has the stick breaking distribution. Nevertheless, as shown in the next example, X can behave quite differently from a Dirichlet sequence.
Example 2
(Example 1 continued). Let H be a countable partition of S such that H B and ν ( H ) > 0 for all H H . Define
α ( x ) = H H 1 H ( x ) ν ( · H ) = ν ( · H x ) for all x S
where H x is the only element of the partition H , such that x H . Then, α is a regular conditional distribution for ν given σ ( H ) (i.e., condition (3) holds). If the q n are as in Example 1 with θ n = θ 0 for all n, one obtains
a n = θ 0 ν + i = 1 n ν ( · H X i ) n + θ 0 .
Therefore,
a n ν for all n 0 .
This is a striking difference with respect to Dirichlet sequences. For instance, if ν is non-atomic, condition (11) yields
P ( X i = X j f o r s o m e i j ) = 0
while P ( X i = X j f o r s o m e i j ) = 1 if X is a Dirichlet sequence. Note also that, for each f F ,
σ 2 = α ( x ) ( f ) 2 μ ( d x ) μ ( f ) 2 = H H ν ( f H ) 2 μ ( H ) μ ( f ) 2
while σ 2 = μ ( f 2 ) μ ( f ) 2 if X is a Dirichlet sequence. Other choices of α, which make X quite different from a Dirichlet sequence, are in [30].
Example 3.
A meaningful special case is n ( 1 q n ) < . In this case,
j = 0 q j : = lim n j = 0 n q j
exists and is strictly positive. Hence, μ admits the representation
μ = ν j = 0 q j + i = 1 α ( X i ) ( 1 q i 1 ) j = i q j .
As an example, under conditions (2)–(3), Theorem 1 applies whenever
q n = exp { ( c + n ) 2 } f o r s o m e c o n s t a n t c > 0 .
With this choice of q n , one obtains ( 1 q n ) ( c + n ) 2 1 , so that n ( 1 q n ) < and μ can be written as above. Note also that
lim n d n ( c + n ) 3 / 2 = 3 .
Therefore, for fixed f F , the rate of convergence of a n ( f ) to μ ( f ) is n 3 / 2 and not the usual n 1 / 2 .

Author Contributions

Methodology, P.B., L.P. and P.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 817257.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

We are grateful to Giorgio Letta and Eugenio Regazzini. They not only introduced us to probability theory, they also shared with us their enthusiasm and some of their expertise.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Dudley, R.M. Uniform Central Limit Theorems; Cambridge University Press: Cambridge, UK, 1999. [Google Scholar]
  2. Van der Vaart, A.; Wellner, J.A. Weak Convergence and Empirical Processes; Springer: New York, NY, USA, 1996. [Google Scholar]
  3. Berti, P.; Dreassi, E.; Pratelli, L.; Rigo, P. A class of models for Bayesian predictive inference. Bernoulli 2021, 27, 702–726. [Google Scholar] [CrossRef]
  4. Berti, P.; Dreassi, E.; Pratelli, L.; Rigo, P. Asymptotics of certain conditionally identically distributed sequences. Statist. Prob. Lett. 2021, 168, 108923. [Google Scholar] [CrossRef]
  5. Berti, P.; Pratelli, L.; Rigo, P. Limit theorems for empirical processes based on dependent data. Electron. J. Probab. 2012, 17, 1–18. [Google Scholar] [CrossRef]
  6. Crimaldi, I.; Pratelli, L. Convergence results for conditional expectations. Bernoulli 2005, 11, 737–745. [Google Scholar] [CrossRef]
  7. Goggin, E.M. Convergence in distribution of conditional expectations. Ann. Probab. 1994, 22, 1097–1114. [Google Scholar] [CrossRef]
  8. Lan, G.; Hu, Z.C.; Sun, W. Products of conditional expectation operators: Convergence and divergence. J. Theore. Probab. 2021, 34, 1012–1028. [Google Scholar] [CrossRef]
  9. Berti, P.; Pratelli, L.; Rigo, P. Limit theorems for a class of identically distributed random variables. Ann. Probab. 2004, 32, 2029–2052. [Google Scholar] [CrossRef]
  10. Berti, P.; Crimaldi, I.; Pratelli, L.; Rigo, P. A central limit theorem and its applications to multicolor randomly reinforced urns. J. Appl. Probab. 2011, 48, 527–546. [Google Scholar] [CrossRef]
  11. Berti, P.; Pratelli, L.; Rigo, P. Exchangeable sequences driven by an absolutely continuous random measure. Ann. Probab. 2013, 41, 2090–2102. [Google Scholar] [CrossRef]
  12. Blackwell, D.; Dubins, L.E. Merging of opinions with increasing information. Ann. Math. Statist. 1962, 33, 882–886. [Google Scholar] [CrossRef]
  13. Cifarelli, D.M.; Regazzini, E. De Finetti’s contribution to probability and statistics. Statist. Sci. 1996, 11, 253–282. [Google Scholar] [CrossRef]
  14. Cifarelli, D.M.; Dolera, E.; Regazzini, E. Frequentistic approximations to Bayesian prevision of exchangeable random elements. Int. J. Approx. Reason. 2016, 78, 138–152. [Google Scholar] [CrossRef]
  15. Dolera, E.; Regazzini, E. Uniform rates of the Glivenko-Cantelli convergence and their use in approximating Bayesian inferences. Bernoulli 2019, 25, 2982–3015. [Google Scholar] [CrossRef]
  16. Fortini, S.; Ladelli, L.; Regazzini, E. Exchangeability, predictive distributions and parametric models. Sankhyā Indian J. Stat. Ser. A 2000, 62, 86–109. [Google Scholar]
  17. Hahn, P.R.; Martin, R.; Walker, S.G. On recursive Bayesian predictive distributions. J. Am. Stat. Assoc. 2018, 113, 1085–1093. [Google Scholar] [CrossRef]
  18. Morvai, G.; Weiss, B. On universal algorithms for classifying and predicting stationary processes. Probab. Surv. 2021, 18, 77–131. [Google Scholar] [CrossRef]
  19. Pitman, J. Some developments of the Blackwell-MacQueen urn scheme. Stat. Probab. Game Theory IMS Lect. Notes Mon. Ser. 1996, 30, 245–267. [Google Scholar]
  20. Pitman, J. Combinatorial Stochastic Processes; Lectures from the XXXII Summer School in Saint-Flour; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  21. Regazzini, E. Old and recent results on the relationship between predictive inference and statistical modeling either in nonparametric or parametric form. In Bayesian Statistics 6; Oxford University Press: Oxford, UK, 1999; pp. 571–588. [Google Scholar]
  22. Kallenberg, O. Spreading and predictable sampling in exchangeable sequences and processes. Ann. Probab. 1988, 16, 508–534. [Google Scholar] [CrossRef]
  23. Airoldi, E.M.; Costa, T.; Bassetti, F.; Leisen, F.; Guindani, M. Generalized species sampling priors with latent beta reinforcements. J. Am. Stat. Assoc. 2014, 109, 1466–1480. [Google Scholar] [CrossRef]
  24. Bassetti, F.; Crimaldi, I.; Leisen, F. Conditionally identically distributed species sampling sequences. Adv. Appl. Probab. 2010, 42, 433–459. [Google Scholar] [CrossRef]
  25. Cassese, A.; Zhu, W.; Guindani, M.; Vannucci, M. A Bayesian nonparametric spiked process prior for dynamic model selection. Bayesian Anal. 2019, 14, 553–572. [Google Scholar] [CrossRef]
  26. Fong, E.; Holmes, C.; Walker, S.G. Martingale posterior distributions. arXiv 2021, arXiv:2103.15671v1. [Google Scholar]
  27. Fortini, S.; Petrone, S. Predictive construction of priors in Bayesian nonparametrics. Braz. J. Probab. Statist. 2012, 26, 423–449. [Google Scholar] [CrossRef]
  28. Fortini, S.; Petrone, S.; Sporysheva, P. On a notion of partially conditionally identically distributed sequences. Stoch. Proc. Appl. 2018, 128, 819–846. [Google Scholar] [CrossRef]
  29. Fortini, S.; Petrone, S. Quasi-Bayes properties of a procedure for sequential learning in mixture models. J. R. Stat. Soc. B 2020, 82, 1087–1114. [Google Scholar] [CrossRef]
  30. Berti, P.; Dreassi, E.; Leisen, F.; Pratelli, L.; Rigo, P. Kernel based Dirichlet sequences. arXiv 2021, arXiv:2106.00114. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Berti, P.; Pratelli, L.; Rigo, P. A Central Limit Theorem for Predictive Distributions. Mathematics 2021, 9, 3211. https://0-doi-org.brum.beds.ac.uk/10.3390/math9243211

AMA Style

Berti P, Pratelli L, Rigo P. A Central Limit Theorem for Predictive Distributions. Mathematics. 2021; 9(24):3211. https://0-doi-org.brum.beds.ac.uk/10.3390/math9243211

Chicago/Turabian Style

Berti, Patrizia, Luca Pratelli, and Pietro Rigo. 2021. "A Central Limit Theorem for Predictive Distributions" Mathematics 9, no. 24: 3211. https://0-doi-org.brum.beds.ac.uk/10.3390/math9243211

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop