Next Article in Journal
The Truncated Cauchy Power Family of Distributions with Inference and Applications
Next Article in Special Issue
On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity
Previous Article in Journal
Analysis of Social Media Impact on Opportunity Recognition. A Social Networks and Entrepreneurial Alertness Mixed Approach
Previous Article in Special Issue
Multiplexing Gains under Mixed-Delay Constraints on Wyner’s Soft-Handoff Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Rate-Distortion Function of Sampled Cyclostationary Gaussian Processes

1
Department of Electrical and Computer Engineering, Ben-Gurion University, Be’er-Sheva 8410501, Israel
2
Faculty of Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 7610001, Israel
*
Author to whom correspondence should be addressed.
Submission received: 14 February 2020 / Revised: 8 March 2020 / Accepted: 10 March 2020 / Published: 17 March 2020
(This article belongs to the Special Issue Wireless Networks: Information Theoretic Perspectives)

Abstract

:
Man-made communications signals are typically modelled as continuous-time (CT) wide-sense cyclostationary (WSCS) processes. As modern processing is digital, it is applied to discrete-time (DT) processes obtained by sampling the CT processes. When sampling is applied to a CT WSCS process, the statistics of the resulting DT process depends on the relationship between the sampling interval and the period of the statistics of the CT process: When these two parameters have a common integer factor, then the DT process is WSCS. This situation is referred to as synchronous sampling. When this is not the case, which is referred to as asynchronous sampling, the resulting DT process is wide-sense almost cyclostationary (WSACS). The sampled CT processes are commonly encoded using a source code to facilitate storage or transmission over wireless networks, e.g., using compress-and-forward relaying. In this work, we study the fundamental tradeoff between rate and distortion for source codes applied to sampled CT WSCS processes, characterized via the rate-distortion function (RDF). We note that while RDF characterization for the case of synchronous sampling directly follows from classic information-theoretic tools utilizing ergodicity and the law of large numbers, when sampling is asynchronous, the resulting process is not information stable. In such cases, the commonly used information-theoretic tools are inapplicable to RDF analysis, which poses a major challenge. Using the information-spectrum framework, we show that the RDF for asynchronous sampling in the low distortion regime can be expressed as the limit superior of a sequence of RDFs in which each element corresponds to the RDF of a synchronously sampled WSCS process (yet their limit is not guaranteed to exist). The resulting characterization allows us to introduce novel insights on the relationship between sampling synchronization and the RDF. For example, we demonstrate that, differently from stationary processes, small differences in the sampling rate and the sampling time offset can notably affect the RDF of sampled CT WSCS processes.

1. Introduction

Man-made signals are typically generated using a repetitive procedure, which takes place at fixed intervals. The resulting signals are thus commonly modeled as continuous-time (CT) random processes exhibiting periodic statistical properties [1,2,3], which are referred to as wide-sense cyclostationary (WSCS) processes. In digital communications, where the transmitted waveforms commonly obey the WSCS model [3], the received CT signal is first sampled to obtain a discrete-time (DT) received signal. In the event that the sampling interval is commensurate with the period of the statistics of the CT WSCS signal, cyclostationarity is preserved in DT ([3] Section 3.9). In this work, we refer to this situation as synchronous sampling. However, it is practically common to encounter scenarios in which the sampling rate at the receiver and the symbol rate of the received CT WSCS process are incommensurate, which is referred to as asynchronous sampling. The resulting sampled process in such cases is a DT wide-sense almost cyclostationary (WSACS) stochastic process ([3] Section 3.9).
This research aims at investigating lossy source coding for asynchronously sampled CT WSCS processes. In the source coding problem, every sequence of information symbols from the source is mapped into a sequence of code symbols, referred to as codewords, taken from a predefined codebook. In lossy source coding, the source sequence is recovered up to a predefined distortion constraint, within an arbitrary small tolerance of error. The figure-of-merit for lossy source coding is the rate-distortion function (RDF) which characterizes the minimum number of bits per source symbol required to compress the source sequence such that it can be reconstructed at the decoder within the specified maximal distortion [4]. For an independent and identically distributed (IID) random source process, the RDF can be expressed as the minimum mutual information between the source variable and the reconstruction variable, such that with the corresponding conditional distribution of the reconstruction symbol given the source symbol, the distortion constraint is satisfied ([5] Chapter 10). The source coding problem has been further studied in multiple different scenarios, including the reconstruction of a single source at multiple destinations [6] and the reconstruction of multiple correlated stationary Gaussian sources at a single destination [7,8,9].
For sampled stationary source processes, ergodicity theory and the asymptotic equipartition property (AEP) ([5] Chapter 3) were utilized for characterizing the RDF in different scenarios ([10] Chapter 9), ([4] Section I), [11]. However, as in a broad range of applications, including digital communications networks, the CT signals are WSCS processes, the sampling operation results in DT source signals whose statistics depends on the relationship between the sampling rate and the period of the statistics of the source signal. When sampling is synchronous, the resulting DT source signal is WSCS ([3] Section 3.9). The RDF for lossy compression of DT WSCS Gaussian sources with memory was studied in [12]. The work [12] used the fact that any WSCS signal can be transformed into a set of stationary subprocess [2]; thereby facilitating the application of information-theoretic results obtained for multivariate stationary sources to the derivation of the RDF; Nonetheless, in many digital communications scenarios, the sampling rate and the symbol rate of the CT WSCS process are not related in any way, and are possibly incommensurate, resulting in a sampled process which is a DT WSACS stochastic process. Such situations can occur as a result of the a-priori determined values of the sampling interval and the symbol duration of the WSCS source signal, as well as due to sampling clock jitters resulting from hardware impairments. A comprehensive review of trends and applications for almost cyclostationary signals can be found in [13]. Despite of their apparent frequent occurrences, the RDF for lossy compression of WSACS sources has not been characterized, which is the motivation for the current research. A major challenge associated with characterizing fundamental limits for asynchronously sampled WSCS processes stems from the fact that the resulting processes are not information stable, in the sense that their conditional distributions are not ergodic ([14] Page X), [15,16]. As a result, the standard information-theoretic tools cannot be employed, making the characterization of the RDF a very challenging problem.
Our recent study in [17] on channel coding reveals that for the case of additive CT WSCS Gaussian noise, capacity varies significantly with sampling rates, whether the Nyquist criterion is satisfied or not. In particular, it was observed that the capacity can change dramatically with minor variations in the sampling rate, causing it to switch from synchronous sampling to asynchronous sampling. This is in direct contrast to the results obtained for wide-sense stationary noise for which the capacity remains unchanged for any sampling rate above the Nyquist rate [18]. A natural fundamental question that arises from this result is how the RDF of a sampled Gaussian source process varies with the sampling rate. As a motivating example, one may consider compress-and-forward (CF) relaying, in which the relay samples at a rate which can be incommensurate with the symbol rate of the incoming communications signal.
In this work, we employ the information-spectrum framework [14] for characterizing the RDF of asynchronously sampled memoryless Gaussian WSCS processes, as this framework is applicable to the information-theoretic analysis of non information-stable processes ([14] Page VII). We further note that while rate characterizations obtained using information spectrum tools and its associated quantities may be difficult to evaluate ([14] Remark 1.7.3), here we obtain a numerically computable characterization of the RDF. In particular, we focus on the mean squared error (MSE) distortion measure in the low distortion regime, namely, source codes for which the average square of the difference between the source and the reproduction process is not larger than the minimal source variance. The results of this research lead to accurate modelling of signal compression in current and future digital communications systems. The derived RDF, which characterizes the fundamental performance limits in encoding sampled CT WSCS Gaussian processes into a digital representation, allows to evaluate source coding schemes associated with different levels of complexity in terms of their gap from optimality, when applied to this important class of signals.
Furthermore, we utilize our characterization of the RDF to examine how the RDF for a sampled CT WSCS Gaussian source varies with different sampling rates and sampling time offsets. We demonstrate that, differently from stationary sources, when applying a lossy source code to a sampled WSCS process, the achievable rate-distortion tradeoff can be significantly affected by minor variations in the sampling time offset and the sampling rate. Our results thus allow identifying the sampling rate and sampling time offsets which minimize the RDF in systems involving sampled WSCS processes.
The rest of this work is organised as follows: Section 2 provides a scientific background on cyclostationary processes and on rate-distortion analysis of DT WSCS Gaussian sources. Section 3 presents the problem formulation and auxiliary results, and Section 4 details the main result of RDF characterization for sampled WSCS Gaussian process. Numerical examples and discussions are provided in Section 5, and Section 6 concludes the paper.

2. Preliminaries and Background

In the following we review the main tools and framework used in this work: In Section 2.1 we detail the notations, and in Section 2.2 we review the basics of cyclostationary processes and the statistical properties of a DT process resulting from sampling a CT WSCS process. In Section 2.3 we recall some preliminaries of rate-distortion theory as well as the RDF for DT WSCS Gaussian source processes. This background creates a premise for the statement of the main result provided in Section 4 of this paper.

2.1. Notations

In this paper, random vectors are denoted by boldface uppercase letters, e.g., X ; boldface lowercase letters denote deterministic column vectors, e.g., x . Scalar RVs and deterministic values are denoted via standard uppercase and lowercase fonts respectively, e.g., X and x. Scalar random processes are denoted with X ( t ) , t R for CT and with X [ n ] , n Z for DT. Uppercase Sans-Serif fonts represent matrices, e.g., A , and the element at the i t h row and the l t h column of A is denoted with ( A ) i , l . We use | · | to denote the absolute value, d , d R , to denote the floor function and d + , d R , to denote the max { 0 , d } . δ [ · ] denotes the Kronecker delta function: δ [ n ] = 1 for n = 0 and δ [ n ] = 0 otherwise, and E { · } denotes the stochastic expectation. The sets of positive integers, integers, rational numbers, real numbers, positive numbers, and complex numbers are denoted by N , Z , Q , R , R + + , and C , respectively. The cumulative distribution function (CDF) is denoted by F X ( x ) Pr ( X x ) and the probability density function (PDF) of a CT random variable (RV) is denoted by p X ( x ) . We represent a real Gaussian distribution with mean μ and variance σ 2 by the notation N ( μ , σ 2 ) . All logarithms are taken to base-2, and j = 1 . Lastly, for any sequence y [ i ] , i N , and positive integer k N , y ( k ) denotes the column vector y [ 1 ] , , y [ k ] T .

2.2. Wide-Sense Cyclostationary Random Processes

Here, we review some preliminaries from the theory of cyclostationarity. We begin by recalling the definition of wide-sense cyclostationary processes for CT and for DT:
Definition 1
(CT wide-sense cyclostationary processes ([3] Section 3.2.1)). A scalar stochastic process { S ( t ) } t R is called WSCS if both its first-order and its second-order moments are periodic with respect to t R with some period T p R .
Definition 2
(DT wide-sense cyclostationary processes ([2] Section 17.2)). A scalar stochastic process { S [ n ] } n Z is called WSCS if both its first-order and its second-order moments are periodic with respect to n Z with some period N p Z .
WSCS signal are thus random processes whose first and second-order moments are periodic functions with the same period. To define WSACS signals, we first recall the definition of almost-periodic functions:
Definition 3
(Almost-periodic functions ([19] Definition 2.1)). A DT function x [ n ] , n Z , is called an almost-periodic function if for every ϵ > 0 there exists a number l ( ϵ ) N with the property that for any n Z and any α Z , Δ α , α + l ( ϵ ) , such that
| x [ Δ ] x [ n ] | < ϵ .
Definition 4
(DT wide-sense almost-cyclostationary processes ([2] Section 17.2)). A scalar stochastic process S [ n ] ) n Z is called WSACS if its first and its second order moments are almost-periodic functions with respect to n Z .
Remark 1.
Note that when the mean and the autocorrelation function are each periodic with periods which are incommensurate, the resulting processes is WSACS. We note that in many practical cases the mean is zero, see e.g., ([2] Section 17.2), hence the classification of the process will be determined by the periodicity of the autocorrelation function.
The DT WSCS model is commonly used in the communications literature, as it facilitates the the analysis of many problems of interest, such as fundamental rate limits analysis [20,21,22], channel identification [23], synchronization [24], and noise mitigation [25]. However, in many scenarios, the considered signals are WSACS rather than WSCS. To see how the WSACS model is obtained in the context of sampled signals, we briefly recall the discussion in [17] on sampled WSCS processes (please refer to ([17] Section II.B) for more details): Consider a CT WSCS random process S ( t ) , which is sampled uniformly with a sampling interval of T s and sampling time offset ϕ , resulting in a DT random process S [ i ] = S ( i · T s + ϕ ) . It is well known that contrary to stationary processes, which have a time-invariant statistical characteristics, the values of T s and ϕ have a significant effect on the statistics of sampled WSCS processes ([17] Section II.B). To demonstrate this point, consider a CT WSCS process with variance σ s 2 ( t ) = 1 2 · sin 2 π t / T sym + 2 for some T sym > 0 . The sampled process for ϕ = 0 (no sampling time offset) and T s = T sym 3 has a variance function whose period is N p = 3 : σ s 2 ( i T s ) = { 2 , 2.433 , 1.567 , 2 , 2.433 , 1.567 , } , for i = 0 , 1 , 2 , 3 , 4 , 5 , ; while the DT process obtained with the same sampling interval and with a sampling time offset of ϕ = T s 2 π has a periodic variance with N p = 3 with values σ s 2 ( i T s + ϕ ) = { 2.155 , 2.335 , 1.510 , 2.155 , 2.335 , 1.510 , } , for i = 0 , 1 , 2 , 3 , 4 , 5 , , which are different from the values of the DT variance for ϕ = 0 . It follows that both variances are periodic in discrete-time with the same period N p = 3 , although with different values within the period, which is a result of the sampling time offset, yet, both DT processes correspond to two instances of synchronous sampling. Lastly, consider the sampled variance obtained by sampling without a time offset (i.e., ϕ = 0 ) at a sampling interval of T s = ( 1 + 1 2 π ) T sym 3 . For this case, T s is not an integer divisor of T sym or of any of its integer multiples (i.e., T sym T s = 2 + 2 π 2 2 π + 1 2 + ϵ ; where ϵ Q and ϵ [ 0 , 1 ) ) resulting in the variance values σ s 2 ( i T s ) = { 2 , 2.335 , 1.5027 , 2.405 , 1.896 , 1.75 , } , for i = 0 , 1 , 2 , 3 , 4 , 5 . For this scenario, the DT variance is not periodic but is almost-periodic, corresponding to asynchronous sampling and the resulting DT process is not WSCS but WSACS ([3] Section 3.2). The example above demonstrates that the statistical properties of sampled WSCS processes depend on the sampling rate and the sampling time offset, implying that the RDF of such processes should also depend on these quantities, as we demonstrate in the sequel.

2.3. The Rate-Distortion Function for DT WSCS Processes

In this subsection we review the source coding problem and the existing results on the RDF of WSCS processes. We begin by recalling the definition of a source coding scheme, see, e.g., ([26] Chapter 30), ([5] Chapter 10):
Definition 5
(Source coding scheme). A source coding scheme with blocklength l consists of (see Figure 1):
1. 
An encoder f S which maps a block of l source samples { S [ i ] } i = 1 l into an index from a set of M = 2 l R indexes, f S : { S [ i ] } i = 1 l { 1 , 2 , , M } .
2. 
A decoder g S which maps the received index into a reconstructed sequence of length l, S ^ [ i ] i = 1 l , g S : { 1 , 2 , , M } S ^ [ i ] i = 1 l
The encoder-decoder pair is referred to as an ( R , l ) source code, where R is the rate of the code in bits per source symbol, defined as:
R = 1 l log 2 M
The RDF characterizes the minimal average number of bits per source symbol, denoted R ( D ) , that can be used to encode a source process such that it can be reconstructed from its encoded representation with a recovery distortion not larger than D > 0 ([5] Section 10.2). In the current work, we use the MSE distortion measure, which measures the distortion due to decoding a source symbol S into S ^ via d ( S , S ^ ) = ( S S ^ ) 2 . The distortion for a sequence of source samples S ( l ) decoded into a reproduction sequence S ^ ( l ) is given by d S ( l ) , S ^ ( l ) = 1 l i = 1 l S [ i ] S ^ [ i ] 2 and the average distortion in decoding a random source sequence S ( l ) into a random reproduction sequence S ^ ( l ) is defined as:
d ¯ S ( l ) , S ^ ( l ) E d S ( l ) , S ^ ( l ) = 1 l i = 1 l E S [ i ] S ^ [ i ] 2 ,
where the expectation in Equation (2) is taken with respect to the joint probability distributions on the source S [ i ] and its reproduction S ^ [ i ] . Using Definition 5 we can now define the achievable rate-distortion pair for a source S [ i ] , as stated in the following definition ([10] Pg. 471):
Definition 6
(Achievable rate-distortion pair). A rate-distortion pair ( R , D ) is achievable for a process { S [ i ] } i N if for any η > 0 and for all sufficiently large l it is possible to construct an R s , l source code such that
R s R + η .
and
d ¯ S ( l ) , S ^ ( l ) D + η .
Definition 7.
The rate-distortion function R ( D ) is defined as the infimum of all achievable rates R for a given maximum allowed distortion D.
Definition 6 defines a rate-distortion pair to be achievable if the rate and the distortion constraints are satisfied using source codes with any sufficiently large blocklength. In the following lemma, which will be used to characterize the RDF of DT WSCS signals, we state that it is sufficient to consider only source codes whose blocklength is an integer multiple of some fixed positive integer:
Lemma 1.
Consider the process { S [ i ] } i N with a finite and bounded variance. For a given maximum allowed distortion D, the optimal reproduction process { S ^ [ i ] } i N is also the optimal reproduction process when restricted to using source codes whose blocklengths are integer multiples of some fixed positive integer r.
Proof. 
The proof of the lemma is detailed in Appendix A. □
This lemma facilitates switching between multivariate and scalar representations of the source and the reproduction processes.
The RDF obviously depends on the distribution of the source { S [ i ] } i N . Thus, statistically different sources have different RDFs. However, when a source is scaled by some positive constant, the RDF of the scaled process with the MSE distortion can be inferred from that of the original source process, as stated in the following theorem:
Theorem 1.
Let { S [ i ] } i N be a source process for which the rate-distortion pair ( R , D ) is achievable under the MSE distortion. Then, for every α R + + , it holds that the rate-distortion pair ( R , α 2 · D ) is achievable for the scaled source { α · S [ i ] } i N .
Proof. 
The proof of the theorem is detailed in Appendix B. □
Lastly, in the proof of our main result, we make use of the RDF for DT WSCS sources derived in ([12] Theorem 1), repeated below for ease of reference. Prior to the statement of the theorem, we recall that for blocklenghts which are integer multiples of N p , a WSCS process S [ i ] with period N p > 0 can be represented as an equivalent N p -dimensional process S ( N p ) [ i ] via the decimated component decomposition (DCD) ([2] Section 17.2). The power spectral density (PSD) of the process S ( N p ) is defined as ([12] Section II):
ρ S e j 2 π f u , v = Δ Z R S [ Δ ] u , v e j 2 π f Δ 1 2 f 1 2 , u , v { 1 , 2 , N p }
where R S [ Δ ] E S ( N p ) [ i ] · S ( N p ) [ i + Δ ] ([2] Section 17.2). We now proceed to the statement of ([12] Theorem 1):
Theorem 2.
([12] Theorem 1) Consider a zero-mean real DT WSCS Gaussian source S [ i ] , i N with memory, and let N p N denote the period of its statistics. The RDF is expressed as:
R ( D ) = 1 2 N p m = 1 N p f = 0.5 0.5 log λ m e j 2 π f θ + d f ,
where λ m e j 2 π f , m = 1 , 2 , , N p denote the eigenvalues of the PSD matrix of the process S ( N p ) [ i ] , which is obtained from S [ i ] by applying the N p -dimensional DCD, and θ is selected such that
D = 1 N p m = 1 N p f = 0.5 0.5 min λ m e j 2 π f , θ d f .
We note that S ( N p ) [ i ] corresponds to a vector of stationary processes whose elements are not identically distributed; hence the variance function is different for each scalar stationary process. Using ([12] Theorem 1), we can directly obtain the RDF for the special case of a DT memoryless WSCS Gaussian process. This is stated in the following corollary:
Corollary 1.
Let { S [ i ] } i N be a zero-mean DT memoryless real WSCS Gaussian source with period N p N , and set σ m 2 = E { S 2 [ m ] } for m = 1 , 2 , , N P . The RDF for compression of S [ i ] is expressed as:
R ( D ) = 1 2 N p m = 1 N p log σ m 2 D m D 1 N p m = 1 N p σ m 2 0 D > 1 N p m = 1 N p σ m 2 ,
where D m min σ m 2 , θ , and θ is defined such that
D = 1 N p m = 1 N p D m .
Proof. 
Applying Equations (6a) and (6b) to our specific case of a memoryless WSCS source, we obtain Equations (7a) and (7b) as follows: First, note that the corresponding DCD components for a zero-mean memoryless WSCS process are also zero-mean and memoryless; hence the PSD matrix for the multivariate process S ( N p ) [ i ] is a diagonal matrix, whose eigenvalues are the constant diagonal elements such that the mth diagonal element is equal to the variance σ m 2 : λ m e j 2 π f = σ m 2 . Now, writing Equation (6a) for this case we obtain:
R ( D ) = 1 2 N p m = 1 N p f = 0.5 0.5 log λ m e j 2 π f θ + d f = 1 2 N p m = 1 N p log σ m 2 θ + .
Since log σ m 2 θ + = max 0 , log σ m 2 θ log σ m 2 D m it follows that (8) coincides with (7a). Next, expressing Equation (6b) for the memoryless source process, we obtain:
D = 1 N p m = 1 N p f = 0.5 0.5 min λ m e j 2 π f , θ d f = 1 N p m = 1 N p min σ m 2 , θ ,
proving Equation (7b). □
Now, from Lemma 1, we conclude that the RDF for compression of source sequences whose blocklength is an integer multiple of N p is the same as the RDF for compressing source sequences whose blocklength is arbitrary. We recall that from ([5] Chapter 10.3.3) it follows that for the zero-mean memoryless Gaussian DCD vector source process S ( N p ) [ i ] the optimal reproduction process which achieves the RDF is an N p × 1 memoryless process whose covariance matrix is diagonal with non-identically distributed elements. From [2], we can apply the inverse DCD to obtain a WSCS process. Hence, from Lemma 1 we can conclude that the optimal reproduction process for the DT WSCS Gaussian source is a DT WSCS Gaussian process.

3. Problem Formulation and Auxiliary Results

Our objective is to characterize the RDF for compression of asynchronously sampled CT WSCS Gaussian sources when the sampling interval is larger than the memory of the source. In particular, we focus on the minimal rate required to achieve a high fidelity reproduction, representing the RDF curve for distortion values not larger than the variance of the source. Such characterization of the RDF for asynchronous sampling is essential for comprehending the relationship between the minimal required number of bits and the sampling rate at a given distortion. Our analysis constitutes an important step towards constructing joint source-channel coding schemes for scenarios in which the symbol rate of the transmitter is not necessarily synchronized with the sampling rate of the source to be transmitted. Such scenarios arise, for example, when recording a communications signal for storage or processing, or in compress-and-forward relaying (([26] Chapter 16.7), [27]) in which the relay compresses the sampled received signal, which is then forwarded to the assisted receiver. As the relay operates with its own sampling clock, which need not necessarily be synchronized with the symbol rate of the assisted transmitter, sampling at the relay may result in a DT WSACS source signal. In the following we first characterize the sampled source model in Section 3.1. Then, as a preliminary step towards our characterization the RDF for asynchronously sampled CT WSCS Gaussian processes stated in Section 4, we recall in Section 3.2 the definitions of some information-spectrum quantities used in this study. Finally, in Section 3.3, we recall an auxiliary result relating the information-spectrum quantities of a collection of sequences of RVs to the information-spectrum quantities of its limit sequence of RVs. This result will be applied in the derivation of the RDF with asynchronous sampling.

3.1. Source Model

Consider a real CT, zero-mean WSCS Gaussian random process S c ( t ) with period T ps . Let the variance function of S c ( t ) be defined as σ S c 2 ( t ) E S c 2 ( t ) , and assume it is both upper bounded and lower bounded away from zero, and that it is continuous in t R . Let τ m > 0 denote the maximal correlation length of S c ( t ) , i.e., r S c ( t , τ ) E S c ( t ) S c ( t τ ) = 0 , | τ | > τ m . By the cyclostationarity of S c ( t ) , we have that σ S c 2 ( t ) = σ S c 2 ( t + T ps ) , t R . Let S c ( t ) be sampled uniformly with the sampling interval T s > 0 such that T ps = ( p + ϵ ) · T s for p N and ϵ [ 0 , 1 ) yielding S ϵ [ i ] S c ( i · T s ) , where i Z . The variance of S ϵ [ i ] is given by σ S ϵ 2 [ i ] r S ϵ [ i , 0 ] = σ S c 2 i · T ps p + ϵ .
In this work, as in [17], we assume that the duration of temporal correlation of the CT signal is shorter than the sampling interval T s , namely, τ m < T s . Consequently, the DT Gaussian process S ϵ [ i ] is a memoryless zero-mean Gaussian process and its autocorrelation function is given by:
r S ϵ [ i , Δ ] = E S ϵ [ i ] S ϵ [ i + Δ ] = E S c i · T ps p + ϵ · S c ( i + Δ ) · T ps p + ϵ = σ S c 2 i · T ps p + ϵ · δ [ Δ ] = σ S ϵ 2 [ i ] · δ [ Δ ] .
While we do not explicitly account for sampling time offsets in our definition of the sampled process S ϵ [ i ] , it can be incorporated by replacing σ S c 2 ( t ) with a time-shifted version, i.e., σ S c 2 ( t ϕ ) , see also ([17] Section II.C).
It can be noted from (10) that if ϵ is a rational number, i.e., u , v N , u and v are relatively prime, such that ϵ = u v , then S ϵ [ i ] i Z is a DT memoryless WSCS process with the period p u , v = p · v + u N ([17] Section II.C). For this class of processes, the RDF can be obtained from ([12] Theorem 1) as stated in Corollary 1. On the other hand, if ϵ is an irrational number, then sampling becomes asynchronous and leads to a WSACS process whose RDF has not been characterized to date.

3.2. Definitions of Relevant Information-Spectrum Quantities

Conventional information theoretic tools for characterizing RDFs are based on an underlying ergodicity of the source. Consequently, these techniques cannot be applied to characterize the RDF of asynchronously sampled WSCS processes. To tackle this challenge, we use the information-spectrum framework, as this framework [14] can be utilized to obtain general formulas for rate limits for any arbitrary class of processes. The resulting expressions are not restricted to specific statistical models of the considered processes, and in particular, do not require information stability or stationarity. In the following, we recall the definitions of several information-spectrum quantities used in this study, see also ([14] Definitions 1.3.1 and 1.3.2):
Definition 8.
The limit-inferior in probability of a sequence of real RVs { Z k } k N is defined as
p lim inf k Z k sup α R | lim k Pr Z k < α = 0 α 0 .
Hence, α 0 is the largest real number satisfying that α ˜ < α 0 and μ > 0 there exists k 0 ( μ , α ˜ ) N such that Pr ( Z k < α ˜ ) < μ , k > k 0 ( μ , α ˜ ) .
Definition 9.
The limit-superior in probability of a sequence of real RVs { Z k } k N is defined as
p lim sup k Z k inf β R | lim k Pr Z k > β = 0 β 0 .
Hence, β 0 is the smallest real number satisfying that β ˜ > β 0 and μ > 0 , there exists k 0 ( μ , β ˜ ) N , such that Pr ( Z k > β ˜ ) < μ , k > k 0 ( μ , β ˜ ) .
The notion of uniform integrability of a sequence of RVs is a basic property in probability ([28] Chapter 12), which is not directly related to information spectrum methods. However, since it plays an important role in the information spectrum characterization of RDFs, we include its statement in the following definition:
Definition 10
(Uniform integrability ([28] Definition 12.1), ([14] Equation (5.3.2))). The sequence of real-valued random variables { Z k } k = 1 , is said to satisfy uniform integrability if
lim u sup k 1 z : | z | u p Z k z | z | d z = 0
The aforementioned quantities facilitate characterizing the RDF of arbitrary sources. Consider a general source process { S [ i ] } i = 1 (stationary or non-stationary) taking values from the source alphabet S [ i ] S and a reproduction process { S ^ [ i ] } i = 1 with values from the reproduction alphabet S ^ [ i ] S ^ . It follows from ([14] Section 5.5) that for a distortion measure which satisfies the uniform integrability criterion, i.e., that there exists a deterministic sequence { r [ i ] } i = 1 such that the sequence of RVs { d S ( k ) , r ( k ) } k = 1 satisfies Definition 10 ([14] Page 336), then the RDF is expressed as ([14] Equation (5.4.2)):
R ( D ) = inf F S , S ^ : d ¯ S ( S ( k ) , S ^ ( k ) ) D I ¯ S ( k ) ; S ^ ( k ) ,
where d ¯ S ( S ( k ) , S ^ ( k ) ) = lim sup k E d S ( k ) , S ^ ( k ) , F S , S ^ denotes the joint CDF of { S [ i ] } i = 1 and { S ^ [ i ] } i = 1 , and I ¯ S ( k ) : S ^ ( k ) represents the limit superior in probability of the mutual information rate of S ( k ) and S ^ ( k ) , given by:
I ¯ S ( k ) ; S ^ ( k ) p lim sup k 1 k log p S ( k ) | S ^ ( k ) S ( k ) | S ^ ( k ) p S ( k ) S ( k )
In order to use the RDF characterization in (14), the distortion measure must satisfy the uniform integrability criterion. For the considered class of sources detailed in Section 3.1, the MSE distortion satisfies this criterion, as stated in the following lemma:
Lemma 2.
For any real memoryless zero-mean Gaussian source { S [ i ] } i = 1 with bounded variance, i.e., σ max 2 < such that E { S 2 [ i ] } σ max 2 for all i N , the MSE distortion satisfies the uniform integrability criterion.
Proof. 
Set the deterministic sequence { r [ i ] } i = 1 to be the all-zero sequence. Under this setting and the MSE distortion, it holds that d S ( k ) , r ( k ) = 1 k i = 1 k S 2 [ i ] . To prove the lemma, we show that the sequence of RVs d S ( k ) , r ( k ) k = 1 has a bounded 2 norm, which implies that it is uniformly integrable by ([28] Corollary 12.8). The 2 norm of d S ( k ) , r ( k ) satisfies
E d S ( k ) , r ( k ) 2 = 1 k 2 E i = 1 k S 2 [ i ] j = 1 k S 2 [ j ] = 1 k 2 i = 1 k j = 1 k E S 2 [ i ] S 2 [ j ] ( a ) 1 k 2 i = 1 k j = 1 k 3 σ max 4 = 3 σ max 4 ,
where ( a ) follows since E { S 2 [ i ] S 2 [ j ] } = E { S 2 [ i ] } E { S 2 [ j ] } = σ max 4 for i j while E { S 4 [ i ] } = 3 σ max 4 ([29] Chapter 5.4). Equation (16) proves that d S ( k ) , r ( k ) is 2 -bounded by 3 σ max 4 < for all k N , which in turn implies that the MSE distortion is uniformly integrable for the source { S [ i ] } i = 1 . □
Since, as detailed in Section 3.1, we focus in the following on memoryless zero-mean Gaussian sources, Lemma 2 implies that the RDF of the source can be characterized using (14). However, (14) is in general difficult to evaluate, and thus does not lead to a meaningful understanding of how the RDF of sampled WSCS sources behaves, motivating our analysis in Section 4.

3.3. Information Spectrum Limits

The following theorem originally stated in ([17] Theorem 1) presents a fundamental result which is directly useful for the derivation of the RDF:
Theorem 3.
([17] Theorem 1) Let Z ˜ k , n n , k N be a set of sequences of real scalar RVs satisfying two assumptions:
AS1 
For every fixed n N , every convergent subsequence of Z ˜ k , n k N converges in distribution, as k , to a finite deterministic scalar. Each subsequence may converge to a different scalar.
AS2 
For every fixed k N , the sequence Z ˜ k , n n N converges uniformly in distribution, as n , to a scalar real-valued RV Z k . Specifically, letting F ˜ k , n ( α ) and F k ( α ) , α R , denote the CDFs of Z ˜ k , n and of Z k , respectively, then by AS2 it follows that η > 0 , there exists n 0 ( η ) such that for every n > n 0 ( η )
F ˜ k , n ( α ) F k ( α ) < η ,
for each α R , k N .
Then, for Z ˜ k , n n , k N it holds that
p lim inf k Z k = lim n p lim inf k Z ˜ k , n ,
p lim sup k Z k = lim n p lim sup k Z ˜ k , n .
Proof. 
In Appendix C we explicitly prove Equation (17b). This complements the proof in ([17] Appendix A) which explicitly considers only (17a). □

4. Rate-Distortion Characterization for Sampled CT WSCS Gaussian Sources

4.1. Main Result

Using the information-spectrum based characterization of the RDF (14) combined with the characterization of the limit of a sequence of information spectrum quantities in Theorem 3, we now analyze the RDF of asynchronously sampled WSCS processes. Our analysis is based on constructing a sequence of synchronously sampled WSCS processes, whose RDF is given in Corollary 1. Then, we show that the RDF of the asynchronously sampled process can be obtained as the limit superior of the computable RDFs of the sequence of synchronously sampled processes. We begin by letting ϵ n n · ϵ n for n N and defining a Gaussian source process S n [ i ] = S c i · T ps p + ϵ n . From the discussion in Section 3.1 (see also ([17] Section II.C)), it follows that since ϵ n is rational, S n [ i ] is a WSCS process and its period is given by p n = p · n + n · ϵ . Accordingly, the periodic correlation function of S n [ i ] can be obtained similarly to (10) as:
r S n [ i , Δ ] = E S n [ i ] S n [ i + Δ ] = σ S c 2 i · T ps p + ϵ n · δ [ Δ ] .
Due to cyclostationarity of S n [ i ] , we have that r S n [ i , Δ ] = r S n [ i + p n , Δ ] , i , Δ Z , and we let σ S n 2 [ i ] r S n [ i , 0 ] denote its periodic variance.
We next restate Corollary 1 in terms of ϵ n as follows:
Proposition 1.
Consider a DT, memoryless, zero-mean, WSCS Gaussian random process S n [ i ] with a variance σ S n 2 [ i ] , obtained from S c ( t ) by sampling with a sampling interval of T s ( n ) = T ps p + ϵ n . Let S n ( p n ) [ i ] denote the memoryless stationary multivariate random process obtained by applying the DCD to S n [ i ] and let σ S n 2 [ m ] , m = 1 , 2 , , p n , denote the variance of the m t h component of S n ( p n ) [ i ] . The rate-distortion function is given by:
R n ( D ) = 1 2 p n m = 1 p n log σ S n 2 [ m ] D n [ m ] D 1 p n m = 1 p n σ S n 2 [ m ] 0 D > 1 p n m = 1 p n σ S n 2 [ m ] ,
where for D 1 p n m = 1 p n σ S n 2 [ m ] we let D n [ m ] min σ S n 2 [ m ] , θ n , and θ n is selected such that
D = 1 p n m = 1 p n D n [ m ] .
We recall that the RDF of S n [ i ] is characterized in Proposition 1 via the RDF of the multivariate stationary process S n ( p n ) [ i ] obtained via a p n -dimensional DCD applied to S n [ i ] . Next, we recall that the relationship between the source process S n ( p n ) [ i ] and the optimal reconstruction process, denoted by S ^ n ( p n ) [ i ] , is characterized in ([5] Chapter 10.3.3) via a linear, multivariate, time-invariant backward channel with a p n × 1 additive vector noise process W n ( p n ) [ i ] , and is given by:
S n ( p n ) [ i ] = S ^ n ( p n ) [ i ] + W n ( p n ) [ i ] , i N .
It also follows from ([5] Section 10.3.3) that for the IID Gaussian multivariate process whose entries are independent and distributed via S n ( p n ) [ i ] m N ( 0 , σ S n 2 [ m ] ) , m { 1 , 2 , , p n } , the optimal reconstruction vector process S ^ n ( p n ) [ i ] and the corresponding noise vector process W n ( p n ) [ i ] each follow a multivariate Gaussian distribution:
S ^ n ( p n ) [ i ] N 0 , σ S ^ n 2 [ 1 ] 0 0 σ S ^ n 2 [ p n ] and W n ( p n ) [ i ] N 0 , D n [ 1 ] 0 0 D n [ p n ] ,
where D n [ m ] min σ S n 2 [ m ] , θ n ; θ n denotes the reverse waterfilling threshold defined in Prop. 1 for the index n, and is selected such that D = 1 p n m = 1 p n D n [ m ] . The optimal reconstruction process, S ^ n ( p n ) [ i ] and the noise process W n ( p n ) [ i ] are mutually independent, and for each m { 1 , 2 , , p n } it holds that E S n ( p n ) [ i ] S ^ n ( p n ) [ i ] m 2 = D n [ m ] , see ([5] Chapters 10.3.2 and 10.3.3). The multivariate relationship between stationary processes in (20) can be transformed into an equivalent linear relationship between cyclostationary Gaussian memoryless processes via the inverse DCD transformation ([2] Sec 17.2) applied to each of the processes, resulting in:
S n [ i ] = S ^ n [ i ] + W n [ i ] , i N .
We are now ready to state our main result, which is the RDF of asynchronously sampled DT sources S ϵ [ i ] , ϵ Q , in the low MSE regime, i.e., when the distortion D is not larger than the source variance. The RDF is stated in the following theorem, which applies to both synchronous sampling as well as to asynchronous sampling:
Theorem 4.
Consider a DT source { S ϵ [ i ] } i = 1 obtained by sampling a CT WSCS source, whose period of statistics is T ps , at intervals T s . Then, for any distortion constraint D such that D < min 0 t T ps σ S c 2 ( t ) and any ϵ [ 0 , 1 ) , the RDF R ϵ ( D ) for compressing { S ϵ [ i ] } i = 1 can be obtained as the limit:
R ϵ ( D ) = lim sup n R n ( D ) ,
where R n ( D ) is defined Prop. 1.
Proof. 
The detailed proof is provided in Appendix D. Here, we give a brief outline: The derivation of the RDF with asynchronous sampling follows three steps: First, we note that sampling at a rate of T s ( n ) = T ps p + ϵ n results in a sequence of DT WSCS sources { S n [ i ] } i N , n N whose sampling interval T s ( n ) asymptotically approaches, as n , the sampling interval for irrational ϵ given by T s = T ps p + ϵ . We define a sequence of rational numbers ϵ n s . t . ϵ n ϵ as n ; Building upon this insight, we prove that the RDF with T s can be stated as a double limit where the outer limit is with respect to the blocklength and the inner limit is with respect to ϵ n . Lastly, we use Theorem 3 to show that the order of the limits can be exchanged, obtaining a limit of expressions which are computable. □
Remark 2.
Theorem 4 focuses on the low distortion regime, defined as the values of D satisfying D < min 0 t T ps σ S c 2 ( t ) . This implies that θ n has to be smaller than min 0 t T ps σ S c 2 ( t ) ; hence, from Prop. 1 it follows that for the corresponding stationary noise vector W n ( p n ) [ i ] in (20), D n [ m ] = min σ S n 2 [ m ] , θ n = θ n and D = 1 p n m = 1 p n D n [ m ] = θ n = D n [ m ] . We note that since every element of the vector W n ( p n ) [ i ] m has the same variance D n [ m ] = D for all n N and m = 1 , 2 , , p n then by applying the inverse DCD to W n ( p n ) [ i ] , the resulting scalar DT process W n [ i ] is wide sense stationary; and in fact IID with E W n [ i ] 2 = D .

4.2. Discussion and Relationship with Capacity Derivation in Reference 17

Theorem 4 provides a meaningful and computable characterization for the RDF of sampled WSCS signals. We note that the proof of the main theorem uses some of the steps used in our recent study on the capacity of memoryless channels with sampled CT WSCS Gaussian noise [17]. It should be emphasized, however, that there are several fundamental differences between the two studies, which require the introduction of new treatments and derivations original to the current work. First, it is important to note that in the study on capacity, a physical channel model exists, and therefore the conditional PDF of the output signal given the input signal can be characterized explicitly for both synchronous sampling and asynchronous sampling for every input distribution. For the current study of the RDF we note that the relationship (21), commonly referred to as the backward channel [30], ([5] Chapter 10.3.2), characterizes the relationship between the source process and the optimal reproduction process, and hence is valid only for synchronous sampling and for the optimal reproduction process. Consequently, in the RDF analysis the limiting relationship (21) as n is not even known to exist and, in fact, we can show it exists under a rather strict condition on the distortion (namely, the condition D < min 0 t T ps σ S c 2 ( t ) stated in Theorem 4). In particular, to prove the statement in Theorem 4, we had to show that from the backward channel (21), we can define an asymptotic relationship, as n , which corresponds to the asynchronously sampled source process, denoted by S ϵ [ i ] , and relates S ϵ [ i ] with its optimal reconstruction process S ^ ϵ [ i ] . This is done by showing that the PDFs for the reproduction process S ^ n [ i ] and noise process W n [ i ] from (21), each converge uniformly as n to a respective limiting PDF, which has to be defined as well. This enabled us to relate the RDFs for the synchronous sampling and for the asynchronous sampling cases using Theorem 3, eventually leading to (22). Accordingly, in our detailed proof of Theorem 4 given in Appendix D, Lemmas A6 and A8 as well as a significant part of Lemma A4 are largely new, addressing the special aspects of the proof arising from the fundamental differences between current setup and the setup in [17], while the derivations of Lemmas A3 and A7 follow similarly to ([17] Lemma B.1) and ([17] Lemma B.5), respectively, and parts of Lemma A4 coincide with ([17] Lemma B.2).

5. Numerical Examples

In this section we demonstrate the insights arising from our RDF characterization via numerical examples. Recalling that Theorem 4 states the RDF for asynchronously sampled CT WSCS Gaussian process, R ϵ ( D ) , as the limit supremum of a sequence of RDFs corresponding to DT memoryless WSCS Gaussian source processes R n ( D ) n N , we first consider the convergence of { R n ( D ) } n N in Section 5.1. Next, in Section 5.2 we study the variation of the RDF of the sampled CT process due to changes in the sampling rate and in the sampling time offset.
Similarly to ([17] Section IV), define a periodic continuous pulse function, denoted by Π t dc , t rf ( t ) , with equal rise/fall time t rf = 0.01 , duty cycle t dc [ 0 , 0.98 ] , and period of 1, i.e., Π t dc , t rf ( t + 1 ) = Π t dc , t rf ( t ) for all t R . Specifically, for t [ 0 , 1 ) the function Π t dc , t rf ( t ) is given by
Π t dc , t rf ( t ) = t t rf t [ 0 , t rf ] 1 t ( t rf , t dc + t rf ) 1 t t dc t rf t rf t [ t dc + t rf , t dc + 2 · t rf ] 0 t ( t dc + 2 · t rf , 1 ) .
In the following, we model the time varying variance of the WSCS source σ S c 2 ( t ) to be a linear periodic function of Π t dc , t rf ( t ) . To that aim, we define a time offset between the first sample and the rise start time of the periodic continuous pulse function; we denote the time offset by ϕ [ 0 , 1 ) . This corresponds to the sampling time offset normalized to the period T ps . The variance of S c ( t ) is a periodic function with period T ps which is defined as
σ S c 2 ( t ) = 0.2 + 4.8 · Π t dc , t rf t T ps ϕ , t [ 0 , T ps ) ,
with a period of T ps = 5 μ secs.

5.1. Convergence of R n ( D ) in n

From Theorem 4 it follows that if the distortion satisfies D < min 0 t T ps σ S c 2 ( t ) , the RDF of the asynchronously sampled CT WSCS Gaussian process is given by the limit superior of the sequence { R n ( D ) } n N ; where R n ( D ) is defined in Proposition 1. In this subsection, we study the sequence of RDFs { R n ( D ) } n N as n increases. For this evaluation setup, we fixed the distortion constraint at D = 0.18 and set ϵ = π 7 and p = 2 . Let the variance of the CT WSCS Gaussian source process σ S c 2 ( t ) be modelled by Equation (24) for two sampling time offsets ϕ = { 0 , 1 16 } . For each offset ϕ , four duty cycle values were considered: t dc = [ 20 , 45 , 75 , 98 ] % . For each n we obtain the synchronous sampling mismatch ϵ n n · ϵ n , which approaches ϵ as n , where n N . Since ϵ n is a rational number, corresponding to a sampling period of T s ( n ) = T ps p + ϵ n , then for each n, the resulting dt process is WSCS with the period p n = p · n + n · ϵ and its RDF follows from Proposition 1.
Figure 2 and Figure 3 depict R n ( D ) for n [ 1 , 500 ] with the specified duty cycles and sampling time offsets, where in Figure 2 there is no sampling time offset, i.e., ϕ = 0 , and in Figure 3 the sampling time offset is set to ϕ = 1 16 . We observe that in both figures the RDF values are higher for higher t dc . This can be explained by noting that for higher t dc values, the resulting time-averaged variance of the DT source process increases, hence, a higher number of bits per source sample is required to encode the source process maintaining the same distortion value. Moreover, in all configurations, R n ( D ) varies significantly for smaller values of n. Comparing Figure 2 and Figure 3, we see that the pattern of these variations depends on the sampling time offset ϕ . For example, when t dc = 45 % at n [ 4 , 15 ] , then for ϕ = 0 the rdf varies in the range [ 1.032 , 1.143 ] bits per source sample, while for ϕ = 1 16 the RDF varies in the range [ 1.071 , 1.237 ] bits per source sample. However, as n increases above 230, the variations in R n ( D ) become smaller and are less dependent on the sampling time offset, and the resulting values of R n ( D ) are approximately in the same range for each t dc in both Figure 2 and Figure 3 for n 230 . This behaviour can be explained by noting that as n varies, the period p n also varies and hence the statistics of the DT variance differs over its respective period. This consequently affects the resulting RDF (especially for small periods). As n increases ϵ n approaches the asynchronous sampling mismatch ϵ and the period p n takes a sufficiently large value such that the samples of the DT variance over the period are identically distributed irrespective of the value of ϕ ; leading to a negligible variation in the RDF as seen in the above figures.

5.2. The Variation of the RDF with the Sampling Rate

Next, we observe the dependence of the RDF for the sampled memoryless WSCS Gaussian process on the value of the sampling interval T s . For this setup, we fix the distortion constraint to D = 0.18 and set the duty cycle in the source process (24) to t dc = [ 45 , 75 ] % . Figure 4 and Figure 5 demonstrate the numerically evaluated values for R n ( D ) at sampling intervals in the range 2 < T ps T s < 4 with the sampling time offsets ϕ = 0 and ϕ = 1 16 , respectively. We note that while the discussion which follows focuses on this range, as it corresponds to relatively low sampling rates—which are typically preferable in practice, the statements and observations regarding the relationship between the denominator of T ps T s and the value of R n ( D ) , and regarding the continuity the RDF in the parameter T ps T s , are directly applicable to any range of values of T ps T s , e.g., when higher sampling rates are preferable. A very important insight which arises from the figures is that the sequence of RDFs R n ( D ) is not convergent; hence, for example, one cannot approach the RDF for T ps T s = 2.5 by simply taking rational values of T ps T s which approach 2.5 . This verifies that the RDF for asynchronous sampling cannot be obtained by straightforward application of previous results, and indeed, the entire analysis carried out in the manuscript is necessary for the desired characterization.
We observe in Figure 4 and Figure 5 that when T ps T s has a fractional part with a relatively small integer denominator, the variations in the RDF are significant and depend on the sampling time offset. These variations can either degrade the ability to accurately represent the source, which are the observed peaks in Figure 4 and Figure 5, or alternatively, allow to encode the signal to within the same distortion with smaller code rates, corresponding to the deeps in these figures. However, when T ps T s approaches an irrational number, the period of the sampled variance function becomes very long, and consequently, the RDF is approximately constant and independent of the sampling time offset. As an example, consider T ps T s = 2.5 and t dc = 75 % : For sampling time offset ϕ = 0 the rdf takes a value of 1.469 bits per source sample, as shown in Figure 4 while for the offset of ϕ = 1 16 the RDF peaks to 1.934 bits per source sample as can be seen in Figure 5. On the other hand, when approaching asynchronous sampling, the RDF takes a nearly constant value of 1.85 bits per source sample for all the considered values of T ps T s and this value is invariant to the offset ϕ . This follows since when the denominator of the fractional part of T ps T s increases, then the DT period of the resulting sampled variance, p n , increases and practically captures the entire set of values of the CT variance regardless of the sampling time offset. In a similar manner as with the study on capacity in [17], we conjecture that since asynchronous sampling captures the entire set of values of the CT variance, the respective RDF represents the RDF of the analog source, which does not depend on the specific sampling rate and offset. Figure 4 and Figure 5 demonstrate how slight variations in the sampling rate can result in significant changes in the RDF. For instance, at ϕ = 0 we observe in Figure 4 that when the sampling rate switches from T s = 2.25 · T ps to T s = 2.26 · T ps , i.e., the sampling rate switches from being synchronous to being nearly asynchronous, then the RDF changes from 1.624 bits per source sample to 1.859 bits per source sample for t dc = 75 % ; also, we observe in Figure 5 for t dc = 45 % , that when the sampling rate switches from T s = 2.5 · T ps to T s = 2.51 · T ps , i.e., the sampling rate also switches from being synchronous to being nearly asynchronous, then the rdf changes from 1.005 bits per source sample to 1.154 bits per source sample.
Lastly, Figure 6 and Figure 7 numerically evaluate the RDF versus the distortion constraint D [ 0.05 , 0.19 ] for sampling time offsets of 0 and 1 16 respectively. At each ϕ , the result is evaluated at three different values of synchronization mismatch ϵ . For this setup, we fix t dc = 75 % , p = 2 and ϵ { 0.5 , 5 π 32 , 0.6 } . The only mismatch value that refers to the asynchronous sampling case is ϵ = 5 π 32 and its corresponding sampling interval is approximately 2.007 μ secs, which is a negligible variation from the sampling intervals corresponding to ϵ { 0.5 , 0.6 } , which are 2.000 μ secs and 1.923 μ secs, respectively. Observing both figures, we see that the rdf may vary significantly for very slight variation in the sampling rate. For instance, as shown in Figure 6 for ϕ = 0 , at D = 0.18 , a slight change in the synchronization mismatch from ϵ = 5 π 32 (i.e., T s 2.007 μ secs) to ϵ = 0.5 (i.e., T s = 2.000 μ secs) results to approximately 20 % decrease in the rdf. For ϕ = 1 16 the same change in the sampling synchronization mismatch at D = 0.18 results in an increase in the RDF by roughly 4 % . These results demonstrate the unique and counter-intuitive characteristics of the RDF of sampled WSCS signals which arise from our derivation. It is also interesting to examine how the RDF varies with the sampling time offset ϕ . To that aim we plot in Figure 8 the RDF vs. ϕ for the three sampling rates used in Figure 6 and Figure 7 at D = 0.18 . The points marked on the plot correspond to ϕ = 0 and ϕ = 1 16 considered in Figure 6 and Figure 7, respectively. We observe that the RDF is indeed periodic with ϕ . These variations in the RDF occur as by changing ϕ the number of high variance samples within a period of the variance of the DT process changes due to the duty cycle of the CT variance. Then, at ϕ = 0 the periodic variance of the DT process corresponding to T s = 2.000 μ secs has the smallest number of high variance values within a period, and when ϕ = 1 16 the periodic variance of the DT process corresponding to asynchronous sampling has the smallest number of high variance values within a period. For the asynchronous sampling rate the sampling time offset does not matter as in any case (nearly) all values of the CT variance are reflected in the variance of the DT process.

6. Conclusions

In this work the RDF of a sampled CT WSCS Gaussian source process was characterized for scenarios in which the resulting DT process is memoryless and the distortion is relatively small. This characterization shows the relationship between the sampling rate and the minimal number of bits per source sample required for compression at a given distortion. For cases in which the sampling rate is synchronized with the period of the statistics of the source process, the resulting DT process is WSCS and standard information theoretic framework can be used for deriving its RDF. For asynchronous sampling, information stability does not hold, and hence we resort to the information spectrum framework to obtain a characterization. To that aim we derived a relationship between some relevant information spectrum quantities for uniformly convergent sequences of RVs. This relationship was further applied to characterize the RDF of an asynchronously sampled CT WSCS Gaussian source process as the limit superior of a sequence of RDFs, each corresponding to the synchronous sampling of the CT WSCS Gaussian process. The results were derived in the low distortion regime, i.e., under the condition that the distortion constraint D is less than the minimum variance of the source, and for sampling intervals which are larger than the correlation length of the CT process. Our numerical examples give rise to non-intuitive insights which follow from the derivations. In particular, the numerical evaluation demonstrates that the RDF for a sampled CT WSCS Gaussian source can change dramatically with minor variations in the sampling rate and the sampling time offset. In particular, when the sampling rate switches from being synchronous to being asynchronous and vice versa, the RDF may change considerably as the statistical model of the source switches between WSCS and WSACS. The resulting analysis enables determining the sampling system parameters in order to facilitate accurate and efficient source coding of acquired CT signals.

Author Contributions

Conceptualization, E.A., N.S. and R.D.; Methodology, E.A., N.S. and R.D.; Derivation, E.A., N.S. and R.D.; Writing—original draft preparation, E.A., N.S. and R.D.; writing—review and editing, E.A., N.S. and R.D.; Supervision, N.S. and R.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Israel Science Foundation under Grants 1685/16 and 0100101, and by the Israeli Ministry of Economy through the HERON 5G consortium.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Lemma 1

Proof. 
To prove that the minimum achievable rate at a given maximum distortion for a code with arbitrary blocklength can be achieved by considering only codes whose blocklength is an integer multiple of r, we apply the following approach: We first show that every rate-distortion pair achievable when restricted to using source codes whose blocklength is an integer multiple of r is also achievable when using arbitrary blocklenghts; We then prove that every achievable rate-distortion pair is also achievable when restricted to using codes whose blocklength is an integer multiple of r. Combining these two assertions proves that the rate-distortion function of the source { S [ i ] } i N can be obtained when restricting the blocklengths to be an integer multiple of r. Consequently, a reproduction signal { S ^ [ i ] } i N which achieves the minimal rate for a given D under the restriction to use only blocklengths which are an integer multiple of r is also the reproduction signal achieving the minimal rate without this restriction, and vice versa, thus proving the lemma.
To prove the first assertion, consider a rate-distortion pair ( R , D ) which is achievable when using codes whose blocklength is an integer multiple of r. It thus follows directly from Definition 6 that for every η > 0 , b 0 N such that for all b > b 0 there exists a a source code R ( b · r ) , b · r with rate R ( b · r ) R + η satisfying d ¯ S ( b · r ) , S ^ ( b · r ) D + η 2 . We now show that we can construct a code with an arbitrary blocklength l = b · r + j where 0 < j < r (i.e., the blocklength l is not an integer multiple of r) satisfying Definition 6 for all j { 1 , , r 1 } as follows: Apply the code R ( b · r ) , b · r to the first b · r samples of S [ i ] and then concatenate each codeword by j zeros to obtain a source code having codewords of length b · r + j . The average distortion (i.e., see (2)) of the resulting R ( b · r + j ) , b · r + j code is given by:
d ¯ S ( b · r + j ) , S ^ ( b · r + j ) = 1 b · r + j i = 1 b · r E S [ i ] S ^ [ i ] 2 + i = b · r + 1 b · r + j E S [ i ] 2 = 1 b · r + j b · r · d ¯ S ( b · r ) , S ^ ( b · r ) + i = 1 j σ S 2 [ i ] = b · r b · r + j · d ¯ S ( b · r ) , S ^ ( b · r ) + 1 b · r + j i = 1 j σ S 2 [ i ] .
Thus, b 1 > b 0 such that 1 b 1 · r + j i = 1 j σ S 2 [ i ] < η 2 and hence, for all b > b 1
d ¯ S ( b · r + j ) , S ^ ( b · r + j ) = b · r b · r + j · d ¯ S ( b · r ) , S ^ ( b · r ) + 1 b · r + j i = 1 j σ S 2 [ i ] b · r b · r + j · d ¯ S ( b · r ) , S ^ ( b · r ) + η 2 d ¯ S ( b · r ) , S ^ ( b · r ) + η 2 D + η .
The rate R ( b · r + j ) satisfies:
R ( b · r + j ) = 1 b · r + j · log 2 M = R ( b · r ) · b · r b · r + j R + η · b · r b · r + j R + η .
Consequently, any rate-distortion pair achievable with codes whose blocklength is an integer multiple of r can be achieved by codes with arbitrary blocklengths.
Next, we prove that any achievable rate-distortion pair ( R , D ) can be achieved by codes whose blocklength is an integer multiple of r. To that aim, we fix η > 0 . By Definition 6, it holds that there exists a code of blocklength l satisfying (3) and (4). To show that ( R , D ) is achievable using codes whose blocklength is an integer multiple of r, we assume here that l is not an integer multiple of r, hence, there exist some positive integers b and j such that j < r and l = b · r + j . We denote this code by R ( b · r + j ) , b · r + j . It follows from Definition 6 that R ( b · r + j ) R + η and d ¯ S ( b · r + j ) , S ^ ( b · r + j ) D + η 2 . Next, we construct a code R ( b + 1 ) · r , ( b + 1 ) · r with codewords whose length is ( b + 1 ) · r , i.e., an integer multiple of r, by adding r j zeros at the end of each codeword of the code R ( b · r + j ) , b · r + j . The average distortion can now be computed as follows:
d ¯ S ( ( b + 1 ) · r ) , S ^ ( ( b + 1 ) · r ) = 1 ( b + 1 ) · r i = 1 b · r + j E S [ i ] S ^ [ i ] 2 + i = b · r + j + 1 ( b + 1 ) · r E S [ i ] 2 = 1 ( b + 1 ) · r ( b · r + j ) · d ¯ S ( b · r + j ) , S ^ ( b · r + j ) + i = b · r + j + 1 ( b + 1 ) · r σ S 2 [ i ] = b · r + j ( b + 1 ) · r · d ¯ S ( b · r + j ) , S ^ ( b · r + j ) + i = b · r + j + 1 ( b + 1 ) · r σ S 2 [ i ] ( b + 1 ) · r ,
Thus, since the variance is finite and bounded, b 1 > b 0 such that i = b 1 · r + j + 1 ( b 1 + 1 ) · r σ S 2 [ i ] ( b 1 + 1 ) · r < η 2 for all b > b 1 . Hence, for all b > b 1
d ¯ S ( ( b + 1 ) · r ) , S ^ ( ( b + 1 ) · r b · r + j ( b + 1 ) · r · d ¯ S ( b · r + j ) , S ^ ( b · r + j ) + η 2 d ¯ S ( b · r + j ) , S ^ ( b · r + j ) + η 2 D + η .
The rate R ( b + 1 ) · r can be expressed as follows:
R ( b + 1 ) · r = 1 ( b + 1 ) · r · log 2 M = R ( b · r + j ) · b · r + j ( b + 1 ) · r R + η · b · r + j ( b + 1 ) · r < R + η .
It follows that R ( b + 1 ) · r R + η for any arbitrary η by selecting a sufficiently large b. This proves that every rate-distortion pair achievable with arbitrary blocklengths (e.g., l = b · r + j , j < r ) is also achievable when considering source codes whose blocklength is an integer multiple of r (i.e., l = b · r ). This concludes the proof. □

Appendix B. Proof of Theorem 1

Recall that α R + + . To prove the theorem, we fix a rate-distortion pair ( R , D ) that is achievable for the source { S [ i ] } i N . By Definition 6 this implies that for all η > 0 there exists l 0 ( η ) N such that for all l > l 0 ( η ) there exists a source code C l with rate R ( l ) R + η and MSE distortion D ( l ) = E 1 l S ( l ) S ^ ( l ) 2 D + η , where · denotes the norm of a vector. Next, we use the code C l to define the source code C l ( α ) , which operates in the following manner: The encoder first scales its input block by 1 / α . Then, the block is encoded using the source code C l . Finally, the selected codeword is scaled by α . Since the C l ( α ) has the same number of codewords and the same blocklength as C l , it follows that its rate, denote R ( l ) ( α ) , satisfied R ( l ) ( α ) = R ( l ) R + η . Furthermore, by the construction of C l ( α ) , it holds that its reproduction vector when applied to α · S ( l ) is equal to the output of C l applied to S ( l ) scaled by α , i.e., α · S ^ ( l ) . Consequently, the MSE of C l ( α ) when applied to the source { α · S [ i ] } i N , denoted D ( l ) ( α ) , satisfies D ( l ) ( α ) = E 1 l α · S ( l ) α · S ^ ( l ) 2 = α 2 · D ( l ) α 2 · D + α 2 η .
It thus follows that for all η ˜ > 0 there exists l ˜ 0 ( η ˜ ) = l 0 min ( η ˜ , α 2 η ˜ ) such that for all l > l ˜ 0 ( η ˜ ) there exists a code C l ( α ) with rate R ( l ) ( α ) R + η ˜ which achieves an MSE distortion of D ( l ) ( α ) α 2 · D + η ˜ when applied to the compression of { α · S [ i ] } i N . Hence, ( R , α 2 · D ) is achievable for compression of { α · S [ i ] } i N by Definition 6, proving the theorem.

Appendix C. Proof of Theorem 3

In this appendix, we prove (17b) by applying a similar approach as used for proving (17a) in ([17] Appendix A). We first note that Definition 9 can also be written as follows:
p lim sup k Z k = ( a ) inf β R | lim sup k Pr Z k > β = 0 = ( b ) inf β R | lim inf k F k ( β ) = 1 .
For the equality ( a ) , we note that the set of probabilities { Pr Z k > β } k N is non-negative and bounded in [ 0 , 1 ] ; hence, for any β R for which lim sup k Pr Z k > β = 0 , it also holds from ([31] Theorem 3.17) that the limit of any subsequence of Pr Z k > β k N is also 0, since non-negativity of the probability implies lim inf k Pr Z k > β 0 . Then, combined with the relationship lim inf k Pr Z k > β lim sup k Pr Z k > β , we conclude:
0 lim inf k Pr Z k > β lim sup k Pr Z k > β = 0   lim inf k Pr Z k > β = lim sup k Pr Z k > β = ( a ) lim k Pr Z k > β = 0 ,
where ( a ) follows from ([31] Example 3.18(c)). This implies lim k Pr Z k > β exists and is equal to 0.
In the opposite direction, if lim k Pr Z k > β = 0 then it follows from ([31] Example 3.18(c)) that lim sup k Pr Z k > β = 0 . Next, we note that since F k ( β ) is bounded in [ 0 , 1 ] then lim inf k F k ( β ) is finite β R , even if lim k F k ( β ) does not exist. Equality ( b ) follows since lim sup k Pr Z k > β = lim sup k 1 Pr Z k β which according to ([32] Theorem 7.3.7) is equal to 1 + lim sup k Pr Z k β . By ([33] Chapter 1, page 29), this quantity is also equal to 1 lim inf k Pr Z k β = 1 lim inf k F k ( β ) .
Next, we state the following lemma:
Lemma A1.
Given assumption AS2, for all β R it holds that
lim inf k F k ( β ) = lim n lim inf k F ˜ k , n ( β ) .
Proof. 
To prove the lemma we first show that lim inf k F k ( β ) lim n lim inf k F ˜ k , n ( β ) , and then we show lim inf k F k ( β ) lim n lim inf k F ˜ k , n ( β ) . Recall that by AS2, for all β R and k N , F ˜ k , n ( β ) converges as n to F k ( β ) , uniformly over k and β , i.e., for all η > 0 there exists n 0 ( η ) N , k 0 n 0 ( η ) , η N such that for every n > n 0 ( η ) , β R and k > k 0 n 0 ( η ) , η , it holds that | F ˜ k , n ( β ) F k ( β ) | < η . Consequently, for every subsequence 0 < k 1 < k 2 < such that lim l F ˜ k l , n ( β ) exists for any n > n 0 ( η ) , it follows from ([31] Theorem 7.11) that, as the convergence over k is uniform, the limits over n and l are interchangeable:
lim n lim l F ˜ k l , n ( β ) = lim l lim n F ˜ k l , n ( β ) = lim l F k l ( β ) .
The existence of such a convergent subsequence is guaranteed by the Bolzano-Weierstrass theorem ([31] Theorem 2.42) as F ˜ k , n ( β ) [ 0 , 1 ] .
From the properties of the limit inferior ([31] Theorem 3.17) it follows that there exists a subsequence of F k ( β ) k N , denoted F k m ( β ) m N , such that lim m F k m ( β ) = lim inf k F k ( β ) . Consequently,
lim inf k F k ( β ) = lim m F k m ( β ) = ( a ) lim n lim m F ˜ k m , n ( β ) ( b ) lim n lim inf k F ˜ k , n ( β ) ,
where ( a ) follows from (A9), and ( b ) follows from the definition of the limit inferior ([31] Definition 3.16). Similarly, by ([31] Theorem 3.17), for any n N there exists a subsequence of { F ˜ k , n ( β ) } k N which we denote by F ˜ k l , n ( β ) l N where { k l } l N satisfy 0 < k 1 < k 2 < , such that lim l F ˜ k l , n ( β ) = lim inf k F ˜ k , n ( β ) . Therefore,
lim n lim inf k F ˜ k , n ( β ) = lim n lim l F ˜ k l , n ( β ) = ( a ) lim l F k l ( β ) ( b ) lim inf k F k ( β ) ,
where ( a ) follows from (A9), and ( b ) follows from the definition of the limit inferior ([31] Definition 3.16). Therefore, lim inf k F k ( β ) lim n lim inf k F ˜ k , n ( β ) . Combining (A10) and (A11) proves (A8) in the statement of the lemma. □
Lemma A2.
Given assumptions AS1–AS2, the sequence of RVs Z ˜ k , n k , n N satisfies
lim n p lim sup k Z ˜ k , n = inf β R | lim n lim inf k F ˜ k , n ( β ) = 1 .
Proof. 
Since by assumption AS1, for every n N , every convergent subsequence of Z ˜ k , n k N converges in distribution as k to a deterministic scalar, it follows that every convergent subsequence of F ˜ k , n ( β ) converges as k to a step function, which is the CDF of the corresponding sublimit of Z ˜ k , n . In particular, the limit lim inf k F ˜ k , n ( β ) is a step function representing the CDF of the deterministic scalar ζ n , i.e.,
lim inf k F ˜ k , n ( β ) = 0 β < ζ n 1 β ζ n .
Since, by Lemma A1, AS2 implies that the limit lim n lim inf k F ˜ k , n ( β ) exists (convergence to a discontinuous function is in the sense of ([31] Ex. 7.3)), then lim n ζ n exists. Hence, we obtain that
lim n lim inf k F ˜ k , n ( β ) = 0 β < lim n ζ n 1 β lim n ζ n ,
and from the right-hand side of (A12) we have that
inf β R | lim n lim inf k F ˜ k , n ( β ) = 1 = lim n ζ n .
Next, from (A7) and (A13) we note that
p lim sup k Z ˜ k , n = inf β R | lim inf k F ˜ k , n ( β ) = 1 = ζ n .
Consequently, the left-hand side of (A12) is equal to lim n ζ n . Combining with (A15) we arrive at the equality (A12) in the statement of the lemma. □
Substituting (A8) into (A7) results in
p lim sup k Z k = inf β R | lim n lim inf k F ˜ k , n ( β ) = 1 = ( a ) lim n p lim sup k Z ˜ k , n ,
where ( a ) follows from (A12). Equation (A16) concludes the proof for (17b).

Appendix D. Proof of Theorem 4

In this appendix we detail the proof of Theorem 4. The outline of the proof is given as follows:
  • We first show in Appendix D.1 that for any k N , the PDF of the random vector S n ( k ) , representing the first k samples of the CT WSCS source S c ( t ) sampled at time instants T s ( n ) = T ps p + ϵ n , converges in the limit as n and for any k N to the PDF of S ϵ ( k ) , which represents the first k samples of the CT WSCS source S c ( t ) , sampled at time instants T s = T ps p + ϵ . We prove that this convergence is uniform in k N and in the realization vector s ( k ) R k . This is stated in Lemma A3.
  • Next, in Appendix D.2 we apply Theorem 3 to relate the mutual information density rates for the random source vector S n ( k ) and its reproduction S ^ n ( k ) with that of the random source vector S ϵ ( k ) and its reproduction S ^ ϵ ( k ) . To that aim, let the functions F S n , S ^ n and F S ϵ , S ^ ϵ denote the joint distributions of an arbitrary dimensional source and reproduction vectors corresponding to the synchronously sampled and to the asynchronously sampled source process respectively. We define the following mutual information density rates:
    Z ˜ k , n F S n , S ^ n 1 k log p S n ( k ) | S ^ n ( k ) S n ( k ) | S ^ n ( k ) p S n ( k ) S n ( k ) ,
    and
    Z k , ϵ F S ϵ , S ^ ϵ 1 k log p S ϵ ( k ) | S ^ ϵ ( k ) S ϵ ( k ) | S ^ ϵ ( k ) p S ϵ ( k ) S ϵ ( k ) ,
    k , n N . The RVs Z ˜ k , n F S n , S ^ n and Z k , ϵ F S ϵ , S ^ ϵ in (A17) denote the mutual information density rates ([14] Definition 3.2.1) between the DT source process and the corresponding reproduction process for the case of synchronous sampling and for the case of asynchronous sampling, respectively.
    We then show that if the pairs of source process and optimal reproduction process S n [ i ] , S ^ n [ i ] i N and S ϵ [ i ] , S ^ ϵ [ i ] i N satisfy that p S ^ n ( k ) s ^ ( k ) n p S ^ ϵ ( k ) s ^ ( k ) uniformly with respect to s ^ ( k ) R k and k N , and that p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) n p S ϵ ( k ) | S ^ ϵ ( k ) s ( k ) | s ^ ( k ) uniformly in s ^ ( k ) T , s ( k ) T T R 2 k and k N , then Z ˜ k , n F S n , S ^ n n ( d i s t . ) Z k , ϵ F S ϵ , S ^ ϵ uniformly in k N . In addition, Lemma A5 proves that every subsequence of Z ˜ k , n F S n , S ^ n k N w . r . t . k, indexed as k l converges in distribution, in the limit l to a deterministic scalar.
  • Lastly, in Appendix D.3 we combine the above results to show in Lemmas A7 and A8 that R ϵ ( D ) lim sup n R n ( D ) and R ϵ ( D ) lim sup n R n ( D ) respectively; implying that R ϵ ( D ) = lim sup n R n ( D ) , which proves the theorem.
To facilitate our proof we will need uniform convergence in k N , of p S n ( k ) s ( k ) , p S ^ n ( k ) s ^ ( k ) and p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) to p S ϵ ( k ) s ( k ) , p S ^ ϵ ( k ) s ^ ( k ) and p S ϵ ( k ) | S ^ ϵ ( k ) s ( k ) | s ^ ( k ) , respectively. To that aim, we will make the following scaling assumption w.l.o.g.:
Assumption A1.
The variance of the source and the allowed distortion are scaled by some factor α 2 such that
α 2 · min D , min 0 t T ps σ S c 2 ( t ) D > 1 2 π .
Note that this assumption has no effect on the generality of the RDF for multivariate stationary processes detailed in ([5] Section 10.3.3), ([34] Section IV). Moreover, by Theorem 1, for every α > 0 it holds that when any rate R achievable when compressing the original source S c ( t ) with distortion not larger that D is achievable when compressing the scaled source α · S c ( t ) with distortion not larger than α 2 · D . Note that if for the source S c ( t ) the distortion satisfies D < min 0 t T ps σ S c 2 ( t ) , then for the scaled source and distortion we have α 2 · D < min 0 t T ps α 2 · σ S c 2 ( t ) .

Appendix D.1. Convergence in Distribution of S n ( k ) to S ϵ ( k ) Uniformly with Respect to k N

In order to prove the uniform convergence in distribution, S n ( k ) n ( d i s t . ) S ϵ ( k ) , uniformly with respect to k N , we first prove, in Lemma A3, that as n the sequence of PDFs of S n ( k ) , p S n ( k ) s ( k ) , converges to the PDF of S ϵ ( k ) , p S ϵ ( k ) s ( k ) , uniformly in s ( k ) R k and in k N . Next, we show in Corollary A1 that S n ( k ) n ( d i s t . ) S ϵ ( k ) uniformly in k N .
To that aim, let us define the set K { 1 , 2 , , k } and consider the k- dimensional zero-mean, memoryless random vectors S n ( k ) and S ϵ ( k ) with their respective diagonal correlation matrices expressed below:
R n ( k ) E S n ( k ) S n ( k ) T = diag σ S n 2 [ 1 ] , , σ S n 2 [ k ] ,
R ϵ ( k ) E S ϵ ( k ) S ϵ ( k ) T = diag σ S ϵ 2 [ 1 ] , , σ S ϵ 2 [ k ] .
Since ϵ n n · ϵ n it holds that n · ϵ 1 n ϵ n n · ϵ n ; therefore
lim n ϵ n = ϵ .
Now we note that since σ S c 2 ( t ) is uniformly continuous, then by the definition of a uniformly continuous function, for each i N , the limit in (A20) implies that
lim n σ S n 2 [ i ] lim n σ S c 2 i · T ps p + ϵ n = σ S c 2 i · T ps p + ϵ σ S ϵ 2 [ i ] .
From Assumption A1, it follows that σ S n 2 [ i ] satisfies σ S n 2 [ i ] > 1 2 π ; Hence, we can state the following lemma:
Lemma A3.
The PDF of S n ( k ) , p S n ( k ) s ( k ) , converges as n to the PDF of S ϵ ( k ) , p S ϵ ( k ) s ( k ) , uniformly in s ( k ) R k and in k N :
lim n p S n ( k ) s ( k ) = p S ϵ ( k ) s ( k ) , s ( k ) R k , k N .
Proof. 
The proof of the lemma directly follows from the steps in the proof of ([17] Lemma B.1), which was applied to random Gaussian processes with independent entries and variance larger than 1 2 π . □
Lemma A3 gives rise to the following corollary:
Corollary A1.
For any k N it holds that S n ( k ) n ( d i s t . ) S ϵ ( k ) , and convergence is uniform over k.
Proof. 
The corollary holds due to ([35] Theorem 1): Since p S n ( k ) s ( k ) converges to p S ϵ ( k ) s ( k ) then S n ( k ) n ( d i s t . ) S ϵ ( k ) . In addition, since the convergence of the PDFs is uniform in k N , the convergence of the CDFs is also uniform in k N . □

Appendix D.2. Showing that Z ˜ k , n F S n , S ^ n opt and Z k , ϵ F S ϵ , S ^ ϵ Satisfy the Conditions of Theorem 3

Let F S n , S ^ n opt denote the joint distribution for the source process and the corresponding optimal reproduction process satisfying the distortion constraint D. We next prove that for F S n , S ^ n opt n ( d i s t . ) F S ϵ , S ^ ϵ , then Z ˜ k , n F S n , S ^ n opt and Z k , ϵ F S ϵ , S ^ ϵ satisfy AS1–AS2. In particular, Lemma A4 proves that Z ˜ k , n F S n , S ^ n opt n ( d i s t . ) Z k , ϵ F S ϵ , S ^ ϵ uniformly in k N for the optimal zero-mean Gaussian reproduction vectors with independent entries. Lemma A5 proves that for any fixed n, Z ˜ k , n F S n , S ^ n opt converges in distribution to a deterministic scalar as k .
Lemma A4.
Let { S ^ n ( k ) } n N and { W n ( k ) } n N be two sets of mutually independent sequences of k × 1 zero-mean Gaussian random vectors related via the backward channel (20), each having independent entries and let PDFs p S ^ n ( k ) s ^ ( k ) and p W n ( k ) w ( k ) , respectively, denote their PDFs. Consider two other zero-mean Gaussian random vectors S ^ ϵ ( k ) and W ϵ ( k ) each having independent entries with the PDFs p S ^ ϵ ( k ) s ^ ( k ) and p W ϵ ( k ) w ( k ) , respectively, such that lim n p S ^ n ( k ) s ^ ( k ) = p S ^ ϵ ( k ) s ^ ( k ) uniformly in s ^ ( k ) R k and uniformly with respect to k N , and lim n p W n ( k ) w ( k ) = p W ϵ ( k ) w ( k ) uniformly in w ( k ) R k and uniformly with respect to k N . Then, the RVs Z ˜ k , n F S n , S ^ n opt and Z k , ϵ F S ϵ , S ^ ϵ , defined via (A17) satisfy Z ˜ k , n F S n , S ^ n opt n ( d i s t . ) Z k , ϵ F S ϵ , S ^ ϵ uniformly over k N .
Proof. 
To begin the proof, for s ( k ) , s ^ ( k ) R 2 k , define
f k , n s ( k ) , s ^ ( k ) p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) p S n ( k ) s ( k ) , f k , ϵ s ( k ) , s ^ ( k ) p S ϵ ( k ) | S ^ ϵ ( k ) s ( k ) | s ^ ( k ) p S ϵ ( k ) s ( k ) .
Now, we recall the backward channel relationship (20):
S n ( k ) = S ^ n ( k ) + W n ( k ) ,
where S ^ n ( k ) and W n ( k ) are mutually independent zero-mean, Gaussian random vectors with independent entries, corresponding to the optimal compression process and its respective distortion. From this relationship we obtain
p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) = ( a ) p S ^ n ( k ) + W n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) = p W n ( k ) | S ^ n ( k ) s ( k ) s ^ ( k ) | s ^ ( k ) = ( b ) p W n ( k ) s ( k ) s ^ ( k ) ,
where ( a ) follows since S n ( k ) = S ^ n ( k ) + W n ( k ) , see (A23), and ( b ) follows since W n ( k ) and S ^ n ( k ) are mutually independent. The joint PDF of S n ( k ) and S ^ n ( k ) can be expressed via the conditional PDF as:
p S n ( k ) , S ^ n ( k ) s ( k ) , s ^ ( k ) = p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) · p S ^ n ( k ) s ^ ( k ) = ( a ) p W n ( k ) s ( k ) s ^ ( k ) · p S ^ n ( k ) s ^ ( k ) ,
where ( a ) follows from (A24). Since S ^ n ( k ) and W n ( k ) are Gaussian and mutually independent and since the product of two multivariate Gaussian PDFs is also a multivariate Gaussian PDF ([36] Section 3), it follows from (A25) that S n ( k ) and S ^ n ( k ) are jointly Gaussian. Following the mutual independence of W n ( k ) and S ^ n ( k ) , the right hand side (RHS) of (A25) is also equivalent to the joint PDF of W n ( k ) T , S ^ n ( k ) T T denoted by p W n ( k ) , S ^ n ( k ) s ( k ) s ^ ( k ) , s ^ ( k ) . Now, from (A24), the assumption lim n p W n ( k ) w ( k ) = p W ϵ ( k ) w ( k ) implies that a limit exists for the conditional PDF p S n ( k ) S ^ n ( k ) s ( k ) s ^ ( k ) , this we denote by p S ϵ ( k ) S ^ ϵ ( k ) s ( k ) s ^ ( k ) . Combining this with the assumption lim n p S ^ n ( k ) s ^ ( k ) = p S ^ ϵ ( k ) s ^ ( k ) , we have that,
lim n p S n ( k ) , S ^ n ( k ) s ( k ) , s ^ ( k ) = lim n p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) · p S ^ n ( k ) s ^ ( k ) = ( a ) lim n p W n ( k ) s ( k ) s ^ ( k ) · p S ^ n ( k ) s ^ ( k ) = ( b ) lim n p W n ( k ) s ( k ) s ^ ( k ) · lim n p S ^ n ( k ) s ^ ( k ) = p S ϵ ( k ) | S ^ ϵ ( k ) s ( k ) | s ^ ( k ) · p S ^ ϵ ( k ) s ^ ( k ) = p S ϵ ( k ) , S ^ ϵ ( k ) s ( k ) , s ^ ( k ) ,
where ( a ) follows from (A24), and ( b ) follows since the limit for each sequence in the product exists ([31] Theorem 3.3); Convergence is uniform in s ^ ( k ) T , s ( k ) T T R 2 k and k N , as each sequence converges uniformly in k N ([31] Page 165). Observe that the joint PDF for the zero-mean Gaussian random vectors S n ( k ) , S ^ n ( k ) is given by the general expression:
p S n ( k ) , S ^ n ( k ) s ( k ) , s ^ ( k ) = Det 2 π C ˜ n ( 2 k ) 1 2 exp 1 2 s ^ ( k ) T , s ( k ) T C ˜ n ( 2 k ) 1 s ^ ( k ) T , s ( k ) T T ,
where C ˜ n ( 2 k ) denotes the joint covariance matrix of S ^ n ( k ) T , S n ( k ) T T . From (A27) we note that p S n ( k ) , S ^ n ( k ) s ( k ) , s ^ ( k ) is a continuous mapping of C ˜ n ( 2 k ) with respect to the index n, see ([17] Lemma B.1). Hence the convergence in (A26) of p S n ( k ) , S ^ n ( k ) s ( k ) , s ^ ( k ) as n directly implies the convergence of C ˜ n ( 2 k ) as n to a limit which we denote by C ˜ ϵ ( 2 k ) . It therefore follows that the limit function p S ϵ ( k ) , S ^ ϵ ( k ) s ( k ) , s ^ ( k ) corresponds to the PDF of a Gaussian vector with the covariance matrix C ˜ ϵ ( 2 k ) .
The joint PDF for the zero-mean Gaussian random vectors W n ( k ) , S ^ n ( k ) can be obtained using their mutual independence as:
p W n ( k ) , S ^ n ( k ) s ( k ) s ^ ( k ) , s ^ ( k ) = Det 2 π Σ n ( 2 k ) 1 2 exp 1 2 s ( k ) s ^ ( k ) T , s ^ ( k ) T Σ n ( 2 k ) 1 s ( k ) s ^ ( k ) T , s ^ ( k ) T T ,
where Σ n ( 2 k ) denotes the joint covariance matrix of W n ( k ) T , S ^ n ( k ) T T . Since the vectors W n ( k ) and S ^ n ( k ) are zero-mean, mutually independent and, by the relationship (20), each vector has independent entries, it follows that Σ n ( 2 k ) is a diagonal matrix with each diagonal element taking the value of the corresponding temporal variance at the respective index i { 1 , 2 , , k } . i.e.,
Σ n ( 2 k ) E W n ( k ) T , S ^ n ( k ) T T · W n ( k ) T , S ^ n ( k ) T = diag E W n [ 1 ] 2 , E W n [ 2 ] 2 , , E W n [ k ] 2 , σ S ^ n 2 [ 1 ] , σ S ^ n 2 [ 2 ] , σ S ^ n 2 [ k ] .
The convergence of p W n ( k ) , S ^ n ( k ) s ( k ) s ^ ( k ) , s ^ ( k ) , from (A26), implies a convergence of the diagonal elements in (A29) as n . Hence Σ n ( 2 k ) converges as n to a diagonal joint covariance matrix which we denote by Σ ϵ ( 2 k ) . This further implies that the limiting vectors W ϵ ( k ) and S ^ ϵ ( k ) are zero-mean, mutually independent and each vector has independent entries in i [ 1 , 2 , , k ] .
Relationship (A26) implies that the joint limit distribution satisfies p S ϵ ( k ) , S ^ ϵ ( k ) s ( k ) , s ^ ( k ) = p S ^ ϵ ( k ) s ^ ( k ) p W ϵ ( k ) s ( k ) s ^ ( k ) . Consequently, we can define an asymptotic backward channel that satisfies (A26) via the expression:
S ϵ ( k ) [ i ] = S ^ ϵ ( k ) [ i ] + W ϵ ( k ) [ i ] .
Next, by convergence of the joint PDF p W n ( k ) s ( k ) s ^ ( k ) · p S ^ n ( k ) s ^ ( k ) uniformly in k N and in s ( k ) T , s ^ ( k ) T T R 2 k , it follows from ([35] Theorem 1) that S ^ n ( k ) T , W n ( k ) T T n ( d i s t . ) S ^ ϵ ( k ) T , W ϵ ( k ) T T and the convergence is uniform in k N and in s ( k ) T , s ^ ( k ) T T R 2 k . Then, by the continuous mapping theorem (CMT) ([37] Theorem 7.7), we have
S n ( k ) T , S ^ n ( k ) T T = S ^ n ( k ) + W n ( k ) T , S ^ n ( k ) T T n ( d i s t . ) S ^ ϵ ( k ) + W ϵ ( k ) T , S ^ ϵ ( k ) T T = S ϵ ( k ) T , S ^ ϵ ( k ) T T .
Now, using the extended CMT ([37] Theorem 7.24), we will show that f k , n S n ( k ) , S ^ n ( k ) n ( d i s t . ) f k , ϵ S ϵ ( k ) , S ^ ϵ ( k ) for each k N , following the same approach as in the proof of ([17] Lemma B.2). Then, since Z ˜ k , n F S n , S ^ n opt = 1 k log f k , n S n ( k ) , S ^ n ( k ) and Z k F S ϵ , S ^ ϵ = 1 k log f k , ϵ S ϵ ( k ) , S ^ ϵ ( k ) , we conclude that Z ˜ k , n F S n , S ^ n opt n ( d i s t . ) Z k F S ϵ , S ^ ϵ , where it also follows from the proof of ([17] Lemma B.2) that the convergence is uniform in k N . Specifically, to prove that f k , n S n ( k ) , S ^ n ( k ) n ( d i s t . ) f k , ϵ S ϵ ( k ) , S ^ ϵ ( k ) , we will show that the following two properties hold:
P1
The distribution of S ϵ ( k ) T , S ^ ϵ ( k ) T T is separable (as defined in ([37] Pg. 101)).
P2
For any convergent sequence s n ( k ) T , s ^ n ( k ) T T R 2 k such that lim n s n ( k ) , s ^ n ( k ) = s ϵ ( k ) , s ^ ϵ ( k ) , then lim n f k , n s n ( k ) , s ^ n ( k ) = f k , ϵ s ϵ ( k ) , s ^ ϵ ( k ) .
To prove property P1, we show that U ( k ) S ϵ ( k ) T , S ^ ϵ ( k ) T T (here we misuse the dimension notation as U ( k ) denotes a 2 k -dimensional vector) is separable ([37] Pg. 101), i.e., we show that η > 0 , there exists β > 0 such that Pr U ( k ) 2 > β < η . To that aim, recall first that by Markov’s inequality ([29] Pg. 114), it follows that Pr U ( k ) 2 > β < 1 β E U ( k ) 2 . For the asynchronously sampled source process, we note that σ S ϵ 2 [ i ] E S ϵ [ i ] 2 [ 0 , max 0 t T ps σ S c 2 ( t ) ] . By the independence of W ϵ ( k ) and S ^ ϵ ( k ) , and by the fact that their mean is zero, we have, from (A30) that E S ϵ [ i ] 2 = E S ^ ϵ [ i ] 2 + E W ϵ [ i ] 2 max 0 t T ps σ S c 2 ( t ) ; Hence E S ^ ϵ [ i ] 2 max 0 t T ps σ S c 2 ( t ) , and E W ϵ [ i ] 2 max 0 t T ps σ S c 2 ( t ) . This further implies that E U ( k ) 2 = E S ϵ ( k ) T , S ^ ϵ ( k ) T T 2 2 · k · max 0 t T ps σ S c 2 ( t ) ; therefore for each β > 1 η E U ( k ) 2 we have that Pr U ( k ) 2 > β < η , and thus U ( k ) is separable.
By the assumption in this lemma it follows that η > 0 there exists n 0 ( η ) > 0 such that for all n > n 0 ( η ) we have that w ( k ) R k , | p W n ( k ) w ( k ) p W ϵ ( k ) w ( k ) | < η , for all sufficiently large k N . Consequently, for all s ( k ) T , s ^ ( k ) T T R 2 k , n > n 0 ( η ) and a sufficiently large k N , it follows from (A24) that
p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) p S ϵ ( k ) | S ^ ϵ ( k ) s ( k ) | s ^ ( k ) = p W n ( k ) s ( k ) s ^ ( k ) p W ϵ ( k ) s ( k ) s ^ ( k ) < η .
Following the continuity of p S n ( k ) | S ^ n ( k ) s ( k ) | s ^ ( k ) and of p S n ( k ) ( s ( k ) ) , f k , n s ( k ) , s ^ ( k ) is also continuous ([31] Theorem 4.9); hence, when lim n s n ( k ) , s ^ n ( k ) = s ( k ) , s ^ ( k ) , then lim n f k , n s n ( k ) , s ^ n ( k ) = f k , ϵ s ( k ) , s ^ ( k ) . This satisfies condition P2 for the extended CMT; Therefore, by the extended CMT, we have that f k , n S n ( k ) , S ^ n ( k ) n ( d i s t . ) f k , ϵ S ϵ ( k ) , S ^ ϵ ( k ) . Since the RVs Z ˜ k , n F S n , S ^ n opt and Z k , ϵ F S ϵ , S ^ ϵ , defined in (A17), are also continuous mappings of f k , n S n ( k ) , S ^ n ( k ) and of f k , ϵ S ϵ ( k ) , S ^ ϵ ( k ) , respectively, it follows from the CMT ([37] Theorem 7.7) that Z ˜ k , n F S n , S ^ n opt n ( d i s t . ) Z k , ϵ F S ϵ , S ^ ϵ .
Finally, to prove that the convergence Z ˜ k , n F S n , S ^ n opt n ( d i s t . ) Z k , ϵ F S ϵ , S ^ ϵ is uniform in k N , we note that as S ^ n ( k ) and S ^ ϵ ( k ) have independent entries, and the backward channels (21) and (A30) are memoryless. Hence, it follows from the proof of ([17] Lemma B.2), that the characteristic function of the RV k · Z ˜ k , n F S n , S ^ n opt which is denoted by Φ k · Z ˜ k , n ( α ) E e j · α · k · Z ˜ k , n converges to the characteristic function of k · Z k , ϵ F S ϵ , S ^ ϵ , denoted by Φ k · Z k , ϵ ( α ) , uniformly over k N . Thus, for all sufficiently small η > 0 , k 0 N , n 0 ( η , k 0 ) N such that n > n 0 ( η , k 0 ) , and k > k 0
| Φ k · Z ˜ k , n ( α ) Φ k · Z k , ϵ ( α ) | < η , α · R .
Hence, following Lévy’s convergence theorem ([38] Theorem 18.1) we conclude that k · Z ˜ k , n F S n , S ^ n opt n ( d i s t . ) k · Z k , ϵ F S ϵ , S ^ ϵ and that this convergence is uniform for sufficiently large k. Finally, since the CDFs of k · Z ˜ k , n F S n , S ^ n opt and k · Z k , ϵ F S ϵ , S ^ ϵ obtained at α R are equivalent to the CDFs of Z ˜ k , n F S n , S ^ n opt and Z k , ϵ F S ϵ , S ^ ϵ obtained at α k R respectively, we can conclude that Z ˜ k , n F S n , S ^ n opt n ( d i s t . ) Z k , ϵ F S ϵ , S ^ ϵ , uniformly in k N . □
The following convergence lemma A5 corresponds to ([17] Lemma B.3),
Lemma A5.
Let n N be given. Every subsequence of Z ˜ k , n F S ^ n , S n opt k N , indexed by k l , converges in distribution, in the limit as l , to a finite deterministic scalar.
Proof. 
Recall that the RVs Z ˜ k , n F S ^ n , S n opt represent the mutual information density rate between k samples of the source process S n [ i ] and the corresponding samples of its reproduction process S ^ n [ i ] , where these processes are jointly distributed via the Gaussian distribution measure F S ^ n , S n opt . Further, recall that the relationship between the source signal and the reproduction process which achieves the RDF can be described via the backward channel in (21) for a Gaussian source. The channel (21) is a memoryless additive WSCS Gaussian noise channel with period p n , thus, by [21], it can be equivalently represented as a p n × 1 multivariate memoryless additive stationary Gaussian noise channel, which is an information stable channel ([39] Section 1.5). For such channels in which the source and its reproduction obey the RDF-achieving joint distribution F S n , S ^ n opt , the mutual information density rate converges as k increases, almost surely, to the finite and deterministic mutual information rate ([14] Theorem 5.9.1). Since almost sure convergence implies convergence in distribution ([37] Lemma 7.21), this proves the lemma. □

Appendix D.3. Showing that R ϵ ( D ) = lim sup n R n ( D )

This section completes the proof to Theorem 4. We note from (14) that the RDF for the source process S n [ i ] (for fixed length coding and MSE distortion measure) is given by:
R n ( D ) = inf F S ^ n , S n : d ¯ S F S ^ n , S n D p lim sup k Z ˜ k , n F S ^ n , S n opt ,
where d ¯ S F S ^ n , S n = lim sup k 1 k E S n ( k ) S ^ n ( k ) 2 .
We now state the following lemma characterizing the asymptotic statistics of the optimal reconstruction S ^ n ( k ) process and the respective noise process W n ( k ) used in the backward channel relationship (21):
Lemma A6.
Consider the RDF-achieving distribution with distortion D for compression of a vector Gaussian source process S n ( k ) characterized by the backward channel (21). Then, there exists a subsequence in the index n N denoted n 1 < n 2 < , such that for the RDF-achieving distribution, the sequences of reproduction vectors { S ^ n l ( k ) } l N and backward channel noise vectors { W n l ( k ) } l N satisfy that lim l p S ^ n l ( k ) s ^ ( k ) = p S ^ ϵ ( k ) s ^ ( k ) uniformly in s ^ ( k ) R k and uniformly with respect to k N , as well as lim l p W n l ( k ) w ( k ) = p W ϵ ( k ) w ( k ) uniformly in w ( k ) R k and uniformly with respect to k N , where p S ^ ϵ ( k ) s ^ ( k ) and p W ϵ ( k ) w ( k ) are Gaussian PDFs.
Proof. 
Recall from the analysis of the RDF for WSCS processes that for each n N , the marginal distributions of the RDF-achieving reproduction process S ^ n [ i ] and the backward channel noise W n [ i ] is Gaussian, memoryless, zero-mean, and with variances σ S ^ n 2 [ i ] E S ^ n [ i ] 2 and
E W n [ i ] 2 = σ S n 2 [ i ] σ S ^ n 2 [ i ] ,
respectively. Consequently, the sequences of reproduction vectors { S ^ n ( k ) } n N and backward channel noise vectors { W n ( k ) } n N are zero-mean Gaussian with independent entries for each k N . Since σ S n 2 [ i ] max t R σ S c 2 ( t ) , then, from (A34), it follows that σ S ^ n 2 [ i ] is also bounded in the interval [ 0 , max t R σ S c 2 ( t ) ] for all n N . Therefore, by Bolzano-Weierstrass theorem ([31] Theorem 2.42), σ S ^ n 2 [ i ] has a convergent subsequence, and we let n 1 < n 2 < denote the indexes of this convergent subsequence and let the limit of the subsequence be denoted by σ S ^ ϵ 2 [ i ] . From the CMT, as applied in the proof of ([17] Lemma B.1), the convergence σ S ^ n l 2 [ i ] l σ S ^ ϵ 2 [ i ] for each i N implies that the subsequence of PDFs p S ^ n l ( k ) s ^ ( k ) corresponding to the memoryless Gaussian random vectors { S ^ n l ( k ) } l N converges as l to a Gaussian PDF which we denote by p S ^ ϵ ( k ) s ^ ( k ) , and the convergence of p S ^ n l ( k ) s ^ ( k ) is uniform in s ( k ) for any fixed k N . By Remark 2, it holds that W n [ i ] is a memoryless stationary process with variance E W n [ i ] 2 = D and by Equation (A34), σ S ^ n 2 [ i ] = σ S n 2 [ i ] D . Hence by Assumption A1 and by the proof of ([17] Lemma B.1), it follows that for a fixed η > 0 and k 0 N , n 0 ( η , k 0 ) such that for all n > n 0 ( η , k 0 ) and for all sufficiently large k, it holds that | p S ^ n l ( k ) s ^ ( k ) p S ^ ϵ ( k ) s ^ ( k ) | < η for every s ^ ( k ) R k . Since n 0 ( η , k 0 ) does not depend on k (only on the fixed k 0 ), this implies that the convergence is uniform with respect to k N .
The fact that W n [ i ] is a zero-mean stationary Gaussian process with variance D for each n N , implies that the sequence of PDFs p W n ( k ) w ( k ) converges as n to a Gaussian PDF which we denote by p W ( k ) w ( k ) , hence its subsequence with indices n 1 < n 2 < also converges to p W ( k ) w ( k ) . Since D > 1 2 π by Assumption A1 combined with the proof of ([17] Lemma B.1) it follows that this convergence is uniform in w ( k ) and in k N to p W ϵ ( k ) w ( k ) .
Following the proof of Corollary A1, it holds that the subsequences of the memoryless Gaussian random vectors S ^ n l ( k ) and W n l ( k ) converge in distribution as l to a Gaussian distribution, and the convergence is uniform in k N for any fixed k N . Hence, as shown in Lemma A4 the joint distribution S n l ( k ) T , S ^ n l ( k ) T T n ( d i s t . ) S ϵ ( k ) T , S ^ ϵ ( k ) T T , and the limit distribution is jointly Gaussian. □
Lemma A7.
The RDF of { S ϵ [ i ] } satisfies R ϵ ( D ) lim sup n R n ( D ) , and the rate lim sup n R n ( D ) is achievable for the source { S ϵ [ i ] } with distortion D when the reproduction process which obeys a Gaussian distribution.
Proof. 
According to Lemma A6, we note that the sequence of joint distributions { F S n , S ^ n opt } n N has a convergent subsequence, i.e., there exists a set of indexes n 1 < n 2 < such that the sequence of distributions with independent entries { F S n l , S ^ n l opt } l N converges in the limit l to a joint Gaussian distribution F S ϵ , S ^ ϵ and the convergence is uniform in k N . Hence, this satisfies the condition of Lemma A4; This implies that Z ˜ k , n l F S n l , S ^ n l opt l ( d i s t . ) Z k F S ϵ , S ^ ϵ uniformly in k N . Moreover, by Lemma A5 every subsequence of Z ˜ k , n l F S n l , S ^ n l opt l N converges in distribution to a finite deterministic scalar as k . Therefore, by Theorem 3 it holds that
lim l p lim sup k Z ˜ k , n l F S n l , S ^ n l opt = p lim sup k Z k , ϵ F S ϵ , S ^ ϵ inf F S ϵ , S ^ ϵ p lim sup k Z k , ϵ F S ϵ , S ^ ϵ = R ϵ ( D ) .
From (14) we have that R n ( D ) = p lim sup k Z ˜ k , n F S n , S ^ n opt , then from (A35), it follows that
R ϵ ( D ) lim l R n l ( D ) ( a ) lim sup n R n ( D ) ,
where ( a ) follows since, by ([31] Definition 3.16), the limit of every subsequence is not greater than the limit superior. Noting that F S ϵ , S ^ ϵ is Gaussian by Lemma A6 concludes the proof. □
Lemma A8.
The RDF of { S ϵ [ i ] } satisfies R ϵ ( D ) lim sup n R n ( D ) .
Proof. 
To prove this lemma, we first show that for a joint distribution F S ϵ , S ^ ϵ which achieves a rate-distortion pair ( R ϵ , D ) it holds that R ϵ E { Z k , ϵ ( F S ϵ , S ^ ϵ ) } : Recall that ( R ϵ , D ) is an achievable rate-distortion pair for the source { S ϵ [ i ] } , namely, there exists a sequence of codes { C l } whose rate-distortion approach ( R ϵ , D ) when applied to { S ϵ [ i ] } , This implies that for any η > 0 there exists l 0 ( η ) such that l > l 0 ( η ) it holds that C l has a code rate R l = 1 l log 2 M l satisfying R l R ϵ + η by (3). Recalling Definition 5, the source code maps S ϵ ( l ) into a discrete index J l { 1 , 2 , , M l } , which is in turn mapped into S ^ ϵ ( l ) , i.e., S ϵ ( l ) J l S ^ ϵ ( l ) form a Markov chain. Since J l is a discrete random variable taking values in { 1 , 2 , , M l } , it holds that
log 2 M l H ( J l ) ( a ) I ( S ϵ ( l ) ; J l ) ( b ) I ( S ϵ ( l ) ; S ^ ϵ ( l ) ) ,
where ( a ) follows since I ( S ϵ ( l ) ; J l ) = H ( J l ) H ( J l | S ϵ ( l ) ) which is not larger than H ( J l ) as J l takes discrete values; while ( b ) follows from the data processing inequality ([5] Chapter 2.8). Now, (A37) implies that for each l > l 0 ( η ) , the reproduction obtained using the code C l satisfies 1 l I ( S ϵ ( l ) ; S ^ ϵ ( l ) ) 1 l log M l R ϵ + η . Since for every arbitrarily small η 0 , this inequality holds for all l > l 0 ( η ) , i.e., for all sufficiently large l, it follows that R ϵ lim sup k 1 l I ( S ϵ ( l ) ; S ^ ϵ ( l ) ) . Hence, replacing the blocklength symbol from l to k, as 1 k I ( S ϵ ( k ) , S ^ ϵ ( k ) ) = E { Z k , ϵ ( F S ϵ , S ^ ϵ ) } ([5] Equation (2.3)), we conclude that
R ϵ ( D ) lim sup k E { Z k , ϵ ( F S ϵ , S ^ ϵ ) } .
Next, we consider lim sup k E { Z k l , ϵ ( F S ϵ , S ^ ϵ ) } : Let Z k l , ϵ F S ϵ , S ^ ϵ be a subsequence of E Z k , ϵ ( F S ϵ , S ^ ϵ ) with the indexes k 1 < k 2 < such that its limit equals the limit superior. i.e., lim l E Z k l , ϵ F S ϵ , S ^ ϵ = lim sup k E Z k , ϵ F S ϵ , S ^ ϵ . Since by Lemma A4, the sequence of non-negative RVs Z ˜ k l , n F S n , S ^ n opt n N convergences in distribution to Z k l , ϵ F S ϵ , S ^ ϵ as n uniformly in k N , it follows from ([40] Theorem 3.5) that E Z k l , ϵ F S ϵ , S ^ ϵ = lim n E Z ˜ k l , n F S n , S ^ n opt . Moreover, we define a family of distributions F ( D ) such that F ( D ) = { F S , S ^ : D F S , S ^ D } . Consequently, Equation (A38) can now be written as:
R ϵ ( D ) lim sup k E Z k , ϵ F S ϵ , S ^ ϵ = lim l lim n E Z ˜ k l , n F S n , S ^ n opt = ( a ) lim n lim l E Z ˜ k l , n F S n , S ^ n opt = ( b ) lim sup n lim l E Z ˜ k l , n F S n , S ^ n opt lim sup n lim l inf F S , S ^ F ( D ) E Z ˜ k l , n F S , S ^ = ( c ) lim sup n lim l inf F S , S ^ F ( D ) 1 k l I S ^ n ( k l ) ; S n ( k l ) ,
where ( a ) follows since the convergence Z ˜ k l , n F S n , S ^ n opt n ( d i s t . ) Z k l , ϵ F S ϵ , S ^ is uniform with respect to k l , thus the limits are interchangeable ([31] Theorem 7.11); ( b ) follows since the limit of the subsequence E Z ˜ k l , n F S n , S ^ n opt exists in the index n, and is therefore equivalent to the limit superior, lim sup n E Z ˜ k l , n F S n , S ^ n opt ([31] Page 57); and ( c ) holds since mutual information is the expected value of the mutual information density rate ([5] Equation (2.30)). Finally, we recall that in the proof of Lemma A5 it was established that the backward channel for the RDF at the distortion constraint D, defined in (21), is information stable, hence for such backward channels, we have from ([41] Theorem 1) that the minimum rate is defined as R n ( D ) = lim k inf F S , S ^ F ( D ) 1 k I S ^ ϵ ( k ) ; S n ( k ) and the limit exists; Hence, lim k inf F S , S ^ F ( D ) 1 k I S ^ ϵ ( k ) ; S n ( k ) = lim l inf F S , S ^ F ( D ) 1 k l I S ^ ( k l ) ; S n ( k l ) in the index k. Substituting this into Equation (A39) yields the result:
R ϵ ( D ) lim sup n R n ( D ) .
This proves the lemma. □
Combining the Lemmas A7 and A8 proves that R ϵ ( D ) = lim sup n R n ( D ) and the rate is achievable with Gaussian inputs, completing the proof of the theorem.

References

  1. Gardner, W.; Brown, W.; Chen, C.K. Spectral correlation of modulated signals: Part II-digital modulation. IEEE Trans. Commun. 1987, 35, 595–601. [Google Scholar] [CrossRef]
  2. Giannakis, G.B. Cyclostationary signal analysis. In Digital Signal Processing Handbook; CRC PRESS: Boca Raton, FL, USA, 1998; pp. 17–21. [Google Scholar]
  3. Gardner, W.A.; Napolitano, A.; Paura, L. Cyclostationarity: Half a century of research. Signal Process. 2006, 86, 639–697. [Google Scholar] [CrossRef]
  4. Berger, T.; Gibson, J.D. Lossy source coding. IEEE Trans. Inf. Theory 1998, 44, 2693–2723. [Google Scholar] [CrossRef] [Green Version]
  5. Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley & Sons: New York, NY, USA, 2006. [Google Scholar]
  6. Wolf, J.K.; Wyner, A.D.; Ziv, J. Source coding for multiple descriptions. Bell Syst. Tech. J. 1980, 59, 1417–1426. [Google Scholar] [CrossRef]
  7. Wyner, A.D.; Ziv, J. The rate-distortion function for source coding with side information at the decoder. IEEE Trans. Inf. Theory 1976, 22, 1–10. [Google Scholar] [CrossRef]
  8. Oohama, Y. Gaussian multiterminal source coding. IEEE Trans. Inf. Theory 1997, 43, 1912–1923. [Google Scholar] [CrossRef]
  9. Pandya, A.; Kansal, A.; Pottie, G.; Srivastava, M. Lossy source coding of multiple Gaussian sources: m-helper problem. In Proceedings of the IEEE Information Theory Workshop, San Antonio, TX, USA, 24–29 October 2004; pp. 34–38. [Google Scholar]
  10. Gallager, R.G. Information Theory and Reliable Communication; Springer: Berlin, Germany, 1968; Volume 588. [Google Scholar]
  11. Harrison, M.T. The generalized asymptotic equipartition property: Necessary and sufficient conditions. IEEE Trans. Inf. Theory 2008, 54, 3211–3216. [Google Scholar] [CrossRef] [Green Version]
  12. Kipnis, A.; Goldsmith, A.J.; Eldar, Y.C. The distortion rate function of cyclostationary Gaussian processes. IEEE Trans. Inf. Theory 2018, 64, 3810–3824. [Google Scholar] [CrossRef] [Green Version]
  13. Napolitano, A. Cyclostationarity: New trends and applications. Signal Process. 2016, 120, 385–408. [Google Scholar] [CrossRef]
  14. Han, T.S. Information-Spectrum Methods in Information Theory; Springer: Berlin, Germany, 2003; Volume 50. [Google Scholar]
  15. Verdú, S.; Han, T.S. A general formula for channel capacity. IEEE Trans. Inf. Theory 1994, 40, 1147–1157. [Google Scholar] [CrossRef] [Green Version]
  16. Zeng, W.; Mitran, P.; Kavcic, A. On the information stability of channels with timing errors. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Seattle, WA, USA, 9–14 July 2006; pp. 1885–1889. [Google Scholar]
  17. Shlezinger, N.; Abakasanga, E.; Dabora, R.; Eldar, Y.C. The Capacity of Memoryless Channels with Sampled Cyclostationary Gaussian Noise. IEEE Trans. Commun. 2020, 68, 106–121. [Google Scholar] [CrossRef]
  18. Shannon, C.E. Communication in the presence of noise. Proc. IEEE 1998, 86, 447–457. [Google Scholar] [CrossRef]
  19. Cherif, F. A various types of almost periodic functions on Banach spaces: Part I. Int. Math. Forum 2011, 6, 921–952. [Google Scholar]
  20. Shlezinger, N.; Dabora, R. On the capacity of narrowband PLC channels. IEEE Trans. Commun. 2015, 63, 1191–1201. [Google Scholar] [CrossRef] [Green Version]
  21. Shlezinger, N.; Dabora, R. The capacity of discrete-time Gaussian MIMO channels with periodic characteristics. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, 10–16 October 2016; pp. 1058–1062. [Google Scholar]
  22. Shlezinger, N.; Zahavi, D.; Murin, Y.; Dabora, R. The secrecy capacity of Gaussian MIMO channels with finite memory. IEEE Trans. Inf. Theory 2017, 63, 1874–1897. [Google Scholar] [CrossRef]
  23. Heath, R.W.; Giannakis, G.B. Exploiting input cyclostationarity for blind channel identification in OFDM systems. IEEE Trans. Signal Process. 1999, 47, 848–856. [Google Scholar] [CrossRef] [Green Version]
  24. Shaked, R.; Shlezinger, N.; Dabora, R. Joint estimation of carrier frequency offset and channel impulse response for linear periodic channels. IEEE Trans. Commun. 2017, 66, 302–319. [Google Scholar] [CrossRef]
  25. Shlezinger, N.; Dabora, R. Frequency-shift filtering for OFDM signal recovery in narrowband power line communications. IEEE Trans. Commun. 2014, 62, 1283–1295. [Google Scholar] [CrossRef] [Green Version]
  26. El Gamal, A.; Kim, Y.H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  27. Wu, X.; Xie, L.L. On the optimal compressions in the compress-and-forward relay schemes. IEEE Trans. Inf. Theory 2013, 59, 2613–2628. [Google Scholar] [CrossRef] [Green Version]
  28. Zitkovic, G. Lecture Notes on the Theory of Probability Parts I and II. Available online: https://web.ma.utexas.edu/users/gordanz/lecture_notes_page.html (accessed on 12 March 2020).
  29. Papoulis, A. Probability, Random Variables, and Stochastic Processes; McGraw-Hill: New York, NY, USA, 2002. [Google Scholar]
  30. Zamir, R.; Kochman, Y.; Erez, U. Achieving the Gaussian rate–distortion function by prediction. IEEE Trans. Inf. Theory 2008, 54, 3354–3364. [Google Scholar] [CrossRef]
  31. Rudin, W. Principles of Mathematical Analysis; International series in pure and applied mathematics; McGraw-Hill: New York, NY, USA, 1976. [Google Scholar]
  32. Dixmier, J. General Topology; Springer: New York, NY, USA, 1984. [Google Scholar]
  33. Stein, E.M.; Shakarchi, R. Real Analysis: Measure Theory, Integration, and Hilbert Spaces; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
  34. Kolmogorov, A. On the Shannon theory of information transmission in the case of continuous signals. IRE Trans. Inf. Theory 1956, 2, 102–108. [Google Scholar] [CrossRef]
  35. Scheffé, H. A useful convergence theorem for probability distributions. Ann. Math. Stat. 1947, 18, 434–438. [Google Scholar] [CrossRef]
  36. Bromiley, P. Products and convolutions of Gaussian probability density functions. Tina-Vision Memo 2003, 3, 1. [Google Scholar]
  37. Kosorok, M.R. Introduction to Empirical Processes and Semiparametric Inference.; Springer: New York, NY, USA, 2008. [Google Scholar]
  38. Williams, D. Probability with Martingales; Cambridge University Press: Cambridge, UK, 1991. [Google Scholar]
  39. Dobrushin, R.L. A general formulation of the fundamental theorem of Shannon in the theory of information. Uspekhi Matematicheskikh Nauk 1959, 14, 3–104. [Google Scholar]
  40. Billingsley, P. Convergence of Probability Measures; John Wiley & Sons: New York, NY, USA, 2013. [Google Scholar]
  41. Venkataramanan, R.; Pradhan, S.S. Source coding with feed-forward: Rate-distortion theorems and error exponents for a general source. IEEE Trans. Inf. Theory 2007, 53, 2154–2179. [Google Scholar] [CrossRef]
Figure 1. Source coding block diagram.
Figure 1. Source coding block diagram.
Entropy 22 00345 g001
Figure 2. R n ( D ) versus n; offset ϕ = 0 .
Figure 2. R n ( D ) versus n; offset ϕ = 0 .
Entropy 22 00345 g002
Figure 3. R n ( D ) versus n; offset ϕ = 1 16 .
Figure 3. R n ( D ) versus n; offset ϕ = 1 16 .
Entropy 22 00345 g003
Figure 4. R n ( D ) versus T ps T s ; offset ϕ = 0 .
Figure 4. R n ( D ) versus T ps T s ; offset ϕ = 0 .
Entropy 22 00345 g004
Figure 5. R n ( D ) versus T ps T s ; offset ϕ = 1 16 .
Figure 5. R n ( D ) versus T ps T s ; offset ϕ = 1 16 .
Entropy 22 00345 g005
Figure 6. R n ( D ) versus D; offset ϕ = 0 .
Figure 6. R n ( D ) versus D; offset ϕ = 0 .
Entropy 22 00345 g006
Figure 7. R n ( D ) versus D; offset ϕ = 1 16 .
Figure 7. R n ( D ) versus D; offset ϕ = 1 16 .
Entropy 22 00345 g007
Figure 8. R n ( D ) versus ϕ at t dc = 75 % .
Figure 8. R n ( D ) versus ϕ at t dc = 75 % .
Entropy 22 00345 g008

Share and Cite

MDPI and ACS Style

Abakasanga, E.; Shlezinger, N.; Dabora, R. On the Rate-Distortion Function of Sampled Cyclostationary Gaussian Processes. Entropy 2020, 22, 345. https://0-doi-org.brum.beds.ac.uk/10.3390/e22030345

AMA Style

Abakasanga E, Shlezinger N, Dabora R. On the Rate-Distortion Function of Sampled Cyclostationary Gaussian Processes. Entropy. 2020; 22(3):345. https://0-doi-org.brum.beds.ac.uk/10.3390/e22030345

Chicago/Turabian Style

Abakasanga, Emeka, Nir Shlezinger, and Ron Dabora. 2020. "On the Rate-Distortion Function of Sampled Cyclostationary Gaussian Processes" Entropy 22, no. 3: 345. https://0-doi-org.brum.beds.ac.uk/10.3390/e22030345

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop