Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources

Gutiérrez-Gutiérrez, Jesús; Zárraga-Rodríguez, Marta; Villar-Rosety, Fernando M.; Insausti, Xabier

doi:10.3390/e20060399

Open AccessArticle

Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources

Tecnun, University of Navarra, Paseo de Manuel Lardizábal 13, 20018 San Sebastián, Spain

^*

Author to whom correspondence should be addressed.

Entropy 2018, 20(6), 399; https://0-doi-org.brum.beds.ac.uk/10.3390/e20060399

Submission received: 16 April 2018 / Revised: 14 May 2018 / Accepted: 20 May 2018 / Published: 23 May 2018

(This article belongs to the Special Issue Rate-Distortion Theory and Information Theory)

Download

Browse Figure

Versions Notes

Abstract

:

In this paper, we give upper bounds for the rate-distortion function (RDF) of any Gaussian vector, and we propose coding strategies to achieve such bounds. We use these strategies to reduce the computational complexity of coding Gaussian asymptotically wide sense stationary (AWSS) autoregressive (AR) sources. Furthermore, we also give sufficient conditions for AR processes to be AWSS.

Keywords:

source coding; rate-distortion function (RDF); Gaussian vector; autoregressive (AR) source; discrete Fourier transform (DFT)

1. Introduction

In 1956, Kolmogorov [1] gave a formula for the rate-distortion function (RDF) of Gaussian vectors and the RDF of Gaussian wide sense stationary (WSS) sources. Later, in 1970 Gray [2] obtained a formula for the RDF of Gaussian autoregressive (AR) sources.

In 1973, Pearl [3] gave an upper bound for the RDF of finite-length data blocks of Gaussian WSS sources, but he did not propose a coding strategy to achieve his bound for a given block length. In [4], we presented two tighter upper bounds for the RDF of finite-length data blocks of Gaussian WSS sources, and we proposed low-complexity coding strategies, based on the discrete Fourier transform (DFT), to achieve such bounds. Moreover, we proved that those two upper bounds tend to the RDF of the WSS source (computed by Kolmogorov in [1]) when the size of the data block grows.

In the present paper, we generalize the upper bounds and the two low-complexity coding strategies presented in [4] to any Gaussian vector. Therefore, in contrast to [4], here no assumption about the structure of the correlation matrix of the Gaussian vector has been made (observe that since the sources in [4] were WSS the correlation matrix of the Gaussian vectors there considered was Toeplitz). To obtain such generalization we start our analysis by first proving several new results on the DFT of random vectors. Although in [4] (Theorem 1) another new result on the DFT was presented, it cannot be used here, because such result and its proof rely on the power spectral density (PSD) of a WSS process and its properties.

The two low-complexity strategies here presented are applied in coding finite-length data blocks of Gaussian AR sources. Specifically, we prove that the rates (upper bounds) corresponding to these two strategies tend to the RDF of the AR source (computed by Gray in [2]) when the size of the data block grows and the AR source is asymptotically WSS (AWSS).

The definition of AWSS process was introduced by Gray in [5] (Chapter 6) and it is based on his concept of asymptotically equivalent sequences of matrices [6]. Sufficient conditions for AR processes to be AWSS can be found in [5] (Theorem 6.2) and [7] (Theorem 7). In this paper we present other sufficient conditions which make easier to check in practice whether an AR process is AWSS.

The paper is organized as follows. In Section 2 we obtain several new results on the DFT of random vectors which are used in Section 3. In Section 3 we give upper bounds for the RDF of Gaussian vectors, and we propose coding strategies to achieve such bounds. In Section 4 we apply the strategies proposed in Section 3 to reduce the computational complexity of coding Gaussian AWSS AR sources. In Section 5 we give sufficient conditions for AR processes to be AWSS. We finish the paper with a numerical example and conclusions.

2. Several New Results on the DFT of Random Vectors

We begin by introducing some notation.

C

denotes the set of (finite) complex numbers,

i

is the imaginary unit,

Re

and

Im

denote real and imaginary parts, respectively. * stands for conjugate transpose, ⊤ denotes transpose, and

λ_{k} (A)

,

k \in {1, \dots, n}

, are the eigenvalues of an

n \times n

Hermitian matrix A arranged in decreasing order. E stands for expectation, and

V_{n}

is the

n \times n

Fourier unitary matrix, i.e.,

{[V_{n}]}_{j, k} = \frac{1}{\sqrt{n}} e^{- \frac{2 π (j - 1) (k - 1)}{n} i}, j, k \in {1, \dots, n} .

If

z \in C

then

\hat{z}

denotes the real (column) vector

\hat{z} = (\begin{matrix} Re (z) \\ Im (z) \end{matrix}) .

If

z_{k} \in C

for all

k \in {1, \dots, n}

then

z_{n : 1}

is the n-dimensional vector given by

z_{n : 1} = (\begin{matrix} z_{n} \\ z_{n - 1} \\ z_{n - 2} \\ ⋮ \\ z_{1} \end{matrix}) .

In this section, we give several new results on the DFT of random vectors in two theorems and one lemma.

Theorem 1.

Let

y_{n : 1}

be the DFT of an n-dimensional random vector

x_{n : 1}

, that is,

y_{n : 1} = V_{n}^{*} x_{n : 1}

.

If $k \in {1, \dots, n}$ then

$λ_{n} (E (x_{n : 1} x_{n : 1}^{*})) \leq E ({|x_{k}|}^{2}) \leq λ_{1} (E (x_{n : 1} x_{n : 1}^{*}))$

(1)

and

$λ_{n} (E (x_{n : 1} x_{n : 1}^{*})) \leq E ({|y_{k}|}^{2}) \leq λ_{1} (E (x_{n : 1} x_{n : 1}^{*})) .$

(2)
If the random vector $x_{n : 1}$ is real and $k \in {1, \dots, n - 1} ∖ {\frac{n}{2}}$ then

$\frac{λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} \leq E ({(Re (y_{k}))}^{2}) \leq \frac{λ_{1} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2},$

(3)

and

$\frac{λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} \leq E ({(Im (y_{k}))}^{2}) \leq \frac{λ_{1} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} .$

(4)

Proof.

(1) We first prove that if

W_{n}

is an

n \times n

unitary matrix then

λ_{n} (E (x_{n : 1} x_{n : 1}^{*})) \leq {[W_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) W_{n}^{*}]}_{n - k + 1, n - k + 1} \leq λ_{1} (E (x_{n : 1} x_{n : 1}^{*})) .

(5)

We have

\begin{matrix} {[W_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) W_{n}^{*}]}_{k_{1}, k_{2}} & = \sum_{h = 1}^{n} {[W_{n}]}_{k_{1}, h} {[{diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) W_{n}^{*}]}_{h, k_{2}} \\ = \sum_{h = 1}^{n} {[W_{n}]}_{k_{1}, h} \sum_{l = 1}^{n} {[{diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*})))]}_{h, l} {[W_{n}^{*}]}_{l, k_{2}} \\ = \sum_{h = 1}^{n} {[W_{n}]}_{k_{1}, h} λ_{h} (E (x_{n : 1} x_{n : 1}^{*})) \bar{{[W_{n}]}_{k_{2}, h}} \end{matrix}

(6)

for all

k_{1}, k_{2} \in {1, \dots, n}

, and hence,

{[W_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) W_{n}^{*}]}_{n - k + 1, n - k + 1} = \sum_{h = 1}^{n} λ_{h} (E (x_{n : 1} x_{n : 1}^{*})) {| {[W_{n}]}_{n - k + 1, h} |}^{2} .

Consequently,

\begin{matrix} λ_{n} (E (x_{n : 1} x_{n : 1}^{*})) \sum_{h = 1}^{n} {| {[W_{n}]}_{n - k + 1, h} |}^{2} & \leq {[W_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) W_{n}^{*}]}_{n - k + 1, n - k + 1} \\ \leq λ_{1} (E (x_{n : 1} x_{n : 1}^{*})) \sum_{h = 1}^{n} {| {[W_{n}]}_{n - k + 1, h} |}^{2}, \end{matrix}

and applying

\sum_{h = 1}^{n} {| {[W_{n}]}_{n - k + 1, h} |}^{2} = \sum_{h = 1}^{n} {[W_{n}]}_{n - k + 1, h} {[W_{n}^{*}]}_{h, n - k + 1} = {[W_{n} W_{n}^{*}]}_{n - k + 1, n - k + 1} = {[I_{n}]}_{n - k + 1, n - k + 1} = 1,

where

I_{n}

denotes the

n \times n

identity matrix, we obtain Equation (5).

Let

E (x_{n : 1} x_{n : 1}^{*}) = U_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) U_{n}^{- 1}

be a diagonalization of

E (x_{n : 1} x_{n : 1}^{*})

where the eigenvector matrix

U_{n}

is unitary. As

E ({|x_{k}|}^{2}) = {[E (x_{n : 1} x_{n : 1}^{*})]}_{n - k + 1, n - k + 1} = {[U_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) U_{n}^{*}]}_{n - k + 1, n - k + 1},

Equation (1) follows directly by taking

W_{n} = U_{n}

in Equation (5).

Since

\begin{matrix} E ({|y_{k}|}^{2}) & = {[E (y_{n : 1} y_{n : 1}^{*})]}_{n - k + 1, n - k + 1} \\ = {[E (V_{n}^{*} x_{n : 1} x_{n : 1}^{*} {(V_{n}^{*})}^{*})]}_{n - k + 1, n - k + 1} \\ = {[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{*}) {(V_{n}^{*})}^{*}]}_{n - k + 1, n - k + 1} \\ = {[V_{n}^{*} U_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) U_{n}^{*} {(V_{n}^{*})}^{*}]}_{n - k + 1, n - k + 1} \\ = {[V_{n}^{*} U_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{*}))) {(V_{n}^{*} U_{n})}^{*}]}_{n - k + 1, n - k + 1}, \end{matrix}

(7)

taking

W_{n} = V_{n}^{*} U_{n}

in Equation (5) we obtain Equation (2).

(2) Applying [4] (Equation (10)) and taking

W_{n} = U_{n}

in Equation (6) yields

\begin{matrix} E ({(Re (y_{k}))}^{2}) \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} cos \frac{2 π (1 - k_{1}) k}{n} cos \frac{2 π (1 - k_{2}) k}{n} E (x_{n - k_{1} + 1} x_{n - k_{2} + 1}) \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} cos \frac{2 π (1 - k_{1}) k}{n} cos \frac{2 π (1 - k_{2}) k}{n} {[E (x_{n : 1} x_{n : 1}^{⊤})]}_{k_{1}, k_{2}} \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} cos \frac{2 π (1 - k_{1}) k}{n} cos \frac{2 π (1 - k_{2}) k}{n} {[U_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{⊤}))) U_{n}^{*}]}_{k_{1}, k_{2}} \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} cos \frac{2 π (1 - k_{1}) k}{n} cos \frac{2 π (1 - k_{2}) k}{n} \sum_{h = 1}^{n} {[U_{n}]}_{k_{1}, h} λ_{h} (E (x_{n : 1} x_{n : 1}^{⊤})) \bar{{[U_{n}]}_{k_{2}, h}} \\ = \frac{1}{n} \sum_{h = 1}^{n} λ_{h} (E (x_{n : 1} x_{n : 1}^{⊤})) (\sum_{k_{1} = 1}^{n} cos \frac{2 π (1 - k_{1}) k}{n} {[U_{n}]}_{k_{1}, h}) (\sum_{k_{2} = 1}^{n} cos \frac{2 π (1 - k_{2}) k}{n} \bar{{[U_{n}]}_{k_{2}, h}}) \\ = \frac{1}{n} \sum_{h = 1}^{n} λ_{h} (E (x_{n : 1} x_{n : 1}^{⊤})) {|\sum_{l = 1}^{n} cos \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2}, \end{matrix}

and therefore,

\begin{matrix} λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤})) \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} cos \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} \\ \leq E ({(Re (y_{k}))}^{2}) \leq λ_{1} (E (x_{n : 1} x_{n : 1}^{⊤})) \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} cos \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} . \end{matrix}

Analogously, it can be proved that

\begin{matrix} λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤})) \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} sin \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} \\ \leq E ({(Im (y_{k}))}^{2}) \leq λ_{1} (E (x_{n : 1} x_{n : 1}^{⊤})) \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} sin \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} . \end{matrix}

To finish the proof we only need to show that

\frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} cos \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} = \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} sin \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} = \frac{1}{2} .

(8)

If

b_{1}, \dots, b_{n}

are n real numbers then

\begin{matrix} \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} b_{l} {[U_{n}]}_{l, h}|}^{2} & = \frac{1}{n} \sum_{h = 1}^{n} (\sum_{k_{1} = 1}^{n} b_{k_{1}} {[U_{n}]}_{k_{1}, h}) (\sum_{k_{2} = 1}^{n} b_{k_{2}} \bar{{[U_{n}]}_{k_{2}, h}}) = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} b_{k_{1}} b_{k_{2}} \sum_{h = 1}^{n} {[U_{n}]}_{k_{1}, h} {[U_{n}^{*}]}_{h, k_{2}} \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} b_{k_{1}} b_{k_{2}} {[U_{n} U_{n}^{*}]}_{k_{1}, k_{2}} = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} b_{k_{1}} b_{k_{2}} {[I_{n}]}_{k_{1}, k_{2}} = \frac{1}{n} \sum_{l = 1}^{n} b_{l}^{2}, \end{matrix}

(9)

and thus,

\begin{matrix} \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} sin \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} & = \frac{1}{n} \sum_{l = 1}^{n} {(sin \frac{2 π (1 - l) k}{n})}^{2} = \frac{1}{n} \sum_{l = 1}^{n} (1 - {(cos \frac{2 π (1 - l) k}{n})}^{2}) \\ = 1 - \frac{1}{n} \sum_{l = 1}^{n} {(cos \frac{2 π (1 - l) k}{n})}^{2} = 1 - \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} cos \frac{2 π (1 - l) k}{n} {[U_{n}]}_{l, h}|}^{2} . \end{matrix}

Equation (8) now follows directly from [4] (Equation (15)). ☐

Lemma 1.

Let

y_{n : 1}

be the DFT of an n-dimensional random vector

x_{n : 1}

. If

k \in {1, \dots, n}

then

$E ({|y_{k}|}^{2}) = {[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{*}) V_{n}]}_{n - k + 1, n - k + 1}$ .
$E (y_{k}^{2}) = {[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{⊤}) \bar{V_{n}}]}_{n - k + 1, n - k + 1}$ .
$E (Re (y_{k}) Im (y_{k})) = \frac{1}{2} Im (E (y_{k}^{2}))$ .
$E ({(Re (y_{k}))}^{2}) = \frac{E ({|y_{k}|}^{2}) + Re (E (y_{k}^{2}))}{2}$ .
$E ({(Im (y_{k}))}^{2}) = \frac{E ({|y_{k}|}^{2}) - Re (E (y_{k}^{2}))}{2}$ .

Proof.

(1) It is a direct consequence of Equation (7).

(2) We have

\begin{matrix} E (y_{k}^{2}) & = {[E (y_{n : 1} y_{n : 1}^{⊤})]}_{n - k + 1, n - k + 1} = {[E (V_{n}^{*} x_{n : 1} x_{n : 1}^{⊤} {(V_{n}^{*})}^{⊤})]}_{n - k + 1, n - k + 1} \\ = {[E (V_{n}^{*} x_{n : 1} x_{n : 1}^{⊤} \bar{V_{n}})]}_{n - k + 1, n - k + 1} = {[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{⊤}) \bar{V_{n}}]}_{n - k + 1, n - k + 1} . \end{matrix}

(3) Observe that

\begin{matrix} E (y_{k}^{2}) & = E ({(Re (y_{k}))}^{2} - {(Im (y_{k}))}^{2} + 2 Re (y_{k}) Im (y_{k}) i) \\ = E ({(Re (y_{k}))}^{2}) - E ({(Im (y_{k}))}^{2}) + 2 E (Re (y_{k}) Im (y_{k})) i, \end{matrix}

(10)

and hence,

Im (E (y_{k}^{2})) = 2 E (Re (y_{k}) Im (y_{k})) .

(4) and (5) From Equation (10) we obtain

Re (E (y_{k}^{2})) = E ({(Re (y_{k}))}^{2}) - E ({(Im (y_{k}))}^{2}) .

(11)

Furthermore,

E ({|y_{k}|}^{2}) = E ({(Re (y_{k}))}^{2} + {(Im (y_{k}))}^{2}) = E ({(Re (y_{k}))}^{2}) + E ({(Im (y_{k}))}^{2}) .

(12)

(4) and (5) follow directly from Equations (11) and (12). ☐

Theorem 2.

Let

y_{n : 1}

be the DFT of a real n-dimensional random vector

x_{n : 1}

. If

k \in {1, \dots, n - 1} ∖ {\frac{n}{2}}

then

\frac{λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} \leq λ_{2} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) \leq λ_{1} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) \leq \frac{λ_{1} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} .

Proof.

Fix

r \in {1, 2}

and consider a real unit eigenvector

v = {(v_{1}, v_{2})}^{⊤}

corresponding to

λ_{r} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤}))

. We have

λ_{r} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) = λ_{r} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) v^{⊤} v = v^{⊤} (λ_{r} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) v) = v^{⊤} E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤}) v .

From [4] (Equation (10)) we obtain

\begin{matrix} E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤}) \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} (\begin{matrix} cos \frac{2 π (1 - k_{1}) k}{n} cos \frac{2 π (1 - k_{2}) k}{n} E (x_{n - k_{1} + 1} x_{n - k_{2} + 1}) & cos \frac{2 π (1 - k_{1}) k}{n} sin \frac{2 π (1 - k_{2}) k}{n} E (x_{n - k_{1} + 1} x_{n - k_{2} + 1}) \\ sin \frac{2 π (1 - k_{1}) k}{n} cos \frac{2 π (1 - k_{2}) k}{n} E (x_{n - k_{1} + 1} x_{n - k_{2} + 1}) & sin \frac{2 π (1 - k_{1}) k}{n} sin \frac{2 π (1 - k_{2}) k}{n} E (x_{n - k_{1} + 1} x_{n - k_{2} + 1}) \end{matrix}) \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} {[E (x_{n : 1} x_{n : 1}^{⊤})]}_{k_{1}, k_{2}} w_{k_{1}} w_{k_{2}}^{⊤} \end{matrix}

with

\begin{matrix} w_{l} = (\begin{matrix} cos \frac{2 π (1 - l) k}{n} \\ sin \frac{2 π (1 - l) k}{n} \end{matrix}), l \in {1, \dots, n}, \end{matrix}

and consequently,

\begin{matrix} λ_{r} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) & = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} {[E (x_{n : 1} x_{n : 1}^{⊤})]}_{k_{1}, k_{2}} v^{⊤} w_{k_{1}} w_{k_{2}}^{⊤} v \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} \sum_{h = 1}^{n} {[U_{n}]}_{k_{1}, h} λ_{h} (E (x_{n : 1} x_{n : 1}^{⊤})) \bar{{[U_{n}]}_{k_{2}, h}} v^{⊤} w_{k_{1}} w_{k_{2}}^{⊤} v \\ = \frac{1}{n} \sum_{k_{1}, k_{2} = 1}^{n} {(w_{k_{1}}^{⊤} v)}^{⊤} \sum_{h = 1}^{n} {[U_{n}]}_{k_{1}, h} λ_{h} (E (x_{n : 1} x_{n : 1}^{⊤})) \bar{{[U_{n}]}_{k_{2}, h}} w_{k_{2}}^{⊤} v \\ = \frac{1}{n} \sum_{h = 1}^{n} λ_{h} (E (x_{n : 1} x_{n : 1}^{⊤})) (\sum_{k_{1} = 1}^{n} w_{k_{1}}^{⊤} v {[U_{n}]}_{k_{1}, h}) (\sum_{k_{2} = 1}^{n} w_{k_{2}}^{⊤} v \bar{{[U_{n}]}_{k_{2}, h}}) \\ = \frac{1}{n} \sum_{h = 1}^{n} λ_{h} (E (x_{n : 1} x_{n : 1}^{⊤})) {|\sum_{l = 1}^{n} w_{l}^{⊤} v {[U_{n}]}_{l, h}|}^{2} \end{matrix}

with

E (x_{n : 1} x_{n : 1}^{⊤}) = U_{n} {diag}_{1 \leq j \leq n} (λ_{j} (E (x_{n : 1} x_{n : 1}^{⊤}))) U_{n}^{- 1}

being a diagonalization of

E (x_{n : 1} x_{n : 1}^{⊤})

where the eigenvector matrix

U_{n}

is unitary. Therefore,

\begin{matrix} λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤})) \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} w_{l}^{⊤} v {[U_{n}]}_{l, h}|}^{2} \leq λ_{r} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) \leq λ_{1} (E (x_{n : 1} x_{n : 1}^{⊤})) \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} w_{l}^{⊤} v {[U_{n}]}_{l, h}|}^{2} . \end{matrix}

To finish the proof we only need to show that

\frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} w_{l}^{⊤} v {[U_{n}]}_{l, h}|}^{2} = \frac{1}{2} .

Applying Equation (9) and [4] (Equations (14) and (15)) yields

\begin{matrix} \frac{1}{n} \sum_{h = 1}^{n} {|\sum_{l = 1}^{n} w_{l}^{⊤} v {[U_{n}]}_{l, h}|}^{2} \\ = \frac{1}{n} \sum_{l = 1}^{n} {(w_{l}^{⊤} v)}^{2} = \frac{1}{n} \sum_{l = 1}^{n} {(cos \frac{2 π (1 - l) k}{n} v_{1} + sin \frac{2 π (1 - l) k}{n} v_{2})}^{2} \\ = v_{1}^{2} \frac{1}{n} \sum_{l = 1}^{n} {(cos \frac{2 π (1 - l) k}{n})}^{2} + v_{2}^{2} \frac{1}{n} \sum_{l = 1}^{n} {(sin \frac{2 π (1 - l) k}{n})}^{2} + 2 v_{1} v_{2} \frac{1}{n} \sum_{l = 1}^{n} cos \frac{2 π (1 - l) k}{n} sin \frac{2 π (1 - l) k}{n} \\ = v_{1}^{2} \frac{1}{n} \sum_{l = 1}^{n} {(cos \frac{2 π (1 - l) k}{n})}^{2} + \frac{v_{2}^{2}}{2} + v_{1} v_{2} \frac{1}{n} \sum_{l = 1}^{n} sin \frac{4 π (1 - l) k}{n} \\ = v_{1}^{2} \frac{1}{n} \sum_{l = 1}^{n} (1 - {(sin \frac{2 π (1 - l) k}{n})}^{2}) + \frac{v_{2}^{2}}{2} - v_{1} v_{2} \frac{1}{n} \sum_{l = 1}^{n} sin \frac{4 π (l - 1) k}{n} \\ = v_{1}^{2} (1 - \frac{1}{n} \sum_{l = 1}^{n} {(sin \frac{2 π (1 - l) k}{n})}^{2}) + \frac{v_{2}^{2}}{2} - v_{1} v_{2} \frac{1}{n} \sum_{l = 1}^{n} Im (e^{\frac{4 π (l - 1) k}{n} i}) \\ = \frac{v_{1}^{2}}{2} + \frac{v_{2}^{2}}{2} - v_{1} v_{2} \frac{1}{n} Im (\sum_{l = 1}^{n} e^{\frac{4 π (l - 1) k}{n} i}) = \frac{1}{2} v^{⊤} v = \frac{1}{2} . \end{matrix}

☐

3. RDF Upper Bounds for Real Gaussian Vectors

We first review the formula for the RDF of a real Gaussian vector given by Kolmogorov in [1].

Theorem 3.

If

x_{n : 1}

is a real zero-mean Gaussian n-dimensional vector with positive definite correlation matrix, its RDF is given by

R_{x_{n : 1}} (D) = \frac{1}{n} \sum_{k = 1}^{n} max \{0, \frac{1}{2} ln \frac{λ_{k} (E (x_{n : 1} x_{n : 1}^{⊤}))}{θ}\} \forall D \in (0, \frac{tr (E (x_{n : 1} x_{n : 1}^{⊤}))}{n}],

where

tr

denotes trace and θ is a real number satisfying

D = \frac{1}{n} \sum_{k = 1}^{n} min \{θ, λ_{k} (E (x_{n : 1} x_{n : 1}^{⊤}))\} .

We recall that

R_{x_{n : 1}} (D)

can be thought of as the minimum rate (measured in nats) at which one must encode (compress)

x_{n : 1}

in order to be able to recover it with a mean square error (MSE) per dimension not larger than D, that is:

\frac{E ({∥x_{n : 1} - \tilde{x_{n : 1}}∥}_{2}^{2})}{n} \leq D,

where

\tilde{x_{n : 1}}

denotes the estimation of

x_{n : 1}

and

{∥ \cdot ∥}_{2}

is the spectral norm.

The following result provides an optimal coding strategy for

x_{n : 1}

in order to achieve

R_{x_{n : 1}} (D)

whenever

D \leq λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))

. Observe that if

D \leq λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))

then

R_{x_{n : 1}} (D) = \frac{1}{2 n} \sum_{k = 1}^{n} ln \frac{λ_{k} (E (x_{n : 1} x_{n : 1}^{⊤}))}{D} = \frac{1}{2 n} ln \frac{det (E (x_{n : 1} x_{n : 1}^{⊤}))}{D^{n}} .

(13)

Corollary 1.

Suppose that

x_{n : 1}

is as in Theorem 3. Let

E (x_{n : 1} x_{n : 1}^{⊤}) = U_{n} {diag}_{1 \leq k \leq n} (λ_{k} (E (x_{n : 1} x_{n : 1}^{⊤}))) U_{n}^{- 1}

be a diagonalization of

E (x_{n : 1} x_{n : 1}^{⊤})

where the eigenvector matrix

U_{n}

is real and orthogonal. If

D \in (0, λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))]

then

R_{x_{n : 1}} (D) = \frac{1}{n} \sum_{k = 1}^{n} R_{z_{k}} (D) = \frac{1}{2 n} \sum_{k = 1}^{n} ln \frac{E (z_{k}^{2})}{D}

(14)

with

z_{n : 1} = U_{n}^{⊤} x_{n : 1}

.

Proof.

We encode

z_{1}, \dots, z_{n}

separately with

E ({∥z_{k} - \tilde{z_{k}}∥}_{2}^{2}) \leq D

for all

k \in {1, \dots, n}

. Let

\tilde{x_{n : 1}} : = U_{n} \tilde{z_{n : 1}}

, where

\tilde{z_{n : 1}} : = (\begin{matrix} \tilde{z_{n}} \\ ⋮ \\ \tilde{z_{1}} \end{matrix}) .

As

U_{n}^{⊤}

is unitary (in fact, it is a real orthogonal matrix) and the spectral norm is unitarily invariant, we have

\begin{matrix} \frac{E ({∥x_{n : 1} - \tilde{x_{n : 1}}∥}_{2}^{2})}{n} & = \frac{E ({∥U_{n}^{⊤} x_{n : 1} - U_{n}^{⊤} \tilde{x_{n : 1}}∥}_{2}^{2})}{n} = \frac{E ({∥z_{n : 1} - \tilde{z_{n : 1}}∥}_{2}^{2})}{n} \\ = \frac{E (\sum_{k = 1}^{n} {(z_{k} - \tilde{z_{k}})}^{2})}{n} = \frac{\sum_{k = 1}^{n} E ({(z_{k} - \tilde{z_{k}})}^{2})}{n} = \frac{\sum_{k = 1}^{n} E ({∥z_{k} - \tilde{z_{k}}∥}_{2}^{2})}{n} \leq D, \end{matrix}

and thus,

R_{x_{n : 1}} (D) \leq \frac{1}{n} \sum_{k = 1}^{n} R_{z_{k}} (D) .

To finish the proof we show Equation (14). Since

E (z_{n : 1} z_{n : 1}^{⊤}) = E (U_{n}^{⊤} x_{n : 1} x_{n : 1}^{⊤} U_{n}) = U_{n}^{⊤} E (x_{n : 1} x_{n : 1}^{⊤}) U_{n} = {diag}_{1 \leq k \leq n} (λ_{k} (E (x_{n : 1} x_{n : 1}^{⊤}))),

we obtain

E (z_{k}^{2}) = {[E (z_{n : 1} z_{n : 1}^{⊤})]}_{n - k + 1, n - k + 1} = λ_{n - k + 1} (E (x_{n : 1} x_{n : 1}^{⊤})) \geq λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤})) \geq D > 0 .

Hence, applying Equation (13) yields

\begin{matrix} \frac{1}{n} \sum_{k = 1}^{n} R_{z_{k}} (D) & = \frac{1}{n} \sum_{k = 1}^{n} \frac{1}{2} ln \frac{E (z_{k}^{2})}{D} = \frac{1}{2 n} \sum_{k = 1}^{n} ln \frac{λ_{n - k + 1} (E (x_{n : 1} x_{n : 1}^{⊤}))}{D} \\ = \frac{1}{2 n} \sum_{k = 1}^{n} ln \frac{λ_{k} (E (x_{n : 1} x_{n : 1}^{⊤}))}{D} = R_{x_{n : 1}} (D) . \end{matrix}

☐

Corollary 1 shows that an optimal coding strategy for

x_{n : 1}

is to encode

z_{1}, \dots, z_{n}

separately.

We now give two coding strategies for

x_{n : 1}

based on the DFT whose computational complexity is lower than the computational complexity of the optimal coding strategy provided in Corollary 1.

Theorem 4.

Let

x_{n : 1}

be as in Theorem 3. Suppose that

y_{n : 1}

is the DFT of

x_{n : 1}

and

D \in (0, λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))]

. Then

\begin{matrix} R_{x_{n : 1}} (D) & \leq {\tilde{R}}_{x_{n : 1}} (D) \leq {\overset{˘}{R}}_{x_{n : 1}} (D) \leq \frac{1}{2 n} \sum_{k = 1}^{n} ln \frac{E (| y_{k} |^{2})}{D} \end{matrix}

(15)

\begin{matrix} \leq R_{x_{n : 1}} (D) + \frac{1}{2} ln (1 + \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - V_{n} {diag}_{1 \leq k \leq n} ({[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{⊤}) V_{n}]}_{k, k}) V_{n}^{*}∥}_{F}}{\sqrt{n} λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))}), \end{matrix}

(16)

where

{∥ \cdot ∥}_{F}

is the Frobenius norm,

{\tilde{R}}_{x_{n : 1}} (D) : = \{\begin{matrix} \frac{R_{y_{\frac{n}{2}}} (D) + 2 \sum_{k = \frac{n}{2} + 1}^{n - 1} R_{\hat{y_{k}}} (\frac{D}{2}) + R_{y_{n}} (D)}{n} & i f n i s e v e n, \\ \frac{2 \sum_{k = \frac{n + 1}{2}}^{n - 1} R_{\hat{y_{k}}} (\frac{D}{2}) + R_{y_{n}} (D)}{n} & i f n i s o d d, \end{matrix}

and

{\overset{˘}{R}}_{x_{n : 1}} (D) : = \{\begin{matrix} \frac{R_{y_{\frac{n}{2}}} (D) + \sum_{k = \frac{n}{2} + 1}^{n - 1} (R_{Re (y_{k})} (\frac{D}{2}) + R_{Im (y_{k})} (\frac{D}{2})) + R_{y_{n}} (D)}{n} & i f n i s e v e n, \\ \frac{\sum_{k = \frac{n + 1}{2}}^{n - 1} (R_{Re (y_{k})} (\frac{D}{2}) + R_{Im (y_{k})} (\frac{D}{2})) + R_{y_{n}} (D)}{n} & i f n i s o d d . \end{matrix}

Proof.

Equations (15) and (16) were presented in [4] (Equations (16) and (20)) for the case where the correlation matrix

E (x_{n : 1} x_{n : 1}^{⊤})

is Toeplitz. They were proved by using a result on the DFT of random vectors with Toeplitz correlation matrix, namely, ref. [4] (Theorem 1). The proof of Theorem 4 is similar to the proof of [4] (Equations (16) and (20)) but using Theorem 1 instead of [4] (Theorem 1). Observe that in Theorems 1 and 4 no assumption about the structure of

E (x_{n : 1} x_{n : 1}^{⊤})

has been made. ☐

Theorem 4 shows that a coding strategy for

x_{n : 1}

is to encode

y_{⌈ \frac{n}{2} ⌉}, \dots, y_{n}

separately, where

⌈ \frac{n}{2} ⌉

denotes the smallest integer higher than or equal to

\frac{n}{2}

. Theorem 4 also shows that another coding strategy for

x_{n : 1}

is to encode separately the real part and the imaginary part of

y_{k}

instead of encoding

y_{k}

when

k \in {⌈ \frac{n}{2} ⌉, \dots, n - 1} ∖ {\frac{n}{2}}

. The computational complexity of these two coding strategies based on the DFT is lower than the computational complexity of the optimal coding strategy provided in Corollary 1. Specifically, the complexity of computing the DFT (

y_{n : 1} = V_{n}^{*} x_{n : 1}

) is

O (n log n)

whenever the fast Fourier transform (FFT) algorithm is used, while the complexity of computing

z_{n : 1} = U_{n}^{⊤} x_{n : 1}

is

O (n^{2})

. Moreover, when the coding strategies based on the DFT are used, we do not need to compute a real orthogonal eigenvector matrix

U_{n}

of

E (x_{n : 1} x_{n : 1}^{⊤})

. It should also be mentioned that for these coding strategies based on the DFT the knowledge of

E (x_{n : 1} x_{n : 1}^{⊤})

is not even required, in fact, for them we only need to know

E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})

with

k \in {⌈ \frac{n}{2} ⌉, \dots, n}

.

The rates corresponding to the two coding strategies given in Theorem 4,

{\tilde{R}}_{x_{n : 1}} (D)

and

{\overset{˘}{R}}_{x_{n : 1}} (D)

, can be written in terms of

E (x_{n : 1} x_{n : 1}^{⊤})

and

V_{n}

by using Lemma 1 and the following lemma.

Lemma 2.

Let

y_{n : 1}

and D be as in Theorem 4. Then

$R_{y_{k}} (D) = \frac{1}{2} ln \frac{E (y_{k}^{2})}{D}$ for all $k \in {1, \dots, n} \cap {\frac{n}{2}, n}$ .
$R_{\hat{y_{k}}} (\frac{D}{2}) = \frac{1}{4} ln \frac{E ({(Re (y_{k}))}^{2}) E ({(Im (y_{k}))}^{2}) - {(E (Re (y_{k}) Im (y_{k})))}^{2}}{{(\frac{D}{2})}^{2}}$ for all $k \in {1, \dots, n - 1} ∖ {\frac{n}{2}}$ .
$R_{Re (y_{k})} (\frac{D}{2}) = \frac{1}{2} ln \frac{E ({(Re (y_{k}))}^{2})}{\frac{D}{2}}$ for all $k \in {1, \dots, n - 1} ∖ {\frac{n}{2}}$ .
$R_{Im (y_{k})} (\frac{D}{2}) = \frac{1}{2} ln \frac{E ({(Im (y_{k}))}^{2})}{\frac{D}{2}}$ for all $k \in {1, \dots, n - 1} ∖ {\frac{n}{2}}$ .

Proof.

(1) Applying Equation (2) and [4] (Lemma 1) yields

0 < D \leq λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤})) \leq E ({|y_{k}|}^{2}) = E (y_{k}^{2}) .

Assertion (1) now follows directly from Equation (13).

(2) Applying Theorem 2 we have

0 < \frac{D}{2} \leq \frac{λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} \leq λ_{2} (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤})) .

Consequently, from Equation (13) we obtain

\begin{matrix} R_{\hat{y_{k}}} (\frac{D}{2}) = \frac{1}{4} ln \frac{det (E (\hat{y_{k}} {(\hat{y_{k}})}^{⊤}))}{{(\frac{D}{2})}^{2}} = \frac{1}{4} ln \frac{det (\begin{matrix} E ({(Re (y_{k}))}^{2}) & E (Re (y_{k}) Im (y_{k})) \\ E (Im (y_{k}) Re (y_{k})) & E ({(Im (y_{k}))}^{2}) \end{matrix})}{{(\frac{D}{2})}^{2}} . \end{matrix}

Assertions (3) and (4) Applying Equations (3) and (4) yields

0 < \frac{D}{2} \leq \frac{λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} \leq E ({(Re (y_{k}))}^{2}) .

and

0 < \frac{D}{2} \leq \frac{λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))}{2} \leq E ({(Im (y_{k}))}^{2}) .

Assertions (3) and (4) now follow directly from Equation (13). ☐

We end this section with a result that is a direct consequence of Lemma 2. This result shows when the rates corresponding to the two coding strategies given in Theorem 4,

{\tilde{R}}_{x_{n : 1}} (D)

and

{\overset{˘}{R}}_{x_{n : 1}} (D)

, are equal.

Lemma 3.

Let

x_{n : 1}

,

y_{n : 1}

, and D be as in Theorem 4. Then the two following assertions are equivalent:

${\tilde{R}}_{x_{n : 1}} (D) = {\overset{˘}{R}}_{x_{n : 1}} (D)$ .
$E (Re (y_{k}) Im (y_{k})) = 0$ for all $k \in {⌈ \frac{n}{2} ⌉, \dots, n - 1} ∖ {\frac{n}{2}}$ .

Proof.

Fix

k \in {⌈ \frac{n}{2} ⌉, \dots, n - 1} ∖ {\frac{n}{2}}

. From Lemma 2 we have

\begin{matrix} 2 R_{\hat{y_{k}}} (\frac{D}{2}) & = \frac{1}{2} ln \frac{E ({(Re (y_{k}))}^{2}) E ({(Im (y_{k}))}^{2}) - {(E (Re (y_{k}) Im (y_{k})))}^{2}}{{(\frac{D}{2})}^{2}} \\ \leq \frac{1}{2} ln \frac{E ({(Re (y_{k}))}^{2}) E ({(Im (y_{k}))}^{2})}{{(\frac{D}{2})}^{2}} \\ = \frac{1}{2} ln \frac{E ({(Re (y_{k}))}^{2})}{\frac{D}{2}} + \frac{1}{2} ln \frac{E ({(Im (y_{k}))}^{2})}{\frac{D}{2}} \\ = R_{Re (y_{k})} (\frac{D}{2}) + R_{Im (y_{k})} (\frac{D}{2}) . \end{matrix}

☐

4. Low-Complexity Coding Strategies for Gaussian AWSS AR Sources

We begin by introducing some notation. The symbols

N

,

Z

, and

R

denote the set of positive integers, integers, and (finite) real numbers, respectively. If

f : R \to C

is continuous and

2 π

-periodic, we denote by

T_{n} (f)

the

n \times n

Toeplitz matrix given by

{[T_{n} (f)]}_{j, k} = t_{j - k},

where

{t_{k}}_{k \in Z}

is the sequence of Fourier coefficients of f, i.e.,

t_{k} = \frac{1}{2 π} \int_{0}^{2 π} f (ω) e^{- k ω i} d ω \forall k \in Z .

If

A_{n}

and

B_{n}

are

n \times n

matrices for all

n \in N

, we write

{A_{n}} \sim {B_{n}}

if the sequences

{A_{n}}

and

{B_{n}}

are asymptotically equivalent, that is,

{∥ A_{n} ∥_{2}}

and

{∥ B_{n} ∥_{2}}

are bounded and

{lim}_{n \to \infty} \frac{∥ A_{n} - B_{n} ∥_{F}}{\sqrt{n}} = 0

(see [5] (Section 2.3) and [6]).

We now review the definitions of AWSS processes and AR processes.

Definition 1.

A random process

{x_{n}}

is said to be AWSS if it has constant mean (i.e.,

E (x_{j}) = E (x_{k})

for all

j, k \in N)

and there exists a continuous

2 π

-periodic function

f : R \to C

such that

{E (x_{n : 1} x_{n : 1}^{*})} \sim {T_{n} (f)}

. The function f is called (asymptotic) PSD of

{x_{n}}

.

Definition 2.

A real zero-mean random process

{x_{n}}

is said to be AR if

x_{n} = w_{n} - \sum_{k = 1}^{n - 1} a_{- k} x_{n - k} \forall n \in N,

or equivalently,

\sum_{k = 0}^{n - 1} a_{- k} x_{n - k} = w_{n} \forall n \in N,

(17)

where

a_{0} = 1

,

a_{- k} \in R

for all

k \in N

, and

{w_{n}}

is a real zero-mean random process satisfying that

E (w_{j} w_{k}) = δ_{j, k} σ^{2}

for all

j, k \in N

with

σ^{2} > 0

and

δ_{j, k}

being the Kronecker delta (i.e.,

δ_{j, k} = 1

if

j = k

, and it is zero otherwise).

The AR process

{x_{n}}

in Equation (17) is of finite order if there exists

p \in N

such that

a_{- k} = 0

for all

k > p

. In this case,

{x_{n}}

is called an AR

(p)

process.

The following theorem shows that if

x_{n : 1}

is a large enough data block of a Gaussian AWSS AR source, the rate does not increase whenever we encode it using the two coding strategies based on the DFT presented in Section 3, instead of encoding

x_{n : 1}

using an eigenvector matrix of its correlation matrix.

Theorem 5.

Let

{x_{n}}

be as in Definition 2. Suppose that

{a_{k}}_{k \in Z}

, with

a_{k} = 0

for all

k \in N

, is the sequence of Fourier coefficients of a function

a : R \to C

which is continuous and

2 π

-periodic. Then

${inf}_{n \in N} λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤})) \geq \frac{σ^{2}}{{max}_{ω \in [0, 2 π]} {| a (ω) |}^{2}} > 0$ .
Consider $D \in (0, {inf}_{n \in N} λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))]$ .
(a)
If ${x_{n}}$ is Gaussian,

$\frac{1}{2} ln \frac{σ^{2}}{D} = R_{x_{n : 1}} (D) \leq {\tilde{R}}_{x_{n : 1}} (D) \leq {\overset{˘}{R}}_{x_{n : 1}} (D) \leq K_{1} (n, D) \leq K_{2} (n, D) \leq K_{3} (n, D) \forall n \in N,$

(18)

where $K_{1} (n, D)$ is given by Equation (16), and $K_{2} (n, D)$ and $K_{3} (n, D)$ are obtained by replacing $λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))$ in Equation (16) by ${inf}_{n \in N} λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤}))$ and $\frac{σ^{2}}{{max}_{ω \in [0, 2 π]} {| a (ω) |}^{2}}$ , respectively.
(b)
If ${x_{n}}$ is Gaussian and AWSS,

$lim_{n \to \infty} R_{x_{n : 1}} (D) = lim_{n \to \infty} {\tilde{R}}_{x_{n : 1}} (D) = lim_{n \to \infty} {\overset{˘}{R}}_{x_{n : 1}} (D) = lim_{n \to \infty} K_{3} (n, D) .$

(19)

Proof. (1) Equation (17) can be rewritten as

T_{n} (a) x_{n : 1} = w_{n : 1} \forall n \in N .

Consequently,

T_{n} (a) E (x_{n : 1} x_{n : 1}^{⊤}) {(T_{n} (a))}^{⊤} = E (T_{n} (a) x_{n : 1} {(T_{n} (a) x_{n : 1})}^{⊤}) = E (w_{n : 1} w_{n : 1}^{⊤}) = σ^{2} I_{n} \forall n \in N .

As

det (T_{n} (a)) = 1

,

T_{n} (a)

is invertible, and therefore,

\begin{matrix} E (x_{n : 1} x_{n : 1}^{⊤}) & = σ^{2} {(T_{n} (a))}^{- 1} {({(T_{n} (a))}^{⊤})}^{- 1} = σ^{2} {({(T_{n} (a))}^{⊤} T_{n} (a))}^{- 1} = σ^{2} {({(T_{n} (a))}^{*} T_{n} (a))}^{- 1} \\ = σ^{2} {(N_{n} {diag}_{1 \leq k \leq n} ({(σ_{k} (T_{n} (a)))}^{2}) N_{n}^{*})}^{- 1} = N_{n} {diag}_{1 \leq k \leq n} (\frac{σ^{2}}{{(σ_{k} (T_{n} (a)))}^{2}}) N_{n}^{*} \end{matrix}

(20)

for all

n \in N

, where

T_{n} (a) = M_{n} {diag}_{1 \leq k \leq n} (σ_{k} (T_{n} (a))) N_{n}^{*}

is a singular value decomposition of

T_{n} (a)

. Thus, applying [8] (Theorem 4.3) yields

λ_{n} (E (x_{n : 1} x_{n : 1}^{⊤})) = \frac{σ^{2}}{{(σ_{1} (T_{n} (a)))}^{2}} \geq \frac{σ^{2}}{{max}_{ω \in [0, 2 π]} {| a (ω) |}^{2}} > 0 \forall n \in N .

(2a) From Equation (13) we have

\begin{matrix} R_{x_{n : 1}} (D) & = \frac{1}{2 n} ln \frac{det (E (x_{n : 1} x_{n : 1}^{⊤}))}{D^{n}} = \frac{1}{2 n} ln \frac{det (σ^{2} {(T_{n} (a))}^{- 1} {({(T_{n} (a))}^{⊤})}^{- 1})}{D^{n}} \\ = \frac{1}{2 n} ln \frac{{(σ^{2})}^{n}}{D^{n} det (T_{n} (a)) det ({(T_{n} (a))}^{⊤})} = \frac{1}{2 n} ln \frac{{(σ^{2})}^{n}}{D^{n}} = \frac{1}{2} ln \frac{σ^{2}}{D} \forall n \in N . \end{matrix}

Assertion (2a) now follows from Theorem 4 and Assertion (1).

(2b) From Assertion (2a) we only need to show that

lim_{n \to \infty} \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - V_{n} {diag}_{1 \leq k \leq n} ({[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{⊤}) V_{n}]}_{k, k}) V_{n}^{*}∥}_{F}}{\sqrt{n}} = 0 .

(21)

As the Frobenius norm is unitarily invariant we obtain

\begin{matrix} 0 & \leq \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - V_{n} {diag}_{1 \leq k \leq n} ({[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{⊤}) V_{n}]}_{k, k}) V_{n}^{*}∥}_{F}}{\sqrt{n}} \\ \leq \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥T_{n} (f) - {\hat{C}}_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥V_{n} {diag}_{1 \leq k \leq n} ({[V_{n}^{*} E (x_{n : 1} x_{n : 1}^{⊤}) V_{n}]}_{k, k}) V_{n}^{*} - {\hat{C}}_{n} (f)∥}_{F}}{\sqrt{n}} \\ = \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥T_{n} (f) - {\hat{C}}_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥V_{n} {diag}_{1 \leq k \leq n} ({[V_{n}^{*} (E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)) V_{n}]}_{k, k}) V_{n}^{*}∥}_{F}}{\sqrt{n}} \\ = \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥T_{n} (f) - {\hat{C}}_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥{diag}_{1 \leq k \leq n} ({[V_{n}^{*} (E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)) V_{n}]}_{k, k})∥}_{F}}{\sqrt{n}} \\ \leq \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥T_{n} (f) - {\hat{C}}_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥V_{n}^{*} (E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)) V_{n}∥}_{F}}{\sqrt{n}} \\ = 2 \frac{{∥E (x_{n : 1} x_{n : 1}^{⊤}) - T_{n} (f)∥}_{F}}{\sqrt{n}} + \frac{{∥T_{n} (f) - {\hat{C}}_{n} (f)∥}_{F}}{\sqrt{n}}, \end{matrix}

where f is (asymptotic) PSD of

{x_{n}}

and

{\hat{C}}_{n} (f) = V_{n} {diag}_{1 \leq k \leq n} ({[V_{n}^{*} T_{n} (f) V_{n}]}_{k, k}) V_{n}^{*}

. Assertion (2b) now follows from

{E (x_{n : 1} x_{n : 1}^{⊤})} \sim {T_{n} (f)}

and [9] (Lemma 4.2). ☐

If

\sum_{k = - \infty}^{0} | a_{k} | < \infty

, there always exists such function a and it is given by

a (ω) = \sum_{k = - \infty}^{0} a_{k} e^{k ω i}

for all

ω \in R

(see, e.g., [8] (Appendix B)). In particular, if

{x_{n}}

is an AR

(p)

process,

a (ω) = \sum_{k = - p}^{0} a_{k} e^{k ω i}

for all

ω \in R

.

5. Sufficient Conditions for AR Processes to be AWSS

In the following two results we give sufficient conditions for AR processes to be AWSS.

Theorem 6.

Let

{x_{n}}

be as in Definition 2. Suppose that

{a_{k}}_{k \in Z}

, with

a_{k} = 0

for all

k \in N

, is the sequence of Fourier coefficients of a function

a : R \to C

which is continuous and

2 π

-periodic. Then the following assertions are equivalent:

${x_{n}}$ is AWSS.
${{∥E (x_{n : 1} x_{n : 1}^{⊤})∥}_{2}}$ is bounded.
${T_{n} (a)}$ is stable (that is, ${∥ {(T_{n} (a))}^{- 1} ∥_{2}}$ is bounded).
$a (ω) \neq 0$ for all $ω \in R$ and ${x_{n}}$ is AWSS with (asymptotic) PSD $\frac{σ^{2}}{{| a |}^{2}}$ .

Proof.

(1)⇒(2) This is a direct consequence of the definition of AWSS process, i.e., of Definition 1.

(2)⇔(3) From Equation (20) we have

{∥E (x_{n : 1} x_{n : 1}^{⊤})∥}_{2} = \frac{σ^{2}}{{(σ_{n} (T_{n} (a)))}^{2}} = σ^{2} {∥N_{n} {diag}_{1 \leq k \leq n} (\frac{1}{σ_{k} (T_{n} (a))}) M_{n}^{*}∥}_{2}^{2} = σ^{2} {∥{(T_{n} (a))}^{- 1}∥}_{2}^{2}

for all

n \in N

.

(3)⇒(4) It is well known that if

f : R \to C

is continuous and

2 π

-periodic, and

{T_{n} (f)}

is stable then

f (ω) \neq 0

for all

ω \in R

. Hence,

a (ω) \neq 0

for all

ω \in R

.

Applying [8] (Lemma 4.2.1) yields

\{{(T_{n} (a))}^{⊤}\} = \{{(T_{n} (a))}^{*}\} = \{T_{n} (\bar{a})\}

. Consequently, from [7] (Theorem 3) we obtain

\{{(T_{n} (a))}^{⊤} T_{n} (a)\} = \{T_{n} (\bar{a}) T_{n} (a)\} \sim \{T_{n} (\bar{a} a)\} = \{T_{n} ({| a |}^{2})\} .

Observe that the sequence

\{{∥{({(T_{n} (a))}^{⊤} T_{n} (a))}^{- 1}∥}_{2}\} = \{{∥\frac{1}{σ^{2}} E (x_{n : 1} x_{n : 1}^{⊤})∥}_{2}\} = \{\frac{1}{σ^{2}} {∥E (x_{n : 1} x_{n : 1}^{⊤})∥}_{2}\}

is bounded. As the function

{| a |}^{2}

is real, applying [8] (Theorem 4.4) we have that

T_{n} ({| a |}^{2})

is Hermitian and

0 < {min}_{ω \in [0, 2 π]} {| a (ω) |}^{2} \leq λ_{n} (T_{n} ({| a |}^{2}))

for all

n \in N

, and therefore,

{∥{(T_{n} ({| a |}^{2}))}^{- 1}∥}_{2} = \frac{1}{λ_{n} (T_{n} ({| a |}^{2}))} \leq \frac{1}{{min}_{ω \in [0, 2 π]} {| a (ω) |}^{2}} \forall n \in N .

Thus, from [5] (Theorem 1.4) we obtain

\{\frac{1}{σ^{2}} E (x_{n : 1} x_{n : 1}^{⊤})\} = \{{({(T_{n} (a))}^{⊤} T_{n} (a))}^{- 1}\} \sim \{{(T_{n} ({| a |}^{2}))}^{- 1}\} .

Hence, applying [10] (Theorem 4.2) and [5] (Theorem 1.2) yields

\{\frac{1}{σ^{2}} E (x_{n : 1} x_{n : 1}^{⊤})\} \sim \{T_{n} (\frac{1}{{| a |}^{2}})\} .

Consequently, from [8] (Lemma 3.1.3) and [8] (Lemma 4.2.3) we have

\{E (x_{n : 1} x_{n : 1}^{⊤})\} \sim \{σ^{2} T_{n} (\frac{1}{{| a |}^{2}})\} = \{T_{n} (\frac{σ^{2}}{{| a |}^{2}})\} .

(4)⇒(1) It is obvious.

Corollary 2.

Let

{x_{n}}

be as in Definition 2 with

\sum_{k = - \infty}^{0} | a_{k} | < \infty

. If

\sum_{k = 0}^{\infty} a_{- k} z^{k} \neq 0

for all

| z | \leq 1

then

{x_{n}}

is AWSS.

Proof.

It is well known that if a sequence of complex numbers

{t_{k}}_{k \in Z}

satisfies that

\sum_{k = - \infty}^{\infty} | t_{k} | < \infty

and that

\sum_{k = - \infty}^{\infty} t_{k} z^{k} \neq 0

for all

| z | \leq 1

then

{T_{n} (f)}

is stable with

f (ω) = \sum_{k = - \infty}^{\infty} t_{k} e^{k ω i}

for all

ω \in R

. Therefore,

{T_{n} (b)}

is stable with

b (ω) = \sum_{k = 0}^{\infty} a_{- k} e^{k ω i}

for all

ω \in R

. Thus,

\{{∥{(T_{n} (a))}^{- 1}∥}_{2}\} = \{{∥{({(T_{n} (a))}^{- 1})}^{⊤}∥}_{2}\} = \{{∥{({(T_{n} (a))}^{⊤})}^{- 1}∥}_{2}\} = \{{∥{(T_{n} (b))}^{- 1}∥}_{2}\}

is bounded with

a (ω) = \sum_{k = - \infty}^{0} a_{k} e^{k ω i}

for all

ω \in R

. As

{T_{n} (a)}

is stable, from Theorem 6 we conclude that

{x_{n}}

is AWSS. ☐

6. Numerical Example and Conclusions

6.1. Example

Let

{x_{n}}

be as in Definition 2 with

a_{- k} = 0

for all

k > 1

. Observe that

\frac{σ^{2}}{{max}_{ω \in [0, 2 π]} {| a (ω) |}^{2}} = \frac{σ^{2}}{(1 + | a_{- 1} {|)}^{2}}

. If

| a_{- 1} | < 1

from Corollary 2 we obtain that the

A R (1)

process

{x_{n}}

is AWSS. Figure 1 shows

R_{x_{n : 1}} (D)

,

{\tilde{R}}_{x_{n : 1}} (D)

, and

{\overset{˘}{R}}_{x_{n : 1}} (D)

by assuming that

{x_{n}}

is Gaussian,

a_{- 1} = - \frac{1}{2}

,

σ^{2} = 1

,

D = \frac{σ^{2}}{(1 + | a_{- 1} {|)}^{2}} = \frac{4}{9}

, and

n \leq 100

. Figure 1 also shows the highest upper bound of

R_{x_{n : 1}} (D)

presented in Theorem 5, namely,

K_{3} (n, D)

. Observe that the figure bears evidence of the equalities and inequalities given in Equations (18) and (19).

6.2. Conclusions

The computational complexity of coding finite-length data blocks of Gaussian sources can be reduced by using any of the two low-complexity coding strategies here presented instead of the optimal coding strategy. Moreover, the rate does not increase if we use those strategies instead of the optimal one whenever the Gaussian source is AWSS and AR, and the considered data block is large enough.

Author Contributions

Authors are listed in order of their degree of involvement in the work, with the most active contributors listed first. All authors have read and approved the final manuscript.

Funding

This work was supported in part by the Spanish Ministry of Economy and Competitiveness through the CARMEN project (TEC2016-75067-C4-3-R).

Conflicts of Interest

The authors declare no conflict of interest.

References

Kolmogorov, A.N. On the Shannon theory of information transmission in the case of continuous signals. IRE Trans. Inf. Theory 1956, 2, 102–108. [Google Scholar] [CrossRef]
Gray, R.M. Information rates of autoregressive processes. IEEE Trans. Inf. Theory 1970, 16, 412–421. [Google Scholar] [CrossRef]
Pearl, J. On coding and filtering stationary signals by discrete Fourier transforms. IEEE Trans. Inf. Theory 1973, 19, 229–232. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Insausti, X. Upper bounds for the rate distortion function of finite-length data blocks of Gaussian WSS sources. Entropy 2017, 19, 554. [Google Scholar] [CrossRef]
Gray, R.M. Toeplitz and circulant matrices: A review. Found. Trends Commun. Inf. Theory 2006, 2, 155–239. [Google Scholar] [CrossRef]
Gray, R.M. On the asymptotic eigenvalue distribution of Toeplitz matrices. IEEE Trans. Inf. Theory 1972, 18, 725–730. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and multivariate ARMA processes. IEEE Trans. Inf. Theory 2011, 57, 5444–5454. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.; Crespo, P.M. Block Toeplitz matrices: Asymptotic results and applications. Found. Trends Commun. Inf. Theory 2011, 8, 179–257. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Insausti, X.; Hogstad, B.O. On the complexity reduction of coding WSS vector processes by using a sequence of block circulant matrices. Entropy 2017, 19, 95. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and Hermitian block Toeplitz matrices with continuous symbols: Applications to MIMO systems. IEEE Trans. Inf. Theory 2008, 54, 5671–5680. [Google Scholar] [CrossRef]

Figure 1. Considered rates for a Gaussian AWSS AR(1) source.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Villar-Rosety, F.M.; Insausti, X. Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources. Entropy 2018, 20, 399. https://0-doi-org.brum.beds.ac.uk/10.3390/e20060399

AMA Style

Gutiérrez-Gutiérrez J, Zárraga-Rodríguez M, Villar-Rosety FM, Insausti X. Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources. Entropy. 2018; 20(6):399. https://0-doi-org.brum.beds.ac.uk/10.3390/e20060399

Chicago/Turabian Style

Gutiérrez-Gutiérrez, Jesús, Marta Zárraga-Rodríguez, Fernando M. Villar-Rosety, and Xabier Insausti. 2018. "Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources" Entropy 20, no. 6: 399. https://0-doi-org.brum.beds.ac.uk/10.3390/e20060399

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources

Abstract

1. Introduction

2. Several New Results on the DFT of Random Vectors

3. RDF Upper Bounds for Real Gaussian Vectors

4. Low-Complexity Coding Strategies for Gaussian AWSS AR Sources

5. Sufficient Conditions for AR Processes to be AWSS

6. Numerical Example and Conclusions

6.1. Example

6.2. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI