Information Length Analysis of Linear Autonomous Stochastic Processes

Guel-Cortez, Adrian-Josue; Kim, Eun-jin

doi:10.3390/e22111265

Open AccessArticle

Information Length Analysis of Linear Autonomous Stochastic Processes

by

Adrian-Josue Guel-Cortez

^*

and

Eun-jin Kim

Centre for Fluid and Complex Systems, Coventry University, Priory St, Coventry CV1 5FB, UK

^*

Author to whom correspondence should be addressed.

Entropy 2020, 22(11), 1265; https://0-doi-org.brum.beds.ac.uk/10.3390/e22111265

Submission received: 7 October 2020 / Revised: 2 November 2020 / Accepted: 5 November 2020 / Published: 7 November 2020

(This article belongs to the Special Issue Entropies, Information Geometry and Fluctuations in Non-equilibrium Systems)

Download

Browse Figures

Versions Notes

Abstract

:

When studying the behaviour of complex dynamical systems, a statistical formulation can provide useful insights. In particular, information geometry is a promising tool for this purpose. In this paper, we investigate the information length for n-dimensional linear autonomous stochastic processes, providing a basic theoretical framework that can be applied to a large set of problems in engineering and physics. A specific application is made to a harmonically bound particle system with the natural oscillation frequency

ω

, subject to a damping

γ

and a Gaussian white-noise. We explore how the information length depends on

ω

and

γ

, elucidating the role of critical damping

γ = 2 ω

in information geometry. Furthermore, in the long time limit, we show that the information length reflects the linear geometry associated with the Gaussian statistics in a linear stochastic process.

Keywords:

non-equilibrium; stochastic processes; time-dependent PDF; information length; information geometry; entropy; fluctuations

1. Introduction

Stochastic processes are common in nature or laboratories, and play a major role across traditional disciplinary boundaries (e.g., see [1,2]). These stochastic processes often exhibit complex temporal behaviour and even the emergence of order (self-organization). The latter can also be artificially designed to complete an orderly task (guided self-organization) [3,4,5,6]. In order to study and compare the dynamics of different stochastic processes and self-organization, it is valuable to utilize a measurement which is independent of any specifics of a system [7,8,9,10,11] (e.g., physical variables, units, dimensions, etc.). This can be achieved by using information theory based on probability density functions (PDFs) and working in terms of information content or information change, e.g., by quantifying the statistical difference between two states [12,13,14]. Mathematically, we do this by assigning a metric to probability and by using the notion of ‘length’ or ‘distance’ in the statistical space.

One method of measuring the information content in a system is utilizing the Fisher information, which represents the degree of certainty, or order. The opposite is entropy, which is a popular concept for the uncertainty or amount of disorder. Comparing entropy at different times then gives a measure of the difference in information content between the two states, which is known as relative entropy (e.g., see [15]). Another example is the Wasserstein metric [16,17], which provides an exact solution to the Fokker-Planck equation for a gradient flow subject to the minimization of the energy functional defined as the sum of the entropy and potential energy [18,19,20]. This metric has units of a physical length in comparison with other metrics, for instance the dimensionless statistical distance based on the Fisher information metric [21,22,23]. Interestingly, there is a link between the Fisher information and the Wasserstein distance [24]. Furthermore, the relative entropy can be expressed by the integral of the Fisher information along the same path [25].

Although quite useful, the relative entropy lacks the locality of a metric as it concerns only about the difference between given two PDFs. For instance, when these two PDFs represent the two states at different times, the relative entropy between them tells us nothing about how one PDF evolves to the other PDF over time or what intermediate states a system passes through between the two PDFs. As a result, it can only inform us of the changes that affect the overall system evolution [26]. To overcome this limitation, the information length

L (t)

was proposed in recent works, which quantifies the total number of different states that the system evolves through in time [27,28]. This means that the information length is a measure that depends on the evolution path between two states (PDFs). Its formulation allows us to measure local changes in the evolution of the system as well as providing an intriguing link between stochastic processes and geometry [26].

For instance, the relation between the information length

L_{\infty} = L (t \to \infty)

and the mean value of the initial PDF for the fixed values of all other parameters was invoked as a new way of mapping out an attractor structure in a relaxation problem where any initial PDF relaxes into its equilibrium PDF in the long time limit. Specifically, for the Ornstein-Uhlenbeck (O-U) process driven by a Gaussian white-noise (which is a linearly damped, relaxation problem),

L_{\infty}

increases linearly with the distance between the mean position of an initial PDF and the stable equilibrium point for further details, see [28,29], with its minimum value zero at the stable equilibrium point. This linear dependence manifests that the information length preserves the linear geometry of the underlying Gaussian process, which is lost in other metrics [26]. For a nonlinear stochastic process with nonlinear damping,

L_{\infty}

still takes its minimum value at the stable equilibrium point but exhibits a power-law dependence on the distance between the mean value of an initial PDF and the stable equilibrium point. In contrast, for a chaotic attractor,

L_{\infty}

changes abruptly under an infinitesimal change of the mean value of an initial PDF, reminiscent of the sensitive dependence on initial conditions of the Lyapunov exponent [30]. These results suggest that

L_{\infty}

elucidates how different (non)linear forces affect (information) geometry.

With the above background in mind, this paper aims to extend the analysis of the information length of the O-U process to an arbitrary n-th order linear autonomous stochastic processes, providing a basic theoretical framework to be utilized in a large set of problems in both engineering and physics. In particular, we provide a useful analytical result that defines the information diagnostics as a function of the covariance matrix and the mean vector of the system, which enormously reduces the computational cost of numerical simulations of high-order systems.

This is followed by a specific application to a harmonically bound particle system (Kramers equation) for the position x and velocity

v = \frac{d x}{d t}

, with the natural oscillation frequency

ω

, subject to a damping constant

γ

and a Gaussian white-noise (short-correlated). We find an exact time-dependent joint PDF

p (x, v, t)

starting from an initial Gaussian PDF which has a finite-width. Note that as far as we are aware of, our result

p (x, v, t)

is original since in literature, the calculation was done only for the case of a delta-function initial PDF. Since this process is governed by the two variables, x and v, we investigate how

L_{\infty}

depends on their initial mean values

〈 x_{0} 〉

and

〈 v_{0} 〉

. Here, the angular brackets denote the average. Furthermore, the two characteristic time scales associated with

ω

and

γ

raise the interesting question as to their role in

L_{\infty}

. Thus, we explore how the information length depends on

ω

and

γ

. Our principle results are as follows: (i)

L_{\infty}

tends to increase linearly with either the deviation of initial mean position

〈 x_{0} 〉

or the initial mean velocity

〈 v_{0} 〉

from their equilibrium values

〈 x (0) 〉 = 〈 v (0) 〉 = 0

; (ii) a linear geometry is thus preserved for our linearly coupled stochastic processes driven by a Gaussian noise; (iii)

L_{\infty}

tends to take its minimum value near the critical damping

γ = 2 ω

for the same initial conditions and other parameters.

The remainder of this paper is organized as follows: Section 2 presents the basic concept of information length and the formulation of our problem. In Section 3, our main theoretical results are provided (see also Appendix A). In Section 4, we apply the results in Section 3 to analyze a harmonically bound particle system with the natural oscillation frequency

ω

subject to a damping constant

γ

and a Gaussian white-noise. Finally, Section 5 contains our concluding remarks.

To help readers, we here provide a summary of our notations:

R

and

C

are the sets of real and complex numbers, respectively.

x \in R^{n}

represents a column vector

x

of real numbers of dimension n,

A \in R^{n \times n}

represents a real matrix of dimension

n \times n

,

tr (A)

corresponds to the trace of the matrix

A

,

A^{T}

and

A^{- 1}

are the transpose and inverse of matrix

A

, respectively. (Bold-face letters are used to represent vectors and matrices.) In some places,

\partial_{t}

or the prime both are used for the partial derivative with respect to time. Besides,

i = \sqrt{- 1}

and for

s \in C

,

L^{- 1} [F (s)] = \frac{1}{2 π i} {lim}_{b \to \infty} \int_{a - b i}^{a + b i} e^{s t} F (s) d s

corresponds to the inverse Laplace transform of the complex function

F (s)

. Finally, the average of a random vector

x

is denoted by

〈 x 〉

.

2. Preliminaries

2.1. Information Length

As noted in Section 1, the information length [26,27,31] is a dimensionless measurement of the total number of statistically different states that a system passes through in time in non-equilibrium processes. We cannot overemphasize that it is a measure that depends on the evolution of the system, being a useful index for understanding the information geometry underlying non-equilibrium processes. For example, for a time-dependent PDF

p (x, t)

of one stochastic variable x, the information length

L (t)

is the total information change between time 0 and t, and is defined by

L (t) = \int_{0}^{t} \frac{d t_{1}}{τ (t_{1})} = \int_{0}^{t} d t_{1} \sqrt{E (t_{1})} = \int_{0}^{t} d t_{1} \sqrt{\int_{- \infty}^{\infty} d x \frac{1}{p (x, t_{1})} {[\frac{\partial p (x, t_{1})}{\partial t_{1}}]}^{2}} .

(1)

Here,

E (t_{1}) = \int_{- \infty}^{\infty} d x \frac{1}{p (x, t_{1})} {[\frac{\partial p (x, t_{1})}{\partial t_{1}}]}^{2}

is the square of the information velocity (recalling we are working with the unit where the distance given by the information length has no dimension). As we can see, to define the information length, we compute the dynamic time unit

τ (t) = \frac{1}{\sqrt{E}}

, which quantifies the correlation time over which the PDF

p (x, t)

changes. Besides,

τ

serves as the time unit in the statistical space. Alternatively, the information velocity

\frac{1}{τ (t_{1})}

quantifies the (average) rate of change of information in time.

2.2. Problem Formulation

We consider the following linear autonomous process

\dot{x} (t) = A x (t) + Γ (t) .

(2)

Here,

A

is an

n \times n

constant real matrix;

Γ \in R^{n}

is a stochastic driving given by a n dimensional vector of

δ

-correlated Gaussian noises

Γ_{i}

(

i = 1, 2, . . . n

), with the following statistical property

〈 Γ_{i} (t) 〉 = 0, 〈 Γ_{i} (t) Γ_{j} (t_{1}) 〉 = 2 D_{i j} δ (t - t_{1}), D_{i j} = D_{j i}, \forall i, j = 1, \dots, n .

(3)

Note that

D_{i i}

represents the strength of the i-th stochastic noise while

D_{i j}

for

i \neq j

denotes the correlation between i-th and j-th noises (i.e., random fluctuations). Then, from the joint PDF

p (x, t)

, we define the information length

L

of system (2) by the following integral

L (t) = \int_{0}^{t} d t_{1} \sqrt{\int_{- \infty}^{\infty} d x \frac{1}{p (x, t_{1})} {[\frac{\partial p (x, t_{1})}{\partial t_{1}}]}^{2}} = \int_{0}^{t} d t_{1} \sqrt{E},

(4)

where

E = \int_{- \infty}^{\infty} d x \frac{1}{p (x, t_{1})} {[\frac{\partial p (x, t_{1})}{\partial t_{1}}]}^{2}

is the square of the information velocity.

The first goal of this paper is to provide theoretical results for the information length (4) for the system (2) and (3). This is done in the following Section 3.

3. General Analytical Results

In the section, we provide the analytical results for Problem 2.1, summarizing the main steps required to calculate information length (4). To this end, we assume that an initial PDF is Gaussian and then take the advantage of the fact that a linear process driven by a Gaussian noise with an initial Gaussian PDF is always Gaussian. The joint PDF for (2) and (3) is thus Gaussian, whose form is provided below.

Proposition 1

(Joint probability). The system (2) and (3) for a Gaussian random variable

x

at any time t has the following joint PDF

p (x, t) = \frac{1}{\sqrt{det (2 π Σ)}} e^{- \frac{1}{2} {(x - 〈 x (t) 〉)}^{T} Σ^{- 1} (x - 〈 x (t) 〉)},

(5)

where

\begin{matrix} 〈 x (t) 〉 & = & e^{A t} 〈 x (0) 〉, \end{matrix}

(6)

\begin{matrix} Σ (t) & = & e^{A t} 〈δ x (0) δ x {(0)}^{T}〉 e^{A^{T} t} + 2 \int_{0}^{t} e^{A (t - t_{1})} D e^{A^{T} (t - t_{1})} d t_{1}, \end{matrix}

(7)

and

D \in R^{n \times n}

is the matrix of elements

D_{i j}

. Here,

〈 x (t) 〉

is the mean value of

x (t)

while Σ is the covariance matrix.

Proof.

For a Gaussian PDF of

x

, all we need to calculate are the mean and covariance of

x

and substitute them in the general expression for multi-variable Gaussian distribution (5). To this end, we first write down the solution of Equation (2) as follows

x (t) = e^{A t} x (0) + \int_{0}^{t} e^{A (t - t_{1})} Γ (t_{1}) d t_{1} .

(8)

By taking the average of Equation (8), we find the mean value of

x (t)

of (8) as follows

〈 x (t) 〉 = 〈 e^{A t} x (0) 〉 + \int_{0}^{t} e^{A (t - t_{1})} 〈 Γ (t_{1}) 〉 d t_{1} = e^{A t} 〈 x (0) 〉,

(9)

which is Equation (6). On the other hand, to find covariance

Σ (t)

, we let

x = 〈 x 〉 + δ x

, and use the property

〈 δ x (0) Γ (t) 〉 = 0

to find

\begin{matrix} Σ (t) & = & 〈δ x δ x^{T}〉 \\ = & 〈(e^{A t} δ x (0) + \int_{0}^{t} e^{A (t - t_{2})} 〈 Γ (t_{2}) 〉 d t_{2}) {(e^{A t} δ x (0) + \int_{0}^{t} e^{A (t - t_{1})} 〈 Γ (t_{1}) 〉 d t_{1})}^{T}〉 \\ = & 〈(e^{A t} δ x (0) + \int_{0}^{t} e^{A (t - t_{2})} Γ (t_{2}) d t_{2}) (δ x {(0)}^{T} e^{A^{T} t} + \int_{0}^{t} Γ {(t_{1})}^{T} {(e^{A (t - t_{1})})}^{T} d t_{1})〉 \\ = & e^{A t} 〈δ x (0) δ x {(0)}^{T}〉 e^{A^{T} t} + 〈(\int_{0}^{t} e^{A (t - t_{2})} Γ (t_{2}) d t_{2}) (\int_{0}^{t} Γ {(t_{1})}^{T} e^{A^{T} (t - t_{1})} d t_{1})〉 \\ = & e^{A t} 〈δ x (0) δ x {(0)}^{T}〉 e^{A^{T} t} + \int_{0}^{t} \int_{0}^{t} e^{A (t - t_{2})} 〈 Γ (t_{2}) Γ {(t_{1})}^{T} 〉 e^{A^{T} (t - t_{1})} d t_{2} d t_{1} \\ = & e^{A t} 〈δ x (0) δ x {(0)}^{T}〉 e^{A^{T} t} + 2 \int_{0}^{t} e^{A (t - t_{1})} D e^{A^{T} (t - t_{1})} d t_{1} . \end{matrix}

(10)

Here

δ x (0) = δ x (t = 0)

is the initial fluctuation at

t = 0

. Equation (10) thus proves Equation (7). Substitution of Equations (6) and (7) in Equation (5) thus gives us a joint PDF

p (x, t)

☐

Next, in order to calculate the information length from the joint PDF

p (x, t)

in Equation (5), we now use the following Theorem:

Theorem 1

(Information Length). The information length of the joint PDF of system (2) and (3) is given by the following integral

L (t) = \int_{0}^{t} d t_{1} \sqrt{E (t_{1})} = \frac{1}{\sqrt{2}} \int_{0}^{t} d t_{1} \sqrt{\partial_{t_{1}} [t r (Q \partial_{t_{1}} Σ)] + 2 {〈 x^{'} (t_{1}) 〉}^{T} Q 〈 x^{'} (t_{1}) 〉 + tr (Q^{″} Σ)},

(11)

where

Q = Σ^{- 1}

(recall, a prime denotes

\frac{\partial}{\partial t}

).

Proof.

To prove this theorem, we use the PDF (5) in (4). To simplify the expression, we let

w \equiv δ x = x - 〈 x (t) 〉, Q = Σ^{- 1} .

We then compute step by step

\frac{{[\partial_{t_{1}} p (x, t_{1})]}^{2}}{p (x, t_{1})}

as follows:

\begin{matrix} \partial_{t_{1}} p (x, t_{1}) & = & \frac{\partial}{\partial t_{1}} [{(det (2 π Σ))}^{- \frac{1}{2}} e^{- \frac{1}{2} w^{T} Q w}] \\ = & - \frac{1}{2} e^{- \frac{1}{2} w^{T} Q w} {(det (2 π Σ))}^{- \frac{3}{2}} \partial_{t_{1}} (det (2 π Σ)) - \frac{1}{2} {(det (2 π Σ))}^{- \frac{1}{2}} e^{- \frac{1}{2} w^{T} Q w} \partial_{t_{1}} (w^{T} Q w), \end{matrix}

(12)

\begin{matrix} {[\partial_{t_{1}} p (x, t_{1})]}^{2} & = & \frac{1}{4} e^{- w^{T} Q w} {(det (2 π Σ))}^{- 3} {[\partial_{t_{1}} det (2 π Σ)]}^{2} + \frac{1}{4} {(det (2 π Σ))}^{- 1} e^{- w^{T} Q w} {(\partial_{t_{1}} [w^{T} Q w])}^{2} \\ + \frac{1}{2} {(det (2 π Σ))}^{- 2} \partial_{t_{1}} [det (2 π Σ)] \partial_{t_{1}} [w^{T} Q w] e^{- w^{T} Q w}, \end{matrix}

(13)

\begin{matrix} \frac{{[\partial_{t_{1}} p (x, t_{1})]}^{2}}{p (x, t_{1})} & = & \frac{1}{4} {(det (2 π Σ))}^{- \frac{5}{2}} {[\partial_{t_{1}} det (2 π Σ)]}^{2} e^{- \frac{1}{2} w^{T} Q w} + \frac{1}{4} {(det (2 π Σ))}^{- \frac{1}{2}} e^{- \frac{1}{2} w^{T} Q w} {(\partial_{t_{1}} (w^{T} Q w))}^{2} \\ + \frac{1}{2} {(det (2 π Σ))}^{- \frac{3}{2}} \partial_{t_{1}} (det (2 π Σ)) \partial_{t_{1}} (w^{T} Q w) e^{- \frac{1}{2} w^{T} Q w} . \end{matrix}

(14)

Now, using Equation (12) in Equation (14), we compute the integral

E (t_{1}) = \int_{- \infty}^{\infty} x (\frac{{[\partial_{t_{1}} p (x, t_{1})]}^{2}}{p (x, t_{1})})

as follows

\begin{matrix} E (t_{1}) & = & \int_{- \infty}^{\infty} p (x, t_{1}) (\frac{{(det (2 π Σ))}^{- 2}}{4} {[\partial_{t_{1}} (det (2 π Σ))]}^{2} + \frac{{(\partial_{t_{1}} (w^{T} Q w))}^{2}}{4} + \frac{\partial_{t_{1}} [det (2 π Σ)] \partial_{t_{1}} [w^{T} Q w]}{2 det (2 π Σ)}) d x \\ = & 〈{(\frac{\partial_{t_{1}} [det (2 π Σ)]}{2 det (2 π Σ)})}^{2}〉 + 〈{(\frac{\partial_{t_{1}} [w^{T} Q w]}{2})}^{2}〉 + 〈\frac{\partial_{t_{1}} [det (2 π Σ)] \partial_{t_{1}} [w^{T} Q w]}{2 det (2 π Σ)}〉 . \end{matrix}

(15)

To calculate the three averages in (15), we use the properties

\int_{- \infty}^{\infty} e^{- \frac{1}{2} w^{T} Q w} w = \sqrt{det (2 π Σ)}

[32],

\partial_{t_{1}} [e^{- \frac{1}{2} w^{T} Q w}] = - \frac{1}{2} e^{- \frac{1}{2} w^{T} Q w} \partial_{t_{1}} [w^{T} Q w]

and

\partial_{t_{1} t_{1}} [e^{- \frac{1}{2} w^{T} Q w}] = - \frac{1}{2} \partial_{t_{1} t_{1}} [w^{T} Q w] e^{- \frac{1}{2} w^{T} Q w} + \frac{1}{4} e^{- \frac{1}{2} w^{T} Q w} {(\partial_{t_{1}} [w^{T} Q w])}^{2}

. We then have

\begin{matrix} E (t_{1}) & = & {(\frac{\partial_{t_{1}} [det (2 π Σ)]}{2 det (2 π Σ)})}^{2} + \int_{- \infty}^{\infty} p (x, t_{1}) {(\frac{\partial_{t_{1}} [w^{T} Q w]}{2})}^{2} d x + \frac{\partial_{t_{1}} [det (2 π Σ)]}{2 det (2 π Σ)} \int_{- \infty}^{\infty} p (x, t_{1}) \partial_{t_{1}} [w^{T} Q w] d x \\ = & \frac{1}{4} {(\frac{\partial_{t_{1}} [det (2 π Σ)]}{det (2 π Σ)})}^{2} + \frac{1}{4 {(det (2 π Σ))}^{\frac{1}{2}}} \int_{- \infty}^{\infty} [4 \partial_{t_{1} t_{1}} e^{- \frac{1}{2} w^{T} Q w} + 2 \partial_{t_{1} t_{1}} [w^{T} Q w] e^{- \frac{1}{2} w^{T} Q w}] d x \\ - 2 \frac{\partial_{t_{1}} [det (2 π Σ)]}{2 {(det (2 π Σ))}^{\frac{3}{2}}} \int_{- \infty}^{\infty} \partial_{t_{1}} [e^{- \frac{1}{2} w^{T} Q w}] d x \\ = & \frac{1}{4} {(\frac{\partial_{t_{1}} [det (2 π Σ)]}{det (2 π Σ)})}^{2} + \frac{\partial_{t_{1} t_{1}} [\sqrt{det (2 π Σ)}]}{\sqrt{det (2 π Σ)}} + \frac{1}{2 \sqrt{det (2 π Σ)}} \int_{- \infty}^{\infty} \partial_{t_{1} t_{1}} [w^{T} Q w] e^{- \frac{1}{2} w^{T} Q w} d x \\ - \frac{\partial_{t_{1}} [det (2 π Σ)]}{{(det (2 π Σ))}^{\frac{3}{2}}} \partial_{t_{1}} \sqrt{det (2 π Σ)} . \end{matrix}

(16)

Here

\partial_{t_{1} t_{1}} [w^{T} Q w] = \sum_{i, j = 1}^{n} [\partial_{t_{1} t_{1}} (q_{i j} w_{i} w_{j})] = \sum_{i, j = 1}^{n} [4 q_{i j}^{'} w_{i}^{'} w_{j} + \underset{independent of x}{\underset{︸}{2 q_{i j} w_{i}^{'} w_{j}^{'}}} + 2 q_{i j} w_{i}^{″} w_{j} + \underset{w^{T} Q^{″} w}{\underset{︸}{q_{i j}^{″} w_{i} w_{j}}}] .

(17)

We recall that

ω_{i}^{'}, q_{i j}^{'}

and

ω_{i}^{″}, q_{i j}^{″}

denote the first and second derivative over time of the elements

ω_{i}

and

q_{i j}

. By substituting (17) in (16) and making some arrangements, we obtain

\begin{matrix} E (t_{1}) & = & \frac{1}{4} {(\frac{\partial_{t_{1}} [det (2 π Σ)]}{det (2 π Σ)})}^{2} + \frac{\partial_{t_{1} t_{1}} [\sqrt{det (2 π Σ)}]}{\sqrt{det (2 π Σ)}} + \frac{1}{2} {〈4 \sum_{i, j = 1}^{n} q_{i j}^{'} w_{i}^{'} w_{j}〉}^{0} + \frac{1}{2} {〈\sum_{i, j = 1}^{n} 2 q_{i j} w_{i}^{″} w_{j}〉}^{0} \\ + \frac{1}{2} 〈\sum_{i, j = 1}^{n} 2 q_{i j} w_{i}^{'} w_{j}^{'}〉 + \frac{1}{2} 〈w^{T} Q^{″} w〉 - \frac{1}{2} {(\frac{\partial_{t_{1}} [det (2 π Σ)]}{det (2 π Σ)})}^{2} . \end{matrix}

(18)

Now with the help of the following relations

〈w^{T} Q^{″} w〉 = tr (Q^{″} Σ)

[33],

\partial_{t_{1}} det (Σ) = det (Σ) tr (Q \partial_{t_{1}} Σ)

[34], and

\partial_{t_{1} t_{1}} \sqrt{det (2 π Σ)} = \frac{1}{4} \sqrt{det (2 π Σ)} {(tr (Q \partial_{t_{1}} Σ))}^{2} + \frac{1}{2} \sqrt{det (2 π Σ)} \partial_{t_{1}} (tr (Q \partial_{t_{1}} Σ))

, we then have

\begin{matrix} E (t_{1}) & = & - \frac{1}{4} {(tr (Q \partial_{t_{1}} Σ))}^{2} + \frac{1}{2} \partial_{t_{1}} [tr (Q \partial_{t_{1}} Σ)] + \frac{1}{4} {(tr (Q \partial_{t_{1}} Σ))}^{2} \\ + {〈 x^{'} (t_{1}) 〉}^{T} Q 〈 x^{'} (t_{1}) 〉 + \frac{1}{2} tr (Q^{″} Σ) \\ = & \frac{1}{2} \partial_{t_{1}} [tr (Q \partial_{t_{1}} Σ)] + {〈 x^{'} (t_{1}) 〉}^{T} Q 〈 x^{'} (t_{1}) 〉 + \frac{1}{2} tr (Q^{″} Σ) . \end{matrix}

(19)

Equation (19) thus proves Equation (11). ☐

Given important properties of the covariance matrix eigenvalues (see, e.g., [35]), it is useful to express Equation (19) and the information length as a function of these covariance matrix eigenvalues. This is done in the following Corollary.

Corollary 1.

Let

φ_{i} (t)

’s (

i = 1, . . . n

) be the eigenvalues of the covariance matrix Σ, and

\bar{x} = {〈 x^{'} (t) 〉}^{T} P

where

P

is an orthonormal matrix whose column vectors are linearly independent eigenvectors of

Q = Σ^{- 1}

. We can rewrite the information length (11) as

L (t) = \int_{0}^{t} d t_{1} \sqrt{E (t_{1})} = \int_{0}^{t} d t_{1} \sqrt{\sum_{i = 1}^{n} (\frac{{φ^{'}}_{i}^{2} (t_{1}) + 2 φ_{i} (t_{1}) {\bar{x}}_{i}^{2}}{2 φ_{i}^{2} (t_{1})})} .

(20)

Proof.

The proof follows straightforwardly from the fact that

Σ

is a symmetric matrix which can be diagonalised by finding the orthonormal matrix

P

such that

P^{T} Σ^{- 1} P = Φ

. Here

Φ

is the diagonal matrix whose entries are the eigenvalues

\frac{1}{φ_{i} (t)}

\forall i = 1, 2, \dots, n

(recall that

φ_{i} (t)

is i-th the eigenvalue of

Σ^{- 1}

). This gives us

E (t) = \frac{1}{2} \sum_{i = 1}^{n} (\partial_{t} [\frac{φ_{i}^{'} (t)}{φ_{i} (t)}] + φ_{i} (t) \partial_{t t} [\frac{1}{φ (t)}] + 2 \frac{{\bar{x}}_{i}^{2}}{φ_{i} (t)}) = \sum_{i = 1}^{n} (\frac{{φ^{'}}_{i}^{2} (t) + 2 φ_{i} (t) {\bar{x}}_{i}^{2}}{2 φ_{i}^{2} (t)}) .

(21)

This finishes the proof. ☐

It is useful to check that Equation (20) reproduces the previous result for the O-U process [36]

E = \frac{β^{' 2}}{2 β^{2}} + 2 β {〈 x^{'} 〉}^{2},

(22)

where

β = \frac{1}{2 〈 {(x - 〈 x 〉)}^{2} 〉}

is the inverse temperature. Here,

β^{'}

denotes the time derivative of

β

. To show this, we note that for the O-U process, the covariance matrix is a scalar (

n = 1

) with the value

Σ = \frac{1}{2 β} = φ (t)

and thus

Q = \frac{1}{φ (t)} = 2 β

while

〈 x^{'} (t) 〉 = 〈 x^{'} 〉

. Thus,

E (t) = \frac{1}{2} {(\frac{- \frac{1}{2} β^{- 2} β^{'}}{\frac{1}{2 β}})}^{2} + 2 β {〈 x^{'} 〉}^{2} = \frac{β^{' 2}}{2 β^{2}} + 2 β {〈 x^{'} 〉}^{2} .

In sum, for the O-U process, the square of the information velocity (shown in expression (22)) increases with the ‘roughness’ of the process, as quantified by the squared ratio of the rate of change of the inverse temperature (or precision) and the precision – plus a term that depends upon this precision times the variance of the state velocity.

4. Kramers Equation

In this section we apply our results in Section 3 to the Kramers equation for a harmonically bound particle [19,37]. As noted in Introduction, we investigate the behaviour of the information length when varying various parameters and initial conditions to elucidate how the information geometry is affected by the damping, oscillations, strength of the stochastic noises and initial mean values.

Consider the Kramers equation

\begin{matrix} \frac{d x}{d t} & = & v \\ \frac{d v}{d t} & = & - γ v - ω^{2} x + ξ (t) . \end{matrix}

(23)

Here,

ω

is a natural frequency and

γ

is the damping constant, both positive real numbers.

ξ

is a Gaussian white-noise acting on v with the zero mean value

〈 ξ (t) 〉 = 0

, with the statistical property

〈 ξ (t) ξ (t_{1}) 〉 = 2 D δ (t - t_{1}) .

(24)

Comparing Equations (23) and (24) with Equations (2) and (3), we note that

x_{1} = x

,

x_{2} = v

,

ξ_{1} = 0

,

ξ_{2} = ξ

,

D_{11} = 0

,

D_{12} = 0

, and

D_{22} = D

while the matrix

A

for (23) has the element

A_{11} = 0, A_{12} = 1, A_{21} = - ω^{2}, A_{22} = - γ

. Thus, the eigenvalues of

A

are

λ_{1, 2} = - \frac{1}{2} (γ \pm \sqrt{γ^{2} - 4 ω^{2}})

.

To find the information length for the system (23), we use Proposition 1 and Theorem 1. First, Proposition 1 requires the computation of the exponential matrix

e^{A t}

involving a rather long algebra with the help of [38]. The result is:

\begin{matrix} e^{A t} = L^{- 1} [{(s I - A)}^{- 1}] = L^{- 1} [[\begin{matrix} \frac{s + γ}{(s - λ_{1}) (s - λ_{2})} & \frac{1}{(s - λ_{1}) (s - λ_{2})} \\ - \frac{ω^{2}}{(s - λ_{1}) (s - λ_{2})} & \frac{s}{(s - λ_{1}) (s - λ_{2})} \end{matrix}]] = [\begin{matrix} \frac{e^{λ_{1} t} (γ + λ_{1}) - e^{λ_{2} t} (γ + λ_{2})}{λ_{1} - λ_{2}} & \frac{e^{λ_{1} t} - e^{λ_{2} t}}{λ_{1} - λ_{2}} \\ - \frac{(e^{λ_{1} t} - e^{λ_{2} t}) ω^{2}}{λ_{1} - λ_{2}} & \frac{e^{λ_{1} t} λ_{1} - e^{λ_{2} t} λ_{2}}{λ_{1} - λ_{2}} \end{matrix}] . \end{matrix}

(25)

Here,

I \in R^{n \times n}

is the identity matrix. Similarly, we can show

\begin{matrix} 2 \int_{0}^{t} e^{A (t - t_{1})} D e^{A^{T} (t - t_{1})} d t_{1} = \\ [\begin{matrix} \frac{D (\frac{- 1 + e^{2 λ_{1} t}}{λ_{1}} + \frac{- λ_{1} - 4 e^{(λ_{1} + λ_{2}) t} λ_{2} + 3 λ_{2} + e^{2 λ_{2} t} (λ_{1} + λ_{2})}{λ_{2} (λ_{1} + λ_{2})})}{{(λ_{1} - λ_{2})}^{2}} & \frac{D {(e^{λ_{1} t} - e^{λ_{2} t})}^{2}}{{(λ_{1} - λ_{2})}^{2}} \\ \frac{D {(e^{λ_{1} t} - e^{λ_{2} t})}^{2}}{{(λ_{1} - λ_{2})}^{2}} & \frac{D ((- 1 + e^{2 λ_{2} t}) λ_{2} + λ_{1} (- \frac{4 e^{(λ_{1} + λ_{2}) t} λ_{2}}{λ_{1} + λ_{2}} + \frac{4 λ_{2}}{λ_{1} + λ_{2}} + e^{2 λ_{1} t} - 1))}{{(λ_{1} - λ_{2})}^{2}} \end{matrix}] . \end{matrix}

(26)

Using Equations (25) and (26) in Equations (6) and (7), we have the time-dependent (joint) PDF (5) at any time t for our system (23) and (24). To calculate Equation (11) with the help of Equations (25) and (26), we perform numerical simulations (integrations) for various parameters in Equations (23) and (24) as well as initial conditions. Note that while we have simulated many different cases, for illustration, we show some representative cases by varying D,

ω

,

γ

and

〈 x (0) 〉

,

〈 v (0) 〉

in Section 4.1, Section 4.2 and Section 4.3 and Appendix A, respectively, for the same initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01

and

Σ_{12} (0) = Σ_{21} (0) = 0

. Note that the initial marginal distributions of

p (x (0))

and

p (v (0))

are Gaussian with the same variance

0.01

. Results in the limit

ω \to 0

are presented in Section 4.4.

4.1. Varying D

Figure 1 shows the results when varying D as

D \in (0.0005, 0.04)

for the fixed parameters

γ = 2

and

ω = 1

. The initial joint PDFs are Gaussian with the fixed mean values

〈 x (0) 〉 = - 0.5

,

〈 v (0) 〉 = 0.7

; as noted above, the covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01

and

Σ_{12} (0) = Σ_{21} (0) = 0

. Consequently, at

t = 0

, the marginal distributions of

p (x (0))

and

p (v (0))

are Gaussian PDFs with the same variance

0.01

and the mean values

〈 x (0) 〉 = - 0.5

and

〈 v (0) 〉 = 0.7

, respectively.

Figure 1a,b show the snapshots of time-dependent joint PDF

p (x, t)

(in contour plots) for the two different values of

D = 0.0005

and

D = 0.04

, respectively. The black solid represents the phase portrait of the mean value of

〈 x (t) 〉

and

〈 v (t) 〉

while the red arrows display the direction of time increase. Note that in Figure 1b, only some of the initial snapshots of the PDFs are shown for clarity, given the great amount of overlapping between different PDFs. Figure 1c,d show the time-evolution of the information velocity

E (t)

and information length

L (t)

, respectively, for different values of

D \in (0.0005, 0.04)

. It can be seen that the system approaches a stationary (equilibrium) state for

t ≳ 20

for all values of D,

L (t)

approaching constant values (recall

L (t)

does not change in a stationary state). Therefore, we approximate the total information length as

L_{\infty} = L (t = 50)

, for instance. Finally, the total information length

L_{\infty} = L (t = 50)

is shown in Figure 1e. We determine the dependence of

L_{\infty}

on D by fitting an exponential function as

L_{\infty} (D) = 7.84 e^{- 329.05 D} + 11.21 e^{- 11.86 D}

(shown in red solid line).

4.2. Varying $ω$ or $γ$

We now explore how results depend on the two parameters

ω

and

γ

, associated with oscillation and damping, respectively. To this end, we use

D = 0.0005

and the same initial conditions as in Figure 1 but vary

ω \in (0, 2)

and

γ \in (0, 6)

in Figure 2 and Figure 3, respectively. Specifically, in different panels of these figures, we show the snapshots of the joint PDF

p (x, t)

, the time-evolutions of

E (t)

and

L (t)

for different values of

ω \in (0, 2)

and

γ \in (0, 6)

, and

L_{\infty}

against either

ω

or

γ

. From Figure 2e and Figure 3e, we can see that the system is in a stationary state for sufficiently large

t = 10

and

t = 100

, respectively. Thus, we use

L_{\infty} = L (t = 10) = L (10)

in Figure 2f,g and

L_{\infty} = L (t = 100) = L (10)

in Figure 3f,g.

Notably, Figure 2f,g (shown on linear-linear and log-linear scales on

x - y

axes, respectively) exhibit an interesting a non-monotonic dependence of

L_{\infty}

on

ω

for the fixed

γ = 2

, with the presence of a distinct minimum in

L_{\infty}

at certain

ω

. Similarly, Figure 3f,g (shown in linear-linear and log-log scales on

x - y

axes, respectively) also shows a non-monotonic dependence of

L_{\infty}

on

γ

for the fixed

ω = 1

. These non-monotonic dependences are more clearly seen in Figure 2g and Figure 3g. A close inspection of these figures then reveals that the minimum value of

L_{\infty}

occurs close to the critical damping (CD)

γ \sim 2 ω

; specifically, this happens at

ω \sim 1

for

γ = 2

in Figure 2f,g while at

γ \sim 2

for

ω = 1

in Figure 3f,g. We thus fit

L_{\infty}

against

ω

or

γ

depending on whether

ω

or

γ

is smaller/larger than its critical value as follows:

\begin{matrix} L_{10} (ω) & = & - 0.03 e^{4.34 ω} + 19.63 e^{0.06 ω} \forall ω \in (0, 1), \end{matrix}

(27)

\begin{matrix} L_{10} (ω) & = & 19.52 e^{- 0.12 ω} + 0.11 e^{2.48 ω} \forall ω \in (1, 2), \end{matrix}

(28)

\begin{matrix} L_{100} (γ) & = & 413.22 e^{- 12.4 γ} + 95.39 e^{- 1.02 γ} \forall γ \in (0, 2), \end{matrix}

(29)

\begin{matrix} L_{100} (γ) & = & 3.23 γ \forall γ \in (2, 6) . \end{matrix}

(30)

The fitted curves in Equations (27)–(30) are superimposed in Figure 2f and Figure 3f, respectively. It is important to notice from Equations (27)–(30) that

L_{\infty}

tends to increase as either

ω \to \infty

for a finite, fixed

γ

(

< \infty

) or

γ \to \infty

for a finite, fixed

ω

(

< \infty

).

Finally, we note that for the critical damping

γ = 2 ω

, the eigenvalue becomes a real double root with the value

λ_{1, 2} \to - ω

. Thus, in this limit, we have that

〈 x (t) 〉 = [\begin{matrix} e^{- t ω} (x (0) + t (v (0) + (γ - ω) x (0))) \\ e^{- t ω} (- t x (0) ω^{2} - t v (0) ω + v (0)) \end{matrix}],

(31)

and

Σ (t)

is composed by the following elements

\begin{matrix} Σ_{11} (t) & = & \frac{e^{- 2 t ω} (2 ω^{3} (Σ_{11} {(γ t - t ω + 1)}^{2} + t^{2} ((Σ_{12} + Σ_{21}) (γ - ω) + Σ_{22}) + t (Σ_{12} + Σ_{21})) + D (- 2 t ω (t ω + 1) + e^{2 t ω} - 1))}{2 ω^{3}}, \\ Σ_{12} (t) & = & e^{- 2 t ω} (t (- ω^{2} (Σ_{11} γ t + Σ_{11} + Σ_{21} t) + Σ_{11} t ω^{3} - Σ_{22} t ω + Σ_{22} + D t) - Σ_{12} (t ω - 1) (γ t - t ω + 1)), \\ Σ_{21} (t) & = & e^{- 2 t ω} (t (- ω^{2} (Σ_{11} γ t + Σ_{11} + Σ_{12} t) + Σ_{11} t ω^{3} - Σ_{22} t ω + Σ_{22} + D t) - Σ_{21} (t ω - 1) (γ t - t ω + 1)), \\ Σ_{22} (t) & = & \frac{e^{- 2 t ω} (2 t ω^{2} (t ω^{2} (Σ_{11} ω + Σ_{12} + Σ_{21}) - ω (Σ_{12} + Σ_{21}) + Σ_{22} (t ω - 2)) + 2 Σ_{22} ω + D (- 2 t ω (t ω - 1) + e^{2 t ω} - 1))}{2 ω} . \end{matrix}

(32)

Equations (31) and (32) are used in Section 4.1 (Figure 1).

4.3. Varying $〈 x (0) 〉$ or $〈 v (0) 〉$

To elucidate the information geometry associated with the Kramer equation (Equations (23) and (24)), we now investigate how

L_{\infty}

behaves near the equilibrium point

〈 x (0) 〉 = 〈 v (0) 〉 = 0

. To this end, we scan over

〈 x (0) 〉

for

〈 v (0) 〉 = 0

in Figure 4a–e while scanning over

〈 v (0) 〉

for

〈 x 0) 〉 = 0

in Figure 4f–i. For our illustrations in Figure 4, we use the same initial covariance matrix

Σ (0)

as in Figure 1, Figure 2 and Figure 3,

D = 0.0005

and

ω = 1

and a few different values of

γ

(above/below/at the critical value

γ = 2

). We note that the information geometry near a non-equilibrium point is studied in Appendix A.

Specifically, snapshots of

p (x, t)

are shown in Figure 4a–f for

γ = 2.5

(above its critical value

γ = 2 = 2 ω

) while those in Figure 4c–g are for

γ = 0.1

below the critical value 2. By approximating

L_{\infty} = L (t = 100)

, we then show how

L_{\infty}

depends on

〈 x (0) 〉

and

〈 v (0) 〉

for different values of

γ

in Figure 4d,e and Figure 4h,i, respectively.

Figure 4d,e show the presence of a minimum in

L_{\infty}

at the equilibrium

〈 x (0) 〉 = 0

(recall

〈 v (0) 〉 = 0

);

L_{\infty}

is a linear function of

〈 x (0) 〉

for

〈 x (0) 〉 ≫ 0.1

, which can be described as

L_{\infty} (x (0), γ) = h (γ) | 〈 x (0) 〉 | + f (γ)

. Here,

h (γ)

and

f (γ)

are constant functions depending on

γ

for a fixed

ω

which represent the slope and the y-axis intercept, respectively. A non-zero value of

L_{\infty}

at

〈 x (0) 〉 = 0

is caused by the adjustment (oscillation and damping) of the width of the PDFs in time due to the disparity between the width of the initial and equilibrium PDFs (see Figure 4b). In other words, even though the mean values remain in equilibrium for all time

{[〈 x (0) 〉, 〈 v (0) 〉]}^{T} = {lim}_{t \to \infty} 〈 x (t) 〉 = {[0, 0]}^{T}

, the information length (11) depends on the covariance matrix

Σ

which changes from its initial value to the final equilibrium value as follows

Σ (0) = [\begin{matrix} 0.01 & 0 \\ 0 & 0.01 \end{matrix}] to lim_{t \to \infty} Σ (t) = [\begin{matrix} \frac{D}{γ ω^{2}} & 0 \\ 0 & \frac{D}{γ} \end{matrix}] .

On the other hand,

L_{\infty}

against

〈 x (0) 〉

shows parabolic behaviour for small

〈 x (0) 〉 < 0.1

in Figure 4e. This is caused by the finite width

0.1 = \sqrt{Σ_{11} (0)} = \sqrt{Σ_{22} (0)}

of the initial

p (x, 0)

; we see that

〈 x (0) 〉 < 0.1

is within the uncertainty of the initial

p (x, 0)

.

Similarly, Figure 4h,i exhibit a minimum in

L_{\infty}

at the equilibrium

〈 v (0) 〉 = 0

(recall

〈 x (0) 〉 = 0

in this case);

L_{\infty}

is a linear function of

〈 v (0) 〉

for

〈 v (0) 〉 ≫ 0.1

described by

L_{\infty} (v (0), γ) = H (γ) | 〈 v (0) | + F (γ)

(again parabolic for

〈 v (0) 〉 < 0.1

, see Figure 4i). Here again,

H (γ)

and

F (γ)

are constant functions depending on

γ

for a fixed

ω

which represent the slope and the y-axis intercept, respectively.

Finally, Figure 4j shows in logarithmic scale that the minimum value of

L_{\infty}

at

〈 x (0) 〉 = 〈 v (0) 〉

monotonically increases with

γ

.

4.4. The Limit Where $ω \to 0$

When the natural frequency

ω = 0

(i.e., damped-driven system like the O-U process [36]) in Equation (23), the two eigenvalues of the matrix

A

become

λ_{1} \to - γ

and

λ_{2} \to 0

. It then easily follows that

〈 x (t) 〉 = [\begin{matrix} \frac{v (0) - e^{- γ t} v (0)}{γ} + x (0) \\ e^{- γ t} v (0) \end{matrix}],

(33)

and

Σ (t)

is composed by the elements

\begin{matrix} Σ_{11} (t) & = & \frac{e^{- 2 γ t} (- D + Σ_{22} (0) γ + e^{γ t} (4 D - γ (2 Σ_{22} (0) + (Σ_{12} (0) + Σ_{21} (0)) γ)) + e^{2 γ t} (γ (Σ_{22} (0) + γ (Σ_{12} (0) + Σ_{21} (0) + Σ_{11} (0) γ)) + D (2 γ t - 3)))}{γ^{3}}, \\ Σ_{12} (t) & = & \frac{e^{- 2 γ t} (D {(- 1 + e^{γ t})}^{2} - Σ_{22} (0) γ + e^{γ t} γ (Σ_{22} (0) + Σ_{12} (0) γ))}{γ^{2}}, \\ Σ_{21} (t) & = & \frac{e^{- 2 γ t} (D {(- 1 + e^{γ t})}^{2} - Σ_{22} (0) γ + e^{γ t} γ (Σ_{22} (0) + Σ_{21} (0) γ))}{γ^{2}}, \\ Σ_{22} (t) & = & \frac{e^{- 2 γ t} (D (- 1 + e^{2 γ t}) + Σ_{22} (0) γ)}{γ} . \end{matrix}

(34)

To investigate the case of

ω \to 0

, we consider the scan over

D \in (0.0005, 0.04)

for the same parameter value

γ = 2

, and the initial conditions as in Figure 1, apart from using

ω = 0

instead of

ω = 1

. Figure 5 presents the results – snapshots of

p (x, t)

, time evolutions of

E (t)

,

L (t)

, and

L_{\infty} = L (t = 50)

against D in Figure 5a–e. In particular, in Figure 5e, we identify the dependence of

L_{\infty}

on D by fitting the results to the curve

L_{=} 8.99 e^{- 324.19 D} + 10.83 e^{- 12.24 D}

.

5. Concluding Remarks

We have presented theoretical results of time-dependent PDFs and the information length for n-th order linear autonomous stochastic processes, which can be applied to a variety of practical problems. In particular, the information length diagnostics was found as a function of the mean and covariance matrices; the latter was further expressed in terms of the covariance matrix eigenvalues. A Specific application was made to a harmonically bound particle system with the natural oscillation frequency

ω

, subject to a damping

γ

and a Gaussian white-noise (Kramer equation). We investigated how the information length depends on

ω

and

γ

, elucidating the role of critical damping

γ = 2 ω

in information geometry. The fact that the information length tends to take its minimum value near the critical damping can be viewed as the simplification of dynamics and thus the decrease in information change due to the reduction of the two characteristic time scales associated with

ω

and

γ

to the one value. On the other hand, the information length in the long time limit was shown to preserve the linear geometry associated with the Gaussian statistics in a linear stochastic process, as in the case of the O-U process.

Future works would include the exploration of our results when applied to high-dimensional processes and the extension of our work to a more general (e.g., finite-correlated) stochastic noise, non-autonomous systems or non-linearly coupled systems. In particular, it will be of interest to look for a geodesic solution in non-autonomous systems [9] with the help of an external force, optimization or guiding self-organization (multi-agent systems) as well as elucidating the role of critical damping and resonances in self-organization. In addition, it would also be interesting to utilize the results introduced in [39] to predict the bound on the evolution of any observable for the Kramers problem (23), and compare it with a natural observable in such a system, the energy, for instance.

Author Contributions

Conceptualization, A.-J.G.-C. and E.-j.K.; Formal analysis, A.-J.G.-C.; Investigation, A.-J.G.-C. and E.-j.K.; Methodology, E.-j.K.; Project administration, E.-j.K.; Resources, A.-J.G.-C.; Software, A.-J.G.-C.; Supervision, E.-j.K.; Validation, A.-J.G.-C. and E.-j.K.; Visualization, A.-J.G.-C. and E.-j.K.; Writing—original draft, A.-J.G.-C. and E.-j.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Acknowledgments

E.-j.K. acknowledges the Leverhulme Trust Research Fellowship (RF-2018-142-9).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Analysis for Non-Zero Fixed Initial Conditions

In Section 4.3 we analysed the behaviour of the information geometry associated with the Kramer equation (Equations (23) and (24)) for different

γ \in (0, 2.5)

near the equilibrium point

〈 x (0) 〉 = 〈 v (0) 〉 = 0

. To this end, we plotted

L_{\infty}

when varying

〈 x (0) 〉

and

〈 v (0) 〉

for a fixed

〈 v (0) 〉 = 0

and

〈 x (0) 〉 = 0

, respectively. In this Appendix, we want to show how such information geometry changes near a non-equilibrium point by scanning over

〈 x (0) 〉

and

〈 v (0) 〉

for a fixed non-zero

〈 v (0) 〉 = 0.7

and

〈 x (0) 〉 = - 0.5

, respectively. We show that the use of non-zero fixed initial conditions changes the location of the minimum

L_{\infty}

depending on

γ

. Here, we use the same parameter values

D = 0.0005

,

ω = 1

,

Σ_{12} (0) = Σ_{21} (0) = 0

and

Σ_{11} (0) = Σ_{22} (0) = 0.01

.

First, snapshots of

p (x, t)

are shown in Figure A1a,f for

γ = 2.5

(above its critical value

γ = 2 = 2 ω

) while those in Figure A1b,g are for

γ = 0.1

below the critical value 2. It is important to notice that there is a non-symmetric behaviour of the trajectories of the system for

γ ≫ 0

. This is shown at Figure A1a,f whose trajectories asymmetrically vary over the initial conditions in comparison with the results shown in Figure 4a,f. By approximating

L_{\infty} = L (t = 100)

, we then show how

L_{\infty}

depends on

〈 x (0) 〉

and

〈 v (0) 〉

for different values of

γ

in Figure A1c,d and Figure A1h,i, respectively. Of prominence in Figure A1c,d is the presence of a distinct minimum in

L_{\infty}

for a particular value of

〈 x (0) 〉 = x_{c}

,

L_{\infty}

linearly increasing with

| 〈 x (0) 〉 - x_{c} |

for a sufficiently large

| 〈 x (0) 〉 - x_{c} |

; similarly, Figure A1h,i shows a distinct minimum in

L_{\infty}

for a particular value of

〈 v (0) 〉 = v_{c}

,

L_{\infty}

linearly increasing with

| 〈 v (0) 〉 - v_{c} |

for a sufficiently large

| 〈 v (0) 〉 - v_{c} |

.

Finally, we scan over

〈 x (0) 〉

and

〈 v (0) 〉

and identify the minimum value of

L_{\infty}

for a given

γ

and plot this minimum value of

L_{\infty}

(at

x_{c}

and

v_{c}

) against

γ

in Figure A1e,j. In Figure A1e,j,

L_{\infty}

against

γ

takes its minimum near the critical damping

γ = 2 ω = 2

(shown in a vertical line), as observed previously in Section 4.1 and Section 4.2. This is clearly different from the behaviour of the minimum value of

L_{\infty}

against

γ

(for the equilibrium point

〈 x (0) 〉 = 0

and

〈 v (0) 〉 = 0

) in Figure 4j where

L_{\infty}

monotonically increases with

γ

. This is because for

〈 x (0) 〉 = 0

and

〈 v (0) 〉 = 0

, mean values does not change over time, with less effect of oscillations (

ω

) and thus the critical damping

γ = 2 ω

.

Figure A1. Results of Equations (23) and (24) scanned over

〈 x (0) 〉 \in (- 5, 5)

for

〈 v (0) 〉 = 0.7

[Figure A1a–e] and

〈 v (0) 〉 \in (- 5, 5)

for

〈 x (0) 〉 = - 0.5

[Figure A1f–j]. The parameter values

ω = 1

,

D = 0.0005

, and

γ \in (0, 2.5)

while the initial covariance matrix

Σ (0)

has the elements

Σ_{11} (0) = Σ_{22} (0) = 0.01

,

Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure A1. Results of Equations (23) and (24) scanned over

〈 x (0) 〉 \in (- 5, 5)

for

〈 v (0) 〉 = 0.7

[Figure A1a–e] and

〈 v (0) 〉 \in (- 5, 5)

for

〈 x (0) 〉 = - 0.5

[Figure A1f–j]. The parameter values

ω = 1

,

D = 0.0005

, and

γ \in (0, 2.5)

while the initial covariance matrix

Σ (0)

has the elements

Σ_{11} (0) = Σ_{22} (0) = 0.01

,

Σ_{12} (0) = Σ_{21} (0) = 0

.

References

Klebaner, F. Introduction to Stochastic Calculus with Applications; Imperial College Press: London, UK, 2012. [Google Scholar]
Ramstead, M.; Friston, K.; Hipólito, I. Is the Free-Energy Principle a Formal Theory of Semantics? From Variational Density Dynamics to Neural and Phenotypic Representations. Entropy 2020, 22, 889. [Google Scholar] [CrossRef]
Gershenson, C. Guiding the Self-organization of Cyber-Physical Systems. Front. Robot. AI 2020, 7, 41. [Google Scholar] [CrossRef] [Green Version]
Prokopenko, M.; Boschett, F.; Ryan, A. An information-theoretic primer on complexity, self-organization. Complexity 2009, 15, 11–28. [Google Scholar] [CrossRef]
Trianni, V.; Nolfi, S.; Dorigo, M. Evolution, Self-Organization and Swarm Robotics; Springer: Berlin/Heidelberg, Germangy, 2008; pp. 163–191. [Google Scholar]
Wilson, A.; Schultz, J.; Murphey, T. Trajectory synthesis for Fisher information maximization. IEEE Trans. Robot. 2014, 30, 1358–1370. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Correa, J. Metrics of emergence, self-organization, and complexity for EWOM research. Front. Phys. 2020, 8, 5232–5241. [Google Scholar] [CrossRef]
Gros, C. Generating functionals for guided self-organization. In Guided Self-Organization: Inception; Springer: Berlin/Heidelberg, Germangy, 2014; pp. 53–66. [Google Scholar]
Kim, E.; Lee, U.; Heseltine, J.; Hollerbach, R. Geometric structure and geodesic in a solvable model of nonequilibrium process. Phys. Rev. E 2016, 93, 062127. [Google Scholar] [CrossRef] [Green Version]
Nicholson, S.; Kim, E. Investigation of the statistical distance to reach stationary distributions. Phys. Lett. A 2015, 379, 83–88. [Google Scholar] [CrossRef]
Li, W. Transport information geometry I: Riemannian calculus on probability simplex. arXiv 2020, arXiv:math.DG/1803.06360. [Google Scholar]
Kowalski, A.; Martín, M.; Plastino, A.; Rosso, O.; Casas, M. Distances in Probability Space and the Statistical Complexity Setup. Entropy 2011, 13, 1055–1075. [Google Scholar] [CrossRef] [Green Version]
Prokopenko, M.; Gershenson, C. Entropy Methods in Guided Self-Organisation. Entropy 2014, 16, 5232–5241. [Google Scholar] [CrossRef] [Green Version]
Parr, T.; Da-Costa, L.; Friston, K. Markov blankets, information geometry and stochastic thermodynamics. Philos. Trans. R. Soc. A 2020, 378, 20190159. [Google Scholar] [CrossRef] [Green Version]
Erven, T.; Harremoës, P. Rényi divergence and Kullback-Leibler divergence. IEEE Trans. Inf. Theory 2014, 60, 3797–3820. [Google Scholar] [CrossRef] [Green Version]
Takatsu, A. Wasserstein geometry of Gaussian measures. Osaka J. Math. 2011, 48, 1005–1026. [Google Scholar]
Li, W.; Zhao, J. Wasserstein information matrix. arXiv 2020, arXiv:math.ST/1910.11248. [Google Scholar]
Lott, J. Some Geometric Calculations on Wasserstein Space. Commun. Math. Phys. 2008, 277, 423–437. [Google Scholar] [CrossRef] [Green Version]
Risken, H. The Fokker-Planck Equation: Methods of Solution and Applications; Springer: Berlin, Germany, 1996. [Google Scholar]
Gangbo, W.; McCann, R. The geometry of optimal transportation. Acta Math 1996, 177, 113. [Google Scholar] [CrossRef]
Frieden, B. Science from Fisher Information: A Unification; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Costa, S.; Santos, S.; Strapasson, J. Fisher information distance: A geometrical reading. Discret. Math. 2015, 197, 59–69. [Google Scholar] [CrossRef]
Martin, M.; Plastron, A.; Ross, O. Statistical complexity and disequilibrium. Phys. Lett. A 2003, 311, 126. [Google Scholar] [CrossRef]
Otto, F.; Villani, C. Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality. J. Funct. Anal. 2000, 173, 361–400. [Google Scholar] [CrossRef] [Green Version]
Zamir, R. A proof of the Fisher information inequality via a data processing argument. IEEE Trans. Inf. Theory 1998, 44, 1246–1250. [Google Scholar] [CrossRef]
Heseltine, J.; Kim, E. Comparing information metrics for a coupled Ornstein-Uhlenbeck process. Entropy 2019, 21, 775. [Google Scholar] [CrossRef] [Green Version]
Kim, E. Investigating Information Geometry in Classical and Quantum Systems through Information Length. Entropy 2018, 20, 574–585. [Google Scholar] [CrossRef] [Green Version]
Kim, E.; Hollerbach, R. Signature of nonlinear damping in geometric structure of a nonequilibrium process. Phys. Rev. E 2017, 95, 022137. [Google Scholar] [CrossRef] [Green Version]
Hollerbach, R.; Dimanche, D.; Kim, E. Information geometry of nonlinear stochastic systems. Entropy 2018, 20, 550. [Google Scholar] [CrossRef] [Green Version]
Nicholson, S.; Kim, E. Geometric method for forming periodic orbits in the Lorenz system. Phys. Scr. 2016, 91, 044006. [Google Scholar] [CrossRef]
Kim, E.; Lewis, P. Information length in quantum systems. J. Stat. Mech. Theory Exp. 2018, 2018, 043106. [Google Scholar] [CrossRef] [Green Version]
Zee, A. Quantum Field Theory in A Nutshell; Princeton University Press: Princeton, NJ, USA, 2010; Volume 7. [Google Scholar]
Tracy, D.; Jinadasa, K. Expectations of products of random quadratic forms. Stoch. Anal. Appl. 1986, 4, 111–116. [Google Scholar] [CrossRef]
Magnus, J.R.; Neudecker, H. Matrix Differential Calculus with Applications in Statistics and Econometrics; John Wiley & Sons: Chichester, UK, 2019. [Google Scholar]
Jolliffe, I.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2016, 374, 20150202. [Google Scholar] [CrossRef]
Heseltine, J.; Kim, E. Novel mapping in non-equilibrium stochastic processes. J. Phys. A 2016, 49, 175002. [Google Scholar] [CrossRef]
Velasco, S. On the Brownian motion of a harmonically bound particle and the theory of a Wiener process. Eur. J. Phys. 1985, 6, 259–265. [Google Scholar] [CrossRef]
Chen, C.T. Linear System Theory and Design; Oxford University Press: New York, NY, USA, 2013; Volume 7. [Google Scholar]
Ito, S.; Dechant, A. Stochastic Time Evolution, Information Geometry, and the Cramér-Rao Bound. Phys. Rev. X 2020, 10, 021056. [Google Scholar] [CrossRef]

Figure 1. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ = 2, ω = 1, D \in (0.0005, 0.04)

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 1. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ = 2, ω = 1, D \in (0.0005, 0.04)

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 2. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ = 2, ω \in (0, 2), D \in 0.0005,

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 2. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ = 2, ω \in (0, 2), D \in 0.0005,

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 3. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ \in (0, 6), ω = 1, D = 0.0005,

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 3. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ \in (0, 6), ω = 1, D = 0.0005,

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 4. Results of Equations (23) and (24) scanned over

〈 x (0) 〉 \in (- 5, 5)

for

〈 v (0) 〉 = 0

[Figure 4a–e] and

〈 v (0) 〉 \in (- 5, 5)

for

〈 x (0) 〉 \in = 0

[Figure 4f–j]. The parameter values

ω = 1, D = 0.0005,

and

γ \in (0, 2.5)

while the initial covariance matrix

Σ (0)

has the elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 4. Results of Equations (23) and (24) scanned over

〈 x (0) 〉 \in (- 5, 5)

for

〈 v (0) 〉 = 0

[Figure 4a–e] and

〈 v (0) 〉 \in (- 5, 5)

for

〈 x (0) 〉 \in = 0

[Figure 4f–j]. The parameter values

ω = 1, D = 0.0005,

and

γ \in (0, 2.5)

while the initial covariance matrix

Σ (0)

has the elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 5. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ = 2, ω = 0, D \in (0.0005, 0.04)

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Figure 5. Results of Equations (23) and (24) for

〈 x (0) 〉 = - 0.5, 〈 v (0) 〉 = 0.7, γ = 2, ω = 0, D \in (0.0005, 0.04)

and the initial covariance matrix

Σ (0)

with elements

Σ_{11} (0) = Σ_{22} (0) = 0.01, Σ_{12} (0) = Σ_{21} (0) = 0

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guel-Cortez, A.-J.; Kim, E.-j. Information Length Analysis of Linear Autonomous Stochastic Processes. Entropy 2020, 22, 1265. https://0-doi-org.brum.beds.ac.uk/10.3390/e22111265

AMA Style

Guel-Cortez A-J, Kim E-j. Information Length Analysis of Linear Autonomous Stochastic Processes. Entropy. 2020; 22(11):1265. https://0-doi-org.brum.beds.ac.uk/10.3390/e22111265

Chicago/Turabian Style

Guel-Cortez, Adrian-Josue, and Eun-jin Kim. 2020. "Information Length Analysis of Linear Autonomous Stochastic Processes" Entropy 22, no. 11: 1265. https://0-doi-org.brum.beds.ac.uk/10.3390/e22111265

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Information Length Analysis of Linear Autonomous Stochastic Processes

Abstract

1. Introduction

2. Preliminaries

2.1. Information Length

2.2. Problem Formulation

3. General Analytical Results

4. Kramers Equation

4.1. Varying D

4.2. Varying $ω$ or $γ$

4.3. Varying $〈 x (0) 〉$ or $〈 v (0) 〉$

4.4. The Limit Where $ω \to 0$

5. Concluding Remarks

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Analysis for Non-Zero Fixed Initial Conditions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Information Length Analysis of Linear Autonomous Stochastic Processes

Abstract

1. Introduction

2. Preliminaries

2.1. Information Length

2.2. Problem Formulation

3. General Analytical Results

4. Kramers Equation

4.1. Varying D

4.2. Varying ω or γ

4.3. Varying 〈 x ( 0 ) 〉 or 〈 v ( 0 ) 〉

4.4. The Limit Where ω → 0

5. Concluding Remarks

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Analysis for Non-Zero Fixed Initial Conditions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2. Varying $ω$ or $γ$

4.3. Varying $〈 x (0) 〉$ or $〈 v (0) 〉$

4.4. The Limit Where $ω \to 0$