Next Article in Journal
The entropy of a mixture of probability distributions
Previous Article in Journal
Modeling Non-Equilibrium Dynamics of a Discrete Probability Distribution: General Rate Equation for Maximal Entropy Generation in a Maximum-Entropy Landscape with Time-Dependent Constraints
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Lagrangian submanifolds generated by the Maximum Entropy principle

Dipartimento di Matematica Pura ed Applicata, Università di Padova, via Belzoni, 7 -- 35131 Padova, Italy
Submission received: 25 October 2004 / Accepted: 12 January 2005 / Published: 12 January 2005

Abstract

:
We show that the Maximum Entropy principle (E.T. Jaynes, [8]) has a natural description in terms of Morse Families of a Lagrangian submanifold. This geometric approach becomes useful when dealing with the M.E.P. with nonlinear constraints. Examples are presented using the Ising and Potts models of a ferromagnetic material.
MSC 2000 codes:
53D12; 62B10; 82B20
PACS codes:
5.20Gg; 5.50.+q; 5.70.Fh

1 A synopsis of Morse families

Let M be a smooth even–dimensional manifold and ω a non–degenerate closed two–form. The couple (M, ω) will be called a symplectic manifold. A submanifold S ⊂ M such that 2dimS = dimM and that ω|S = 0 is a Lagrangian submanifold of M.
As an example, given a manifold Q, its cotangent bundle T*Q equipped with the two–form ω = dpdq, where (q, p) is a local chart of T*Q, is a symplectic manifold. A function G : Q R defines a Lagrangian submanifold of T*Q
Entropy 07 00001 i001
(dimΛG = dimQ and d(dGdq) = 0) that is transversal to the fibers of π : T*Q → Q since πG is a diffeomorphism.
Let us consider a family of functions over Q depending on k parameters
Entropy 07 00001 i002
and let us denote with
Entropy 07 00001 i003
the critical set (the set of critical points over the fibers of Q × UQ).
The above introduced family is a Morse family (see e.g. Weinstein, [12]) of functions if ε is a regular set in Q × U , i.e. iff the following local condition hold on ε
Entropy 07 00001 i004
where Gu denotes partial derivative of G with respect to u.
Proposition 1.1
(Weinstein, [12] Lect. 6) Let G be a Morse family. Then
Entropy 07 00001 i005
is a Lagrangian submanifold1 of T * Q, generated by G.
If the above rank condition defining a Morse family is satisfied by the square matrix Guu, i.e.
Entropy 07 00001 i006
then, by the implicit function theorem, locally at q, ε is the graph of a function gW: WQU, u = gW (q) and setting Entropy 07 00001 i101 one has that
Entropy 07 00001 i007
Proposition 1.2
If the above condition (2) holds we say that all the parameters u can be eliminated. Hence, on W, Λ ∩ T*W coincides with graph Entropy 07 00001 i102 therefore Λ is a Lagrangian submanifold locally transversal to the fibers of π.
If G is a Morse family, then the set Γ ⊂ Q of the points qQ such that at (q, p) ∈ Λ the lagrangian submanifold Λ is not transversal to the fibers of π is called the caustic of Λ and it is given by
Entropy 07 00001 i008

1.1 An example: Lagrange multipliers method

We restate the classical Lagrange multipliers method for the constrained extremization of a differentiable function in terms of Morse families.
Let M be a n–dimensional smooth manifold and f be a smooth real function defined on it. Let Entropy 07 00001 i103, kn be a smooth submersion i.e. verifying
Entropy 07 00001 i009
As it is well known, m* ∈ M is an extremal point, df (m*) = 0, of f on the fiber ζ−1(c), c Entropy 07 00001 i104 if and only if, upon introducing the smooth map
Entropy 07 00001 i010
the couple (m*,λ) satisfies the system of equations (gradient system)
Entropy 07 00001 i011
Note that, if the hypothesis rkdζ = max holds, by (5)2 we can associate to the constrained extremum m* a uniquely defined Lagrange multiplier λ*.
It is straightforward to realize that the set of solutions of the above system of equations (5) when c varies in Range(ζ) has the structure of a critical set with respect to the projection (c, λ, m) ↦ c, i.e.
Entropy 07 00001 i012
Moreover, it is easy to see that the condition that G be a Morse family reads u = (λ, m)
Entropy 07 00001 i013
where Ikis the identity matrix in Entropy 07 00001 i105 and GmmSym(n) is the Hessian matrix of G with respect to the m variables (summation over repeated indices is understood)
Entropy 07 00001 i014
By Proposition 1.1, if G is a Morse family, the set
Entropy 07 00001 i015
is a Lagrangian submanifold of Entropy 07 00001 i106.
The local condition (2) for the elimination of all the parameters is, in this setting,
Entropy 07 00001 i016
A sufficient condition (not necessary in the general case) for the elimination of all the parameters is given by the following Proposition (see Berteskas,[3] p. 69)
Proposition 1.3
If the symmetric matrix Gmm(λ, m) ∈ Sym(n) in (7) is (positive or negative) definite on ker Entropy 07 00001 i107 for (c,λ, m) ∈ ε, then the square matrix Guu in (9) has maximal rank, and hence all the parameters u = (λ, m) can be eliminatedand the Lagrangian submanifold in (8) is, locally at c, transversal to the fibers of π.
Proof.
We have to show that Guuz = 0 implies that z = 0, where Entropy 07 00001 i108. Now, setting Entropy 07 00001 i109, we have that
Entropy 07 00001 i017
Now, if v = 0, the equation Entropy 07 00001 i110 implies that w = 0 since T is injective ( is surjective by hypothesis) and we are done. As a reductio ad absurdum hypothesis, suppose that v ≠ 0 and dζv = 0, i.e. v ∈ ker . The equation dζv = 0 implies vTT = 0T; by multiplying the second equation by vT/ ≠ 0 we get
Entropy 07 00001 i018
which is an absurdum since Gmm is a (positive on negative) definite matrix on ker (m). ☐
If all the parameters can be eliminated, then, locally at c, the critical set ε is the graph of a function Entropy 07 00001 i111 and, since (5)1 holds, locally at c, the Morse family Entropy 07 00001 i112 reduces to
Entropy 07 00001 i019
Remark.
The above sufficient condition involves the ”second order” (Hessian) matrix Gmm in (7). In case ζ is a linear fibration, the above condition is satisfied if Hess(f) is a definite matrix on ker (m).
Example.
In case that f is the potential energy of a mechanical system subject to conservative forces, M is the space of configurations of the system, and ζ: Entropy 07 00001 i113 is the fibration defining an ideal (workless) constraint, the above introduced Lagrangian submanifold Λ is the set of couples (c, −λ) respectively value of the constraint and reaction force of the constraint in the equilibrium configuration m. If for every given c there is a unique equilibrium configuration Entropy 07 00001 i114 then the value of the reaction force Entropy 07 00001 i115 is a function of c. This situation is an example of the case where all the parameters can be globally eliminated.

1.2 Global transversality forΛ

The above condition (2) states that the set of critical points (m, λ) over the fiber ζ−1(c) is made of isolated points. If the condition of the Proposition 1.3 above hold, then all the critical points are local maxima (or minima) of f. We now look at the hypothesis that imply the existence of a unique maximum for f on ζ−1, so that the Lagrangian submanifold Λ is globally transversal to the fibers of π.
A sufficient condition is the following (see [2], p.136):
Proposition 1.4
If the matrix Guu is positive definite on some bounded convex domain D, then the map Gu is globally univalent on D, hencethe equation Gu = 0 has at most one solution in D.

2 The M. E. P. with linear constraints.

The Maximum Entropy principle (MEP) (Jaynes, [8]) is a general method of statistical inference that it is able to single out a unique probability distribution over the space of possible states i = 1,…,n of a system (to be considered as outcomes of n independent, identically distributed variables) among those compatibles with the results of measurements made on the system, usually in the form of average values of observables defined for the system.
Let χ = {1, 2,…, n} be the (finite) set of possible states (outcomes) of a system. As an example, one can think i = (qi, pi) and χ is a finite discretization in cells of the phase space of an Hamiltonian system.
Let Entropy 07 00001 i116 be the value assumed by the observable2 ϕL : Entropy 07 00001 i117 on the state iχ, L = 1,…, kn.
The n × k matrix Entropy 07 00001 i1163 defines a linear fibration from Entropy 07 00001 i118 if the observables are independent, i.e.
Entropy 07 00001 i020
A statistical state over χ (an ensemble in the Statistical Mechanics language) is a probability vector p = (p1,…, pn) ∈ Entropy 07 00001 i119; the (information) entropy of p is given by the following function (entropy) defined on Entropy 07 00001 i120
Entropy 07 00001 i021
Note that S is a convex function and that the Hessian of S is
Entropy 07 00001 i022
We will call Entropy 07 00001 i121 the space of microscopic states of the system and Entropy 07 00001 i122 the space of macroscopic states.
Remark.
 In the sequel, the fact that the linear operator ϕ will be restricted to the manifold with boundary Entropy 07 00001 i123 will play no role since all the critical points of S will be in the interior of Entropy 07 00001 i124, where Entropy 07 00001 i125.
(M.E.P.) The selected probability distribution3 p is the (unique) one maximizing the information entropy (uncertainty)
Entropy 07 00001 i023
among those verifying
Entropy 07 00001 i024
Now we show that the M.E.P. procedure can be recast in the scheme developed in the last Section. (See also [9], for a different approach of M.E.P. in the framework of Symplectic Geometry).
The above constrained maximization problem (M.E.P.) can be dealt with by the Lagrange Multipliers Method by introducing the function
Entropy 07 00001 i025
Note that the apparatus of the Lagrange multipliers method supplemented by second order sufficient conditions of Proposition 1.3 and Proposition 1.4 determines the constrained local maximizers of the entropy function. The main feature of the linear constraint case is that there is a unique maximizer.
As a straightforward consequence of Proposition 1.3 (with m = p and Gmm = −HS) we have the following result:
Proposition 2.1
Since ϕ defines a (linear) fibration and −HS(p) ∈ Sym+(n) is a positive definite matrix, then
i) 
condition (1) holds, hence S(c; λ, p) is a Morse family where
ii) 
all the k + n parameters (λ, p) canbe eliminated; hence
iii) 
Λ in (8) is a Lagrangian submanifold locally transversal to the fibers of π.
As we have seen in Proposition 1.2 the critical set ε
Entropy 07 00001 i026
is, locally at c, the graph of a map
Entropy 07 00001 i027
such that
Entropy 07 00001 i028
Now, setting Entropy 07 00001 i126 – see (10)– we have that
Entropy 07 00001 i029
and, by (3) and Proposition 1.2, that
Entropy 07 00001 i030
since (c, g(c)) ∈ ε. Hence the projected submanifold Λ is, locally at W , the graph of Entropy 07 00001 i127
Entropy 07 00001 i031
It is instructive to look at the explicit form of the local map g that allows for the elimination of the extra parameters u = (λ, p). The map g has to be determined by the equations Entropy 07 00001 i128 in (14). More in detail (here summation over repeated indices is understood):
Entropy 07 00001 i032
Note that we do not take into account explicitly the normalization constraint. From the second equation we get
Entropy 07 00001 i033
which is injective since ϕT is injective by hypothesis (11); hence we have succeeded to give the parameters p as a function of the λ. Moreover, by substituting (18) in the first line of (17) we get the following system of k equations in the k unknowns λL:
Entropy 07 00001 i034
where Entropy 07 00001 i129 is defined as follows:
Entropy 07 00001 i035
If the gradient map in (19) has an (at least local) inverse Entropy 07 00001 i130, then by substitution in (18) we get Entropy 07 00001 i131 and we are done. The existence of a local inverse is granted –through the implicit function theorem– by Proposition 2.1 which is the reformulation of Proposition 1.3 to the M.E.P. setting. However, the existence can be checked directly given the explicit form of g as follows: a sufficient condition for local invertiblity of (19) is that the matrix
Entropy 07 00001 i036
be non-singular.
The global invertiblity can be investigated using Proposition 1.4: a sufficient condition for global invertiblity of the map (19) on a bounded convex domain is that the matrix Entropy 07 00001 i132 be (positive or negative) definite. But, for Entropy 07 00001 i133,
Entropy 07 00001 i037
from which we get
Entropy 07 00001 i038
hence Entropy 07 00001 i134 and we have proved the global invertiblity therefore Λ is globally transversal to the fibers of π.
Of course, an explicit formula for the inverse of the gradient map (19) does not exists.
Let us denote with Entropy 07 00001 i135 the inverse of (19). Then the generating function in (15) of the projected lagrangian submanifold (16) has the form
Entropy 07 00001 i039
Remark 1.
The above relation between Entropy 07 00001 i136 is the same as the one between the Lagrangian Entropy 07 00001 i137 and the Hamiltonian H(p).
Remark 2.
Legendre transform. It is now easy to show that the inverse of a gradient map, if it exists, it is a gradient map too. In fact we have
Entropy 07 00001 i040
Remark 3.
It is interesting to consider the case that the matrix ϕ depends smoothly on α real parameters (a1,…, aα) ∈ A ⊂ ℝα, that is
Entropy 07 00001 i041
As a consequence , the above Legendre transform formula inherits a dependence on the parameters a
Entropy 07 00001 i042

3 Examples of M.E.P. with nonlinear constraints

In the case of M.E.P. with linear constraints we have seen that the Lagrangian submanifold associated to the constrained maximization problem is always (globally) transversal to the fibers of π, hence the M.E.P. singles out a unique statistical state. We will see that this feature is no longer conserved in case of M.E.P. with nonlinear constraints.
We point out some of the relevant features of the nonlinear case through some examples and remarks.
Example 1.
Let χ = {1,…, n} be the space state of a system and suppose that the observable ϕ satisfy the hypothesis : there exists i, jχ such that ϕi > 0, and ϕj < 0. Then for every c > 0 there are solutions pi > 0 to the constraint equation
Entropy 07 00001 i043
or
Entropy 07 00001 i044
Now we suppose that the result of an experiment gives the value of 〈ϕ2; therefore we are lead to find the maximum entropy probability distribution subject to the nonlinear constraint
Entropy 07 00001 i045
The solutions to this constrained maximization problem are the union of the solution (which are unique) to the maximum entropy problem with linear constraints (+) and (−). These latter can be found as explained in Sect.2 and are
Entropy 07 00001 i046
Hence, for every c > 0 there are exactly two maximum entropy distributions, and the critical set is
Entropy 07 00001 i047
The associated Lagrangian submanifold is given by equation
Entropy 07 00001 i048
It is easy to see that Λ has the structure of the graph of a two–valued function defined for c > 0. Note that for c = 0 the two constraints (+) and (−) do coincide, therefore the above function is single–valued. However, for c = 0 the constraint ζ(p) = 〈ϕ2 fails to have maximal rank on the set of solutions of ζ(p) = c = 0. The following ones are physical examples that can be easily recast into this scheme.
Example 2.
A physical instance of the M.E.P. with nonlinear constraints is the Curie–Weiss approximation of the Ising model for ferromagnetism that we briefly describe below (see e.g.[4]). Let Entropy 07 00001 i139 be a lattice containing N sites each occupied by an atom whose spin can assume the value s = +1 or s = −1. The set of possible spin configurations of the lattice is Entropy 07 00001 i140. The potential energy of the lattice (ferromagnet) in the configuration ω ∈ Ω and subject to an external magnetic field B is defined as
Entropy 07 00001 i049
where [i, j] is a couple of nearest–neighborhood sites. We are therefore in the case of an observable (the energy) depending on a parameter (the field strength B). The Curie–Weiss approximation of this model is to neglect the interactions between neighboring spins and hence to assume that the values of the spin at the different sites i = 1,… ,n are independent identically distributed random variables taking values in {−1, +1}; therefore
Entropy 07 00001 i050
The average value of the energy of the lattice is
Entropy 07 00001 i051
where
Entropy 07 00001 i052
Here p+, p− is the probability of the spin up, spin down configuration respectively for the spin at an arbitrary site. Therefore the linear constraint on the lattice energy
Entropy 07 00001 i053
where HminEHmax, is equivalent to the nonlinear constraint on 〈s〉 = p+p
Entropy 07 00001 i054
for the average value of the spin on a single site.
From now on we consider the space state to be {1,−1} and we look for the probability distribution Entropy 07 00001 i141 maximizing the entropy (12) subject to the normalization and energy constraints. In order to apply the theory developed in the previous Sections we are lead to introduce the normalization constraint in the general form ) Entropy 07 00001 i142 in this setting the average value of the spin is
Entropy 07 00001 i055
The constraint equation are therefore, see (22)
Entropy 07 00001 i056
Entropy 07 00001 i057
with
Entropy 07 00001 i058
In this rather special example (with k = n, the number of constraints equal to the dimension of the microscopic states manifold) the fiber Entropy 07 00001 i143, c = (α, γ) is made of isolated points; these are the intersections in the positive quarter plane Entropy 07 00001 i144 of the line Entropy 07 00001 i145 with the parabola Entropy 07 00001 i146 There can be none, one or two intersection points.
Example 3.
Planar Potts model. A straightforward generalization of the Ising model is the planar Potts model. Here the spin vector can point in q ≥ 2 directions in the plane, each forming an angle Entropy 07 00001 i147 with a fixed reference axis in the plane. The set of possible spin configurations of the lattice is Entropy 07 00001 i148. The potential energy of the lattice (ferromagnet) in the configuration ω ∈ Ω and subject to an external magnetic field B is
Entropy 07 00001 i059
where [i, j] is a couple of nearest–neighborhood sites, dot denotes scalar product and n is the versor of the fixed reference axis. In the Curie–Weiss approximation of statistical independence of the state si with respect to sj (i.e. there is no correlation between the spin at neighboring sites) the average value of the lattice energy is given by
Entropy 07 00001 i060
where the superscript T denotes transposition, A ∈ Sym(q) is the interaction symmetric matrix
Entropy 07 00001 i061
and Entropy 07 00001 i1164 has components
Entropy 07 00001 i1162
The spectrum of the matrix A has the following expression
Entropy 07 00001 i062
therefore equation (25) defines an hypercylinder in ℝq. The fiber ζ−1(c), Entropy 07 00001 i1165 is the intersection, in the positive octant, of the normalization constraint hyperplane
Entropy 07 00001 i063
with the (convex) energy hypercylinder (25).

4 Phase transitions in the M.E.P. setting

We consider, as in the previous examples, a physical system in which, for simplicity’ sake, the only observable of interest is the energy, and we suppose that the information we have on the system energy E places a (possibly nonlinear) constraint on the probability distribution
Entropy 07 00001 i064
Then the set of macroscopic states of the system, determined by the M.E.P., is the Lagrangian submanifold –see (8)–
Entropy 07 00001 i065
where β is the Lagrangian multiplier associated to the energy constraint that can be given in the linear case –see (20) and (15)–as
Entropy 07 00001 i066
In case Λ is transversal to the base manifold, then there is a unique β associated to E; if Λ is in general position w.r.t. the base manifold, then there can be multiple values β associated to the system energy E. This multi–valuedness is interpreted as a phase transition for the system. Note that, as developed in the previous Sections, phase transitions are possible within this scheme only if nonlinear constraints are in force.
Critical values of E, where transitions can occur, can be (in principle) determined through formula (4). However, (4) requires the computation of the solution of a nonlinear system of (n + k) ≈ n equations and of the determinant of a order ≈ n matrix. Therefore (4) gives both an experimentally testable quantity (the critical values of E or the related temperature β) and sets the limits of applicability of this formalism in terms of system complexity.
Finally, let us mention briefly some other approaches to the existence of phase transitions for discrete systems with finite volume (i.e. not in the thermodynamic limit).
For some Hamiltonian systems the entropy of the system having energy E can be, explicitly or at least numerically, computed through the Boltzmann formula
Entropy 07 00001 i067
where W (E) is the area in phase space of the E–level set of the Hamiltonian H(q, p); the system temperature is defined as (compare with the above formula (26))
Entropy 07 00001 i068
If S(E) is not a convex function of E on its domain, i.e.
Entropy 07 00001 i069
where Cv is the specific heat at constant volume, then there may be multiple values of the energy for a given value of the temperature, which is again interpreted with the presence of a phase transition for the system.
This approach (which dates back to Boltzmann) has ben applied in literature to self–gravitating system ([10], [11]), to the Potts models ([7]), and to mechanical systems with non convex potential energy ([1], [5]). For systems with an Hamiltonian of the form kinetic plus potential energy, the phenomenon of negative specific heat stems from the fact that in some energy intervals, an increase of the total energy E determines an increase of the average system’ potential energy and a diminution of the average kinetic energy (i.e. the temperature of the system).

Acknowledgments

The author whishes to thank F. Cardin for useful discussion on the subject of this paper, P. Dai Pra for suggesting the Potts model example and C. Zanella for advices in the Proof of Proposition 1.3.

References

  1. Berdichevsky, V.L. Thermodynamics of chaos and order, Pitman Monographs and Surveys in Pure and Applied Mathematics 90 Addison Wesley Longman.
  2. Berger, M.; Berger, M. Perspectives in Nonlinearity. An Introduction to Nonlinear Analysis; W. A. Benjamin Inc.: New York, 1968. [Google Scholar]
  3. Bertesekas, D. P. Constrained Optimization and Lagrange Multipliers Methods; Athena Scientific: Belmont, Massachusetts, 1996. [Google Scholar]
  4. Brout, R. Phase Transitions. In Statistical Physics. Phase transitions and Superfluidity; vol. 1, pp. 5–103. Brandeis University Summer Institute in Theoretical Physics, Gordon and Breach Science Publishers, 1996. [Google Scholar]
  5. Cardin, F.; Favretti, M. On the Helmholtz–Boltzmann Thermodynamics of Mechanical Systems. Continuum Mech. Thermodyn. 2004, 16, 15–29. [Google Scholar]
  6. Favretti, M. Isotropic submanifold generated by the Maximum Entropy principle and Onsager reciprocity relations. Journal of Functional Analysis. to appear.
  7. Gross, D.H.E.; Ecker, A.; Zhang, X.Z. Microcanonical thermodynamics of first order phase transitions studied in the Potts model. Ann. Physik (8) 5, no. 5. 1996, 446–452. available online: http://xxx.lanl.gov/abs/cond-mat/9607150. [Google Scholar]
  8. Jaynes, E. T. Information theory and statistical mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar]
  9. Omohundro, S. Geometric Perturbation Theory in Physics; World Scientific Publishing Co.: Singapore, 1986. [Google Scholar]
  10. Thirring, W. Systems with Negative Specific Heat. Z. Phys. 1970, 235, 339–352. [Google Scholar]
  11. Votyakov, E.V.; Hidmi, H.I.; De Martino, A.; Gross, D.H.E. Microcanonical mean–field thermodynamics of self–gravitating and rotating systems. 2002. available online: hhtp://arXiv.org/abs/condmat/0202140.
  12. Weinstein, A. Lectures on symplectic manifolds. In C.B.M.S. Conferences Series in Mathematics, A.M.S. 1977; 29. [Google Scholar]
  • 1An extension of Weinstein’ result to the infinite dimensional Banach space setting is contained in [6].
  • 2we tacitly assume that the normalization constraint Entropy 07 00001 i149 has ben added as the observable Entropy 07 00001 i150.
  • 3From now on we use the letter p to denote a probability distribution while the letter p was used above to denote a covector at q

Share and Cite

MDPI and ACS Style

Favretti, M. Lagrangian submanifolds generated by the Maximum Entropy principle. Entropy 2005, 7, 1-14. https://0-doi-org.brum.beds.ac.uk/10.3390/e7010001

AMA Style

Favretti M. Lagrangian submanifolds generated by the Maximum Entropy principle. Entropy. 2005; 7(1):1-14. https://0-doi-org.brum.beds.ac.uk/10.3390/e7010001

Chicago/Turabian Style

Favretti, Marco. 2005. "Lagrangian submanifolds generated by the Maximum Entropy principle" Entropy 7, no. 1: 1-14. https://0-doi-org.brum.beds.ac.uk/10.3390/e7010001

Article Metrics

Back to TopTop