Entropy-Based Measure of Statistical Complexity of a Game Strategy

Falniowski, Fryderyk

doi:10.3390/e22040470

Open AccessArticle

Entropy-Based Measure of Statistical Complexity of a Game Strategy

by

Fryderyk Falniowski

Department of Mathematics, Cracow University of Economics, Rakowicka 27, 31-510 Kraków, Poland

Entropy 2020, 22(4), 470; https://0-doi-org.brum.beds.ac.uk/10.3390/e22040470

Submission received: 11 March 2020 / Revised: 15 April 2020 / Accepted: 18 April 2020 / Published: 20 April 2020

Download Versions Notes

Abstract

:

In this note, we introduce excess strategic entropy—an entropy-based measure of complexity of the strategy. It measures complexity and predictability of the (mixed) strategy of a player. We show and discuss properties of this measure and its possible applications.

Keywords:

entropy-based measure; complexity of a strategy; predictability of a strategy

1. Introduction

Many social (economic, political, etc.) interactions have been modeled as formal games. In particular, repeated games play the central role in models of the long-term competition in economic theory and modeling interactions which are repeated frequently [1,2]. In such games, a strategy is a set of history-contingent plans of action. Each plan of action (play) is an infinite sequence of pairs (for two-player games) or n-tuples (in the n-player game) of strategies in each stage game. In every stage game, each player uses a mixed strategy (in order to avoid possible confusion, we point out that, as is standard in the literature, we consider mixed strategies so long as their support lies in the set of feasible pure strategies; a possible interpretation of mixed strategies in games in general is that they are distributions of pure strategies in a population of potential players). Generally, in the literature, it is assumed that players can carry out any strategy in a specified strategy set, should they choose to play it. While this latter assumption may seem innocuous in a model where few strategies are available to each player, it may be criticized as being unrealistically rational in more complex models where a theoretical definition of strategy leads to a strategy set that contains a large number of choices, many of which are impractically complex. This is usually the case in repeated games—at every stage, players are playing a history (in-)dependent game and therefore the set of all strategies can be huge. Thus, it is reasonable to consider the complexity of the strategy (the idea that the assumption of fully, or unboundedly, rational players is unrealistic is not new; see, e.g., [3]). In addition, putting the complexity problem in the wider context, one of the main benchmark results in the theory of repeated games is the folk theorem. It demonstrates that any individually rational and feasible payoff vector of a two-player normal form game is a perfect equilibrium outcome of the repeated game, when the discount factor is sufficiently near one [4]. This multiplicity of equilibria has led some researchers, both theoretical and applied, to restrict attention to certain equilibria. Along with other restrictions, the restriction to certain simple strategies, such as the grim trigger strategy, is commonly used. Moreover, these types of equilibrium selection arguments are not restricted to repeated game. Such restrictions are the norm rather than the exception in dynamic models in economics. Most analyses of dynamic macro models, search models, and random matching models restrict attention to equilibria in which agents use history-independent strategies. This is also true for many models in dynamic games, including stochastic games and multi-person bargaining.

In general, restricting the set of feasible strategies having regard to complexity of the strategy seems to be particularly relevant. A popular approach to the problem of complexity of strategies is to assume bounded rationality of an agent. There have been many attempts to model feasible (implementable) sets of strategies that reflect some aspects of the bounded rationality of players. Finite automata, bounded recall, and Turing machines are a few of the approaches taken. These models are useful because they provide us with quantitative measures of complexity of strategies, e.g., number of states of automata and the length of recall. This approach, used e.g., in [5,6], describes the complexity of the strategy from the point of view of an individual who wants to implement the strategy, and is based on the idea similar to Kolmogorov–Chaitin complexity of a channel in the context of information theory. However, the complexity of the strategy comes not only from the fact of how complicated it is to implement. The complex, structured strategy is the strategy that is difficult to predict. The player may face the problem of choosing the best strategy against the strategy of the opponent, which he learns during the repeated game. The more structured the strategy is, the more difficult it is to predict what the opponent will do, and the more difficult it is to find the best response strategy. Therefore, there is a necessity to consider measures that would help us to detect more or less structured strategies.

In the seminal article, Neyman and Okada introduced entropy-based measures of uncertainty for mixed strategies of repeated games—strategic entropy and strategic entropy rate [7]. They studied repeated two-player zero-sum games in which a player, say Player 1, with a restricted set of strategies plays against an unrestricted player, Player 2. Concerning the values of such games, two questions arise: “What is the number of repetitions needed for Player 2 to take advantage of Player’ 1’s restriction?” and “How long can Player 1 protect himself against an unrestricted Player 2?” These questions can be posed in the language of asymptotic behavior of the value of the repeated game: “What is the relationship between the number of repetitions and the unpredictability bound so that, as they tend to infinity, either (a) Player 2 can hold Player 1’s payoff down to his one-shot game max-min value in pure actions or (b) Player 1 can still secure the value of the one-shot game?” They imposed a restriction directly on mixed strategies. To this end, they employed entropy as a measure of uncertainty of the mixed strategy relative to the other player’s strategy. The strategic entropy rate of a mixed strategy is the maximal entropy rate of the play with respect to the other player’s strategy. Thus, it is the maximal uncertainty of the play that the other player faces against (mixed) strategy. Of course, their entropy concept is not a measure of the strategic complexity in the way that the size of the automaton or the length of recall was intended to be but captures an abstract informational feature common in bounded recall restrictions, and thereby serves as a useful tool to analyze them.

To provide a proper and more precise insight into the structure and complexity of strategy, we propose taking the next step—looking at convergence rates of the entropy. It is commonly agreed that a slow entropy convergence rate is a sign of complex structure of a process [8]. Obviously, a more structured strategy is more difficult to predict. Therefore, strategies with a slow convergence rate are less predictable. Hence, restricting the set of feasible strategies to the set of those with bounded unpredictability can give more precise insight on the theory of long-term competition.

We propose to use the concept of excess entropy, which measures an entropy convergence rate. It can result in better understanding of strategies played—measuring not only randomness of the strategy but its structure, regularity, and predictability as well. To this end, we will construct a measure based on the quantity widely used in information theory and physics.

Structure and correlation are not completely independent of randomness. It is generally agreed that both maximally random and perfectly ordered systems possess no structure [9]. Nevertheless, at a given level of randomness away from these extremes, there can be an enormously wide range of differently structured processes. There are many ad hoc methods for detecting structure, but none are as widely applicable as entropy is for indicating randomness. The quantities that have been proposed as general structural measures are often referred to as complexity measures. To reduce confusion, it has become convenient to refer to them instead as statistical complexity measures. In doing so, they are immediately distinguished from deterministic complexities. In contrast, statistical complexity measures discount for randomness, and so provide a measure of the regularities present in an object above and beyond pure randomness.

In 1983, Grassberger and Procaccia in their seminal paper [10] introduced the measure of complexity—excess entropy (they called it the effective complexity measure) to study complexity of chaotic signal. It made a huge impact in particular on statistical mechanics and information theory (Google Scholar records over 1600 citations of this article), and in economics as well [11]. The most precise and consistent approach to this measure can be found in the article by Crutchfield and Feldman [8]. The main idea is to look carefully at the manner in which block entropy

H (n)

,

n \in N

, converges to its asymptotic form. Let us consider infinite sequences defined over the alphabet

A

. We define the n-block entropy as

H (n) = - \sum_{s^{n} \in A^{n}} μ (s^{n}) log μ (s^{n}),

where

s^{n}

are blocks of length n and

μ

is a probability distribution. The average (per-symbol) uncertainty is given by

h = lim_{n \to \infty} \frac{H (n)}{n} = lim_{n \to \infty} (H (n + 1) - H (n)) .

After defining these quantities, the excess entropy can be introduced as

E = \sum_{n = 1}^{\infty} (H (n + 1) - H (n) - h) .

The excess entropy has a number of different interpretations and goes also by a variety of different names (also stored information, effective measure complexity, Grassberger’s complexity, and predictive information). Among others, it measures the excess randomness of the system, and therefore tells us how much additional information must be gained about the sequences in order to reveal the actual per-symbol uncertainty h. Thus, it can be interpreted as a measure of statistical complexity.

The aim of this note is to incorporate the concept of excess entropy into the game theory setting. It can be used to answer the question how fast the strategic entropy of the (first) n stage games of the repeated game converges to the limit that is to the (strategic) entropy rate of the strategy. To this aim, we propose an analogue of excess entropy, which we will call excess strategic entropy (shortly: excess s-entropy). We will use it to measure statistical complexity and unpredictability of strategies. Natural interpretations of this measure impose its applicability e.g., in the case of bounded recall and reputation models. Undertaken research and obtained results fit into the growing field of applications of entropy-based concepts in economics, and game theory in particular (see [12,13,14,15,16,17] and references therein).

2. Results

2.1. Formal Description of the Game

This section is intended to make the reader familiar with basic concepts of game theory. For the clarity of presentation, we consider a two-player zero-sum games. A play of a repeated game is an infinite sequence

ω = {(ω_{k})}_{k = 1}^{\infty}

, where

ω_{k} = (a_{k}, b_{k}) \in A \times B

. We denote the set of all plays by

Ω

, i.e.,

Ω = {(A \times B)}^{N}

.

Two plays

ω = {(ω_{k})}_{k = 1}^{\infty}

and

ω^{'} = {(ω_{k}^{'})}_{k = 1}^{\infty}

are said to be n-equivalent if

ω_{k} = ω_{k}^{'}

for

k = 1, \dots, n

. The n-equivalence is clearly an equivalence relation on

Ω

. Denote by

Q_{n}

the finite partition of

Ω

into n-equivalence classes. Each n-equivalence class of plays is called an n-history, and it represents the information available to the players at the end of stage n.

We denote by

Σ_{n}

the algebra on

Ω

generated by

Q_{n}

. Set

Σ_{0} = {\emptyset, Ω}

. Clearly,

Σ_{n - 1} \subset Σ_{n}

for every

n \in N

. The

σ

-algebra generated by

⋃_{n \geq 0} Σ_{n}

is denoted by

Σ_{\infty}

.

S_{n}

and

T_{n}

denote the sets of measurable mappings from

(Ω, Σ_{n - 1})

to A and B, respectively. Each element of

S_{n}

and

T_{n}

represents a ’strategy at stage n’. For each

s_{n} \in S_{n}

(and similarly for

t_{n} \in T_{n}

), since

s_{n} (ω)

depends only on the first n coordinates of

ω

, we sometimes write

s_{n} (ω_{1}, \dots, ω_{n - 1})

. A pure strategy of Player 1 (resp. 2) is a sequence

s = (s_{n})

with

s_{n} \in S_{n}

(

t = (t_{n})

with

t_{n} \in T_{n}

, respectively). Thus, the sets of pure strategies of the two players are

S = \times_{n \geq 1} S_{n}

and

T = \times_{n \geq 1} T_{n}

. We consider S and T to be endowed with the product topologies with the discrete topology on each factor.

S

and

T

denote the Borel

σ

-algebra of S and T, respectively. A mixed strategy of Player 1 (resp. 2) is then a probability on

(S, S)

(resp.

(T, T)

).

Every pair of pure strategies

(s, t)

induces a play

ω = {(ω_{k})}_{k = 1}^{\infty} \in Ω

, where

ω_{k}

is defined inductively

ω_{k} = (a_{k}, b_{k}) = \{\begin{matrix} (s_{1}, t_{1}) & for k = 1 \\ (s_{k} (ω_{1}, \dots, ω_{k - 1}), t_{k} (ω_{1}, \dots, ω_{k - 1})) & for k > 1 \end{matrix} .

Note that

s_{1}

and

t_{1}

, being

Σ_{0}

-measurable, are constant everywhere. Accordingly, every pair

(σ, τ) \in Δ (S) \times Δ (T)

induces a probability

P_{σ, τ}

on

(Ω, Σ_{\infty})

. Equivalently,

(σ, τ)

induces a sequence of random actions, or a random play

(X_{k})

, where

X_{k} = (a_{k}, b_{k})

is an

(A \times B)

-valued random variable.

Remark 1.

To simplify discussion, one can identify A (and B) with the set of non-negative integers

A = {0, 1, \dots, card A}

(and

B = {0, 1, \dots, card B}

, respectively). Then the space Ω can be represented as the full shift

(A^{N}, τ)

, where

A^{N}

is a set of all one-sided infinite sequences over alphabet

A

, and

τ : A^{N} \mapsto A^{N}

is a shift transformation

{(τ x)}_{n} = x_{n + 1}

. If at every stage every action is accessible, then Ω is the set of all one-sided sequences on

A

. On the other hand, if only some of actions are accessible for Player 1 or Player 2, then the space of possible histories is a subshift of Ω. Moreover, the partition imposed by n-equivalence classes is the partition into cylinders; see Section 2.2.2.

2.2. Basic Concepts

2.2.1. Information Theoretic Concepts

In this subsection, we recall well-known quantities and facts from information theory. Let

Γ

be a finite set and X be a random variable which takes values in

Γ

, and whose distribution is

p \in Δ (Γ)

, i.e.,

p (γ) = P r o b (X = γ)

for each

γ \in Γ

. To simplify the notation, we define a function

η : [0, \infty) \mapsto R

by

η (x) = \{\begin{matrix} - x log x & for x > 0, \\ 0 & x = 0 . \end{matrix}

Definition 1.

The entropy

H (X)

of X defined as the expected value of the information is equal to

H (X) = E (log p^{- 1}) = \sum_{γ \in Γ} η (p (γ)) .

The notion of entropy can be naturally extended to an arbitrary finite dimensional vector of random variables or probability distributions.

Definition 2.

The conditional entropy

H (X_{2} | X_{1})

of

X_{2}

given

X_{1}

is defined as

H (X_{2} | X_{1}) = \sum_{γ \in Γ} p (γ) H_{p (γ)} (X_{2}),

Lemma 1

(see [18]).

H (Y | X) = H (X, Y) - H (X)

and more generally

H (X_{1}, \dots, X_{n}) = H (X_{1}) + \sum_{i = 2}^{n - 1} H (X_{i} | X_{1}, \dots X_{i - 1}) \leq \sum_{i = 1}^{n} H (X_{i}) .

Definition 3.

Let

(X_{n})

be a stochastic process, where each

X_{n}

takes values in a finite set Γ. The entropy rate of the process

(X_{n})

is defined as

h ((X_{n})) = \underset{k \to \infty}{lim sup} \frac{1}{k} H (X_{1}, \dots, X_{k}) .

(1)

2.2.2. Stationary Processes and Symbolic Dynamics

Stationary processes (and induced by them stationary strategies) will be an important tool in our considerations. In this section, we recall the concept of a stationary process and show the connection between stationary process, and existence of a shift-invariant measure (an in-depth discussion of these connections can be found in [19]).

We say that process

(X_{n})

is stationary, if the joint distributions do not depend on the choice of time origin, that is,

P r o b (X_{i} = a_{i}, m \leq i \leq n) = P r o b (X_{i + 1} = a_{i}, m \leq i \leq n)

(2)

for all

m, n \in N

and

a = (a_{1}, a_{2}, \dots) \in A^{N}

. The statement that a process

(X_{n})

on the finite alphabet

A

is stationary, which simply translates into the statement that the Kolmogorov measure introduced on cylinder sets

[a_{m}^{n}] = {x \in A^{N} : x_{i} = a_{i}, m \leq i \leq n}

by

μ ([a_{1}^{n}]) = μ ({x : x_{i} = a_{i}, 1 \leq i \leq n}) = P r o b (X_{i} = a_{i}, 1 \leq i \leq n)

(3)

is invariant under the shift transformation

τ : A^{N} \mapsto A^{N}

, that is,

μ (B) = μ (τ^{- 1} B)

for any measurable set B.

In this note, we will focus our attention on stationary strategies.

Definition 4.

A strategy profile

(σ, t) \in Δ (S) \times T

is called stationary (strategy σ is stationary with respect to t) if a random play

(X_{n})

induced by

(σ, t)

is stationary.

For stationary processes, the upper limit in Formula (1) can be replaced by the limit. In the next section (Lemma 3), we will show a similar fact for stationary strategies.

2.3. Strategic Entropy Rate

2.3.1. Derivation of the Strategic Entropy Rate

We define the strategic entropy rate of the strategy

σ \in Δ (S)

, given strategy

t \in T

, following construction proposed in [7]. It is a measure of average degree of uncertainty which Player 2 suffers playing the pure strategy t if Player 1 plays a mixed strategy

σ

.

For each

n \in N

, we define a function

H_{n} (\cdot, \cdot) : Δ (S) \times T \mapsto R

as follows. For a given

(σ, t) \in Δ (S) \times T

, let

(X_{k})

be the random play induced by

(σ, t)

. Then,

H_{n} (σ, t)

is defined as the entropy of this random play up to stage n, that is,

H_{n} (σ, t) = H (X_{1}, \dots, X_{n}) = \sum_{C \in Q_{n}} η (P_{σ, t} (C)) .

Recall that

Q_{n}

is the partition of

Ω

with respect to actions in the first n stages. Therefore,

H_{n} (σ, t)

is the uncertainty about the play up to the stage n that Player 2 faces when Player 1 uses

σ

and Player 2 uses t. The dual interpretation is that it is the amount of information on the play of the game, which Player 2 can obtain using t when Player 1 uses

σ

. Properties of

H_{n}

are listed in the following lemma, and its proof can be found in [7]:

Lemma 2.

1.: $H_{n} (\cdot, \cdot)$ is continuous on $Δ (S) \times T$ .
2.: For each $t \in T$ , $H_{n} (\cdot, t)$ is concave on $Δ (S)$ and constant on each equivalence class of $Δ (S)$ ,
3.: For each $σ \in Δ (S)$ , $H_{n} (σ, \cdot)$ is constant on each n-equivalence class of T,
4.: $H_{n} (σ, t) \in [0, n log card A]$ for every $(σ, t) \in Δ (S) \times T$ .

Moreover, if

(X_{k})

is the random play induced by

(σ, t)

, then

H_{n} (σ, t) = H (X_{1}) + \sum_{k = 2}^{n} H (X_{k} | X_{1}, \dots, X_{k - 1}) .

Now, we can define the strategic entropy rate.

Definition 5.

If

(X_{n})

is the random play induced by

(σ, t) \in Δ (S) \times T

, then the entropy rate of the strategy σ with respect to the strategy t is

h (σ, t) = \underset{n \to \infty}{lim sup} \frac{1}{n} H_{n} (σ, t) = \underset{n \to \infty}{lim sup} \frac{1}{n} H (X_{1}, \dots, X_{n})

(4)

and the strategic entropy rate of the strategy σ is the supremum of

h (σ, t)

over

t \in T

:

h (σ) = sup_{t \in T} h (σ, t) .

(5)

In general, the limit of

H_{n} (σ, t) / n

doesn’t have to exist. However, in the next section, we will show that, for a large class of strategies, it does.

2.3.2. Limits for Stationary Strategies

We start this section with a discussion on the existence of the limit in Formula (4) for stationary strategies.

Lemma 3.

Sequences

h_{n} (σ, t) = \frac{1}{n} H_{n} (σ, t)

and

g_{n} (σ, t) = (H_{n + 1} (σ, t) - H_{n} (σ, t))

are non-negative and

\underset{n \to \infty}{lim sup} g_{n} (σ, t) = \underset{n \to \infty}{lim sup} h_{n} (σ, t) = h (σ, t) .

If additionally σ is stationary, then they are decreasing and have a common limit.

Proof.

For any process, we have that they are non-negative since

\begin{matrix} H_{n + 1} (σ, t) & = H (X_{1}, \dots, X_{n + 1}) = H (X_{1}, \dots, X_{n}) + H (X_{n + 1} | X_{1}, \dots, X_{n}) \\ \geq H (X_{1}, \dots, X_{n}) = H_{n} (σ, t) . \end{matrix}

Moreover, because

h_{n} (σ, t)

is equal to the Cesáro average of

g_{n} (σ, t),

we obtain that

h (σ, t) = \underset{n \to \infty}{lim sup} g_{n} (σ, t) = \underset{n \to \infty}{lim sup} h_{n} (σ, t) .

(6)

From now on, we assume that

(X_{n})

is stationary. Therefore, we can look at the Kolmogorov measure

μ

(see (3)). To simplify notation, let

p_{ι} = μ ({x : x_{j} = a_{i_{j}}, 1 \leq j \leq n})

for

ι = {i_{1}, \dots, i_{n}} \in A^{n}

. Then, from invariance of the measure

μ

and strong subadditivity of Shannon entropy [20], we obtain

\begin{matrix} H_{n + 2} (σ, t) - H_{n + 1} (σ, t) & = & \sum_{i = 1}^{card A} \sum_{ι \in A^{n}} \sum_{j = 1}^{card A} η (p_{i ι j}) - \sum_{ι \in A^{n}} \sum_{i = 1}^{card A} η (p_{i ι}) \\ = & \sum_{i = 1}^{card A} \sum_{ι \in A^{n}} \sum_{j = 1}^{card A} η (p_{i ι j}) - \sum_{ι \in A^{n}} \sum_{i = 1}^{card A} η (\sum_{j = 1}^{card A} p_{i ι j}) \\ \leq & \sum_{ι \in A^{n}} \sum_{j = 1}^{card A} η (\sum_{i = 1}^{k} p_{i ι j}) - \sum_{ι \in A^{n}} η (\sum_{i = 1}^{card A} \sum_{j = 1}^{card A} p_{i ι j}) \\ = & \sum_{ι \in A^{n}} \sum_{j = 1}^{card A} η (p_{ι j}) - \sum_{ι \in A^{n}} η (p_{ι}) \\ = & H_{n + 1} (σ, t) - H_{n} (σ, t) . \end{matrix}

Therefore, the sequence

(g_{n} (σ, t))

is non-negative and decreasing; thus, it has a limit. □

Corollary 1.

If

(σ, t) \in Δ (S) \times T

is stationary, then

h (σ, t) = lim_{n \to \infty} \frac{1}{n} H_{n} (σ, t)

.

Naturally, we can ask whether we can consider limits instead of the upper limits in Formula (6). Unfortunately, in general, neither

h_{n} (σ, t)

(see discussion after Definition 5) nor

g_{n} (σ, t)

(see Example 1) have to possess the limit.

Example 1.

Let

A = {0, 1}

and let the strategy of σ be the following:

at every even stage, Player 1 chooses 1;
at every stage $4 n + 1$ , Player 1 chooses 0 and 1 with equal probability;
at every stage $4 n + 3$ , Player 1 chooses 0 with the probability $1 / 4$ and 1 with probability $3 / 4$ .

Then,

g_{n} (σ, t) = H_{n + 1} (σ, t) - H_{n} (σ, t) = H (X_{n + 1} | X_{1}, \dots, X_{n}) .

For every

k \in N

, we have that

H (X_{k} | X_{1}, \dots X_{k - 1}) = \{\begin{matrix} 0, & if k is even, \\ log 2, & if k = 4 n + 1, \\ - \frac{1}{4} log \frac{1}{4} - \frac{3}{4} log \frac{3}{4}, & if k = 4 n + 3 . \end{matrix}

Thus,

g_{n}

doesn’t have the limit.

We can see that the assumption that the strategy is stationary gives a nice theory of the strategic entropy (and as we will see in the next section—excess s-entropy). One may ask if the assumption of stationarity isn’t too restrictive. Obviously using only stationary strategies may seem to be quite restrictive, but we know that, for many types of games, there exists an optimal stationary strategy [21,22]. Such strategy can be used widely because of its simplicity and optimality.

2.4. Excess Strategic Entropy

2.4.1. Excess s-Entropy

Now, we are able to introduce the fundamental tool of this note.

Definition 6.

The excess strategic entropy of the strategy

σ \in Δ (S)

with respect to the strategy

t \in T

(shortly the excess s-entropy of

(σ, t)

) is defined as

E (σ, t) = \sum_{n = 0}^{\infty} (g_{n} (σ, t) - h (σ, t)) .

We define the excess strategic entropy of the strategy

σ \in Δ (S)

(excess s-entropy of σ) as

E (σ) = sup_{t \in T} E (σ, t) .

Excess s-entropy is an analogue of the excess entropy concept discussed thoroughly e.g., in [8]. It measures how structured and how complex the strategy is. From the point of view of Player 1, it can be interpreted as a measure of complexity of his choices. On the other hand, from the point of view of Player 2, it can be understood as the measure of predictability of Player 1, that is, how eager Player 1 is to change the (one-shot) strategy. In the following considerations, we show properties of this quantity and calculate excess s-entropy of strategies in few games.

Theorem 1.

For any

σ \in Δ (S)

and

t \in T

, we have

E (σ, t) = \underset{n \to \infty}{lim sup} n \cdot h_{n} (σ, t) = \underset{n \to \infty}{lim sup} n \cdot (\frac{1}{n} H_{n} (σ, t) - h (σ, t)) .

Proof.

Let

M \in N

. Then,

\begin{matrix} \sum_{n = 0}^{M - 1} g_{n} (σ, t) - h (σ, t) & = & H_{M} (σ, t) - H_{0} (σ, t) - M h (σ, t) \\ = & H_{M} (σ, t) - M h (σ, t) = M (\frac{H_{M} (σ, t)}{M} - h (σ, t)) . \end{matrix}

Taking the upper limit over M, we complete the proof. □

Directly from Theorem 1, we obtain the condition for finiteness of excess s-entropy.

Corollary 2.

If

h_{n} (σ, t) \sim \frac{1}{n}

that is

0 < \underset{n \to \infty}{lim inf} n h_{n} (σ, t) \leq \underset{n \to \infty}{lim sup} n h_{n} (σ, t) < \infty,

then

E (σ, t) \in (0, \infty)

.

Obviously, having equivalent formulas for excess s-entropy, one may ask which of these two is better to use. Both of them are useful. Because

h_{n}

is a Césaro average of

g_{n}

, it is easy to see that

g_{n}

converges to the (upper) limit faster than

h_{n}

, so it is better to use it in numerical computations. On the other hand, calculating the upper limit of

(n (h_{n} - h))

can be easier analytically.

2.4.2. Examples

Now, we give examples of excess s-entropy of a strategy for few games.

Example 2

(Pure actions). If σ is a pure strategy, then

H_{n} (σ, t) = 0

for every n, and thus

E (σ) = 0

.

Example 3

(Independent actions). Let

σ = (σ_{k})

be a sequence of independent mixed actions, where

σ_{k} = α_{k} \in Δ (S)

. Then,

H_{n} (σ, t) = \sum_{k = 1}^{n} H (α_{k})

and

h (σ, t) = \underset{k \to \infty}{lim sup} \frac{1}{n} \sum_{k = 1}^{n} H (α_{k}) .

Moreover,

E (σ, t) = \underset{M \to \infty}{lim sup} M (\frac{H (α_{M})}{M} - \underset{k \to \infty}{lim sup} \frac{1}{n} \sum_{k = 1}^{n} H (α_{k})) .

In addition, if

α_{k} = α

for every k, then

H_{n} (σ, t) = n H (α)

and

h (σ, t) = H (α) .

Thus,

E (σ, t) = 0 and E (σ) = 0 .

From Examples 2 and 3, we can see that excess s-entropy doesn’t respond to the unpredictability of a one-stage game. In both examples, a player plays in an unpredictable way (playing mixed strategy) but not in a complex way. Nevertheless, it detects unpredictability of a long-run play that is how predictable is the change in a player’s mixed strategy (probability distribution).

Example 4

(Periodic strategy). Suppose that σ is a periodic strategy of period p. Then,

h (σ, t) = 0

and

E (σ, t) = log p

for any

t \in T

. Thus,

E (σ) = log p

.

Example 5 (Infinite excesss-entropy).

Let

σ = (σ_{k})

be such that

σ_{k} = (\frac{1}{2^{M}}, \frac{2^{M} - 1}{2^{M}})

, where

M \in N

is such that

k \in [10^{M - 1}, 10^{M})

. Then,

h (σ, t) = 0

and

E (σ, t) = \sum_{k = 1}^{\infty} (H_{n + 1} (σ, t) - H_{n} (σ, t)) = \infty

. Therefore,

E (σ) = \infty

.

We can see that we should expect infinite excess s-entropy when changes of strategies are frequent, while, if changes are rare, the excess s-entropy should be finite. Now, we show another example which catches this intuition.

Example 6.

Let us consider a two-player matrix game, where

A = {Top, Bottom}, B = {Left, Right} .

The first player plays “Top” as long as Player 2 chooses “Left”. If Player 2 plays “Right” at the stage m for the first time, then Player 1 chooses a mixed action with distribution

(\frac{1}{2}, \frac{1}{2})

(“Top” and “Bottom” are equally probable) for the following

m^{2}

stages. After this time, he always chooses “Top”.Therefore, for every

t \in T

, the play

(X_{n})

induced by

(σ, t)

is

X_{n} = (Top, Left) for every n,

or there exists

m \in N

such that

X_{n} = \{\begin{matrix} (Top, Left), & for n = 1, \dots, m - 1 \\ (Top, Right), & for n = m \\ ((\frac{1}{2}, \frac{1}{2}), t_{n}), & for n = m + 1, \dots, m + m^{2} \\ (Top, t_{n}), & for n \geq m + m^{2} + 1 . \end{matrix}

If the first condition is fulfilled, then

H_{n} (σ, t) = 0

for every n and therefore

E (σ, t) = 0

. If Player 2 plays “Right” at some stage (m), then

H_{n} (σ, t) = \{\begin{matrix} 0, & for n = 1, \dots, m \\ n - m, & for n = m + 1, \dots, m + m^{2} \\ m^{2}, & for n \geq m + m^{2} + 1 . \end{matrix}

Therefore,

h (σ, t) = 0,

E (σ, t) = m^{2},

and

E (σ) = m^{2} .

One may ask how this concept works and how this quantity can be calculated in more sophisticated and complex problems. Although this is not the aim of this article, we shortly discuss this here. The purpose of this section was to give illustrative examples, showing the intuition behind the introduced quantity. Nevertheless, to describe how excess s-entropy can be used in more complex games, we observe that we are building on the theory of entropy convergence rates. Therefore, we are able to use tools and ideas designed to study these rates. First, looking through the lens of information theory, Feldman et al. showed that excess entropy is the proper tool to detect and analyze patterns produced by a process [23,24]. It is able to distinguish between different patterns that have the same structure factors. However,

E (σ, t)

is the excess entropy of a process generated by

(σ, t)

. Thus, excess s-entropy of a strategy

σ

with respect to a given strategy t can be used to detect and quantify patterns produced by a play

(σ, t)

(

E (σ, t)

can be calculated numerically). Second, from a dynamical systems perspective, we can study entropy convergence rates using different tools, especially from symbolic dynamics.

2.4.3. Excess s-Entropy for Stationary Strategies

As we can see from the previous section, excess s-entropy agrees with the intuition of the complexity of the strategy. The natural question to ask is how we can bound excess s-entropy. Here, we will concentrate on stationary strategies. First, from Corollary 1, if a play induces stationary process, then

E (σ, t) = lim_{n \to \infty} n (\frac{1}{n} H_{n} (σ, t) - h (σ, t)) .

In Theorems 2 and 3, we give few properties of the excess s-entropy.

Theorem 2.

Let σ be a stationary strategy. Then, the following conditions are equivalent:

1.: $E (σ, t) = 0$ ,
2.: $H_{n} (σ, t) = n h (σ, t)$ for all $n \in N$ ,
3.: $H_{1} (σ, t) = h (σ, t)$ .

Proof.

We will prove equivalence of conditions showing implications

(1) \Rightarrow (2) \Rightarrow (3) \Rightarrow (1)

.

To show implication from

(1)

to (2), it is sufficient to see that, because (by Lemma 3) the sequence

(g_{n} (σ, t))

is decreasing, it has to be equal to

h (σ, t)

for every t. Condition (3) is an immediate consequence of (2). Noticing that by (3) we have

g_{0} (σ, t) = H_{1} (σ, t)

and

(g_{n} (σ, t))

decreasing implies (1). □

Moreover, for stationary strategies, we have a simple lower bound for excess s-entropy:

Theorem 3.

If σ is stationary, then

h_{1} (σ, t) \leq E (σ, t)

and

E (σ) \geq sup_{t \in T} h_{1} (σ, t) .

Proof.

First, let

G_{n} (σ, t) = H_{n + 1} (σ, t) - H_{n} (σ, t)

. Then,

H_{1} (σ, t) - G_{1} (σ, t) = G_{0} (σ, t) - G_{1} (σ, t) = \sum_{k = 1}^{n - 1} (G_{k - 1} (σ, t) - G_{k} (σ, t)) .

Converging with n to infinity results in

\begin{matrix} h_{1} (σ, t) = g_{0} (σ, t) = H_{1} (σ, t) - h (σ, t) = \sum_{k = 1}^{\infty} (G_{k - 1} (σ, t) - G_{k} (σ, t)) \\ \leq \sum_{k = 1}^{\infty} k (G_{k - 1} (σ, t) - G_{k} (σ, t)) = \sum_{k = 1}^{\infty} k (g_{k - 1} (σ, t) - g_{k} (σ, t)) \leq E (σ, t) . \end{matrix}

□

3. Discussion

In this article, we proposed a new measure of statistical complexity and unpredictability of a strategy—excess strategic entropy. Because of its entropy-based origin, we were able to catch intuition behind this quantity by understanding how entropy and entropy convergence rates work in the context of game theory. We showed and discussed its properties. To simplify discussion, we introduced this quantity for a two-player game, but it can be easily extended to games with more players. Lastly, in this section, we look at two possible applications of this quantity.

It is natural to restrict strategies of a player to those of finite statistical complexity. Obviously starting from the Nash equilibrium excess s-entropy (and complexity of a strategy) will be equal to zero, as the player won’t be eager to change his strategy. Nevertheless, if a player is not in the Nash equilibrium from the beginning, he may prefer to choose one of the mixed strategies which is not too complicated—for instance that doesn’t force him to change a one-shot strategy too much. Thus, restricting the set of strategies to those with bounded excess s-entropy seems to be a natural choice. Even if the player wants to “converge to the Nash equilibrium”, he may prefer those Nash equilibria to which his sequences of one-shot strategies converge using strategies with bounded excess s-entropy. Moreover, results obtained by Neyman et al. [7,25,26,27] suggest that restricting strategies to those of bounded strategic entropy can impact both Nash equilibria of the player and response of unrestricted players. Similarly to their considerations on strategies with small strategic entropy, restricting strategies to those with small excess s-entropy will impose high predictability of the mixed strategy (probability distribution which he uses is predictable), and will impact the possibility of choosing the best response strategy by the player with an unrestricted set of strategies. Therefore, bounded excess s-entropy strategies should be thoroughly explored.

Another research area where studying excess s-entropy may give a better insight are reputation models. The notion that commitment is valuable has long been a critical insight of non-cooperative game theory, and has deeply affected a number of social science fields, including macroeconomics, international finance, and industrial organization. It seems natural to expect that agents playing in the long-run competition will base their decisions on the reputation of the opponent. Moreover, if the opponent is more predictable (in the sense that the probability distribution which he uses is predictable), then the player can find a better response to the predicted strategy, or may prefer to play with the more predictable player. It should be added that using entropy-based measures in reputation models is not a new idea. For instance, in [28], Ekmekci et al. used entropy bounds to study the impact of unobservable stochastic replacements for the long-run player in the classical reputation model with a long-run player and a series of short-run players. Other applications of entropy concept to reputation models were made, e.g., in [29,30].

4. Materials and Methods

Methods used in article are standard methods and are thoroughly discussed in Section 2.

Funding

This research was funded by National Science Centre, Poland Grant No. 2016/21/D/HS4/01798.

Conflicts of Interest

The author declares no conflict of interest.

References

Maenner, E. Adaptation and complexity in repeated games. Games Econ. Behav. 2008, 63, 166–187. [Google Scholar] [CrossRef]
Dal Bó, P.; Fréchette, G.R. On the determinants of cooperation in infinitely repeated games: A survey. J. Econ. Lit. 2018, 56, 60–114. [Google Scholar] [CrossRef]
Aumann, R. Rationality and bounded rationality: The 1986 Nancy L. Schwartz memorial lecture. Games Econ. Behav. 1997, 21, 2–14. [Google Scholar] [CrossRef]
Fudenberg, D.; Maskin, E. The folk theorem in repeated games with discounting or with incomplete information. Econometrica 1986, 54, 533–554. [Google Scholar] [CrossRef] [Green Version]
Chatterjee, K.; Sabourian, H. Game theory and strategic complexity. In Computational Complexity; Meyers, R.A., Ed.; Springer: New York, NY, USA, 2012; pp. 1292–1308. [Google Scholar]
Kalai, E. Bounded rationality and strategic complexity in repeated games. In Game Theory and Applications; Academic Press: Cambridge, MA, USA, 1990; pp. 131–157. [Google Scholar]
Neyman, A.; Okada, D. Strategic entropy and complexity in repeated games. Games Econ. Behav. 1999, 29, 191–223. [Google Scholar] [CrossRef] [Green Version]
Crutchfield, J.P.; Feldman, D.P. Regularities unseen, randomness observed: Levels of entropy convergence. Chaos 2003, 13, 25–54. [Google Scholar] [CrossRef]
Feldman, D.P.; Crutchfield, J.P. Measures of statistical complexity: Why? Phys. Lett. A 1998, 238, 244–252. [Google Scholar] [CrossRef]
Grassberger, P.; Procaccia, I. Estimation of the Kolmogorov entropy from a chaotic signal. Phys. Rev. A 1983, 28, 2591–2593. [Google Scholar] [CrossRef] [Green Version]
Hsieh, D.A. Chaos and nonlinear dynamics: application to financial markets. J. Financ. 1991, 46, 1839–1877. [Google Scholar] [CrossRef]
Cabrales, A.; Gossner, O.; Serrano, R. Entropy and the value of information for investors. Am. Econ. Rev. 2013, 103, 360–377. [Google Scholar] [CrossRef] [Green Version]
Cabrales, A.; Gossner, O.; Serrano, R. A normalized value for information purchases. J. Econ. Theory 2018, 170, 266–288. [Google Scholar] [CrossRef]
Peretz, R. Correlation through bounded recall strategies. Int. J. Game Theory 2013, 42, 867–890. [Google Scholar] [CrossRef] [Green Version]
Bavly, G.; Peretz, R. Limits of correlation in repeated games with bounded memory. Games Econ. Behav. 2019, 115, 131–145. [Google Scholar] [CrossRef]
Valizadeh, M.; Gohari, A. Playing games with bounded entropy. Games Econ. Behav. 2019, 115, 363–380. [Google Scholar] [CrossRef] [Green Version]
Hellman, Z.; Peretz, R. A survey on entropy and economic behaviour. Entropy 2020, 22, 157. [Google Scholar] [CrossRef] [Green Version]
Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]
Shields, P.C. The Ergodic Theory of Discrete Sample Paths; American Mathematical Society: Providence, RI, USA, 1996. [Google Scholar]
Csiszár, I. Axiomatic characterization of information measures. Entropy 2008, 10, 261–273. [Google Scholar] [CrossRef] [Green Version]
Flesch, J.; Predtetchinski, A.; Sudderth, W. Simplifying optimal strategies in limsup and liminf stochastic games. Discret. Appl. Math. 2018, 251, 40–56. [Google Scholar] [CrossRef] [Green Version]
Flesch, J.; Thuijsman, F.; Vrieze, O.J. Simplifying optimal strategies in stochastic games. SIAM J. Control Optim. 1998, 36, 1331–1347. [Google Scholar] [CrossRef] [Green Version]
Feldman, D.P.; Crutchfield, J.P. Structural information in two-dimensional patterns: Entropy convergence and excess entropy. Phys. Rev. E 2003, 67, 051104. [Google Scholar] [CrossRef] [Green Version]
Feldman, D.P.; McTague, C.S.; Crutchfield, J.P. The organization of intrinsic computation: Complexity-entropy diagrams and the diversity of natural information processing. Chaos 2008, 18, 043106. [Google Scholar] [CrossRef]
Neyman, A.; Okada, D. Repeated games with bounded entropy. Games Econ. Behav. 2000, 30, 228–247. [Google Scholar] [CrossRef]
Neyman, A.; Okada, D. Growth of strategy sets, entropy and nonstationary bounded recall. Games Econ. Behav. 2009, 66, 404–425. [Google Scholar] [CrossRef]
Neyman, A.; Spencer, J. Complexity and effective prediction. Games Econ. Behav. 2010, 69, 165–168. [Google Scholar] [CrossRef]
Ekmekci, M.; Gossner, O.; Wilson, A. Impermanent types and permanent reputations. J. Econ. Theory 2012, 147, 162–178. [Google Scholar] [CrossRef]
Gossner, O. Simple bounds on the value of a reputation. Econometrica 2011, 79, 1627–1641. [Google Scholar]
Hu, J. Reputation in the presence of noisy exogenous learning. J. Econ. Theory 2014, 153, 64–73. [Google Scholar] [CrossRef]

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Falniowski, F. Entropy-Based Measure of Statistical Complexity of a Game Strategy. Entropy 2020, 22, 470. https://0-doi-org.brum.beds.ac.uk/10.3390/e22040470

AMA Style

Falniowski F. Entropy-Based Measure of Statistical Complexity of a Game Strategy. Entropy. 2020; 22(4):470. https://0-doi-org.brum.beds.ac.uk/10.3390/e22040470

Chicago/Turabian Style

Falniowski, Fryderyk. 2020. "Entropy-Based Measure of Statistical Complexity of a Game Strategy" Entropy 22, no. 4: 470. https://0-doi-org.brum.beds.ac.uk/10.3390/e22040470

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Entropy-Based Measure of Statistical Complexity of a Game Strategy

Abstract

1. Introduction

2. Results

2.1. Formal Description of the Game

2.2. Basic Concepts

2.2.1. Information Theoretic Concepts

2.2.2. Stationary Processes and Symbolic Dynamics

2.3. Strategic Entropy Rate

2.3.1. Derivation of the Strategic Entropy Rate

2.3.2. Limits for Stationary Strategies

2.4. Excess Strategic Entropy

2.4.1. Excess s-Entropy

2.4.2. Examples

2.4.3. Excess s-Entropy for Stationary Strategies

3. Discussion

4. Materials and Methods

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI