A Family of Fitness Landscapes Modeled through Gene Regulatory Networks

Yang, Chia-Hung; Scarpino, Samuel V.

doi:10.3390/e24050622

Open AccessArticle

A Family of Fitness Landscapes Modeled through Gene Regulatory Networks

by

Chia-Hung Yang

^1,*

and

Samuel V. Scarpino

^{1,2,3,4,5,6,*}

¹

Network Science Institute, Northeastern University, Boston, MA 02115, USA

²

Physics Department, Northeastern University, Boston, MA 02115, USA

³

Roux Institute, Northeastern University, Boston, MA 02115, USA

⁴

Institute for Experiential AI, Northeastern University, Boston, MA 02115, USA

⁵

Santa Fe Institute, Santa Fe, NM 87501, USA

⁶

Vermont Complex Systems Center, University of Vermont, Burlington, VT 05405, USA

^*

Authors to whom correspondence should be addressed.

Entropy 2022, 24(5), 622; https://0-doi-org.brum.beds.ac.uk/10.3390/e24050622

Submission received: 2 December 2021 / Revised: 11 April 2022 / Accepted: 26 April 2022 / Published: 29 April 2022

(This article belongs to the Special Issue Foundations of Biological Computation)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Fitness landscapes are a powerful metaphor for understanding the evolution of biological systems. These landscapes describe how genotypes are connected to each other through mutation and related through fitness. Empirical studies of fitness landscapes have increasingly revealed conserved topographical features across diverse taxa, e.g., the accessibility of genotypes and “ruggedness”. As a result, theoretical studies are needed to investigate how evolution proceeds on fitness landscapes with such conserved features. Here, we develop and study a model of evolution on fitness landscapes using the lens of Gene Regulatory Networks (GRNs), where the regulatory products are computed from multiple genes and collectively treated as phenotypes. With the assumption that regulation is a binary process, we prove the existence of empirically observed, topographical features such as accessibility and connectivity. We further show that these results hold across arbitrary fitness functions and that a trade-off between accessibility and ruggedness need not exist. Then, using graph theory and a coarse-graining approach, we deduce a mesoscopic structure underlying GRN fitness landscapes where the information necessary to predict a population’s evolutionary trajectory is retained with minimal complexity. Using this coarse-graining, we develop a bottom-up algorithm to construct such mesoscopic backbones, which does not require computing the genotype network and is therefore far more efficient than brute-force approaches. Altogether, this work provides mathematical results of high-dimensional fitness landscapes and a path toward connecting theory to empirical studies.

Keywords:

fitness landscapes; gene regulatory networks; coarse-graining; biological computation; graph theory

1. Introduction

Since its introduction by Wright [1], the concept of fitness landscapes has grown and matured into a cornerstone of biology [2,3,4]. A fitness landscape consists of a space of genotypes that are mutually accessible through mutations and a fitness value associated with the phenotype each genotype encodes. In this context, fitness describes the evolutionary potential of each genotype, and the set of navigable genotypes on these landscapes is termed the genotype network [5]. Continuing with this metaphor, the evolution of a population can be depicted as a trajectory wandering on the fitness landscape. As a consequence, the topography of a fitness landscape sheds light on various evolutionary processes, including constraints on adaptation [6,7,8,9], speciation via genetic incompatibilities [10,11], (dis)advantages of sexual reproduction and recombination [12,13,14], the repeatability/reversibility (or not) of evolutionary trajectories [15,16,17,18], and the role of neutral networks—components of the genotype network with the same fitness—in epochal evolution [19,20,21,22,23,24].

Despite being introduced by Wright [1], Fisher’s 1930 geometric model of adaptation is the first mathematical model of evolution on what we now call fitness landscapes [15,25,26]. Later work by Kingman [27] and Kauffman and Levin [28] constructed what they termed a “house of cards” (HoC) model where fitness values for each genotype are drawn independently from a specified probability distribution. Building on the HoC model, Kauffman and Weinberger [29] introduced the NK model, which forces each locus to interact with a fixed number of other loci and where a genotype’s fitness becomes the sum of the fitness contributions of every interaction group. More recently, the “rough Mount Fuji model” [30,31] combines the HoC landscape with an additional field penalizing a genotype’s Hamming distance away from a referenced genotype with the optimal fitness. The dependence of a genotype’s fitness on that of neighboring genotypes is thought to be a key feature of empirical fitness landscapes.

Over the past three decades, the fitness landscapes for various organisms, including bacteria [32,33,34], fungi [35,36], and fruit flies [37], have been empirically reconstructed. While the number of genotypes included in these early landscapes was limited, modern sequencing techniques and high-throughput analyses have enabled the construction of many large landscapes. Notable studies have been conducted in HIV [38,39], yeast [40], E. coli [41], jellyfish [42], human cancers [43], human stem cells [44], and DNA/RNA networks [45,46,47,48,49]. Comprehensive landscapes for multiple eukaryotic species have also been analyzed based on the binding affinity of transcription factors [50] and after accounting for the ecological context the species experiences [51]. What emerged from these studies is a set of prominent topographical features conserved across diverged taxa [52].

Together, empirical and modeled fitness landscapes exhibit three key topographical features. First, fitness landscapes are more often “rugged” than smooth [52]. The degree of ruggedness can be assessed via a variety of measures, such as the roughness of the slope ratio [53,54] and the number of local fitness maxima [55], which are often strongly correlated with each other [4]. Empirical studies typically show moderate ruggedness in the observed fitness landscapes [32,33,34,36,50,56]. The degree of ruggedness in these empirical landscapes is less than the HoC model assumes and comparable to a fine-tuned NK model or rough Mount Fuji model [4]. Second, fitness landscapes reveal mutational trajectories from one genotype to another where the fitness is non-decreasing, which implies accessibility (typically to a fitness optimum) across the landscape [32,57,58,59,60]. Lastly, whereas the inaccessible region in the HoC model expands when distant from the fitness optimum [61], other models find accessible trajectories despite high genotypic dimensionality [62,63].

Due to the often pervasive interaction between loci, determining phenotype from genotype can have a high degree of computational complexity [64,65]. Many existing fitness landscape models have dealt with this complexity by strongly constraining the state-space of possible genotypic interactions and/or reducing the complexity of how information is processed when mapping genotype to phenotype. For example, studies have focused on the folded structure of short RNA sequences, where the resulting stability or affinity is a fitness proxy [66,67,68,69,70], networks of molecular/genetic pathways whose expression pattern or homeostasis determines fitness [71,72,73,74,75], and modular mutational effects at different loci in Fisher’s geometric model [76].

Here, we model the genotype–phenotype map using the pathway framework of gene regulatory networks (GRNs), where mechanistic knowledge of how phenotypes are computed from genotypes is encoded in the GRN (see [77,78] for a more formal introduction). To study the fitness landscapes induced by GRN evolution, we integrate the pathway framework into a family of fitness landscape models where the fitness value is uniquely determined by the phenotype corresponding to the regulatory outcome of a genotype. For a fitness landscape of GRNs, we first prove the existence of two key topographical features: (a) GRNs with the same phenotype are themselves connected in the underlying genotypic network, and (b) there exists accessible trajectories between all pairs of GRNs with similar phenotypes. Second, utilizing the idea of symmetries and automorphisms in the genotype network, we coarse-grain GRNs into groups with equivalent roles in the fitness landscapes and deduce an underlying mesoscopic structure with which we can predict the trajectory of evolution with minimal complexity. Lastly, using this coarse-graining, we develop a bottom-up algorithm for constructing the underlying fitness landscape of GRNs, which does not require computing the genotype network and is thus more efficient than the conventional brute-force approach.

2. Methods

Here, we introduce a family of fitness landscape models where the genotype-phenotype mapping is constructed from regulatory interactions. We first summarize a modeling framework of GRNs proposed in our previous work [77,78], termed the pathway framework, and then fitness landscape models of GRNs built upon the pathway framework.

2.1. Pathway Framework of GRNs

Genotypes in the pathway framework of GRNs contain all necessary information to construct a regulatory network [77,78]. More specifically, alleles at each locus include both a transcription activator and a protein product, which means the regulatory interactions among the loci can be deduced by connecting genes whose expression product corresponds to the activator of another. Compared to existing work on regulatory circuits—where mutations are modeled as rewiring a single interaction between genes [79,80]—the pathway framework considers a mutation as changing the activator/product of a gene. Lastly, the phenotype is determined by the set of loci reached in a regulatory cascade induced by external stimuli. These stimuli could be completely external to the individual or simply come from another regulatory network in the organism. For additional details on the pathway framework, see [77,78] and Figure 1 for illustration.

In this work, when building the family of fitness landscape models, we restrict the pathway framework with four assumptions. First, we consider a fixed set of genes underlying the genotypes, i.e., gene duplication and deletion events are excluded. Second, we assume a fixed underlying collection of proteins that can possibly exist in the organism. Third, we consider the case where a gene’s expression is activated by a specific protein, and it generates only one protein product. Fourth, we assume that the associated chemical state of each protein is modeled as a Boolean/binary variable (present or absent), and external environmental signals stimulate the existence of specific proteins in the organism. As a consequence, the Boolean state of a phenotype-related protein is determined by whether it is reached by a regulatory cascade starting from an initial stimulus.

While the above assumptions seem naive, as we will show in Section 3, this simplified model still predicts the topographical features observed in empirical landscapes (see Section 1). As a result of these assumptions, we are able to derive rigorous theoretical insights into GRN evolution and obtain fitness landscapes consistent with far more complicated models. We believe these assumptions are conservative with respect to the biology and a justified starting point for modeling fitness landscapes, and we discuss the implications of these assumptions and possible extensions to the model in Section 4.

2.2. Fitness Landscape of GRNs under the Pathway Framework

Let

Γ

and

Ω

be the fixed, underlying collection of loci and proteins, respectively. A genotype is represented by its GRN g such that every locus

γ \in Γ

is associated with a protein activator/product pair

e_{g} (γ) = (u, v)

,

u, v \in Ω

. Equivalently, any GRN is a directed graph with

| Ω |

nodes labeled by the proteins

Ω

and

| Γ |

edges labeled by the loci

Γ

. In the rest of this paper, we will use the terminology “source/target” node of edge

γ

interchangeably to refer to the protein activator/product of locus

γ

. We also write

G

to be the set of all GRNs with the underlying loci

Γ

and proteins

Ω

.

The backbone of a fitness landscape of GRNs, i.e., the genotype network, is an undirected network of networks encoding the mutational relationship between the GRNs. Let G be the genotype network, and we denote its mega-nodes by

V (G) = G

and its edges by

E (G)

. There is an edge

(g_{1}, g_{2}) \in E (G)

between any GRNs

g_{1}, g_{2} \in G

when they only differ by the allele of a single locus

γ

,

e_{g_{1}} (γ) \neq e_{g_{2}} (γ)

. In other words,

g_{1}

and

g_{2}

are connected in G when they can be transformed into each other through one edge rewiring.

Furthermore, we write

x_{ω}

to be the binary state of protein

ω \in Ω

, where

x_{ω} = 1

indicates the presence of

ω

, and

x_{ω} = 0

designates its absence. We also partition

Ω

into three disjoint groups: (a) proteins

Ω_{0}

whose presence is externally stimulated by the given environment, (b) proteins

\hat{Ω}

whose states influence the fitness value, and (c) the remaining ones, which we call the dummy proteins

Ω^{'}

since their specific identities are irrelevant to the external environment and the resultant phenotypes/fitness. (In this paper, we assume that the stimuli

Ω_{0}

must be proteins that cannot be produced by expression, and we leave no constraint to the fitness-relevant and dummy proteins

\hat{Ω}

and

Ω^{'}

).

A phenotype is then treated as a vector of zeros and ones, where each entry corresponds to the binary state of a protein in

\hat{Ω}

. The resultant phenotype

x_{\hat{Ω}} (g)

of a GRN g is determined by the reachability in g: For any

ω \in \hat{Ω}

,

x_{ω} = 1

if and only if there is a stimulus

ω_{0} \in Ω_{0}

and a path from

ω_{0}

to

ω

in g, which represents a chain of sequentially expressed genes that generates protein

ω

. Finally, the fitness f is simply a function of the phenotype

x_{\hat{Ω}} (g)

.

Combined, a fitness landscape of GRNs is characterized by three key elements: the genotype network G, the external stimuli

Ω_{0}

, and the fitness function of phenotype f (which implicitly identifies the fitness-relevant proteins

\hat{Ω}

). The genotype network G serves as the skeleton of the fitness landscape, whereas the environment-dependent stimuli

Ω_{0}

and fitness function f determine the phenotypes of GRNs and their selective advantages.

3. Results

In this work, we derive three theoretical insights into fitness landscape models using GRNs as the embedded genotype–phenotype mapping. First, we show that the resulting family of fitness landscapes must always contain two topographical properties: connectivity, i.e., GRNs with the same phenotype can be mutually reached via mutations, and accessibility, i.e., that any GRN can be reached from an arbitrary less-fit GRN (once certain similarity criterion is met). Second, we propose a mesoscopic coarse-graining for fitness landscapes, which is a more compact alternative to analyzing evolutionary processes than the original landscape. This mesoscopic backbone recognizes “symmetries” in the genotype network, and it aggregates GRNs with the same role in the fitness landscape into a single representative genotype. Third, we provide a bottom-up approach to algorithmically construct this mesoscopic backbone and demonstrate its efficiency over coarse-graining the genotype network using brute force.

3.1. Connectivity and Accessibility in a Fitness Landscape of GRNs

A fitness landscape model of GRNs features a handful of properties that have either been discovered in empirical fitness landscapes or investigated mathematically. First, its underlying space, i.e., the genotype network G, presents immense dimensionality. Second, the fitness function f is flexible and can effectively tune the ruggedness of the fitness landscape. For example, a highly rugged “holely” landscape can be modeled by a binary f such that any GRN

g \in G

has high fitness once some single protein

ω \in \hat{Ω}

is present,

x_{ω} = 1

, and otherwise, g has low/zero fitness. Because one can always find several mutational neighbors of g whose phenotype shows an opposite state

x_{ω}

, the resultant fitness landscape is inevitably rugged. In what follows, we further show that fitness landscape models of GRNs must hold the characteristics of connectivity and accessibility.

Let

y

be a phenotype and denote by

G_{y}

the set of all GRNs with phenotype

y

, i.e.,

x_{\hat{Ω}} (g) = y

for

g \in G_{y}

, under the given external stimuli

Ω_{0}

. We also write

Ω_{y}^{+}

to be the required-present proteins in the phenotype

y

, so

x_{ω} = 1

for

ω \in Ω_{y}^{+}

and

x_{ω^{'}} = 0

for any other

ω^{'} \in \hat{Ω} ⧵ Ω_{y}^{+}

. Note that the number of required-present proteins

| Ω_{y}^{+} |

is bounded from above by the number of loci

| Γ |

since any present protein that is not a stimulus must be triggered by the expression of some locus.

We observe that some GRNs

{\tilde{G}}_{y} \subset G_{y}

play a “central” role among GRNs with the same phenotype

y

. Specifically, for any

\tilde{g} \in {\tilde{G}}_{y}

, all the edges in

\tilde{g}

point from the stimuli

Ω_{0}

to the required-present proteins

Ω_{y}^{+}

, and each

ω \in Ω_{y}^{+}

is targeted by at least one edge in

\tilde{g}

. We demonstrate an example of such

\tilde{g}

in Figure 2a. These

{\tilde{G}}_{y}

are deemed central because they can be reached by any GRN

g \in G_{y}

through mutations among

G_{y}

themselves: First, for every edge in g that points to an

ω \in Ω_{y}^{+}

, we rewire the edge such that it still points to

ω

but now from an

ω_{0} \in Ω_{0}

. Arbitrarily rewiring the remaining edges between

Ω_{0}

and

Ω_{y}^{+}

then leads to some central GRN in

{\tilde{G}}_{y}

(see Figure 2a).

In addition, if the phenotype

y

has strictly less required proteins than the number of loci, the central GRNs

G_{y}

are mutually reachable by edge rewiring among

{\tilde{G}}_{y}

. There is always a redundant edge whose rewiring makes no change to the phenotype, and it helps us rewire each edge to any desired source/target pair between

Ω_{0}

and

Ω_{y}^{+}

(see Figure 2b), which subsequently creates a chain of mutations between any

\tilde{g}, {\tilde{g}}^{'} \in {\tilde{G}}_{y}

. These results implicate that, for any phenotype

y

with

| Ω_{y}^{+} | < | Γ |

and any

g_{1}, g_{2} \in G_{y}

, there is always a mutational trajectory between

g_{1}

and

g_{2}

that only traverses over GRNs in

G_{y}

, especially through the central ones

{\tilde{G}}_{y}

(see Figure 2c). In the extreme case where

| Ω_{y}^{+} | = | Γ |

, however,

G_{y}

fragments into multiple connected components (detailed in Appendix A).

Next, we turn to accessibility between GRNs of different phenotypes

y

and

y^{'}

, where without loss of generality

f (y^{'}) \geq f (y)

. We observe that, if

| Ω_{y}^{+} \cup Ω_{y^{'}}^{+} | \leq | Γ | + 1

, there are always two “peripheral” GRNs

\hat{g} \in G_{y}

and

{\hat{g}}^{'} \in G_{y^{'}}

, which only differ by one edge rewiring. To be more specific, there are two independent chains in

\hat{g}

, one of which begins with a stimulus

ω_{0} \in Ω_{0}

and sequentially connects the proteins required to be present in

y

but not in

y^{'}

, i.e.,

Ω_{y}^{+} ⧵ Ω_{y^{'}}^{+}

, while the other consecutively joins

Ω_{y^{'}}^{+} ⧵ Ω_{y}^{+}

. The rest of the edges in

\hat{g}

merely point from

Ω_{0}

to

Ω_{y}^{+} \cap Ω_{y^{'}}^{+}

, and each

ω \in Ω_{y}^{+} \cap Ω_{y^{'}}^{+}

is targeted by at least one edge (see example in Figure 3, left). The other GRN

{\hat{g}}^{'}

only differ from g by the first edge in the chain of

Ω_{y}^{+} ⧵ Ω_{y^{'}}^{+}

, which is rewired such that it points from the stimulus

ω_{0}

to the first node in the chain of

Ω_{y^{'}}^{+} ⧵ Ω_{y}^{+}

(Figure 3, right).

Our observation suggests that there is a sequence of mutations with non-decreasing fitness from any GRN

g \in G_{y}

to any GRN

g^{'} \in G_{y^{'}}

, as long as

| Ω_{y}^{+} \cup Ω_{y^{'}}^{+} | \leq | Γ | + 1

. In particular, when

| Ω_{y}^{+} | < | Γ |

, the mutational trajectory starting at g first traverses within

G_{y}

to a peripheral GRN and then transitions into

G_{y^{'}}

to reach

g^{'}

. An analogous trajectory exists even under the extreme scenario

| Ω_{y}^{+} | = | Γ |

(see Appendix A). We also note that if the number of fitness-relevant proteins is

| \hat{Ω} | \leq | Γ | + 1

, then the condition

| Ω_{y}^{+} \cup Ω_{y^{'}}^{+} | \leq | Γ | + 1

is assuredly satisfied for any two phenotypes

y

and

y^{'}

. As a corollary, if

| \hat{Ω} | \leq | Γ | + 1

, the fitness optimum will always be accessible.

3.2. Mesoscopic Skeleton Derived from “Symmetries” in the Genotype Network of GRNs

Because the number of possible GRNs grows super-exponentially as the underlying loci and proteins expand, constructing the genotype network becomes extremely challenging beyond a small

Γ

and

Ω

. Here, we present a more compact skeleton of the fitness landscape of GRNs based on “symmetries” in the genotype network.

As the underlying space of a fitness landscape of GRNs, the genotype network G appears to contain redundant information. On the one hand, GRNs leading to the identical phenotype are deemed to have equal fitness. On the other hand, given any GRN, for example, the mega-node rounded by orange in Figure 4a, one can always find some other GRN such that their neighborhoods in G are locally similar, e.g., the mega-node rounded by blue. This simple demonstration suggests that the structure of the genotype network G is not arbitrary; instead, some structural symmetries exist.

In graph theory, symmetries in a network are formally described through the network’s automorphisms. An automorphism of a graph is a way to shuffle the labels of its nodes such that the graph remains identical before and after shuffling. For instance, in Figure 5b, exchanging nodes 2 and 3 generates the same network and is thus an automorphism, whereas exchanging nodes 2 and 4 is not because there is an edge from 2 to 3 after shuffling. Formally, an automorphism of the genotype network G is a permutation

σ

of all plausible GRNs

G = V (G)

such that, for any

g_{1}, g_{2} \in G

,

(σ (g_{1}), σ (g_{2})) \in E (G)

if and only if we also have

(g_{1}, g_{2}) \in E (G)

. (A permutation of

G

is a mapping

σ : G \to G

where no two GRNs are mapped to the same GRN, i.e.,

σ (g_{1}) \neq σ (g_{2})

if

g_{1} \neq g_{2}

for any

g_{1}, g_{2} \in G

.) Once two GRNs g and

g^{'}

are related through an automorphism

σ

of G, e.g.,

g^{'} = σ (g)

, they share the same mega-node properties that are fully determined by the connections in the genotype network (see Proposition A1).

Furthermore, automorphisms partition the GRNs by their roles in the genotype network through the mathematical concept of equivalence classes. For a high-level and general description, imagine a set of elements and a group of operations acting on them. Each operation turns one element into another, and these two elements are related by the operation, which describes the similarity between them. An equivalence class consists of elements that are mutually related by any operation, and the set of elements is said to be partitioned into equivalence classes under the action of the operations (see Figure 5a for an illustrative example). For automorphisms

Σ (G^{'})

of a graph

G^{'}

, the equivalence classes of nodes

V (G^{'})

under the action of

Σ (G^{'})

then gather nodes with a similar “structural position” in

G^{'}

(Figure 5c).

However, to reveal GRNs with identical roles in a fitness landscape, these automorphisms also need to preserve the phenotype. Denote by

Σ_{x} (G)

the set of such automorphisms of G, i.e., for any

σ \in Σ_{x} (G)

, and GRN

g \in G

,

σ (g)

and g have the same phenotype. The equivalence classes of mega-nodes

V (G)

under the action of phenotype-preserving automorphisms

Σ_{x} (G)

then unite GRNs that (a) show similar mutational relationships with others and (b) lead to the same fitness due to their identical phenotype. We will mildly abuse the terminology to call them the equivalence classes of GRNs, which we denote by

Θ

, and each

θ \in Θ

is a set of GRNs related through

Σ_{x} (G)

. Crucially, since the mutational relationship and the resultant phenotype are the two components that characterize a GRN in the fitness landscape, GRNs in a

θ \in Θ

are deemed equivalent semantically, and they can be reduced to an arbitrary representative among them. Therefore, the equivalence classes of GRNs provide an efficient way to depict the underlying space of the fitness landscape.

However, what exactly composes the phenotype-preserving automorphisms

Σ_{x} (G)

of the genotype network? From a sufficiency direction, we show that there exist a few graphical operations on the GRNs that produce phenotype-preserving automorphisms. These graphical operations involve permuting/shuffling different sorts of elements in a GRN:

(i): The identities of loci $Γ$ , e.g., exchanging edge labels of loci A and B in Figure 4b;
(ii): The identities of dummy proteins $Ω^{'}$ , e.g., exchanging node labels of proteins 3 and 4 in Figure 4c.

Then, potentially rewiring a given edge (see details in Definitions A1 and A2):

(iii): Change the source node of an edge from one stimulus to another stimulus and vice versa, e.g., in Figure 4d, moving an edge pointing from node 1 to node 3 to pointing from node 2. (Note that this operation is not necessarily equivalent to permuting the identities of stimuli since at most only the single focal edge will be affected.)
(iv): Move a self-loop at one node to another node and vice versa, for example, re-allocating a self-loop at node 3 to node 4 in Figure 4e.

For the formal proofs, we point the reader to Theorem A1. Additionally, from a necessity direction, one can computationally obtain a partition

\hat{Θ}

of the GRNs

G

that is coarser than the equivalence classes

Θ

. (A partition

P

is coarser than another partition

P^{'}

if any group in

P^{'}

is included in some group in

P

.) Specifically, start with a partition

{\hat{Θ}}_{0}

where GRNs with the same resultant phenotype are grouped together. We create a sequence of partitions of

G

through the following iterative procedure: Given the partition

{\hat{Θ}}_{i}

, the next partition

{\hat{Θ}}_{i + 1}

is obtained by further dividing groups into

{\hat{Θ}}_{i}

(if needed) such that for each group

θ \in {\hat{Θ}}_{i}

and

θ^{'} \in {\hat{Θ}}_{i + 1}

, any two GRNs in

θ^{'}

have the same number of neighbors among

θ

. This iterative procedure is terminated when no further division is required, i.e.,

{\hat{Θ}}_{k + 1} = {\hat{Θ}}_{k}

for some integer k (see Figure 6a for an illustrative cartoon of the iterative procedure). We then have

\hat{Θ} = {\hat{Θ}}_{k}

to be our desired partition of GRNs.

To see why the proposed iterative procedure generates a coarser partition

\hat{Θ}

than the equivalence classes

Θ

of GRNs, we stress that the equivalence classes under automorphisms always form an equitable partition. A partition

P = {P_{i}}_{i = 1}^{m}

of nodes of a graph is equitable [81] if for every

P_{i}, P_{j} \in P

, any two nodes

u, v

in group

P_{i}

have the same number of neighbors in

P_{j}

(Figure 6b). Since GRNs in an equivalence class

θ \in Θ

must have the same amount of neighbors for each different phenotype, we inductively show that any two GRNs

g_{1}, g_{2} \in θ

are never separated during the iterative procedure that generates

\hat{Θ}

(see Theorem A2). Therefore, any equivalence class

θ \in Θ

must be included in a computationally acquired group

\hat{θ} \in \hat{Θ}

.

Figure 7 demonstrates the coarser partition

\hat{Θ}

generated by the iterative procedure for an arbitrary toy example. The obtained

\hat{Θ}

contains 154 groups of GRNs, and the size of groups ranges from 2 to 96. We also count the number of different kinds of GRNs that can not be transformed through graphical operations (i) and (ii), and this number varies from 1 to 4 in our example

\hat{Θ}

. Moreover, for every group in

\hat{Θ}

, we observe that those different kinds of GRNs can be related by changing the stimulus that an edge is pointing from and re-allocating self-loops (e.g., see Figure 7b).

\hat{Θ}

is thus not simply a coarser partition than the equivalence classes; according to (i)–(iv), we know that groups in

\hat{Θ}

are exactly the equivalence classes

Θ

. This arguably general toy example implicates that there is no need for other graphical operations to determine the equivalence classes of GRNs.

As a result, we conjecture that all the phenotype-preserving automorphisms

Σ_{x} (G)

of the genotype network can be generated by combining graphical operations (i) to (iv) on the GRNs. In other words, two GRNs

g_{1}

and

g_{2}

belong to the same equivalence class if and only if, after removing all the self-loops and merging stimuli

Ω_{0}

into a single node, there exist permutations of loci

Γ

and dummy proteins

Ω^{'}

that jointly transform

g_{1}

into

g_{2}

. This condition reconciles with the concept of isomorphisms between graphs. Whereas an automorphism is a mapping of nodes such that a graph preserves itself, an isomorphism is a mapping of nodes that transform one graph into another. We will borrow the terminology and call the two permutations of

Γ

and

Ω^{'}

together a phenotype-preserving isomorphism from

g_{1}

to

g_{2}

.

3.3. Algorithmic Construction of the Mesoscopic Backbone of GRN Fitness Landscape

Next, we investigate algorithmic approaches to construct the mesoscopic backbone of a fitness landscape based on equivalence classes, where a representative GRN replaces all other GRNs in an equivalence class due to their identical role. In particular, the desired algorithm must (a) acquire the equivalence classes

Θ

from scratch and (b), for a representative GRN in any equivalence class, count the number of its mutational neighbors in other equivalence classes and also within the class it belongs to.

To avoid any confusion, we emphasize that, although drawing mutational connections between equivalence classes

Θ

can be achieved by grouping mega-nodes in the genotype network G, this naive exercise is unsuitable. First and foremost, grouping mega-nodes demands prior knowledge of the genotype network itself, but its construction is computationally heavy. Second, in contrast to coarse-graining nodes in a graph where the groups of nodes are pre-specified, listing all GRNs in an equivalence class requires examining pairs of GRNs and assuring a phenotype-preserving isomorphism between them after removing self-loops and merging stimuli. Determining the equivalence classes

Θ

from all the GRNs

G = V (G)

can thus be costly as well. These reasons again show the value of the equivalence classes

Θ

, which consolidate GRNs into their equivalent representatives.

Here, we present a bottom-up approach that enumerates each equivalence class of GRNs and simultaneously computes the number of mutational connections among them. To begin, recall from Section 2.2 that a mutation from a GRN

g_{1} \in G

to another

g_{2} \in G

corresponds to rewiring a single edge in

g_{1}

, where

g_{1}

may rewire a self-loop/non-self-loop edge to a self-loop/non-self-loop edge in

g_{2}

. We observe that the number of non-self-loop edges in mutational neighbors

g_{1}

and

g_{2}

differ at most by one. We denote by

Γ^{'} (g)

the loci representing the non-self-loop edges in the GRN g, and

| Γ^{'} (g) |

the number of those non-self-loop edges. In other words, given equivalence classes

θ, θ^{'} \in Θ

and representative GRNs

g \in θ

and

g^{'} \in θ^{'}

, g has no mutational neighbors in

θ^{'}

if

| | Γ^{'} (g) | - | Γ^{'} (g^{'}) | | > 1

.

We can therefore build the mesoscopic backbone by incrementally examining each equivalence class with an increasing number of non-self-loop edges in the representative GRN. This strategy is envisioned in Figure 8, where the backbone can be viewed as “layers” of equivalence classes of GRNs. Let

Θ_{k}

be the set of equivalence classes where for every

θ \in Θ_{k}

, the representative GRN

g \in θ

has exactly k non-self-loop edges,

| Γ^{'} (g) | = k

. We start with layer

Θ_{0}

, which consists of the only equivalence class with no non-self-loop edges. Then, with layers

Θ_{0}, Θ_{1}, \dots, Θ_{k}

and all the mutational connections among them, we will find the equivalence classes in the next layer

Θ_{k + 1}

and their mutational connections with layer

Θ_{k}

and within themselves up until

k = | Γ |

, where all the edges are non-self-loops.

To be more precise, we introduce the concept of

M^{+}

neighborhood: For any GRN

g \in G

, denote by

M^{+} (g)

the mutational neighbors of g that have one more non-self-loop edge than g.

M^{+}

neighborhoods are sufficient to capture the relationship between two mutational neighbors g and

g^{'}

:

If $g^{'}$ has one more non-self-loop edge than g, then $g^{'} \in M^{+} (g)$ ;
If $g^{'}$ has one less non-self-loop edge than g, then we have $g \in M^{+} (g^{'})$ ;
If $g^{'}$ has the same number of non-self-loop edges as g, and then they share a common mutational neighbor $g^{″}$ , where the only different edge between g and $g^{'}$ is rewired to a self-loop and thus $g, g^{'} \in M^{+} (g^{″})$ .

The mutational connections between equivalence classes can hence be uncovered by examining the

M^{+}

neighborhood of the representative GRNs. Moreover, the

M^{+}

neighborhood of representative GRNs in layer

Θ_{k}

reveals the equivalence classes in layer

Θ_{k + 1}

because any GRN must have a mutational neighbor with one less non-self-loop edge. All that remains is to join different

M^{+}

neighbors into equivalence classes. In particular:

(A): For an equivalence class $θ \in Θ_{k}$ and its representative GRN $g \in θ$ , under what condition will $g_{1}^{'}, g_{2}^{'} \in M^{+} (g)$ belong to the same equivalence class in layer $Θ_{k + 1}$ ?
(B): For two distinct equivalence classes $θ_{1}, θ_{2} \in Θ_{k}$ and their representative GRNs $g_{1} \in θ_{1}$ and $g_{2} \in θ_{2}$ , under what condition will $g_{1}^{'} \in M^{+} (g_{1})$ and $g_{2}^{'} \in M^{+} (g_{2})$ belong to the same equivalence class in layer $Θ_{k + 1}$ ?

For our ease of illustration, we hereafter choose the GRNs g,

g_{1}

,

g_{2}

,

g_{1}^{'}

and

g_{2}^{'}

such that only one stimulus node is incident to out-going edges.

To address (A), let

g_{1}^{'}, g_{2}^{'} \in M^{+} (g)

belong to the same equivalence class, so there is a phenotype-preserving isomorphism

π

from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. Recalling from Section 2.2,

e_{g} (γ) = (u, v)

denotes that “the source–target pair of edge

γ

is

(u, v)

in GRN g.” Furthermore, we write

e_{g_{1}^{'}} (γ_{1}) = (u_{1}, v_{1})

and

e_{g_{2}^{'}} (γ_{2}) = (u_{2}, v_{2})

, where

γ_{1}

and

γ_{2}

are the non-self-loop edges “added” to g that forms

g_{1}^{'}

and

g_{2}^{'}

, respectively. A few observations follow:

There is an integer p such that $π^{p} (γ_{1}) = γ_{1}$ and $(π^{p} (u_{1}), π^{p} (v_{1})) = (u_{1}, v_{1})$ ;
There is another integer $q < p$ such that $π^{q} (γ_{1}) = γ_{2}$ and $(π^{q} (u_{1}), π^{q} (v_{1})) = (u_{2}, v_{2})$ ;
$e_{g_{2}^{'}} (π^{k} (γ_{1})) = (π^{k} (u_{1}), π^{k} (v_{1}))$ for $k = 1, 2, \dots, q$ ;
$e_{g_{2}^{'}} (π^{k} (γ_{1})) \neq (π^{k} (u_{1}), π^{k} (v_{1}))$ for $k = q + 1, q + 2, \dots, p$ ;
For any locus $γ$ and non-self-loop source–target pair $(u, v)$ such that $(γ, u, v) \neq (π^{k} (γ), π^{k} (u_{1}), π^{k} (v_{1}))$ for $0 \leq k \leq q - 1$ , we have $e_{g} (π (γ)) = (π (u), π (v))$ if and only if $e_{g} (γ) = (u, v)$ .

We detail the reasoning behind these observations in Lemma A1–A3. Critically, our fifth observation implies that, after self-loop removal, the isomorphism

π

between

g_{1}^{'}

and

g_{2}^{'}

is in fact a phenotype-preserving automorphism of a subgraph

\bar{g}

of the GRN g. In addition, observations 3. and 4. show that those edges in g—but not in

\bar{g}

—are sequentially mapped from one to another via this automorphism

π

, i.e.,

Γ^{'} (g) ⧵ Γ^{'} (\bar{g}) = {\{π^{k} (γ_{1})\}}_{k = 1}^{q - 1}

, and they bridge the newly added edges

γ_{1}

and

γ_{2} = π^{q} (γ_{1})

. We show that the converse is also true (see Theorem A3): After self-loop removal, if we find a phenotype-preserving automorphism

π

of a subgraph

\bar{g}

of g where

γ_{1}

is consecutively mapped to

γ_{2}

through the edge differences

Γ^{'} (g) ⧵ Γ^{'} (\bar{g})

,

π

is guaranteed a phenotype-preserving isomorphism from

g_{1}^{'}

to

g_{2}^{'}

.

The sufficient and necessary condition for two

M^{+}

neighbors of g to be in the same equivalence class, intriguingly, lies in the phenotype-preserving automorphisms of subgraphs of the representative GRN g. Here, we demonstrate a few simple examples in Figure 9a. In the top row, an automorphism of g directly maps between the two additional edges

(u_{1}, v_{1}) = (3, 5)

and

(u_{2}, v_{2}) = (1, 4)

. In the middle row, the two edges

(u_{1}, v_{1}) = (1, 2)

are consecutively mapped to

(u_{2}, v_{2}) = (3, 4)

through edge

(2, 3)

, and

(u_{2}, v_{2})

is consecutively mapped back to

(u_{1}, v_{1})

through the non-edge

(4, 1)

, so we have

q = 2

and

p = 4

. As a mixture of both, in the bottom row,

(u_{1}, v_{1}) = (2, 5)

is consecutively mapped to

(u_{2}, v_{2}) = (3, 6)

through edge

(1, 4)

, and this isomorphism is exactly an automorphism of a subgraph

\bar{g}

of g where edge

(1, 4)

is removed.

Switching gears to the remaining question (B), suppose that

g_{1}

and

g_{2}

are the representative GRN in two different equivalence classes where

| Γ^{'} (g_{1}) | = | Γ^{'} (g_{2}) |

and that

g_{1}^{'} \in M^{+} (g_{1})

and

g_{2}^{'} \in M^{+} (g_{2})

belong to the same equivalence class. Let

γ_{1}

and

γ_{2}

be the newly added edges to

g_{1}

and

g_{2}

that generate

g_{1}^{'}

and

g_{2}^{'}

, respectively, where

e_{g_{1}^{'}} (γ_{1}) = (u_{1}, v_{1})

and

e_{g_{2}^{'}} (γ_{2}) = (u_{2}, v_{2})

, and let

π

be a phenotype-preserving isomorphism from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. We observe that applying the permutation

π

on

g_{1}

transforms it into another GRN

{\tilde{g}}_{1}

in the same equivalence class. Since

g_{1}

simply has one less edge

γ_{1}

than

g_{1}^{'}

, and

{\tilde{g}}_{1}

and

g_{2}^{'}

only differ by a missing edge

π (γ_{1})

. Namely, we have

g_{2}^{'} \in M^{+} ({\tilde{g}}_{1})

with the additional edge

e_{g_{2}^{'}} (π (γ_{1})) = (π (u_{1}), π (v_{1}))

. Moreover, since

g_{2}^{'}

also belongs to the

M^{+}

neighborhood of

g_{2}

with the additional edge

e_{g_{2}^{'}} (γ_{2}) = (u_{2}, v_{2})

, by removing both the extra edges from

g_{2}^{'}

, we find a GRN

g^{″}

such that

{\tilde{g}}_{1}, g_{2} \in M^{+} (g^{″})

.

We again present an illustrative example in Figure 9b. Here, a GRN

{\tilde{g}}_{1}

in the equivalence class of

g_{1}

can be found via the isomorphism

π

between

g_{1}^{'}

and

g_{2}^{'}

. We note that the newly added edge

(u_{1}, v_{1}) = (4, 1)

is transformed into

(π (u_{1}), π (v_{1})) = (3, 4)

in

g_{2}^{'}

, which is missing in

{\tilde{g}}_{1}

. Removing both

(π (u_{1}), π (v_{1})) = (3, 4)

and

(u_{2}, v_{2}) = (3, 1)

from

g_{2}^{'}

produces a GRN

g^{″}

, which is a common neighbor of

g_{2}

and

{\tilde{g}}_{1}

with one less non-self-loop edge.

Our observation resolves the necessary condition of (B): For the representative GRNs of two different equivalence classes

g_{1} \in θ_{1}

and

g_{2} \in θ_{2}

, if their

M^{+}

neighbor

g_{1}^{'} \in M^{+} (g_{1})

and

g_{2}^{'} \in M^{+} (g_{2})

belong to the same equivalence class, then we can always find two GRNs

{\tilde{g}}_{1}

and

g^{″}

such that (a)

{\tilde{g}}_{1}

falls into the equivalence class of

g_{1}

, and (b)

{\tilde{g}}_{1}

and

g_{2}

are

M^{+}

neighbors of

g^{″}

. Moreover, the converse is true as well (Theorem A4). Therefore, whether the

M^{+}

neighborhood of

g_{1}

and

g_{2}

reveal a common equivalence class depends on the existence of a GRN

g^{″}

that both the equivalence classes

θ_{1}

and

θ_{2}

are rooted from.

Our strategy to build the mesoscopic backbone is now complete, and here, we detail our algorithm that incrementally generates the equivalence classes

Θ

of GRNs and establishes the mutational connections among them. Suppose that we have already built layers of equivalence classes

Θ_{0}, Θ_{1}, \dots, Θ_{k}

and determined the mutational connections among them. For each representative GRN g in layer

Θ_{k}

and every

g^{'} \in M^{+} (g)

, we will view

g^{'}

as the combination of g and an additional, non-self-loop edge

e_{g^{'}} (γ) = (u, v)

, for which we write

g^{'} = g \oplus (γ, u, v)

. All such combinations form a collection of

M^{+}

neighbors of the representative GRNs in layer

Θ_{k}

, for which we abuse the notation

M^{+} (Θ_{k})

.

We initially put each

g^{'} \in M^{+} (Θ_{k})

into an individual group, and we define a collection of operations

Φ

that join groups of

M^{+}

neighbors:

(I): For every representative GRN g in $Θ_{k}$ and every phenotype-preserving automorphism $σ$ of g, there is an operation $ψ_{g, σ}$ that joins together the groups of $g_{1}^{'} = g \oplus (γ, u_{1}, v_{1})$ and $g_{2}^{'} = g \oplus (γ, u_{2}, v_{2})$ , where $u_{1}, u_{2} \in Ω_{0}$ and $v_{2} = σ (v_{1})$ ;
(II): For every representative GRN g in $Θ_{k}$ and every phenotype-preserving automorphism $\bar{σ}$ of each subgraph $\bar{g}$ of g such that the edge differences $Γ^{'} (g) ⧵ Γ^{'} (\bar{g})$ are sequentially connected via $\bar{σ}$ , there is an operation $ϕ_{g, \bar{g}, \bar{σ}}$ that joins together the groups of $g_{1}^{'} = g \oplus (γ_{1}, u_{1}, v_{1})$ and $g_{2}^{'} = g \oplus (γ_{2}, u_{2}, v_{2})$ , where automorphism $\bar{σ}$ consecutively transforms edge $γ_{1}$ into $γ_{2}$ through $Γ^{'} (g) ⧵ Γ^{'} (\bar{g})$ ;
(III): For every representative GRN $g^{″}$ in $Θ_{k - 1}$ and each ${\tilde{g}}_{1} = g^{″} \oplus (γ_{1}^{″}, u_{1}^{″}, v_{1}^{″})$ and ${\tilde{g}}_{2} = g^{″} \oplus (γ_{2}^{″}, u_{2}^{″}, v_{2}^{″})$ in two different equivalence classes $θ_{1}$ and $θ_{2}$ , such that we have phenotype-preserving isomorphisms $π_{1}$ / $π_{2}$ from ${\tilde{g}}_{1}$ / ${\tilde{g}}_{2}$ to the representative GRN $g_{1}$ / $g_{2}$ after self-loop removal, there is an operation $φ_{g, {\tilde{g}}_{1}, {\tilde{g}}_{2}}$ that joins together the groups of $g_{1}^{'} = g_{1} \oplus (π_{2} (γ_{2}^{″}), π_{2} (u_{2}^{″}), π_{2} (v_{2}^{″}))$ , and $g_{2}^{'} = g_{2} \oplus (π_{1} (γ_{1}^{″}), π_{1} (u_{1}^{″}), π_{1} (v_{1}^{″}))$ .

The resulting groups of

M^{+}

neighbors, after applying the joining operations

Φ

, constitute the equivalence classes in the next layer

Θ_{k + 1}

. We hereafter denote by

M_{Φ}^{+} (θ^{'})

the corresponding consequent group of an equivalence class

θ^{'} \in Θ_{k + 1}

. We then choose an arbitrary

M^{+}

neighbor in

M_{Φ}^{+} (θ^{'})

as the representative GRN of the equivalence class

θ^{'}

, such that only one stimulus node is incident to out-going edges in the chosen representative GRN.

The joining operations

Φ

further provide useful information to count the number of mutation neighbors that a representative GRN

g \in θ

in layer

Θ_{k}

has among any equivalence class

θ^{'}

, which we will denote by

A_{g} (θ^{'})

. Let us first consider

θ^{'} \in Θ_{k + 1}

. For any

{\tilde{g}}^{'} \in M_{Φ}^{+} (θ^{'})

,

{\tilde{g}}^{'}

is a mutational neighbor of g if it can be viewed as a combination of g and an arbitrary extra non-self-loop edge, and hence

A_{g} (θ^{'}) = | M_{Φ}^{+} (θ^{'}) \cap M^{+} (g) |, for θ^{'} \in Θ_{k + 1} .

(1)

Note that, in this case,

A_{g} (θ^{'})

is easily acquired when building up the layer

Θ_{k + 1}

through

Φ

.

Second, for

θ^{'} \in Θ_{k - 1}

,

A_{g} (θ^{'})

can be computed given

A_{g^{'}} (θ)

, where

g^{'}

is the representative GRN of

θ^{'}

. Since the equivalence classes

Θ

generate an equitable partition of the genotype network G (see Section 3.2), we have

A_{g} (θ^{'}) \times | θ | = A_{g^{'}} (θ) \times | θ^{'} |

equal to the total number of mutational connections between

θ

and

θ^{'}

. Moreover, the size of the equivalence class

θ

is (see Appendix D)

| θ | = \frac{| Π^{'} |}{| Σ^{'} (g) |} \times n_{l} (| Γ | - k) \times m_{s} (g) \times r (g),

(2)

where (a) we denote by

Π^{'}

the set of all permutations of dummy proteins

Ω^{'}

and denote by

Σ^{'} (g)

the set of automorphisms of the representative GRN g after self-loop removal that only permutes

Ω^{'}

; (b)

n_{l} (| Γ | - k)

is the number of ways to allocate

| Γ | - k

labeled self-loops among the proteins

Ω

; (c)

m_{s} (g)

is the number of ways to re-distribute the edges pointing from stimuli

Ω_{0}

in g; and (d)

r (g)

is the number of ways to divide loci

Γ

into self-loops, non-self-loop edges pointing from stimuli, and others. As a result,

A_{g} (θ^{'}) = A_{g^{'}} (θ) \times \frac{| Σ^{'} (g) |}{| Σ^{'} (g^{'}) |} \times \frac{n_{l} (| Γ | - k + 1) m_{s} (g^{'}) r (g^{'})}{n_{l} (| Γ | - k) m_{s} (g) r (g)}, for θ^{'} \in Θ_{k - 1} .

(3)

Third, we turn to the case where

θ^{'} \in Θ_{k}

but

θ^{'} \neq θ

. Recall that, if any

{\tilde{g}}^{'} \in θ^{'}

is a mutational neighbor of g, then there is a GRN

{\tilde{g}}^{″}

in layer

Θ_{k - 1}

, where

g, {\tilde{g}}^{'} \in M^{+} ({\tilde{g}}^{″})

, and such

{\tilde{g}}^{″}

is unique up to arbitrary self-loop re-allocation. Additionally, the extra edge in g and

{\tilde{g}}^{'}

must correspond to the same locus, so

A_{g} (θ^{'}) = \sum_{θ^{″} \in Θ_{k - 1}} \frac{A_{g} (θ^{″})}{n_{l} (1)} \times A_{g^{″}} (θ^{'}), for θ^{'} \in Θ_{k - 1}, θ^{'} \neq θ,

(4)

in which we use

g^{″}

to be the representative GRN of equivalence class

θ^{″}

. Lastly, if

θ^{'} = θ

, we also need to include the scenario that the mutational neighbor

{\tilde{g}}^{'}

of g is generated by rewiring a self-loop to another self-loop. Therefore,

A_{g} (θ) = (| Γ | - k) \times (n_{l} (1) - 1) + \sum_{θ^{″} \in Θ_{k - 1}} \frac{A_{g} (θ^{″})}{n_{l} (1)} \times [A_{g^{″}} (θ) - 1] .

(5)

In Algorithm 1, we summarize our proposed approach that constructs the mesoscopic backbone. It is apparent that the core of our algorithm is determining the joining operations

Φ

for a given layer

Θ_{k}

. This task can be achieved by pre-computing the phenotype-preserving automorphisms of every representative GRN once it is chosen. In addition, since these joining operations reflect the mutational neighbors and the phenotype-preserving isomorphisms in previous layers, the type-(III)

Φ

for layer

Θ_{k}

is generated as a composition of the already uncovered operations. Furthermore, the remaining

Φ

of type (II) consists of combinations of the uncovered joining operations and the newly computed automorphisms of representative GRNs in layer

Θ_{k}

. As a result, the only prerequisite in our proposed algorithm is producing the phenotype-preserving automorphisms of a GRN.

Algorithm 1 Constructing the underlying space of a fitness landscape of GRNs

Require: The fixed underlying collections of loci

Γ

and proteins

Ω

of GRNs
Ensure: The representative GRN

g_{θ}

of each equivalence class

θ \in Θ

, and its number of
mutational neighbors

A_{g_{θ}} (θ^{'})

in any equivalence class

θ^{'} \in Θ

1:: $k \leftarrow 0$ ▹ initialization
2:: $g_{θ_{0}} \leftarrow$ a GRN with no self-loop, where $θ_{0}$ is the only equivalence class in layer $Θ_{0}$
3:: Store the phenotype-preserving automorphisms $Σ_{x} (g_{θ_{0}})$ .
4:: Compute $A_{g_{θ_{0}}} (θ_{0})$ via Equation (5).
5:: while $k < | Γ |$ do ▹ incrementally find $Θ$
6:: Construct and store the joining operations $Φ$ for layer $Θ_{k}$ .
7:: $M_{Φ}^{+} \leftarrow$ grouping of $M^{+} (Θ_{k})$ acted by $Φ$
8:: $Θ_{k + 1}$ corresponds to the groups in $M_{Φ}^{+}$ .
9:: for all $θ^{'} \in Θ_{k + 1}$ do
10:: $g_{θ^{'}} \leftarrow$ a GRN in $M_{Φ}^{+} (θ^{'})$ ▹ choose the representative GRN
11:: Store the phenotype-preserving automorphisms $Σ_{x} (g_{θ^{'}})$ .
12:: end for
13:: for all $θ \in Θ_{k}, θ^{'} \in Θ_{k + 1}$ do ▹ count the number of mutational neighbors
14:: Compute $A_{g_{θ}} (θ^{'})$ and $A_{g_{θ^{'}}} (θ)$ via Equations (1) and (3).
15:: end for
16:: for all $θ_{1}^{'}, θ_{2}^{'} \in Θ_{k + 1}$ do
17:: Compute $A_{g_{θ_{1}^{'}}} (θ_{2}^{'})$ via Equations (4) and (5).
18:: end for
19:: $k \leftarrow k + 1$
20:: end while
21:: Set any remaining, not computed $A_{g_{θ}} (θ^{'})$ to zero.

4. Conclusions

In this work, we integrate mechanistic knowledge of how phenotypes are computed from genotypes via regulatory interactions into fitness landscape models. The resulting family of fitness landscape models features flexibility for tunable ruggedness and accessibility among phenotypes. Furthermore, we introduce the concept of equivalence classes of GRNs, where GRNs of the same phenotype and with similar structural positions in the genotype network are coarse-grained into a group. These equivalence classes of GRNs lead to a compact and informative description of the fundamental space of a fitness landscape. Using this coarse-graining, we develop a bottom-up, efficient algorithm for constructing the underlying space of a fitness landscape based on the equivalence classes. Critically, this algorithm does not require pre-computing the genotype network and therefore permits the exploration of substantially larger GRNs.

Naively, ruggedness and accessibility would seem to be contradictory characteristics of a fitness landscape. Indeed, reciprocal sign epistasis has been shown to yield a strong influence on a landscape’s ruggedness and was regarded as an impediment to evolutionary accessibility when first introduced [2,32,55]. Nevertheless, recent studies suggest that fitness landscape models most closely aligned with empirical observations show that sign epistasis (and thus ruggedness) can co-exist with accessibility [63,82]. In addition to demonstrating that ruggedness and accessibility are not mutually exclusive, our model is compatible with three additional empirical observations. First, GRNs result in high dimensional genotype–phenotype maps [63]. Second, selection acts on the superposition of mutations and the background GRN rather than a few pairs of mutations [60]. Third, and perhaps most importantly, a GRN may experience a series of neutral mutations and then evolve into a nearby phenotype [3,8,83,84]. The accessibility induced in fitness landscapes of GRNs via neutral evolution agrees with the phenomenon of punctuated equilibrium/epochal evolution [23,85,86].

Our derived equivalence classes for GRNs provide a novel, mesoscopic, and optimally descriptive skeleton of a fitness landscape. Neither the genotypic space nor the phenotypic space alone fully characterize a fitness landscape; however, models with even a relatively simple genotype–phenotype map are computationally intensive because they must retain all plausible genotypes [70,73,74,75]. Intuitively, the complexity of a genotype–phenotype map can be reduced by combining similar phenotypes into high-level descriptors [87]. The equivalence classes of GRNs, on the other hand, serve as an intermediate level between the genotypic and phenotypic space, which provides an optimal coarse-graining that encodes all necessary information to predict the evolutionary trajectory on the fitness landscape.

We argue that our proposed algorithm for coarse-graining GRN fitness landscapes is more efficient than brute-force approaches. First, because we consolidate an equivalence class into a single representative GRN, our method is less costly in memory and requires fewer computations when finding mutational neighbors. Second, suppose all plausible GRNs were organized into layers by the number of non-self-loop edges (see Section 3.3), every layer would still super-exponentially contain many GRNs. Our algorithm instead finds the equivalence classes in each layer iteratively. To construct the (

k + 1

)-th layer, we only have to exhaust the representative GRNs in the k-th layer–along with any plausible additional non-self-loop edge(s)–this amount will be significantly fewer than the number of GRNs in the (

k + 1

)-th layer. Lastly, existing heuristics for graph automorphisms [88,89] can be used to produce the phenotype-preserving automorphisms of the representative GRNs, which is the only prerequisite when joining together different GRN–edge pairs. Because the set of automorphisms becomes more limited as the complexity of GRNs increases, we expect only a minor overhead in the joining procedure as compared to the exhaustive, brute-force approach.

Despite our model being constrained to the pathway framework of GRNs [77,78] and a few naive assumptions described in Section 2, we believe our methodology to be flexible and, in what follows, we outline some potential directions to extend the framework. First, when GRNs are modeled through more complex computation, e.g., with different logic gates connecting multiple expression activators/suppressors/products, those GRNs that only consist of naive interactions are never excluded. Thus, the current model represents a subset of the complete landscape built by more complex gene regulation. The derived connectivity and accessibility among the naive GRNs still hold, and we expect these topographical features to manifest for complex GRNs if mutations between the simple and complex expressions are permitted. Second, hypergraphs [90] could be used to describe the expression behavior of genes where multiple activators/products appear. Third, stable motif identification [91] and target control [92] for Boolean network models could be used to explore the phenotypes of mutational neighbors of a focal complex GRN. Lastly, our methodologies are likely applicable to other classes of genotype–phenotype maps [93,94]. In particular, once the mapping and the genotype network are determined, one can simply follow the proposed iterative procedure (Figure 6) to obtain a genotype partition coarser than the equivalence classes.

More broadly, this work showcases the potential of combining biological computation across different scales along the hierarchy of living systems. Computing biological functionality on the organism level with genotype–phenotype mapping provides a blueprint of the overall fitness landscape, where evolutionary processes occur/compute on the population level. Furthermore, several intriguing perspectives arise from the proposed mesoscopic backbone if we consider evolution to be a random walk on the fitness landscape. The process of evolution not only manifests genotypes with higher fitness values but also reveals genotypes whose mutational neighbors are more fit [19,23,78]; in other words, the prevalence of different genotypes would reflect the connection counts between equivalence classes of GRNs. In addition, these “connection counts” could become associated with an analogous theory of computation in evolution that addresses questions such as how likely a genotype in an equivalence class is to evolve into a specified phenotype, as well as how likely it is to “reset” to another genotype in the same equivalence class and recover its position in the fitness landscape.

Author Contributions

Conceptualization, C.-H.Y. and S.V.S.; Formal analysis, C.-H.Y.; Funding acquisition, S.V.S.; Investigation, C.-H.Y.; Methodology, C.-H.Y.; Project administration, S.V.S.; Supervision, S.V.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors affirm that all data necessary for confirming the conclusions of the article are present within the article, gures, and tables.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Central and Peripheral GRNs Where No Regulation Presents

Here, we show the connectivity of “central” and “peripheral” GRNs mentioned in Section 3.1 under the extreme scenario where no gene regulation appears, in particular, when

| Ω_{y}^{+} | = | Γ |

. We observe that, since

| Ω_{y}^{+} | = | Γ |

, each node

ω \in Ω_{y}^{+}

is incident to one and only one incoming edge, and thus, each central GRN

\tilde{g} \in {\tilde{G}}_{y}

corresponds to a bijective mapping from

Ω_{y}^{+}

to edge labels

Γ

.

First, we show that if two central GRNs

{\tilde{g}}_{1}, {\tilde{g}}_{2} \in {\tilde{G}}_{y}

correspond to different mappings between

Ω_{y}^{+}

and

Γ

, there is no mutational trajectory connecting them among

G_{y}

. Start with assuming that such a mutational trajectory does exist. Due to the different associated mappings of

{\tilde{g}}_{1}

and

{\tilde{g}}_{2}

, there is an

ω \in \hat{Ω}

with distinct labels of the incident edge

γ_{1}

and

γ_{2}

in

{\tilde{g}}_{1}

and

{\tilde{g}}_{2}

, respectively. Because mutating

{\tilde{g}}_{1}

into

{\tilde{g}}_{2}

requires rewiring both

γ_{1}

and

γ_{2}

, there must exist a GRN g where either none or both the edges

γ_{1}

and

γ_{2}

point to

ω

. Nevertheless, g contradicts our observation following the constraint

| Ω_{y}^{+} | = | Γ |

so

g \notin G_{y}

. As a result, under this extreme scenario,

G_{y}

fragment into multiple connected components when only mutations among themselves are considered.

Next, for any phenotype

y^{'}

for which

| Ω_{y}^{+} \cup Ω_{y^{'}}^{+} | \leq | Γ | + 1

, we show that there is a mutational trajectory among

G_{y}

connecting an arbitrary central GRN

\tilde{g} \in {\tilde{G}}_{y}

and a peripheral GRN

\hat{g} \in G_{y}

at the boundary of

G_{y}

and

G_{y^{'}}

. Specifically, take

ω \in Ω_{y}^{+} ⧵ Ω_{y^{'}}^{+}

and its incident edge pointing from

ω_{0} \in Ω_{0}

. For each

ω^{'} \in Ω_{y}^{+} ⧵ Ω_{y^{'}}^{+}

where

ω^{'} \neq ω

, one can sequentially rewire the incident edge of

ω^{'}

to form a chain of

Ω_{y}^{+} ⧵ Ω_{y^{'}}^{+}

initiated by

ω_{0}

, which leads to a resultant GRN

\hat{g}

. Moreover, since

| Ω_{y}^{+} | = | Γ |

, we have

| Ω_{y^{'}}^{+} ⧵ Ω_{y}^{+} | = 1

. Recall from Section 3.1, this

\hat{g}

is indeed a peripheral GRN between

G_{y}

and

G_{y^{'}}

.

Appendix B. Phenotype-Preserving Automorphisms of the Genotype Network of GRNs

In this section, we demonstrate (a) why GRNs mapped by automorphisms of the genotype network G are equivalent, (b) four graphical operations that generate phenotype-preserving automorphisms of G, and (c) the correctness of our iterative procedure to obtain a coarser partition than the equivalence classes of GRNs.

Proposition A1.

Given an automorphism σ of the genotype network G and a mega-node function

f_{G}

that depends on the adjacency matrix

A

of G, for any GRNs

g_{1}, g_{2} \in G

where

g_{2} = σ (g_{1})

,

f_{G} (g_{2}) = f_{G} (g_{1})

.

Proof.

Take

f_{G}^{'} = f_{G} \circ σ

. Since for any

g_{1}, g_{2} \in V (G)

,

(σ (g_{1}), σ (g_{2})) \in E (G)

if and only if

(g_{1}, g_{2}) \in E (G)

, the adjacency matrix A remains unchanged after permuting the mega-nodes through

σ

. As a result, we have

f_{G}^{'} = f_{G}

, and

f_{G} (g_{2}) = f_{G}^{'} (g_{1}) = f_{G} (g_{1})

. □

Let

π

and

π^{'}

be a permutation of the loci

Γ

and the dummy proteins

Ω^{'}

, respectively. It is not hard to see that

π

and

π^{'}

also generate a permutation of the GRNs

G

. For

g \in G

, we abuse the notation

π (g)

to be the GRN mapped through the locus permutation

π

, where an edge with

e_{g} (γ) = (u, v)

is transformed into

e_{π (g)} (π (γ)) = (u, v)

. Similarly, in the GRN

π^{'} (g)

mapped through the dummy protein permutation

π^{'}

, an edge with

e_{g} (γ) = (u, v)

is transformed into

e_{g} (γ) = (π (u), π (v))

.

Furthermore, we have two more types of graphical operations on GRNs:

Definition A1.

For a locus γ and two stimuli

ω, ω^{'} \in Ω_{0}

,

ρ_{γ, ω, ω^{'}} : G \to G

transforms a GRN g into

g^{'}

such that edge γ becomes

\{\begin{matrix} e_{g^{'}} (γ) = (u, ω^{'}) if e_{g} (γ) = (u, ω), \\ e_{g^{'}} (γ) = (u, ω) if e_{g} (γ) = (u, ω^{'}), \\ unchanged otherwise . \end{matrix}

Definition A2.

For a locus γ and two nodes ω and

ω^{'}

,

ϱ_{γ, ω, ω^{'}} : G \to G

transforms a GRN g into

g^{'}

such that the edge γ becomes

\{\begin{matrix} e_{g^{'}} (γ) = (ω^{'}, ω^{'}) if e_{g} (γ) = (ω, ω), \\ e_{g^{'}} (γ) = (ω, ω) if e_{g} (γ) = (ω^{'}, ω^{'}), \\ unchanged otherwise . \end{matrix}

Note both

ρ_{γ, ω, ω^{'}}

and

ϱ_{γ, ω, ω^{'}}

are permutations of

G

as well, where pairs of GRNs are mutually mapped from one to the other.

These four graphical operations introduced above are, more importantly, automorphisms of the genotype network G that also preserve the phenotype of GRNs:

Theorem A1.

The transformations π,

π^{'}

,

ρ_{γ, ω, ω^{'}}

, and

ϱ_{γ, ω, ω^{'}}

are phenotype-preserving automorphisms of G.

Proof.

For

g_{1}, g_{2} \in G

, let

Δ = \{γ \in Γ ∣ e_{g_{1}} (γ) \neq e_{g_{2}} (γ)\}

. Since

π^{'}

is a permutation of dummy proteins, it preserves

Δ

. Additionally, because

ρ_{γ, ω, ω^{'}}

and

ϱ_{γ, ω, ω^{'}}

can be viewed as permutations of the source–target pair of a single edge

γ

, they also preserve

Δ

. The permutation

π

of

Γ

, on the other hand, does not preserve

Δ

but maintains its size

| Δ |

. Since

(g_{1}, g_{2}) \in E (G)

if and only if

| Δ | = 1

, the four transformations are automorphisms of G.

Furthermore, since

π

,

π^{'}

and

ϱ_{γ, ω, ω^{'}}

simply change the labels of edges, labels of the intermediate nodes, and the location of a self-loop, they maintain any path from a stimulus

ω_{0} \in Ω_{0}

to a fitness-relevant protein

\hat{ω} \in \hat{Ω}

.

ρ_{γ, ω, ω^{'}}

may alter the path between

ω_{0}

and

\hat{ω}

, but the reachability from

Ω_{0}

to

\hat{ω}

remains. Therefore, the four transformations also preserve the phenotype of GRNs. □

Finally, we turn to the computationally acquired partition that can be shown to be coarser than the equivalence classes of GRNs. Recall from Section 3.2 that our iterative procedure starts from a partition

{\hat{Θ}}_{0}

where GRNs with the same phenotype are grouped together. Given the partition

{\hat{Θ}}_{i}

, the next partition

{\hat{Θ}}_{i + 1}

is obtained by further dividing groups into

{\hat{Θ}}_{i}

(if needed) such that for each group

θ \in {\hat{Θ}}_{i}

and

θ^{'} \in {\hat{Θ}}_{i + 1}

, any two GRNs in

θ^{'}

have the same number of neighbors among

θ

in the genotype network G. The procedure terminates when a stationary partition

\hat{Θ}

is reached. We then have:

Theorem A2.

Every equivalence class

θ \in Θ

is included in a group

\hat{θ} \in \hat{Θ}

,

θ \subset \hat{θ}

.

Proof.

Recall that GRNs in an equivalence class have the same phenotype, so for each

θ \in Θ

, there is some

θ_{0} \in Θ_{0}

where

θ \subset θ_{0}

. Suppose that

θ \subset θ_{i}

for each

θ \in Θ

and some

θ_{i} \in Θ_{i}

. Since

Θ

forms an equitable partition, every

g \in θ

has the same number of neighbors in each

θ^{'} \in Θ

and thus also in each

θ_{i} \in Θ_{i}

. Consequently, no two GRNs in

θ

will be separated into two different groups in

Θ_{i + 1}

, and the theorem is proved by induction. □

Appendix C. Combining Mutational Neighbors into Equivalence Classes

In this section, we tackle the two questions raised in Section 3.3:

(A): For an equivalence class $θ \in Θ_{k}$ and its representative GRN $g \in θ$ , under what condition will $g_{1}^{'}, g_{2}^{'} \in M^{+} (g)$ belong to the same equivalence class in layer $Θ_{k + 1}$ ?
(B): For two distinct equivalence classes $θ_{1}, θ_{2} \in Θ_{k}$ and their representative GRNs $g_{1} \in θ_{1}$ and $g_{2} \in θ_{2}$ , under what condition will $g_{1}^{'} \in M^{+} (g_{1})$ and $g_{2}^{'} \in M^{+} (g_{2})$ belong to the same equivalence class in layer $Θ_{k + 1}$ ?

Furthermore, recall that for ease of demonstration, we constrain the GRNs g,

g_{1}

,

g_{2}

,

g_{1}^{'}

and

g_{2}^{'}

where only one stimulus node is incident to out-going edges.

Definition A3.

A phenotype-preserving isomorphism π from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal is a permutation of loci Γ and dummy protein

Ω^{'}

such that for any locus γ and non-self-loop source–target pair

(u, v)

,

e_{g_{2}^{'}} (π (γ)) = (π (u), π (v))

if and only if

e_{g_{1}^{'}} (γ) = (u, v)

.

Starting with the question (A), we write

e_{g_{1}^{'}} (γ_{1}) = (u_{1}, v_{1})

and

e_{g_{2}^{'}} (γ_{2}) = (u_{2}, v_{2})

, where

γ_{1}

and

γ_{2}

are the non-self-loop edge newly rewired to generate

g_{1}^{'}

and

g_{2}^{'}

from g, respectively. A few observations appear when we assume

g_{1}^{'}

and

g_{2}^{'}

belong to the same equivalence class:

Lemma A1.

Suppose a phenotype-preserving isomorphism π from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. There are two integers

q < p

such that

π^{p} (γ_{1}) = γ_{1}

,

(π^{p} (u_{1}), π^{p} (v_{1})) = (u_{1}, v_{1})

, and

π^{q} (γ_{1}) = γ_{2}

,

(π^{q} (u_{1}), π^{q} (v_{1})) = (u_{2}, v_{2})

.

Proof.

Since

π

is a permutation of finite sets, it must have a finite period, i.e., an integer p such that

π^{p} (γ_{1}) = γ_{1}

and

(π^{p} (u_{1}), π^{p} (v_{1})) = (u_{1}, v_{1})

.

If

π (γ_{1}) = γ_{2}

and

(π (u_{1}), π (v_{1})) = (u_{2}, v_{2})

, then we have

q = 1

. Otherwise, it must map to a non-self-loop edge in g because

γ_{2}

is the only additional non-self-loop edge in

g_{2}

, i.e.,

e_{g} (π (γ_{1})) = (π (u_{1}), π (v_{1}))

. Assume there is no integer

q < p

such that

π^{q} (γ_{1}) = γ_{2}

and

(π^{q} (u_{1}), π^{q} (v_{1})) = (u_{2}, v_{2})

. Then,

π^{p - 1} (γ_{1})

is a non-self-loop edge in g with

e_{g} (π^{p - 1} (γ_{1})) = (π^{p - 1} (u_{1}), π^{p - 1} (v_{1}))

. However, since

γ_{1}

is not a non-self-loop edge in

g_{2}

, the fact that

π^{p} (γ_{1}) = γ_{1}

and

(π^{p} (u_{1}), π^{p} (v_{1})) = (u_{1}, v_{1})

contradicts

π

being an isomorphism from

g_{1}

to

g_{2}

after self-loop removal. Therefore, there is an integer

q < p

such that

π^{q} (γ_{1}) = γ_{2}

and

(π^{q} (u_{1}), π^{q} (v_{1})) = (u_{2}, v_{2})

. □

Lemma A2.

Suppose a phenotype-preserving isomorphism π from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. For integers p and q in Lemma A1,

e_{g_{2}^{'}} (π^{k} (γ_{1})) = (π^{k} (u_{1}), π^{k} (v_{1}))

for

k = 1, 2, \dots, q

, and

e_{g_{2}^{'}} (π^{k} (γ_{1})) \neq (π^{k} (u_{1}), π^{k} (v_{1}))

for

k = q + 1, q + 2, \dots, p

.

Proof.

Since

π

is an isomorphism and

γ_{2} = π^{q} (γ_{1})

is the only additional non-self-loop edge in

g_{2}

, we have

π (γ_{1}), π^{2} (γ_{1}), \dots, π^{q - 1} (γ_{1})

to be non-self-loop edges in g. Thus,

e_{g_{2}^{'}} (π^{k} (γ_{1})) = (π^{k} (u_{1}), π^{k} (v_{1}))

for

k = 1, 2, \dots, q

. On the other hand, since

γ_{2} = π^{q} (γ_{1})

is not a non-self-loop edge in

g_{1}

, the isomorphism

π

guaranteed that the source–target pairs

(π^{q + 1} (u_{1}), π^{q + 1} (v_{1})), (π^{q + 2} (u_{1}), π^{q + 2} (v_{1})), \dots, (π^{p} (u_{1}), π^{p} (v_{1}))

do not match to edges in

g_{2}

, in particular,

e_{g_{2}^{'}} (π^{k} (γ_{1})) \neq (π^{k} (u_{1}), π^{k} (v_{1}))

for

k = q + 1, q + 2, \dots, p

. □

Lemma A3.

Suppose a phenotype-preserving isomorphism π from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. Given integer q in Lemma A1, for a locus γ and non-self-loop source–target pair

(u, v)

where there is no

0 \leq k \leq q - 1

such that

(γ, u, v) = (π^{k} (γ), π^{k} (u_{1}), π^{k} (v_{1}))

,

e_{g} (π (γ)) = (π (u), π (v))

if and only if

e_{g} (γ) = (u, v)

.

Proof.

For

(γ, u, v) \notin {\{(π^{k} (γ), π^{k} (u_{1}), π^{k} (v_{1}))\}}_{k = 0}^{p - 1}

, since

(γ_{1}, u_{1}, v_{1})

and

(γ_{2}, u_{2}, v_{2})

are already excluded, and by Definition A3, we have

e_{g} (π (γ)) = (π (u), π (v))

if and only if

e_{g} (γ) = (u, v)

. Furthermore, for

k = q, q + 1, \dots, p - 1

, because the only additional non-self-loop edge in

g_{2}^{'}

follows

π^{q} (γ_{1}) = γ_{2}

and

(π^{q} (u_{1}), π^{q} (v_{1})) = (u_{2}, v_{2})

, according to Lemma A2, we know that

e_{g} (π^{k} (γ_{1})) \neq (π^{k} (u_{1}), π^{k} (v_{1}))

and

e_{g} (π^{k + 1} (γ_{1})) \neq (π^{k + 1} (u_{1}), π^{k + 1} (v_{1}))

. As a result, the statement in Lemma A3 is true for any locus

γ

and non-self-loop source–target pair

(u, v)

where

(γ, u, v) \notin {\{(π^{k} (γ), π^{k} (u_{1}), π^{k} (v_{1}))\}}_{k = 0}^{q - 1}

. □

The following theorem resolves the necessary and sufficient condition in question (A):

Theorem A3.

Let

g_{1}^{'}, g_{2}^{'} \in M^{+} (g)

with the additional non-self-loop edge

e_{g_{1}^{'}} (γ_{1}) = (u_{1}, v_{1})

and

e_{g_{2}^{'}} (γ_{2}) = (u_{2}, v_{2})

, respectively.

g_{1}^{'}

and

g_{2}^{'}

belong to the same equivalence class if and only if there exist two integers

q < p

and a phenotype-preserving automorphism σ of a subgraph

\bar{g}

of g such that

i.: $(σ^{q} (γ_{1}), σ^{q} (u_{1}), σ^{q}) = (γ_{2}, u_{2}, v_{2})$ ;
ii.: $(σ^{p} (γ_{1}), σ^{p} (u_{1}), σ^{p} (v_{1})) = (γ_{1}, u_{1}, v_{1})$ ;
iii.: ${\{σ^{k} (γ_{1})\}}_{k = 1}^{q - 1} = Γ^{'} (g) ⧵ Γ^{'} (\bar{g})$ and $e_{g} (σ^{k} (γ_{1})) = (σ^{k} (u_{1}), σ^{k} (v_{1}))$ for $k = 1, 2, \dots, q - 1$ ;
iv.: $e_{g} (σ^{k} (γ_{1})) \neq (σ^{k} (u_{1}), σ^{k} (v_{1}))$ for $k = q, q + 1, \dots, p$ .

Proof.

For one direction, suppose a phenotype-preserving isomorphism

π

from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. According to Lemmas A1–A3, we see that

π

is a phenotype-preserving automorphism of a subgraph

\bar{g}

of g where

Γ^{'} (g) ⧵ Γ^{'} (\bar{g}) = {\{π^{k} (γ_{1})\}}_{k = 1}^{q - 1}

. Taking

σ = π

satisfies all four conditions.

For the other direction, we show that such a phenotype-preserving automorphism

σ

is also a phenotype-preserving isomorphism from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. First, regarding any locus

γ

and non-self-loop target pair

(u, v)

such that

(γ, u, v) \notin {\{(σ^{k} (γ_{1}), σ^{k} (u_{1}), σ^{k} (v_{1}))\}}_{k = 1}^{p}

, we have

e_{g_{2}^{'}} (σ (γ)) = (σ (u), σ (v))

if and only if

e_{g_{1}^{'}} (γ) = (u, v)

because

σ

is an automorphism of

\bar{g}

and

γ_{1}

,

γ_{2}

, and

Γ^{'} (g) ⧵ Γ^{'} (\bar{g})

are already excluded. Second, for

k = 0, 1, \dots, q - 1

, the conditions i.–iii. ensure that

e_{g_{1}^{'}} (σ^{k} (γ)) = (σ^{k} (u), σ^{k} (v))

and

e_{g_{2}^{'}} (σ^{k + 1} (γ)) = (σ^{k + 1} (u), σ^{k + 1} (v))

. Lastly, for

k = q, q + 1, \dots, p - 1

, the conditions i., ii., and iv. indicate

e_{g_{1}^{'}} (σ^{k} (γ)) \neq (σ^{k} (u), σ^{k} (v))

and

e_{g_{2}^{'}} (σ^{k + 1} (γ)) \neq (σ^{k + 1} (u), σ^{k + 1} (v))

. □

Next, the necessary and sufficient condition of question (B) is described in the theorem below:

Theorem A4.

Let

g_{1}

and

g_{2}

be of two different equivalence classes, and let

g_{1}^{'} \in M^{+} (g_{1})

and

g_{2}^{'} \in M^{+} (g_{2})

with the additional non-self-loop edge

e_{g_{1}^{'}} (γ_{1}) = (u_{1}, v_{1})

and

e_{g_{2}^{'}} (γ_{2}) = (u_{2}, v_{2})

, respectively.

g_{1}^{'}

and

g_{2}^{'}

belong to the same equivalence class if and only if there exist GRNs

{\tilde{g}}_{1}

and

g^{″}

such that

i.: ${\tilde{g}}_{1}$ and $g_{1}$ belong to the same equivalence class;
ii.: ${\tilde{g}}_{1}, g_{2} \in M^{+} (g^{″})$ ;
iii.: $g_{2}^{'} \in M^{+} ({\tilde{g}}_{1})$ .

Proof.

For one direction, suppose a phenotype-preserving isomorphism

π

from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. Take

{\tilde{g}}_{1} = π (g_{1})

, i.e.,

e_{{\tilde{g}}_{1}} (π (γ)) = (π (u), π (v))

for each

γ \in Γ^{'} (g_{1})

with

e_{g_{1}} (γ) = (u, v)

, so

π

also becomes a phenotype-preserving isomorphism from

g_{1}

to

{\tilde{g}}_{1}

. Moreover, due to the isomorphism

π

, we note

g_{2}^{'} \in M^{+} ({\tilde{g}}_{1})

with the additional non-self-loop edge

e_{g_{2}^{'}} (π (γ_{1})) = (π (u_{1}), π (v_{1}))

. Let

g^{''}

be a GRN obtained by rewiring edges

π (γ_{1})

and

γ_{2}

from

g_{2}^{'}

to two arbitrary self-loops. We have

{\tilde{g}}_{1}, g_{2} \in M^{+} (g^{″})

with the additional non-self-loop edge

e_{{\tilde{g}}_{1}} (γ_{2}) = (u_{2}, v_{2})

and

e_{g_{2}} (π (γ_{1})) = (π (u_{1}), π (v_{1}))

, respectively. Observe that the GRNs

{\tilde{g}}_{1}

and

g^{''}

satisfy the conditions i.–iii.

For the other direction, suppose there exist GRNs

{\tilde{g}}_{1}

and

g^{''}

where the conditions i.–iii. hold. Let

π

be the phenotype-preserving isomorphism from

g_{1}

to

{\tilde{g}}_{1}

after self-loop removal such that the additional non-self-loop edge of

g_{2} \in M^{+} (g^{″})

complies

e_{g_{2}} (π (γ_{1})) = (π (u_{1}), π (v_{1}))

. Since

{\tilde{g}}_{1}, g_{2} \in M^{+} (g^{″})

,

g_{2}^{'} \in M^{+} (g_{2})

, and

g_{2}^{'} \in M^{+}

, we also have the addition non-self-loop edge in

g_{2}^{'} \in M^{+}

to follow

e_{g_{2}^{'}} (π (γ_{1})) = (π (u_{1}), π (v_{1}))

. Therefore,

π

is also a phenotype-preserving isomorphism from

g_{1}^{'}

to

g_{2}^{'}

after self-loop removal. □

Appendix D. Size of an Equivalence Class of GRNs

Here, given the representative GRN g of an equivalence class

θ

, we calculate the number of GRNs in

θ

in Equation (2). Observe that

| θ |

is proportional to (a) the number of GRNs to which there is a phenotype-preserving isomorphism from g, (b) the number of ways to arbitrarily allocate self-loops on g, and (c) the number of ways to arbitrarily rewire source nodes of edges among the stimuli

Ω_{0}

.

First, for part (a), if we temporally ignore labels on the edges, the set of permutations over dummy proteins

Ω^{'}

is partitioned into groups of isomorphisms from g to different GRNs. In addition, the size of each group is exactly the number of automorphisms of g since the composition of an automorphism of g and an isomorphism from g to

g^{'}

also generates an isomorphism from g to

g^{'}

. Thus, there are in total

\frac{| Π^{'} |}{| Σ^{'} (g) |}

different GRNs isomorphic to g, where

Π^{'}

is the set of permutations over

Ω^{'}

, and

Σ^{'} (g)

is the set of automorphisms of g that only permutes

Ω^{'}

.

Second, for part (b), every possible allocation distributes

| Γ | - | Γ^{'} (g) |

self-loops over

| Ω ⧵ Ω_{0} |

nodes. Suppose that the labels of self-loops

Γ ⧵ Γ^{'} (g)

are given, we have

n_{l} (| Γ ⧵ Γ^{'} (g) |) = | Ω ⧵ Ω_{0} |^{| Γ ⧵ Γ^{'} (g) |} .

Third, we write

k_{s} (g)

as the number of incident edges to stimuli

Ω_{0}

in g. Any possibility in part (c) chooses a source node among

Ω_{0}

for each of the

k_{s} (g)

incident edges. Providing that the labels of incident edges of the stimuli are already known, we have

m_{s} (g) = {| Ω_{0} |}^{k_{s} (g)} .

Combined, the size of the equivalence class

θ

becomes

\begin{matrix} | θ | & = \frac{| Γ |!}{| Γ ⧵ Γ^{'} (g) |! k_{s} (g)!} \frac{| Π^{'} |}{| Σ^{'} (g) |} n_{l} (| Γ ⧵ Γ^{'} (g) |) m_{s} (g) \\ = \frac{| Γ |!}{| Γ ⧵ Γ^{'} (g) |! k_{s} (g)!} \frac{| Π^{'} |}{| Σ^{'} (g) |} | Ω ⧵ Ω_{0} |^{| Γ ⧵ Γ^{'} (g) |} {| Ω_{0} |}^{k_{s} (g)} \end{matrix}

where the first fraction represents first selecting combinations of

| Γ ⧵ Γ^{'} (g) |

and

k_{s} (g)

labels for self-loops and edges incident to stimuli and then permuting the remaining labels, which also contributes to different phenotype-preserving isomorphisms from g but was previously omitted.

References

Wright, S. The roles of mutation, inbreeding, crossbreeding, and selection in evolution. In Proceedings of the Sixth International Congress on Genetics, Ithaca, NY, USA, 24–31 August 1932; pp. 356–366. [Google Scholar]
De Visser, J.A.G.M.; Krug, J. Empirical fitness landscapes and the predictability of evolution. Nat. Rev. Genet. 2014, 15, 480–490. [Google Scholar] [CrossRef] [PubMed]
Fragata, I.; Blanckaert, A.; Louro, M.A.D.; Liberles, D.A.; Bank, C. Evolution in the light of fitness landscape theory. Trends Ecol. Evol. 2019, 34, 69–82. [Google Scholar] [CrossRef] [PubMed]
Szendro, I.G.; Schenk, M.F.; Franke, J.; Krug, J.; De Visser, J.A.G. Quantitative analyses of empirical fitness landscapes. J. Stat. Mech. Theory Exp. 2013, 2013, P01005. [Google Scholar] [CrossRef] [Green Version]
Wagner, A. The Origins of Evolutionary Innovations: A Theory of Transformative Change in Living Systems; Oxford University Press: Oxford, UK, 2011. [Google Scholar]
Jain, K.; Krug, J. Deterministic and stochastic regimes of asexual evolution on rugged fitness landscapes. Genetics 2007, 175, 1275–1288. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kryazhimskiy, S.; Tkačik, G.; Plotkin, J.B. The dynamics of adaptation on correlated fitness landscapes. Proc. Natl. Acad. Sci. USA 2009, 106, 18638–18643. [Google Scholar] [CrossRef] [Green Version]
Draghi, J.A.; Parsons, T.L.; Wagner, G.P.; Plotkin, J.B. Mutational robustness can facilitate adaptation. Nature 2010, 463, 353–355. [Google Scholar] [CrossRef] [Green Version]
Wu, N.C.; Dai, L.; Olson, C.A.; Lloyd-Smith, J.O.; Sun, R. Adaptation in protein fitness landscapes is facilitated by indirect paths. eLife 2016, 5, e16965. [Google Scholar] [CrossRef]
Gavrilets, S. Fitness Landscapes and the Origin of Species; Princeton University Press: Princeton, NY, USA, 2004. [Google Scholar]
Fraïsse, C.; Gunnarsson, P.A.; Roze, D.; Bierne, N.; Welch, J.J. The genetics of speciation: Insights from fisher’s geometric model. Evolution 2016, 70, 1450–1464. [Google Scholar] [CrossRef] [Green Version]
de Visser, J.A.G.; Park, S.C.; Krug, J. Exploring the effect of sex on empirical fitness landscapes. Am. Nat. 2009, 174, S15–S30. [Google Scholar] [CrossRef] [Green Version]
Otto, S.P. The evolutionary enigma of sex. Am. Nat. 2009, 174, S1–S14. [Google Scholar] [CrossRef] [Green Version]
Watson, R.A.; Weinreich, D.M.; Wakeley, J. Genome structure and the benefit of sex. Evol. Int. J. Org. Evol. 2011, 65, 523–536. [Google Scholar] [CrossRef] [PubMed]
Orr, H.A. The genetic theory of adaptation: A brief history. Nat. Rev. Genet. 2005, 6, 119–127. [Google Scholar] [CrossRef] [PubMed]
Lobkovsky, A.E.; Wolf, Y.I.; Koonin, E.V. Predictability of evolutionary trajectories in fitness landscapes. PLoS Comput. Biol. 2011, 7, e1002302. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Salverda, M.L.; Dellus, E.; Gorter, F.A.; Debets, A.J.; Van Der Oost, J.; Hoekstra, R.F.; Taw, D.S.; de Visser, J.A.G. Initial mutations direct alternative pathways of protein evolution. PLoS Genet. 2011, 7, e1001321. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bank, C.; Matuszewski, S.; Hietpas, R.T.; Jensen, J.D. On the (un) predictability of a large intragenic fitness landscape. Proc. Natl. Acad. Sci. USA 2016, 113, 14085–14090. [Google Scholar] [CrossRef] [Green Version]
Van Nimwegen, E.; Crutchfield, J.P.; Huynen, M. Neutral evolution of mutational robustness. Proc. Natl. Acad. Sci. USA 1999, 96, 9716–9720. [Google Scholar] [CrossRef] [Green Version]
Van Nimwegen, E.; Crutchfield, J.P. Metastable evolutionary dynamics: Crossing fitness barriers or escaping via neutral paths? Bull. Math. Biol. 2000, 62, 799–848. [Google Scholar] [CrossRef] [Green Version]
Wilke, C.O. Adaptive evolution on neutral networks. Bull. Math. Biol. 2001, 63, 715–730. [Google Scholar] [CrossRef] [Green Version]
Smith, T.; Husbands, P.; O’Shea, M. Neutral networks and evolvability with complex genotype-phenotype mapping. In European Conference on Artificial Life; Springer: Berlin/Heidelberg, Germany, 2001; pp. 272–281. [Google Scholar]
Aguirre, J.; Catalán, P.; Cuesta, J.A.; Manrubia, S. On the networked architecture of genotype spaces and its critical effects on molecular evolution. Open Biol. 2018, 8, 180069. [Google Scholar] [CrossRef] [Green Version]
Manrubia, S.; Cuesta, J.A.; Aguirre, J.; Ahnert, S.E.; Altenberg, L.; Cano, A.V.; Catalán, P.; Diaz-Uriarte, R.; Elena, S.F.; García-Martín, J.A.; et al. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys. Life Rev. 2021, 38, 55–106. [Google Scholar] [CrossRef]
Tenaillon, O. The utility of fisher’s geometric model in evolutionary genetics. Annu. Rev. Ecol. Evol. Syst. 2014, 45, 179–201. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fisher, R.A. The Genetical Theory of Natural Selection; Oxford University Press: Oxford, UK, 1930. [Google Scholar]
Kingman, J.F.C. A simple model for the balance between selection and mutation. J. Appl. Probab. 1978, 15, 1–12. [Google Scholar] [CrossRef]
Kauffman, S.; Levin, S. Towards a general theory of adaptive walks on rugged landscapes. J. Theor. Biol. 1987, 128, 11–45. [Google Scholar] [CrossRef]
Kauffman, S.A.; Weinberger, E.D. The nk model of rugged fitness landscapes and its application to maturation of the immune response. J. Theor. Biol. 1989, 141, 211–245. [Google Scholar] [CrossRef]
Aita, T.; Uchiyama, H.; Inaoka, T.; Nakajima, M.; Kokubo, T.; Husimi, Y. Analysis of a local fitness landscape with a model of the rough mt. fuji-type landscape: Application to prolyl endopeptidase and thermolysin. Biopolym. Orig. Res. Biomol. 2000, 54, 64–79. [Google Scholar] [CrossRef]
Neidhart, J.; Szendro, I.G.; Krug, J. Adaptation in tunably rugged fitness landscapes: The rough mount fuji model. Genetics 2014, 198, 699–721. [Google Scholar] [CrossRef] [Green Version]
Weinreich, D.M.; Delaney, N.F.; DePristo, M.A.; Hartl, D.L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 2006, 312, 111–114. [Google Scholar] [CrossRef] [Green Version]
Chou, H.-H.; Chiu, H.-C.; NDelaney, N.F.; Segrè, D.; Marx, C.J. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science 2011, 332, 1190–1192. [Google Scholar] [CrossRef] [Green Version]
Khan, A.I.; Dinh, D.M.; Schneider, D.; Lenski, R.E.; Cooper, T.F. Negative epistasis between beneficial mutations in an evolving bacterial population. Science 2011, 332, 1193–1196. [Google Scholar] [CrossRef]
De Visser, J.A.G.M.; Hoekstra, R.F.; van den Ende, H. Test of interaction between genetic markers that affect fitness in aspergillus niger. Evolution 1997, 51, 1499–1505. [Google Scholar] [CrossRef] [Green Version]
Hall, D.W.; Agan, M.; Pope, S.C. Fitness epistasis among 6 biosynthetic loci in the budding yeast saccharomyces cerevisiae. J. Hered. 2010, 101 (Suppl. S1), S75–S84. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Whitlock, M.C.; Bourguet, D. Factors affecting the genetic load in drosophila: Synergistic epistasis and correlations among fitness components. Evolution 2000, 54, 1654–1660. [Google Scholar] [CrossRef] [PubMed]
Hinkley, T.; Martins, J.; Chappey, C.; Haddad, M.; Stawiski, E.; Whitcomb, J.M.; Petropoulos, C.J.; Bonhoeffer, S. A systems analysis of mutational effects in hiv-1 protease and reverse transcriptase. Nat. Genet. 2011, 43, 487–489. [Google Scholar] [CrossRef] [PubMed]
Kouyos, R.D.; Leventhal, G.E.; Hinkley, T.; Haddad, M.; Whitcomb, J.M.; Petropoulos, C.J.; Bonhoeffer, S. Exploring the complexity of the hiv-1 fitness landscape. PLoS Genet. 2012, 8, e1002551. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hietpas, R.T.; Jensen, J.D.; Bolon, D.N.A. Experimental illumination of a fitness landscape. Proc. Natl. Acad. Sci. USA 2011, 108, 7896–7901. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Otwinowski, J.; Nemenman, I. Genotype to phenotype mapping and the fitness landscape of the E. coli lac promoter. PLoS ONE 2013, 8, e61570. [Google Scholar] [CrossRef] [Green Version]
Sarkisyan, K.S.; Bolotin, D.A.; Meer, M.V.; Usmanova, D.R.; Mishin, A.S.; Sharonov, G.V.; Ivankov, D.N.; Bozhanova, N.G.; Baranov, M.S.; Soylemez, O.; et al. Local fitness landscape of the green fluorescent protein. Nature 2016, 533, 397–401. [Google Scholar] [CrossRef] [Green Version]
Rogers, Z.N.; McFarland, C.D.; Winters, I.P.; Seoane, J.A.; Brady, J.J.; Yoon, S.; Curtis, C.; Petrov, D.A.; Winslow, M.M. Mapping the in vivo fitness landscape of lung adenocarcinoma tumor suppression in mice. Nat. Genet. 2018, 50, 483–486. [Google Scholar] [CrossRef]
Watson, C.J.; Papula, A.L.; Poon, G.Y.P.; Wong, W.H.; Young, A.L.; Druley, T.E.; Fisher, D.S.; Blundell, J.R. The evolutionary dynamics and fitness landscape of clonal hematopoiesis. Science 2020, 367, 1449–1454. [Google Scholar] [CrossRef]
Rowe, W.; Platt, M.; Wedge, D.C.; Day, P.J.; Kell, D.B.; Knowles, J. Analysis of a complete DNA–protein affinity landscape. J. R. Soc. Interface 2010, 7, 397–408. [Google Scholar] [CrossRef]
Pitt, J.N.; Ferré-D’Amaré, A.R. Rapid construction of empirical rna fitness landscapes. Science 2010, 330, 376–379. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jiménez, J.I.; Xulvi-Brunet, R.; Campbell, G.W.; Turk-MacLeod, R.; Chen, I.A. Comprehensive experimental fitness landscape and evolutionary network for small rna. Proc. Natl. Acad. Sci. USA 2013, 110, 14984–14989. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Li, C.; Qian, W.; Maclean, C.J.; Zhang, J. The fitness landscape of a trna gene. Science 2016, 352, 837–840. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bendixsen, D.P.; Østman, B.; Hayden, E.J. Negative epistasis in experimental rna fitness landscapes. J. Mol. Evol. 2017, 85, 159–168. [Google Scholar] [CrossRef]
Aguilar-Rodríguez, J.; Payne, J.L.; Wagner, A. A thousand empirical adaptive landscapes and their navigability. Nat. Ecol. Evol. 2017, 1, 0045. [Google Scholar] [CrossRef] [Green Version]
Martin, C.H. Context dependence in complex adaptive landscapes: Frequency and trait-dependent selection surfaces within an adaptive radiation of caribbean pupfishes. Evolution 2016, 70, 1265–1282. [Google Scholar] [CrossRef]
Boucher, J.I.; Cote, P.; Flynn, J.; Jiang, L.; Laban, A.; Mishra, P.; Roscoe, B.P.; Bolon, D.N.A. Viewing protein fitness landscapes through a next-gen lens. Genetics 2014, 198, 461–471. [Google Scholar] [CrossRef]
Aita, T.; Iwakura, M.; Husimi, Y. A cross-section of the fitness landscape of dihydrofolate reductase. Protein Eng. 2001, 14, 633–638. [Google Scholar] [CrossRef] [Green Version]
Carneiro, M.; Hartl, D.L. Adaptive landscapes and protein evolution. Proc. Natl. Acad. Sci. USA 2010, 107 (Suppl. S1), 1747–1751. [Google Scholar] [CrossRef] [Green Version]
Poelwijk, F.J.; Tănase-Nicola, S.; Kiviet, D.J.; Tans, S.J. Reciprocal sign epistasis is a necessary condition for multi-peaked fitness landscapes. J. Theor. Biol. 2011, 272, 141–144. [Google Scholar] [CrossRef] [Green Version]
Lozovsky, E.R.; Chookajorn, T.; Brown, K.M.; Imwong, M.; Shaw, P.J.; Kamchonwongpaisan, S.; Neafsey, D.E.; Weinreich, D.M.; Hartl, D.L. Stepwise acquisition of pyrimethamine resistance in the malaria parasite. Proc. Natl. Acad. Sci. USA 2009, 106, 12025–12030. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lunzer, M.; Miller, S.P.; Felsheim, R.; Dean, A.M. The biochemical architecture of an ancient adaptive landscape. Science 2005, 310, 499–501. [Google Scholar] [CrossRef] [PubMed]
Bridgham, J.T.; Carroll, S.M.; Thornton, J.W. Evolution of hormone-receptor complexity by molecular exploitation. Science 2006, 312, 97–101. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Poelwijk, F.J.; Kiviet, D.J.; Tans, S.J. Evolutionary potential of a duplicated repressor-operator pair: Simulating pathways using mutation data. PLoS Comput. Biol. 2006, 2, e58. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Poelwijk, F.J.; Kiviet, D.J.; Weinreich, D.M.; Tans, S.J. Empirical fitness landscapes reveal accessible evolutionary paths. Nature 2007, 445, 383–386. [Google Scholar] [CrossRef] [PubMed]
Franke, J.; Klözer, A.; de Visser, J.A.G.; Krug, J. Evolutionary accessibility of mutational pathways. PLoS Comput. Biol. 2011, 7, e1002134. [Google Scholar] [CrossRef] [Green Version]
Hegarty, P.; Martinsson, A. On the existence of accessible paths in various models of fitness landscapes. Ann. Appl. Probab. 2014, 24, 1375–1395. [Google Scholar] [CrossRef]
Zagorski, M.; Burda, Z.; Waclaw, B. Beyond the hypercube: Evolutionary accessibility of fitness landscapes with realistic mutational networks. PLoS Comput. Biol. 2016, 12, e1005218. [Google Scholar] [CrossRef] [Green Version]
Kell, D.B. Genotype–phenotype mapping: Genes as computer programs. Trends Genet. 2002, 18, 555–559. [Google Scholar] [CrossRef]
Kondrashov, D.A.; Kondrashov, F.A. Topological features of rugged fitness landscapes in sequence space. Trends Genet. 2015, 31, 24–33. [Google Scholar] [CrossRef]
Chan, H.S.; Bornberg-Bauer, E. Perspectives on protein evolution from simple exact models. Appl. Bioinform. 2002, 50, 121–144. [Google Scholar]
Cowperthwaite, M.C.; Economo, E.P.; Harcombe, W.R.; Miller, E.L.; Meyers, L.A. The ascent of the abundant: How mutational networks constrain evolution. PLoS Comput. Biol. 2008, 4, e1000110. [Google Scholar] [CrossRef] [PubMed]
Stich, M.; Lázaro, E.; Manrubia, S.C. Phenotypic effect of mutations in evolving populations of rna molecules. BMC Evol. Biol. 2010, 10, 46. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Palmer, M.E.; Moudgil, A.; Feldman, M.W. Long-term evolution is surprisingly predictable in lattice proteins. J. R. Soc. Interface 2013, 10, 20130026. [Google Scholar] [CrossRef] [Green Version]
Bershtein, S.; Serohijos, A.W.R.; Shakhnovich, E.I. Bridging the physical scales in evolutionary biology: From protein sequence space to fitness of organisms and populations. Curr. Opin. Struct. Biol. 2017, 42, 31–40. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Perfeito, L.; Ghozzi, S.; Berg, J.; Schnetz, K.; Lässig, M. Nonlinear fitness landscape of a molecular pathway. PLoS Genet. 2011, 7, e1002160. [Google Scholar] [CrossRef] [Green Version]
Chou, H.-H.; Delaney, N.F.; Draghi, J.A.; Marx, C.J. Mapping the fitness landscape of gene expression uncovers the cause of antagonism and sign epistasis between adaptive mutations. PLoS Genet. 2014, 10, e1004149. [Google Scholar] [CrossRef] [Green Version]
Friedlander, T.; Prizak, R.; Barton, N.H.; Tkačik, G. Evolution of new regulatory functions on biophysically realistic fitness landscapes. Nat. Commun. 2017, 8, 216. [Google Scholar] [CrossRef] [Green Version]
Cuypers, T.D.; Rutten, J.P.; Hogeweg, P. Evolution of evolvability and phenotypic plasticity in virtual cells. BMC Evol. Biol. 2017, 17, 60. [Google Scholar] [CrossRef] [Green Version]
Yubero, P.; Manrubia, S.; Aguirre, J. The space of genotypes is a network of networks: Implications for evolutionary and extinction dynamics. Sci. Rep. 2017, 7, 13813. [Google Scholar] [CrossRef] [Green Version]
Harmand, N.; Gallet, R.; Jabbour-Zahab, R.; Martin, G.; Lenormand, T. Fisher’s geometrical model and the mutational patterns of antibiotic resistance across dose gradients. Evolution 2017, 71, 23–37. [Google Scholar] [CrossRef] [PubMed]
Yang, C.-H.; Scarpino, S.V. Reproductive barriers as a byproduct of gene network evolution. bioRxiv 2020. [Google Scholar] [CrossRef]
Yang, C.-H.; Scarpino, S.V. The ensemble of gene regulatory networks at mutation-selection balance. bioRxiv 2021. [Google Scholar] [CrossRef]
Ciliberti, S.; Martin, O.C.; Wagner, A. Innovation and robustness in complex regulatory gene networks. Proc. Natl. Acad. Sci. USA 2007, 104, 13591–13596. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Payne, J.L.; Wagner, A. Latent phenotypes pervade gene regulatory circuits. BMC Syst. Biol. 2014, 8, 64. [Google Scholar] [CrossRef] [Green Version]
Godsil, C.D. Compact graphs and equitable partitions. Linear Algebra Its Appl. 1997, 255, 259–266. [Google Scholar] [CrossRef] [Green Version]
Das, S.G.; Direito, S.O.L.; Waclaw, B.; Allen, R.J.; Krug, J. Predictable properties of fitness landscapes induced by adaptational tradeoffs. eLife 2020, 9, e55155. [Google Scholar] [CrossRef]
Wagner, A. Neutralism and selectionism: A network-based reconciliation. Nat. Rev. Genet. 2008, 9, 965–974. [Google Scholar] [CrossRef] [Green Version]
Bendixsen, D.P.; Collet, J.; Østman, B.; Hayden, E.J. Genotype network intersections promote evolutionary innovation. PLoS Biol. 2019, 17, e3000300. [Google Scholar] [CrossRef] [Green Version]
Hunt, G.; Hopkins, M.J.; Lidgard, S. Simple versus complex models of trait evolution and stasis as a response to environmental change. Proc. Natl. Acad. Sci. USA 2015, 112, 4885–4890. [Google Scholar] [CrossRef] [Green Version]
Heasley, L.R.; Sampaio, N.M.V.; Argueso, J.L. Systemic and rapid restructuring of the genome: A new perspective on punctuated equilibrium. Curr. Genet. 2020, 67, 57–63. [Google Scholar] [CrossRef] [PubMed]
Aguilar-Rodríguez, J.; Peel, L.; Stella, M.; Wagner, A.; Payne, J.L. The architecture of an empirical genotype-phenotype map. Evolution 2018, 72, 1242–1260. [Google Scholar] [CrossRef] [PubMed] [Green Version]
López-Presa, J.L.; Chiroque, L.F.; Anta, A.F. Novel techniques to speed up the computation of the automorphism group of a graph. J. Appl. Math. 2014, 2014, 934637. [Google Scholar] [CrossRef]
Stoichev, S.D. New exact and heuristic algorithms for graph automorphism group and graph isomorphism. J. Exp. Algorithmics (JEA) 2019, 24, 1–27. [Google Scholar] [CrossRef] [Green Version]
Battiston, F.; Cencetti, G.; Iacopini, I.; Latora, V.; Lucas, M.; Patania, A.; Young, J.-G.; Petri, G. Networks beyond pairwise interactions: Structure and dynamics. Phys. Rep. 2020, 874, 1–92. [Google Scholar] [CrossRef]
Maheshwari, P.; Albert, R. A framework to find the logic backbone of a biological network. BMC Syst. Biol. 2017, 11, 122. [Google Scholar] [CrossRef] [Green Version]
Yang, G.; Gómez Tejeda Zañudo, J.; Albert, R. Target control in logical models using the domain of influence of nodes. Front. Physiol. 2018, 9, 454. [Google Scholar] [CrossRef] [Green Version]
Hu, T.; Tomassini, M.; Banzhaf, W. A network perspective on genotype–phenotype mapping in genetic programming. Genet. Program. Evolvable Mach. 2020, 21, 375–397. [Google Scholar] [CrossRef]
Greenbury, S.F.; Louis, A.A.; Ahnert, S.E. The structure of genotype-phenotype maps makes fitness landscapes navigable. bioRxiv 2021. [Google Scholar] [CrossRef]

Figure 1. Cartoon illustration of the pathway framework of GRNs adapted from [77,78]. Under our four simplified assumptions, a GRN (genotype) consists of a fixed number of proteins as nodes and a constant number of directed edges depicting the activator/product pairs of genes. The phenotype is modeled as the Boolean states of proteins (colored), which are determined by their reachability from the external stimulus (lightning icon).

Figure 2. Connectivity exists between all GRNs of the same phenotype. (a) Any GRN can be rewired/mutated into a “central” GRN (shown on the right). (b) A redundant edge (dark green) makes it feasible to turn any central GRN into another via edge rewiring. (c) There is a mutational trajectory between any GRNs of the same phenotype through the central GRNs.

Figure 3. Example of peripheral GRNs connecting two different phenotypes—a peripheral GRN

\hat{g}

of phenotype

y

in this example. There is a chain that triggers the presence state of proteins

Ω_{y}^{+} ⧵ Ω_{y^{'}}^{+} = {3, 4}

. However, the other peripheral GRN

{\hat{g}}^{'}

of phenotype

y^{'}

contains a chain of proteins

Ω_{y^{'}}^{+} ⧵ Ω_{y}^{+} = {5, 6}

.

\hat{g}

and

{\hat{g}}^{'}

are mutational neighbors since they only differ by rewiring the dark green edge, i.e., the first edge in either chain.

Figure 3. Example of peripheral GRNs connecting two different phenotypes—a peripheral GRN

\hat{g}

of phenotype

y

in this example. There is a chain that triggers the presence state of proteins

Ω_{y}^{+} ⧵ Ω_{y^{'}}^{+} = {3, 4}

. However, the other peripheral GRN

{\hat{g}}^{'}

of phenotype

y^{'}

contains a chain of proteins

Ω_{y^{'}}^{+} ⧵ Ω_{y}^{+} = {5, 6}

.

\hat{g}

and

{\hat{g}}^{'}

are mutational neighbors since they only differ by rewiring the dark green edge, i.e., the first edge in either chain.

Figure 4. The genotype network has symmetry such that multiple GRNs have similar local neighborhoods, as we demonstrate in (a) since the corresponding GRNs only differ by exchanging the role of loci A and B. More formally, these GRNs constitute an equivalence class under phenotype-preserving automorphisms, which can be found by graphical operations of (b) permuting loci, (c) permuting dummy proteins (circles), (d) exchanging edges pointing from two different stimuli (squares), and (e) exchanging self-loops at two different nodes.

Figure 5. (a) As a minimal example, imagine an operation that rotates a geometric object 90 degrees clockwise. The rotation maps one object onto another (dashed arrows), and it leads to equivalence classes where objects are grouped by their symmetry under rotation (pink rectangles). (b) An automorphism of a graph is a permutation of nodes that retains the same graph. (c) Equivalence classes under graph automorphisms bring together nodes that have similar roles connection-wise in the graph.

Figure 6. (a) Consider a toy example genotype network of GRNs. (Here, we omit the exact content of GRNs.) Given the partition

{\hat{Θ}}_{k}

, note that mega-nodes in a group (dashed orange rectangle) may share a different number of connections among other groups (blue shaded circles), and they are further divided to generate the next partition

{\hat{Θ}}_{k + 1}

. (b) Both the equivalence classes of GRNs and the stationary partition from our iterative procedure are equitable, e.g., each mega-node in group (1) has one connection among (1), another connection with (2), and none with other groups.

Figure 6. (a) Consider a toy example genotype network of GRNs. (Here, we omit the exact content of GRNs.) Given the partition

{\hat{Θ}}_{k}

, note that mega-nodes in a group (dashed orange rectangle) may share a different number of connections among other groups (blue shaded circles), and they are further divided to generate the next partition

{\hat{Θ}}_{k + 1}

. (b) Both the equivalence classes of GRNs and the stationary partition from our iterative procedure are equitable, e.g., each mega-node in group (1) has one connection among (1), another connection with (2), and none with other groups.

Figure 7. Example partition coarser than the equivalences classes of GRNs. We run the proposed iterative procedure with

| Γ | = 3

and

| Ω_{0} | = | Ω^{'} | = | \hat{Ω} | = 2

, where stimuli

Ω_{0}

are drawn as squares and the present/absent state of fitness-relevant proteins

\hat{Ω}

are colored by orange/blue. (a) The number of GRNs and the number of isomorphism classes of GRNs in each group of the obtained partition

\hat{Θ}

, where the dashed lines separate groups of different phenotypes and (b) isomorphism classes of GRNs in a group.

Figure 7. Example partition coarser than the equivalences classes of GRNs. We run the proposed iterative procedure with

| Γ | = 3

and

| Ω_{0} | = | Ω^{'} | = | \hat{Ω} | = 2

, where stimuli

Ω_{0}

are drawn as squares and the present/absent state of fitness-relevant proteins

\hat{Ω}

are colored by orange/blue. (a) The number of GRNs and the number of isomorphism classes of GRNs in each group of the obtained partition

\hat{Θ}

, where the dashed lines separate groups of different phenotypes and (b) isomorphism classes of GRNs in a group.

Figure 8. Layering the GRNs by their number of non-self-loop edges. A GRN’s mutational neighbors must fall into the same or the adjacent layers. For ease of illustration, we only show the non-self-loop edges and neglect the protein states in GRNs.

Figure 9. Sufficient conditions that two

M^{+}

neighbors belong to an equivalence class. For illustration purposes, we only show the dummy proteins and omit the protein states, edge labels and self-loops in (a) three examples such that two

M^{+}

neighbors of a GRN g are isomorphic, and (b) an example where the

M^{+}

neighbors

g_{1}^{'}

and

g_{2}^{'}

of GRNs

g_{1}

and

g_{2}

in different equivalence classes are isomorphic.

Figure 9. Sufficient conditions that two

M^{+}

neighbors belong to an equivalence class. For illustration purposes, we only show the dummy proteins and omit the protein states, edge labels and self-loops in (a) three examples such that two

M^{+}

neighbors of a GRN g are isomorphic, and (b) an example where the

M^{+}

neighbors

g_{1}^{'}

and

g_{2}^{'}

of GRNs

g_{1}

and

g_{2}

in different equivalence classes are isomorphic.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, C.-H.; Scarpino, S.V. A Family of Fitness Landscapes Modeled through Gene Regulatory Networks. Entropy 2022, 24, 622. https://0-doi-org.brum.beds.ac.uk/10.3390/e24050622

AMA Style

Yang C-H, Scarpino SV. A Family of Fitness Landscapes Modeled through Gene Regulatory Networks. Entropy. 2022; 24(5):622. https://0-doi-org.brum.beds.ac.uk/10.3390/e24050622

Chicago/Turabian Style

Yang, Chia-Hung, and Samuel V. Scarpino. 2022. "A Family of Fitness Landscapes Modeled through Gene Regulatory Networks" Entropy 24, no. 5: 622. https://0-doi-org.brum.beds.ac.uk/10.3390/e24050622

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Family of Fitness Landscapes Modeled through Gene Regulatory Networks

Abstract

1. Introduction

2. Methods

2.1. Pathway Framework of GRNs

2.2. Fitness Landscape of GRNs under the Pathway Framework

3. Results

3.1. Connectivity and Accessibility in a Fitness Landscape of GRNs

3.2. Mesoscopic Skeleton Derived from “Symmetries” in the Genotype Network of GRNs

3.3. Algorithmic Construction of the Mesoscopic Backbone of GRN Fitness Landscape

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Central and Peripheral GRNs Where No Regulation Presents

Appendix B. Phenotype-Preserving Automorphisms of the Genotype Network of GRNs

Appendix C. Combining Mutational Neighbors into Equivalence Classes

Appendix D. Size of an Equivalence Class of GRNs

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI