We select four primary measures to understand the geographic location we are endeavoring to assess. Every measure reflects a specific aspect related to the critical infrastructure. The measures and the description are centrality measures, criticality measures, interdependence measures, and community measures. All measures have been explained in detail as follows.
3.1.1. Centrality Measures
Employing centrality measures has been applied as a valuable mechanism to recognize the critical nodes in graph theory. Graph centrality measures are selected here to assess the relative importance of a node in a graph
G. Several centrality methods exist; each has various features. Three network measures have favored being tested on each graph to determine the most critical nodes. The measure practiced here is the centrality degree, betweenness, and closeness. The adopted model was selected based on prior work in [
11], where we used three centrality measures as an index to measure the importance of the nodes within a specific network and then aggregate the normalized value of each metric to result in an overall weight of each zone. Nevertheless, we formed each geographical site in this paper such as zip code and neighborhood as a graph
, where
V is the set of vertices or network nodes located in the examined geographical area and
E is the set of edges or links or connections connecting the nodes. Nodes, in this case, contain all the critical infrastructure elements.
The geographic area is this work modeled
based on the Nearest Neighbor Algorithm (NNA), which is one of the fundamental algorithms employed to resolve the traveling salesman problem, where the salesman begins at a random city and frequently visits the closest approaching city until all have been visited. The algorithm computes the Euclidean distance [
12] from each point in a point pattern to its nearest neighbor (the nearest other points of the pattern). The nearest neighborhood algorithm helps meet a location with its nearest
k neighbors in a multi-dimensional space. The aim behind using the
algorithm is to connect the infrastructure nodes with a virtual link that matches the physical topology. According to [
13], most power stations and cell tower position into the nearest hospital. In the power scenario, the cost of power transmission is expensive, and that leads to positioning the power station somewhere near the hospital. However, in cell towers, positioning is mainly based on increasing the coverage of users.
In this case, every CI node has a directed edge to the nearest CI node based on the real distance extracted from the geographic data. For example,
node has a straight edge to the nearest
node, as shown in
Figure 2. Counting both one-way and two-way connection, constraints are added into the graphs as follows: 1. Healthcare nodes only receive a connection from all other infrastructures (water, energy, telecommunication); 2. healthcare cannot provide any outgoing connection to other infrastructure nodes; 3. water, energy, and telecommunication include a two-way connection between them; 4. every node receives only one connection from the nearest infrastructures; 6. centrality measures are calculated considering both in and out degrees (undirected graph) since every link has an impact on both sides. By forming the graph in this pattern, an actual graph is generated to accommodate the required centrality measure as follows.
Degree Centrality: In a graph, the node degree can be described as the number of nodes to which a node is directly connected. Accordingly, the degree
of
is the number of edges attached to
i. The degree centrality
of node
i is given by:
where
is the cumulative number of edges from the node
i and normalized by the maximum feasible degree (i.e.,
). The nodes with the highest
are recognized as more valuable.
Betweenness Centrality: It is another major benchmark for recognizing essential nodes in a complex network. The node and edge betweenness are described as the number of shortest paths passing through a node or an edge. The higher the betweenness of a node, the more critical the node is [
14]. The betweenness centrality
of a vertex
can be measured as the percentage of shortest paths that cross over
i. Hence,
can be formulated as:
where
is the total number of shortest paths between node
s and node
t, with
being the number of these paths that cross through node
i. So, the larger
, the more significant the node.
Closeness Centrality: In a connected graph, the closeness centrality
of node
is the average range or closeness of the shortest path joining the node
i and all other nodes in the graph. The closeness can be represented as:
where
is the distance of the shortest path joining nodes
i and
j. The larger the
, the more centrally positioned the node is in the graph, and the higher the value.
The cumulative centrality measure
of geographical location
is yielded by:
3.1.3. Interdependence Measures
Modeling the interdependence among critical infrastructure has been a considerable challenge due to the complexity of the connections. There is relevant work in recognizing and modeling dependencies involves the use of sector-specific designs, e.g., gas lines, electric grid, or ICT, or more comprehensive methods that are relevant in different models of CIs [
15,
16,
17].
An approach to model the interdependency between interconnected critical infrastructure is the dependency risk approach. This study applies the dependency risk methodology of Kotzanikolaou et al. [
18] for analyzing first-order cascading failures by identifying direct relationships between pairs of critical infrastructures as assessed by critical infrastructure operators; however, in this study, we limit the order to the first order.
The interdependency between nodes in each area are formulated based on a risk dependency graph. The risk level of the dependency approach in this paper formulated by developing a similar approach in [
18] with the difference in focusing on the first-order dependency. Dependencies are visualized in a graph
G and applied in the graphs produced for each
by using NNA algorithms in the previous section, where V is the set of nodes (or infrastructures or elements), and E represent the set of edges (dependencies). In addition, the weight of each edge is the level of the cascade failure emerging risk for the receiving infrastructure due to the dependency relies on a predefined risk range 0, …, 1. Each dependency represented as a straight edge from a node
to a node
assigned to an impact value
and a likelihood value
. The product of the impact and likelihood conditions show the dependency risk
directed to infrastructure
and caused infrastructure
as follows:
Where
is the likelihood that defined as the conditional degree of belief that
will become unavailable, due to the unavailability realized in
. The likelihood
can take one of the following qualitative values:
(from very low to very high). To simplify the calculation as
Table 1 shows, we assign to each likelihood value in the scale an implied probability range as follows:
= [0–0.05], L = [0.05–0.25], M = [0.25–0.5], H = [0.5–0.75], and VH = [0.75–1]). In addition,
outlines the impact that is set as the qualitative impact rate that the infrastructure
will experience if the link is unavailable due to a disruption failure in
.
Table 2 displays a scale from one to nine indicating the direct correlated impact between
i and
j.
Note that those values can be assigned to ranges of economic damage or any different related loss such as the effect on system function or public trust. All these values are specified based on the knowledge we have, and in addition, the assumption of acquiring such a model will convince the related sector to release the demanded information appearing in an absolute result:
For a better understanding of the dependencies in interconnected infrastructure, it can also be visualized through graphs, as shown in
Figure 3. An infrastructure is denoted as a circle and its related risk dependencies representing in, i.e., an outgoing risk from the infrastructure
to the infrastructure
. The quantity in each link points to the level of the incoming risk (cascading failure) for the end-node due to the dependency, based on a risk scale (1–9). For example,
has an incoming dependency risk
from the infrastructure
. The risk assessment indicates the likelihood of a disrupted event from
to cascade to
, as well as the societal impact caused to
in the case of such failure at the source of the dependency (the infrastructure
). After measuring all the risk dependency values for every edge, a total edge value is then calculated, ending in an overall risk value per graph (
) as follows:
3.1.4. Community Measures
Community measures are limited to include community-based indicators for every geographic zone
, such as poverty level and population. Due to data availability, six various features have been selected in this measure and aggregated collectively, appearing in
and serving the community rate for the defined
as follows:
where
is the total risk value assigned based on the risk assessment matrix that is based on earlier work in [
19]. The
value conducted for the geographical region so that every
has a specific risk value based on several terrestrial factors such as flooding type and frequency.
is the total electricity use in million
. Flooding level, as well, describes
based on the most advanced map given by DHS [
4] and categorized into three classes (>0.1,
, <0.02).
is the total energy consumption in
Million British thermal unit, which is the universal unit of heat and globally employed as a unit of estimating energy consumption.
is the total population for each
and likewise with
representing the poverty percent.