# Advancements in the Statistical Study, Modeling, and Simulation of Microwave-Links in Cellular Backhaul Networks

^{*}

Next Article in Journal

Next Article in Special Issue

Next Article in Special Issue

Previous Article in Journal / Special Issue

School of Electrical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel

Author to whom correspondence should be addressed.

Received: 3 June 2018 / Revised: 25 June 2018 / Accepted: 26 June 2018 / Published: 28 June 2018

(This article belongs to the Special Issue Environmental Science and Technologies for the Management of Natural Ecosystems and the Sustainable Development of Urban Areas)

While the effect of rainfall and other environmental phenomena on a link budget in microwave wireless communication has been well studied for network design, it has usually been done for each microwave-link separately. Recently, attenuation in multiple microwave-links is being used simultaneously for rainfall mapping over specific areas, and consequently, rain-induced attenuation fields can be constructed. Dedicated algorithms have been designed to relate attenuation in multiple microwave-links to its corresponding rain-field. Their performance depends significantly on the structure of the network. As the topology of a cellular microwave network (CMNs) is region-dependent, general theory for its effect on performance can only be developed statistically. In this paper we study the statistical nature of CMNs and lay the groundwork for such models based on empirical results.

In cellular backhaul networks, cellular microwave-links (CMLs) are used as wireless channels to connect two base stations (BS). Figure 1 portrays a single CML. Each BS is equipped with a transmitting-receiving antenna. Giuli et al. describe in their work [1,2] how such microwave channels can be utilized for rainfall mapping. Capitalizing on the use of CMLs in cellular communications, Messer et al. [3] suggested commercial backhaul networks for environmental monitoring. They suggested utilizing the commercial network, already set-up and functional, thus offering a cheap and opportunistic approach to the problem of precipitation monitoring. In that framework, the precipitation field is modeled as the signal to be reconstructed and the CMLs are modeled as random line projections which sample the signal and serve as data observation. As the microwave propagates along the CML, it accumulates attenuation that is attributed greatly to the air’s present moisture, thus providing a form of spatial sampling of the precipitation field. This physical phenomenon is described in [4] and is elaborated on in [1]. In [5] it is shown that in order to yield a reconstruction method for a given sampled field, one must characterize the sampling scheme, i.e., the distribution of CMLs. In order to do so, we studied the design of a backhaul network.

When a cellular microwave network (CMN) is being initially architected, the chosen topology of network BS, and thus CMLs, is based on several considerations. One can divide those considerations to two types, micro and macro.

Micro factors would be those that have a minor impact on the chosen position of the BS. For instance, after deciding to position a BS on a specific street block, one would consider micro factors for positioning that BS on a specific building’s roof-top rather than another’s. The macro factors, on the other hand, will inform the CMN architect what should be the amount of CMLs to deploy in an area, how to spatially distribute them, and how their lengths (i.e., distances between linked BS) should vary. Insights regarding the design of backhaul networks can be found in [7]. We suggest that there is a random factor in the spread of CMLs. This claim, which is the basis of this paper, relies on both the micro and macro factors. However, in this paper we focus on the macro factors as they impact the spatial distribution of CMLs globally. As will be demonstrated, CMLs’ spatial distribution can be divided into subsets of distribution categories, each corresponding to characteristic population density and topography.

As pointed out in [8], CMLs tend to be distributed in clusters. This means that not only is their spatial distribution observed to be non-uniform, it also leaves regions with little or no coverage. We will make the case that these clusters correspond in their locations to population densities. As this hypothesis will be supported, it provides an intuition for the macro factors of CMLs’ distribution and volume.

We analyze CMLs in the context of distribution categories, namely being urban (most dense), suburban, and rural (least dense) [9,10]. In [11] one may also find a connection between population density and perspective BS service capacity. Population volumes may be high so as to saturate BS service capacity, thus calling for the allocation of additional BS to share the load. Table 1 shows results of BS density studies.

BS densities project on CML densities. To witness this relationship, Figure 2 presents the four common CML topologies. One can see that for each of these, the number of edges is similar to the number of nodes, as they symbolize the CMLs and BS, respectively.

Figure 2 and Table 1 suggest that the spatial distribution of CMLs is not, and cannot be homogenous. However, our study is based on the assumption that any given region can be partitioned into sub-regions, each homogeneous in the sense of CMLs density, meaning that all the CMLs in a sub-region have their positions drawn from the same uniform distribution. Here and throughout this paper, when referring to a CML’s position, it is the position of its midpoint that is considered. Based on [12], we suggest describing CML distribution in any homogeneous region by three characteristics:

- Spatial density of the CMLs;
- Orientations of the CMLs;
- Lengths of the CMLs.

The rest of the paper is organized as follows: in Section 2 and Section 3 we study the CMLs’ statistical characteristics as listed above and suggest statistical models. Section 2 addresses the CMLs’ spatial density, and Section 3 addresses their orientations and lengths. Section 4 suggests a mathematical model for the relationship between CMLs’ lengths and their spatial density. Section 5 then combines the statistical derivations from previous sections to suggest a novel computational method for simulating sets of synthetic CMLs. Section 6 then concludes and discusses the results. The geographical regions analyzed here are described in the appendix. Note that as this paper is an extension of [13], its novelty is presented in Section 4 and Section 5.

In order to be able to characterize the distributions of CMLs in a given region, we suggest partitioning the region into sub-regions, each with a spatially uniform CML density. Such sub-regions are expected to correspond to the common environmental terms: urban, suburban, and rural. If such a partition to homogeneous regions was not performed and the CMLs were clustered together, their spatial distribution would need to be addressed more meticulously in order to evaluate reconstruction potential, as it would not have been simply uniform.

When partitioning a region, it is critical to maintain a nominal area size which is appropriate for capturing relevant rain phenomena. Typical rain clouds over Israel tend to stretch over an area of up to 10 × 10 (km^{2}) [14]. It is recommended to maintain a minimum of such region size. Accordingly, in this paper we examined regions that are indeed 10 × 10 (km^{2}).

The locations and volume of CMLs that were discussed in Section 2 characterize CMLs collectively. Individual CML attributes present a significant attenuation measurement factor as well. These characteristics are the orientation and length of the CML. The understanding of all three factors allows for 2-D modeling of CMLs, as this paper is concerned with. It should be mentioned that if one were to examine the 3-D modeling of CMLs, the CMLs’ height variability would be a factor as well.

The study presented here regards CMLs in the state of Israel belonging to a single cellular provider, Cellcom (see Appendix A for further details). All CMLs are operating in the same frequency range, the K-band.

Results show that in any type of region studied, the CMLs’ orientation takes on any angle with equal probability. This means the direction of CMLs is distributed uniformly and is not at all correlated with the type of population density. Figure 3 portrays this conclusion. Moreover, the orientation is found to be statistically independent of the other factors studied here, the CML’s length and density.

Sendik [12] studied the distribution of CMLs as well. To do so, a map of Israel was partitioned into four parts based on latitude (as Israel stretches from latitude 29.5° N to 33.29° N). These four parts were used as sub-regions to study CMLs. However, these four regions were heterogeneous in their environmental types. Here, by isolating sub-regions of homogeneous environment types and then characterizing CML statistics by such types, we suggest a contribution to Sendik’s work on statistical modeling of CMLs in Israel.

The study of CMLs’ lengths yielded much different results than the orientations. Unlike orientations, CML lengths are distributed non-uniformly. CML lengths are distributed exponentially. Moreover, the lengths’ statistical characteristics depend on the type of environment. Figure 4 illustrates both findings.

The sample-means specified in Figure 4 are used for the fitting of an exponential distribution. For an exponential random variable, e.g., $T~Exp(1/\theta )$, the probability density function is:
where $\theta =E[T]$. This presents a direct tie between the sample mean of the CMLs’ length and the exponential fitting. Table 2 presents a data summary for CMLs in Israel used in this study.

$${f}_{T}(t|\theta )=\{\begin{array}{cc}\frac{1}{\theta}{e}^{-\frac{t}{\theta}}\hfill & \hfill t\ge 0\\ 0\hfill & \hfill t<0\end{array},$$

The findings in Table 2 suggest that there may be an underling relationship between the CMLs’ density and their mean length for a given region. Following that intuition, an analysis on a larger scale was performed. Given the set of CMLs over Israel, a set of 3626 sub-regions was generated. A moving window with varying dimensions scanned the area of Israel. For every iteration, meaning, for every one of the 3626 windows, there was a subset of CMLs that were captured within the window’s bounds. Two features were captured for each such iteration, the CMLs’ density and the CMLs’ mean length. Thus, 3626 observations of pairs {density, mean length} were observed. Figure 5 presents the scatter plot of those observations in blue.

A non-linear empirical relationship is hinted at in the scatter plot, thus calling for non-linear modeling. With the wide range of non-linear models available, we were only interested in an analytical mathematical formula and not a black-box model. Thus, parametric non-linear regression was chosen.

The data set of 3626 observations was split randomly into two, holding out 20% (725) of the observations to be designated as the test set. These observations were not used for model selection or for model training. The test set was used to report model performance after the model was chosen and trained. The remaining 80% (2901 observations) were used for training and validating.

We conducted model selection for a variety of parametric formulas. We constrained the pool of models to those having up to two parameters. The motivation for this constraint was to present a simple model.

The following models were examined:

$${\widehat{d}}_{1}(l)=\frac{{a}_{1}}{l},\hspace{1em}{\widehat{d}}_{2}(l)=\frac{{a}_{1}}{{l}^{{a}_{2}}},\hspace{1em}{\widehat{d}}_{3}(l)=\frac{1}{{a}_{1}(l-{a}_{2})},\hspace{1em}{\widehat{d}}_{4}(l)=\frac{1}{{l}^{{a}_{1}}-{a}_{2}}.$$

Here $\widehat{d}$ is the estimator of $d$, being the CMLs’ density (CMLs/km^{2}), and it is noted as an explicit function of $l$, the CML’s mean length (km). ${a}_{1}$ and ${a}_{2}$ are constant coefficients to be optimized via non-linear regression. They are chosen to be those that minimize the mean squared error (MSE). For example, this is the optimization process for the third model:

$$\underset{{a}_{1},{a}_{2}}{\mathrm{min}}\left\{\mathrm{MSE}\right\}=\underset{{a}_{1},{a}_{2}}{\mathrm{min}}\left\{\frac{1}{I}{\displaystyle \sum _{i=1}^{I}{\left(d[i]-\frac{1}{{a}_{1}(l[i]-{a}_{2})}\right)}^{2}}\right\}.$$

Here, I is the number of train observations.

We performed a 5-fold cross validation for the training of every model out of the four. The MSE of the five validation iterations was calculated for each model by averaging the five MSEs. The best model was found to be the third model, achieving the lowest averaged validation MSE:

$$\widehat{d}(l)=\frac{1}{{a}_{1}(l-{a}_{2})}.$$

The coefficients were derived to be (rounded to two decimal places), ${a}_{1}=3,{a}_{2}=1.14$, yielding an averaged validation MSE = 0.221, and a test MSE = 0.229.

$$\widehat{d}(l)=\frac{1}{3(l-1.14)}\iff \widehat{l}(d)=\frac{1+3.42d}{3d}$$

The regressed model is plotted in Figure 5 in black.

Capitalizing on the statistical derivations of previous sections, we are now introducing a novel method for synthesizing a data set of CMLs. The motivation behind using computer simulated CMLs rather than real-world CMLs data revolves around these virtues:

- It allows for a controlled study. Simulated CMLs allow one to account for every attribute they possess.
- It strengthens the integrity of the results. When deducing an outcome of an experiment, since the simulation of the CMLs is controlled, one can determine a clear set of assumptions under which the outcome holds.
- It provides statistical robustness of the results. When simulating CMLs, the amount of CMLs is not limited, thus allowing one to utilize as many CMLs as necessary for the statistical experiment.
- It introduces a new experimental feature, sensitivity analysis. The computer simulation allows one to tweak the CMLs’ parameters and evaluate their effect.

In [5] we discuss the modeling of rain-field reconstruction to a great length. The modeling of CMLs is only discussed shortly, here we elaborate on that.

The computational structure that represents a single CML is a 2-D array. The dimensions of the array represent the physical area of interest (e.g., a 10 × 10 (km^{2}) area), each array element represents a pixel, and thus the size of the array is directly derived from the chosen resolution. A pixel that the CML crosses has a positive value equal to the length of the overlap that the CML has with that pixel. A pixel that the CML does not cross is zeroed. Here we use the terms “pixel” and “array element” interchangeably. Figure 6 portrays how an array models a CML.

Two key factors are to be determined when simulating CMLs. The first is the CMLs’ mean length, which is a physical parameter, and the second is the spatial resolution, translating to number of pixels per area, which is not a physical feature but a computational one. Since a CML is represented as a numerical array, the spatial resolution dictates how many elements will be in that array. The higher the resolution, the more pixels the area is being partitioned to.

It may be easy to understand why we do not proclaim the CMLs’ orientation to be a key factor. Since we have established that the orientation is completely random and distributed uniformly, there is no distribution parameter that needs to be pre-set to model it. What is not so straightforward is the reason spatial density of CMLs is not declared as a key factor. The answer is computational modularity. Sets of CMLs are generated such that they allow for all relevant spatial densities to be utilized in the experiments they mean to serve. All sets are generated with a sufficiently high number of CMLs that allows for the highest spatial density of CMLs. This way, when lower spatial densities are desired, smaller randomly selected subsets of the CMLs can be utilized. For instance, for an area of 10 × 10 (km^{2}), a set of 1000 CMLs is generated. The experiment of interest requires a density of 2 CMLs per km^{2}, thus 200 CMLs are randomly selected, and the fact that there is a “sufficiently high” number of CMLs allows for many Monte-Carlo iterations with different subsets of CMLs. Here “sufficiently high” refers to a number so high that it allows for the maximal number of CMLs to be randomly selected out of the set, multiple times. The contrary would be to limit the set to having “just enough” CMLs, thus allowing only one manner for selecting the maximal number of CMLs by simply selecting the entire set. The latter would not allow for repetitions of the experiment in a Monte-Carlo setting.

Pre-set variables (i.e., variables that are constant throughout the run):

- Region Area:This is a physical parameter, e.g., 10 × 10 (km
^{2}). - Spatial Resolution—N:N is the number of pixels the area is being partitioned to.
- Environment Type—Mean CML Length:As established in Section 3, simulating a different environment corresponds to choosing a different mean CML length. Values typically range from 1.5 to 10 (km).
- The Number of CMLs to Generate n
_{set}:Given the region area and the maximum spatial density desired for the experiment, let n_{set}be sufficiently high so to allow for multiple random draws of the maximal number of CMLs.

The computation of the CMLs set:

- Generate an Array of n
_{set}Objects, Each is a CML Length Value:Each length is drawn randomly from an exponential distribution with the above pre-set mean length. - Assign Each Object an Orientation:Each of the generated objects is assigned an angle drawn uniformly from $\left[0,\pi \right)$.
- Assign Each Object a Position:Each object’s beginning point is drawn uniformly in a square of dimensions $\sqrt{N}\times \sqrt{N}$. Then, the end point is defined by drawing a line based on the length and orientation of the object.Note that in this step, each CML object is defined by “continuous”, i.e. not discrete, measures.
- Calculate Quantized Pixels Values:In order to suit the discrete model, the CML is being represented as an array.For each CML object, partition the $\sqrt{N}\times \sqrt{N}$ area to $\sqrt{N}\times \sqrt{N}$ squares, each represents a pixel.Set 0 to a pixel that does not have the CML pass through it. For a pixel that the CML does pass through, assign a positive value equal to the physical length of the CML’s overlap with the pixel’s region (i.e., overlapping with the square). See Figure 6 for a graphical description.

Figure 7 portrays the density, lengths, and orientations of a synthetic CML set.

CMLs possess three random features: spatial-distribution, orientation, and length. All three were addressed in this study and empirical conclusions were derived. The simplest of the three is the orientation, appearing to be statistically independent of the other two features, and distributed uniformly across all angles [0°, 180°]. The CMLs’ spatial distribution was empirically found to be dependent on the environment type in the sense of population density. The denser the population is in the observed region, the denser the network is. This finding which relates population density to network density supports prior studies regarding backhaul BSs densities [10]. The CMLs’ lengths were found to suit an exponential random variable. Moreover, CMLs’ mean length, $l$, which we claim corresponds to the distribution parameter $\theta $ (see Equation (1)), was found to be dependent on population density as well. The denser the population, the shorter the CMLs. Thus, by association, statistical dependence is suggested between the CMLs’ lengths and their spatial density. Such dependence was modeled using a non-linear model.

The ability to apply statistical models to CMLs allows a much needed understanding of the study of CML-based precipitation monitoring. Through these models one may design reconstruction algorithms engineered for the nature of these random projections. Moreover, as presented in Section 5, these statistical models allow for the computational simulations of CMLs.

Conceptualization, L.G. and H.M.; Formal Analysis, L.G.; Funding Acquisition, H.M. and L.G.; Methodology, L.G.; Software, L.G.; Validation, L.G.; Investigation, H.M. and L.G.; Writing Original Draft Preparation, L.G.; Visualization, L.G.; Supervision, H.M.

This research was funded in part by the German Research Foundation through the Integrating Microwave Link Data for Analysis of Precipitation in Complex Terrain: Theoretical Aspects and Hydrometeorological Applications project, grant number 04340365168, and in part by the Israel Water Authority on behalf of the Israeli Government, grant number 4500963260.

We deeply thank our research team members in Tel Aviv University, and especially Jonatan Ostrometzky for his fruitful cooperation and discussions. We are thankful to the Israel Water Authority for supporting and funding this research on behalf of the Israeli government. This research is also related to the German Research Foundation (DFG), the Integrating CML data for the Analysis of Precipitation in complex terrains project. We thank our friends in the Israeli cellular providers: Cellcom, Pelephone, and PHI who provided data. In Pelephone, N. Dvela, A. Hival and Y. Shachar. In Cellcom, E. Levi, Y. Koriat, B. Bar, and I. Alexandrovitz. In PHI, Y. Bar Asher, O. Tzur, Y. Sebton, A. Polikar, and O. Borukhov.

The authors declare no conflicts of interest.

This section provides descriptions for the regions analyzed to derive CML statistics. All CMLs belong to a single cellular provider, Cellcom, and are dated to January 2013. Figure A1 presents the distribution of these CMLs. As Table 2 specifies eight sub-regions, their coordinates are specified in Table A1, Table A2 and Table A3.

Geographic Boundary | Tel Aviv | Jerusalem | Haifa |
---|---|---|---|

Min. latitude coordinate | 32.013 | 31.74 | 32.765 |

Max. latitude coordinate | 32.096 | 31.81 | 32.825 |

Min. longitude coordinate | 34.776 | 35.175 | 34.985 |

Max. longitude coordinate | 34.8739 | 35.235 | 35.075 |

Geographic Boundary | Hasharon | Caesarea | Nazareth |
---|---|---|---|

Min. latitude coordinate | 32.15 | 32.41 | 32.615 |

Max. latitude coordinate | 32.3 | 32.52 | 32.732 |

Min. longitude coordinate | 34.83 | 34.91 | 35.224 |

Max. longitude coordinate | 34.98 | 35.04 | 35.374 |

Geographic Boundary | Top North Area | Kseifa Area |
---|---|---|

Min. latitude coordinate | 32.7 | 31.4 |

Max. latitude coordinate | 33.09 | 31.008 |

Min. longitude coordinate | 35.15 | 35.26 |

Max. longitude coordinate | 35.82 | 34.905 |

- Giuli, D.; Toccafondi, A.; Gentili, G.B.; Freni, A. Tomographic reconstruction of rainfall fields through microwave attenuation measurements. J. Appl. Meteorol.
**1991**, 30, 1323–1340. [Google Scholar] [CrossRef] - Giuli, D.; Facheris, L.; Tanelli, S. Microwave tomographic inversion technique based on stochastic approach for rainfall fields monitoring. IEEE Trans. Geosci. Remote Sens.
**1999**, 37, 2536–2555. [Google Scholar] [CrossRef] - Messer, H.; Zinevich, A.; Alpert, P. Environmental monitoring by wireless communication networks. Science
**2006**, 312, 713. [Google Scholar] [CrossRef] [PubMed] - Olsen, R.O.G.E.R.S.; Rogers, D.V.; Hodge, D. The aR
^{b}relation in the calculation of rain attenuation. IEEE Trans. Antennas Propag.**1978**, 26, 318–329. [Google Scholar] [CrossRef] - Gazit, L.; Messer, H. Sufficient Conditions for Reconstructing 2-D Rainfall Maps. IEEE Trans. Geosci. Remote Sens.
**2018**, 1–10. [Google Scholar] [CrossRef] - Gazit, L.; Messer, H. Rain-Mapping through Compressed Sensing: Reconstruction Criteria Relating Image Sparsity, Resolution, and Random-Observations. Master’s Thesis, Tel-Aviv University, Tel Aviv, Israel, 2016. [Google Scholar]
- Electronic Communications Committee (ECC). ECC Report 82. Compatibility Study for UMTS Operating within the GSM 900 and GSM 1800 Frequency Bands; Technical Report; Electronic Communications Committee: Copenhagen, Denmark, 2006. [Google Scholar]
- Messer, H.; Gazit, L. From cellular networks to the garden hose: Advances in rainfall monitoring via cellular power measurements. In Proceedings of the 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Washington, DC, USA, 7–9 December 2016. [Google Scholar]
- Zhang, J.; Wang, W.; Zhang, X.; Huang, Y.; Su, Z.; Liu, Z. Base stations from current mobile cellular networks: Measurement, spatial modeling and analysis. In Proceedings of the 2013 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), Shanghai, China, 7–10 April 2013. [Google Scholar]
- Zhou, Y.; Zhao, Z.; Louët, Y.; Ying, Q.; Li, R.; Zhou, X.; Chen, X.; Zhang, H. Large-scale spatial distribution identification of base stations in cellular networks. IEEE Access
**2015**, 3, 2987–2999. [Google Scholar] [CrossRef] - Amaldi, E.; Capone, A.; Malucelli, F. Planning UMTS base station location: Optimization models with power control and algorithms. IEEE Trans. Wirel. Commun.
**2003**, 2, 939–952. [Google Scholar] [CrossRef] - Sendik, O. On the Coverage and Reconstructability of 2D Functions Sampled by Arbitrary Line Projections with an Application to Rain Field Mapping. Master’s Thesis, Tel-Aviv University, Tel Aviv, Israel, July 2013. [Google Scholar]
- Gazit, L.; Messer, H. Microwave-Links in Cellular Backhaul Networks: Statistical Studying and Modeling. 2017. Available online: https://cest.gnest.org/sites/default/files/presentation_file_list/cest2017_00282_oral_paper.pdf (accessed on 3 June 2018).
- Karklinsky, M.; Morin, E. Spatial characteristics of radar-derived convective rain cells over southern Israel. Meteorol. Z.
**2006**, 15, 513–520. [Google Scholar] [CrossRef]

Region | Area (km^{2}) | BS Amount | BS Density (1/km^{2}) |
---|---|---|---|

Most Dense City | 60 × 40 | 6251 | 2.604 |

Second Densest City | 30 × 50 | 1911 | 1.274 |

Third Densest City | 40 × 40 | 977 | 0.611 |

Rural | 200 × 200 | 12,691 | 0.317 |

Region | Area (km^{2}) | CMLs Amount | CMLs Density (1/km^{2}) | CMLs Mean Length (km) | |
---|---|---|---|---|---|

- | All of Israel | 22,770 | 3624 | 0.16 | 3.54 |

Urban | Tel-Aviv | 85.18 | 264 | 3.1 | 1.48 |

Jerusalem | 44.16 | 141 | 3.2 | 1.26 | |

Haifa | 56.14 | 159 | 2.8 | 1.58 | |

Sub-urban | Hasharon | 235.54 | 124 | 0.53 | 2.5 |

Caesarea Area | 149.27 | 77 | 0.52 | 2.3 | |

Nazareth Area | 182.78 | 89 | 0.49 | 2.4 | |

Rural | Top North Israel | 2718.74 | 278 | 0.1 | 4.7 |

Kseifa Area | 1474.73 | 69 | 0.05 | 8.07 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).