Flash-Flood Susceptibility Assessment Using Multi-Criteria Decision Making and Machine Learning Supported by Remote Sensing and GIS Techniques

Costache, Romulus; Pham, Quoc Bao; Sharifi, Ehsan; Linh, Nguyen Thi Thuy; Abba, S.I.; Vojtek, Matej; Vojteková, Jana; Nhi, Pham Thi Thao; Khoi, Dao Nguyen

doi:10.3390/rs12010106

Open AccessArticle

Flash-Flood Susceptibility Assessment Using Multi-Criteria Decision Making and Machine Learning Supported by Remote Sensing and GIS Techniques

¹

Research Institute of the University of Bucharest, 90-92 Sos. Panduri, 5th District, 050663 Bucharest, Romania

²

National Institute of Hydrology and Water Management, 97E Sos. București-Ploiești, 1st District, 013686 Bucharest, Romania

³

Department of Hydraulic and Ocean Engineering, National Cheng-Kung University, Tainan 701, Taiwan

⁴

Department of Meteorology and Geophysics, University of Vienna, 1090 Vienna, Austria

⁵

Faculty of Water Resource Engineering, Thuyloi University, Hanoi 100000, Vietnam

⁶

Department of Physical Planning Development, Yusuf Maitama Sule University, Kano 700231, Nigeria

⁷

Department of Geography and Regional Development, Faculty of Natural Sciences, Constantine the Philosopher University in Nitra, Trieda A. Hlinku 1, 94974 Nitra, Slovakia

⁸

Institute of Research and Development, Duy Tan University, Danang 550000, Vietnam

⁹

Faculty of Environment, University of Science, Vietnam National University Ho Chi Minh City, Ho Chi Minh City 700000, Vietnam

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Remote Sens. 2020, 12(1), 106; https://0-doi-org.brum.beds.ac.uk/10.3390/rs12010106

Submission received: 30 November 2019 / Revised: 23 December 2019 / Accepted: 24 December 2019 / Published: 27 December 2019

Abstract

:

Concerning the significant increase in the negative effects of flash-floods worldwide, the main goal of this research is to evaluate the power of the Analytical Hierarchy Process (AHP), fi (kNN), K-Star (KS) algorithms and their ensembles in flash-flood susceptibility mapping. To train the two stand-alone models and their ensembles, for the first stage, the areas affected in the past by torrential phenomena are identified using remote sensing techniques. Approximately 70% of these areas are used as a training data set along with 10 flash-flood predictors. It should be remarked that the remote sensing techniques play a crucial role in obtaining eight out of 10 flash-flood conditioning factors. The predictive capability of predictors is evaluated through the Information Gain Ratio (IGR) method. As expected, the slope angle results in the factor with the highest predictive capability. The application of the AHP model implies the construction of ten pair-wise comparison matrices for calculating the normalized weights of each flash-flood predictor. The computed weights are used as input data in kNN–AHP and KS–AHP ensemble models for calculating the Flash-Flood Potential Index (FFPI). The FFPI also is determined through kNN and KS stand-alone models. The performance of the models is evaluated using statistical metrics (i.e., sensitivity, specificity and accuracy) while the validation of the results is done by constructing the Receiver Operating Characteristics (ROC) Curve and Area Under Curve (AUC) values and by calculating the density of torrential pixels within FFPI classes. Overall, the best performance is obtained by the kNN–AHP ensemble model.

Keywords:

flash-flood potential index; K-Star; k-Nearest Neighbor; analytical hierarchy process; Prahova river catchment; machine learning

Graphical Abstract

1. Introduction

The climatic changes and land-use changes represent the leading causes of the growth of the negative effects of flash-floods worldwide and, thus, flash-floods have become one of the most damaging natural phenomena [1]. Frequently, these risk phenomena generate material damage and loss of life at international levels [2]. Globally, the number of casualties due to flash-floods between 1996 and 2015 is estimated at 150,061 [3]. Therefore, the study of the susceptibility to this natural hazard has an increasing weight in the work of present researchers [4].

According to Elkhrachy [5], the flash-flood represents a quick formation of a flood, which is caused by heavy rainfall taking only a few hours and producing relatively high peak discharges. The flash-floods occur due to the heavy rainfall which generates an active manifestation of the surface runoff on slopes. According to Jacinto et al. [6], the flash-flood susceptibility represents the propensity of an area, given by its physical-geographical characteristics, to be affected by flash-flooding. According to Khosravi et al. [7], he flash-flood susceptibility should play an essential role in the effective flash-flood management of basins. Furthermore, Youssef et al. [8] adhere to the fact that flash-flood forecast and warning activity also should be based on accurate identification of susceptible areas.

Regarding flash-flood warning and assessment, the Flash Flood Guidance (FFG) is considered one of the effective methods [9]. The FFG can be described as the spatio-temporal and uniform distribution of the rainfall depth with a specified duration over a basin, which is just enough to produce minor flooding at the basin outlet [9]. This method can be performed statistically or applying geomorphological principles [10]. It has been widely used as an operational method for the quick assessment of localized threats of flash-flooding for large areas. Particularly, it is performed by comparing the observed (real-time) or predicted rainfall to the computed rainfall depth of the same-duration [11]. The FFG method provides useful information for the operational practice of flash-flood early warnings. The details of this method are given, for example, by Carpenter et al. [12], Georgakakos [13], or Ntelekos et al. [14], who also studied the uncertainties in the FFG and its components.

Other methods try to incorporate hydrologic and/or hydraulic models for (flash-)flood modeling [15]. However, these models are generally time-consuming and require very detailed and accurate datasets represented mainly by the high-resolution digital elevation model, design rainfall, roughness parameters or design discharges. Alternately, the flash-flood susceptibility models can be used in large areas, such as in an entire river basin or a whole country. Moreover, these models take into account the areas affected by the past flash-floods as the dependent variable, as well as many independent variables (predictors) which influence the flash-flooding, for example, morphometric factors (e.g., slope angle), lithology, hydrological soil groups, land use/land cover and the like. These input datasets can be incorporated into specific models to derive the areas susceptible to flash-floods.

The current trend in flash-flood susceptibility assessment for large areas is mainly focused on using machine learning, bivariate statistics, and multi-criteria decision-making algorithms. Accordingly, the preferred machine learning algorithms include artificial neural networks (ANN) [16,17], decision trees (DT) [7,18], support vector machine (SVM) [19,20], and logistic regression (LR) [21]. Considering terms of bivariate statistics, the following models can be remarked: frequency ratio (FR) [22], weights of evidence (WoE) [23], statistical index (SI) [24], certainty factor (CF) [1] and index of entropy (IE) [25]. Regarding the multi-criteria decision making (MCDM), the following techniques are frequently used: Analytical Hierarchy Process (AHP) [26], VIseKriterijumska Optimizacija I Kompromisno Resenje (VIKOR) [27] and Technique for Order of Prioritization by Similarity to Ideal Solution (TOPSIS) [28].

Since the ensembles of machine learning with bivariate statistics and MCDM methods are considered more accurate than stand-alone models [29,30], their use in the research becomes increasingly more frequent. An example in which the machine learning was coupled with multi-criteria decision making is the attempt of Choubin et al. [31] who computed the flood susceptibility across the Khiyav–Chai watershed in Iran by using decision trees, a support vector machine, multi-criteria decision making and their ensemble. Tehrany et al. [32] carried out a flood susceptibility study on the Kelantan river basin in Malaysia highlighting that the hybrid model, created by the combination of frequency ratio and SVM, outperformed the DT stand-alone model.

Moreover, during the last years, researchers tried to increase the performance of machine learning algorithms with different optimization algorithms. Among these attempts is the case study realized by Ahmadlou et al. [33] in which the optimization of the adaptive neuro-fuzzy inference system (ANFIS) was performed by two metaheuristic algorithms to estimate the flood susceptibility. Another example is given by Bui et al. [34] who demonstrated that an extreme learning machine (ELM) optimized with particle swarm optimization (PSO) outperformed the ANN, SVM and DT models in terms of flash-flood susceptibility assessment.

Tien Bui et al. [35] highlighted the increase, during the last years, in the number of studies that use remote sensing in combination with many machine learning models to compute flood susceptibility maps. The high importance of remote sensing is due to the fact that these techniques play an essential role in the creation of input datasets for machine learning models. This fact is highlighted also by Wang et al. [36] who mentioned that the remote sensing images are a reliable source for flood inventory procedure.

Accordingly, the aim of this study is to define and calculate the Flash-Flood Potential Index (FFPI) employing two stand-alone models and their ensembles. The k-Nearest Neighbor (kNN) and lazy K-Star (KS) algorithms are used as stand-alone models while the ensemble methods are represented by the integration of the AHP technique and both the kNN and KS models. It is worth remarking that a high percentage of the input datasets in these models are derived through remote sensing (RS) techniques and geographic information systems (GIS). Assessment of the models’ performances are carried out through several statistical measures. Regarding the validation of the FFPI maps, the ROC Curves and the share of torrential pixels in each FFPI class are used. The upper and middle part of the Prahova river basin in Romania is selected as the study area since it is considered one of the most affected regions by flash-flooding [4]. The novelty of this study, i.e., difference in respect to the previous similar studies, is seen in the application of the K-Star (KS) model as well as the two ensemble models (kNN–AHP and KS–AHP), which are used for the first time for flash-flood or, in general, flood susceptibility mapping.

2. Study Area

The location of the upper and middle parts of the Prahova river basin is in the south-central part of Romania. The study area is defined by the parallels of 45°32′21.85″N and 45°00′23.33″N and the meridians of 25°27′32.00″E and 26°27′10.19″E (Figure 1). The total area of the selected basin part is 2600 km².

The study area is characterized by the elevations between 128 m and 2505 m on the highest mountain peaks. The study area is spread on both the Carpathian region and the Subcarpathian region and it is characterized by a mean slope angle of 12.43° [37]. The lithology of the study area contains hard rocks, flysch and sedimentary rocks such as sand, gravel and loess. The amount of precipitation in the Subcarpathian region is approximately 600 mm/year, while in the Carpathian region it increased to 1200 mm/year. During the summer, heavy rainfalls frequently generate severe flash-flood events. Land-use also substantially influences the surface runoff genesis. Thus, the built-up surfaces that create favorable conditions for flash-flood occurrence are spread approximately on 10% of the selected basin area. The afforestation represents 48.2%. Regarding the soil characteristics, the hydrological soil group A covers 801 km², hydrological soil group B has an area of 727 km² and the hydrological soil group D covers 682 km². It should be noted that the hydrological soil group D, which has the highest potential in the genesis of surface runoff, covers around 25% of the selected basin area.

3. Data

Given the fact that this study aims at the mapping of areas that are highly susceptible to flash-flooding, only the geospatial data were used for the analysis. Particularly, it should be mentioned that the RS and GIS were applied to process the input dataset used in the present research.

3.1. Inventory of Torrential Areas

The inventory of torrential areas is mandatory to correctly assess the relationship between the selected influencing factors and flash-flood susceptibility [37]. Based on the work of Costache and Zaharia [38], the surfaces, such as ravines and gullies, were used for analysis, representing the actual torrential microforms. The remote sensing techniques were involved in this stage of the study and the aerial imagery from the Google Earth application was used to delineate the areas affected by torrential processes within the study area (Figure 1). The validation of these areas was performed by field surveys using a professional GNSS device. Finally, a total surface of 260 km², representing torrential areas and containing about 289,000 pixels, was delineated (Figure 1). This surface represents approximately 10% of the entire territory covered by the study area (2600 km²).

3.2. Flash-Flood Conditioning Factors

To compute the FFPI for the study area, ten flash-flood variables were selected. Seven morphometric factors based on the digital elevation model (DEM) and the other three factors were obtained from vector databases. The DEM for the study area was generated from the Shuttle Radar Topography Mission (SRTM), which has a spatial resolution of 30 m, using RS techniques [39]. The DEM included in the SRTM database was obtained based on measurements carried out through the Synthetic Aperture Radar (SAR) interferometry [40]. Based on the DEM, the following variables were derived: slope angle, aspect, plan curvature, profile curvature, convergence index, topographic position index (TPI) and topographic wetness index (TWI). Thus, the application of RS techniques was crucial for obtaining the accurate DEM and, consequently, seven flash-flood conditioning factors. Moreover, the RS techniques, namely the supervised classification method, had a vital contribution in the delimitation of land-use categories. These land-use categories were further used to obtain the Curve Number, which is one of the most critical flash-flood predictors. The flash-flood predictors, along with their influence on runoff generation, are described in the following text.

Lithology has an important influence on surface runoff genesis because this factor controls the water infiltration rate. The lithological classes for the research area were extracted from the digital version of the Geological Map of Romania (1:200,000). Around 40% of the study area is built with marls, flysch and conglomerates. The lithological map was subsequently converted to 30 m raster (Figure 2a). Regarding the convergence index (Ci), its importance for the computation of flash-flood potential was highlighted in many of the previous studies [38,41,42,43]. The river valleys are highlighted by the negative values of this morphometric parameter, while the positive ones indicate the interfluvial zones. The Ci values were classified into the following five classes: (−96)–(−3), (−3)–(−2), (−2)–(−1), (−1)–0 and 0–95 (Figure 2b). Profile curvature differentiates the areas prone to runoff (negative values) from those (positive values) which are less susceptible to surface runoff [43]. More than 50% of the torrential pixels belong to the class between −8.2 and 0 (Figure 2c). Curve Number (CN) (Figure 2d) is widely used to estimate the direct runoff based on a given rainfall event. Concerning a specific territory, the CN values, which may range between 0 to 100, are derived based on the land-use categories and hydrological soil groups (HSG) [37,44,45,46]. The HSG control the infiltration rate, the soil moisture and surface runoff, while the land-use has a high influence on surface runoff due to the different Manning coefficients [47,48]. The HSG were derived from the vector format of the Soil Map of Romania (1:200,000). Moreover, the land-use/land-cover was derived from the CORINE Land Cover (2018) database. It should be mentioned that the CORINE Land Cover dataset was obtained by the supervised classification of Sentinel-2 and Landsat 8 images. Using the natural breaks (Jenks) classification, the resulting CN values were classified as follows: 0–43, 43–52, 52–69, 69–80, 80–89, 88–100 (Figure 3d). The values between 43 and 80 cover 70% of the total basin area and approximately correspond to 45% of the torrential areas. Plan curvature is a flash-flood predictor which highlights the surfaces which are characterized by convergent or divergent runoff. Thus, this morphometric factor is very useful in analyzing the areas susceptible to surface runoff. The highest share (58%) of the study area corresponds to the plan curvature values from 0.1 to 0.5 (Figure 2e). Modified Fournier Index (MFI), as another flash-flood conditioning factor, describes the spatial variability of rainfall intensity within a specific region [37,49]. The following Equation (1) can be used to calculate the MFI values [50]:

MFI = \sum_{i = 1}^{12} \frac{P_{i}^{2}}{P}

(1)

where

P_{i}

is the monthly average amount of precipitation for month

i

in mm, and

P

is the average annual rainfall.

The average amount of precipitation from 23 meteorological stations located inside and around the study area was used (Figure 1). Based on the work of Kourgialas and Karatzas [50], the spatial variability of the precipitation was modeled with the use of Spline interpolation. Finally, the MFI values were derived by using the precipitation amount in raster format and Equation (1). These values were reclassified into five classes using natural breaks (Jenks) algorithm: 53.56–57.43, 57.44–62.83, 62.84–68.69, 68.7–75.14, 75.15–83.46 (Figure 2f). The class between 53.76 and 57.43 quantifies around 28% of the study area. The highest percentage of torrential pixels (36.6%) is included in the MFI class between 68.7 and 75.14. According to the vast majority of studies [48,51,52,53,54,55,56], slope degree has the highest impact on the generation of surface runoff. This morphometric flash-flood predictor was derived from the DEM (Figure 3a). The following classes were established for slope angle: 0–3°, 3–7°, 7–15°, 15–25° and >25° [49]. Although the slope class between 15° and 25° contains almost 40% of the torrential areas, the highest density of torrential pixels was recorded in the class higher than 25°. Thus, the slope class higher than 25°, which quantifies 7.08% of the total study area, includes 20.7% of the total surface of torrential areas (Figure 4). This situation can be explained by the high velocity of water runoff that is recorded on slopes higher than 25°. The torrential relief microforms also are present in a high percentage (32.8%) on slopes between 7° and 15° because the gravitational force is high enough to generate rapid surface runoff.

The values of TPI denote the elevation difference between a grid cell and the neighboring cells [57]. The resulting TPI values were classified as follows: (−34.82)–(−2.08); (−2.08)–(−0.84); (−0.83)–0.14; 0.15–1.12; 1.13–27.96. The highest share (45%) is represented by the TPI class from (−0.83) to 0.14 (Figure 3b). TWI is an important runoff predictor which was computed based on Equation (2) [58]:

T W I = \ln \begin{matrix} (\frac{α}{\tan β}) \end{matrix}

(2)

where

α

is the cumulative upslope area draining through a point (per unit contour length), and

\tan β

is the slope angle at the point.

The classification of the TWI values is as follows: −9.7–4.5; 4.6–8.4; 8.5–12; 13–15; 16–25 (Figure 3c). The largest areas correspond to the medium TWI class (8.5–12) which has a total share of 29% (Figure 4).

Aspect is a flash-flood predictor, derived from the DEM, which values were divided into ten classes. Slope aspect has an indirect role on the surface runoff due to its influence on the other factors such as rainfall regime, soil humidity, and solar radiation. Having a percentage of 15%, eastern and south-western slopes occupy the most extensive surfaces (Figure 3d). Eastern surfaces contain approximately 20% of the total torrential pixels (Figure 4).

4. Background of the Employed Algorithms

4.1. Analytical Hierarchy Process (AHP)

The AHP belongs to multi-criteria decision-making methods that solve complex problems in a simpler way [59]. The AHP model is based on the active participation of decision-makers within the entire methodological workflow [60]. Many studies applied this method for the evaluation of susceptibility to other natural hazards [26,36,52,61,62]. By using this method, an unstructured problem is broken into many components. The workflow applied in the AHP model can be described within the following five main steps [63]:

(i) Defining objectives and dividing the unstructured problem into its components.

(ii) Determining the detailed criteria and alternatives.

(iii) Creation of the pair-wise comparison matrix, which is constructed based upon the expert opinion regarding the influence of each factor or factor class/category to flash-flood occurrence. The relative value of a factor can range from 1 to 9 when that factor is more important than another and conversely from 1/2 to 1/9 when that factor is less important than another [64].

(iv) Computing the relative weights of each criterion (flash-flood predictors) using the eigenvalue technique.

(v) Computing the consistency ratio (

CR

) using Equations (3) and (4).

CI = \frac{λ_{\max} - n}{n - 1}

(3)

where

λ

is the largest eigenvalue of the matrix which can be determined from the comparison matrix, and n is the number of flash-flood predictors or the number of factor classes/categories.

CR = \frac{CI}{RI}

(4)

where

RI

is the random consistency index which relates to the number of factors included in the comparison matrix [65].

When the

CR

value is less than 0.1, then it can be stated that the comparison is consistent [65].

4.2. k-Nearest Neighbor (kNN)

According to Marjanovic et al. [66], this simple machine learning algorithm classifies a point in n-dimensional input space, taking into account the value of class containing the k-closest neighboring elements using the training dataset. Therefore, this method is used especially in predictive analysis [67]. Basically, in the kNN algorithm, the elements of the same geographical site will have the same characteristics if they are located near each other [68]. This algorithm uses a simple voting system to attribute, to a spatial object, a new class value that is most common in the neighboring instances [66]. The new class value is assigned according to a distance metric that exists between the new spatial object and the k-nearest neighbor. Usually, to define this metric, the Euclidean distance is calculated as follows (Equation (5)) [69]:

d (x, x^{'}) = \sqrt{{(x_{1} - z_{1})}^{2} + {(x_{2} - z_{1})}^{2} + \dots + {(x_{n} - z_{n})}^{2}} = \sqrt{\sum_{i}^{n} {(x_{i} - z_{i})}^{2}}

(5)

where

x_{i}

is the coordinate for

x

point and

z_{i}

is the coordinate for

z

point.

Other distance measures such as Manhattan, Chebyshev, and Hamming can be more suitable for other model settings.

Regarding the present study, the kNN algorithm estimated the conditional probability for each class of flash-flood predictor (

X

) to belong to the torrential class (

y

= 1) or non-torrential class (

y

= 0). The conditional probability can be determined using Equation (6) [70]:

P (y = j | X = x) = \frac{1}{k} \sum_{i \in T} I (y^{(i)} = j)

(6)

where

i

is the torrential or non-torrential pixel within training dataset,

T

,

k

is the number of torrential or non-torrential pixels, which are located in the proximity of each class of flash-flood predictors.

The

k

value represents a positive integer which is equal to the number of neighbors used to calculate the new class value for a specific spatial object (Figure 5). Therefore, the determination process of an optimal

k

value is critical due to influences on a considerable measure of the results of the kNN model and the accuracy of the flash-flood susceptibility map. A low value of

k

may lead to a large prediction variance while a large value of

k

may cause a large model bias.

4.3. Lazy K-Star (KS) Algorithm

To the best of our knowledge, the presented approach represents the first attempt to determine the susceptibility to a specific natural hazard through the use of a lazy K-Star (KS) algorithm. The KS algorithm is an instance-based learning algorithm which creates explicitly the shortest string that connects two instances using the Kolmogorov distance [71]. The total probability of all paths from instance

a

to an instance

b

represents the probability function used in a KS algorithm and is defined as follows (Equation (7)):

P^{*} (\frac{b}{a}) = \sum_{t ϵ p : t (a) = b} P (t)

(7)

where

P^{*}

is the probability function,

t

represents the value of T, which is the set of transformations predefined.

Thus, the

KS

function is written as follows (Equation (8)):

KS (\frac{b}{a}) = - \log_{2} P^{*} (\frac{b}{a})

(8)

Apart from the fact that it is not considered exactly a distance function, the KS algorithm also has the following properties highlighted in Equations (9) and (10) [72]:

KS(b/a) ≥ 0

(9)

KS(c/b) + KS(b/a) ≥ KS(c/a)

(10)

The implementation of the KS model requires, at the beginning of the training process, a correct estimation of optimal parameters for X₀ (real numbers) and s (symbolic attributes) [73]. Concerning each type of parameter, the probability distribution includes a number of instances varying from 1 to

N

. The number of instances is equal to

N

in that case, if all the instances are equally weighted [73]. Different functions of conditional probability (

P^{*}

) can be used to estimate this number based on Equation (11) [71]:

n_{0} \leq \frac{{(\sum_{b} P^{*} (\frac{b}{a}))}^{2}}{\sum_{b} P^{*} {(\frac{b}{a})}^{2}} \leq N

(11)

where

N

is the total number of training instances and

n_{0}

is the smallest distances from

a

.

The main advantage of a KS algorithm is represented by its high effectiveness when it works with large data sets and, also, by the fact that this model is robust to noisy training data [74].

To compute the FFPI values using the KS algorithm, Weka 3.9 software [75] was used.

5. Proposed Methodology for Predicting Flash-Flood Potential

The entire methodological workflow is graphically illustrated in Figure 6. As it can be observed, the flash-flood susceptibility was computed through two stand-alone and two ensemble methods.

5.1. Establishment of Flash-Flood Database

During the first step of the methodological workflow, the flash-flood database for the study area was established applying the ArcGIS 10.5 software. The database that was used in this study includes the spatial extension of the 10 flash-flood predictors described in Section 3.2 and a total surface of 260 km² (around 289,000 pixels) containing areas with torrential phenomena. According to literature [16,32,35], to obtain a higher performance of the machine learning models, another sample with non-torrential areas was generated for the training process, which has the same number of pixels (289,000) as in the case of the torrential regions. These surfaces were randomly selected from territories with a slope angle below 3°. The slopes lower than 3° were chosen due to their very low potential for surface runoff; these surfaces were generally not influenced by torrential phenomena. Therefore, it was likely that flash-flooding would occur on these surfaces.

It should be mentioned that all three aforementioned elements of the flash-flood database were processed into 30 m resolution rasters.

5.2. Selection of Flash-Flood Predictors Applying Information Gain Ratio (IGR) Method

The uncorrelated flash-flood predictors, i.e., predictors having very low predictive capability, with flash-flood phenomena may generate noisy input data, which may result in decreased predictive capability of the applied models [76]. Concerning this, the IGR method was employed to evaluate the predictive capability of the selected flash-flood predictors [77]. Particularly, a high value of IGR demonstrates that the flash-flood predictor has a high predictive capability.

IGR

for each flash-flood predictor was computed with the use of Equations (12)–(15) [76]:

IGR (D, F) = \frac{Entropy (D) - Entropy (D, F)}{SplitEntropy (D, F)}

(12)

Entropy (D) = - \sum_{i = 1}^{2} \frac{(Y_{i}, F)}{| D |} \log_{2} \frac{n (Y_{i}, F)}{| D |}

(13)

Entropy (D, F) = - \sum_{j = 1}^{m} \frac{D_{j}}{| D |} Entropy (D)

(14)

SplitEntropy (D, F) = - \sum_{j = 1}^{m} \frac{| D_{j} |}{| D |} \log_{2} \frac{| D_{j} |}{| D |}

(15)

where

D

is the training dataset composed of

n

input samples,

n

(

Y_{i}

,

D

) is the number of samples in the training data

D

belonging to the class label

Y_{i}

(torrential, non-torrential).

Weka 3.9 software was employed to estimate the predictive ability of each flash-flood predictor.

5.3. Computing the AHP Weights for Factor Classes/Categories

The computation of the AHP weights for each factor class/category was carried out after the construction of 10 pair-wise comparison matrices, each one corresponding to each flash-flood predictor. Therefore, by assigning a relative dominant value, each factor and class/category in the same flash-flood predictor was rated against every other. The entire workflow and all the matrices therein used Microsoft Excel 2016 according to the steps described in Section 4.1. The quality of the comparisons was tested by using the consistency ratio calculated according to Equation (4). The AHP weights were used as input in the kNN and KS models.

5.4. Training and Validation Dataset Preparation

Concerning the majority of the studies approaching the assessment of susceptibility to natural hazards using machine learning algorithms, the initial dataset consists of training and validating pixels. According to literature [34,78,79,80], a percentage of 70% and 30% of the dataset was used to train and validate the models, respectively. Regarding the present case, both the torrential and non-torrential samples were divided according to the aforementioned percentages. Thus, the training dataset contains 202,300 torrential pixels (≈182 km²) or 70% and 202,300 non-torrential pixels (≈182 km²), also 70%. The validating dataset includes 86,700 torrential pixels (≈78 km²) or 30%, and 86,700 non-torrential pixels (≈78 km²), also 30%. The Subset Features tool of ArcGIS 10.5 software was employed to establish the training and validation datasets.

Furthermore, each pixel from the training and validation samples was attributed the normalized values (between 0 and 1) of the initial numerical factors and the categories of the categorical factors. According to literature [81], the machine learning models require that the input data are normalized using the same range, since the bias may occur in the results due to the bigger magnitude of the initial untransformed data. The normalization was performed in terms of the work by Amiri et al. [82]. This dataset was applied for training the kNN and KS stand-alone models. Regarding terms of kNN–AHP and KS–AHP ensembles, the pixels from the training and validation samples were assigned the AHP weights calculated according to the methodology described in Section 4.1 and Section 5.3.

5.5. Configuration and Training of Flash-Flood Potential Models

The entire training process of the kNN stand-alone and kNN–AHP ensemble was performed with the help of XLSTAT Microsoft Excel 2016 software. Regarding this, the training and validation data in GIS format was converted into tabular format to be imported into Microsoft Excel 2016. Further, an essential step in the computation of FFPI through the kNN stand-alone model and kNN–AHP ensemble was the determination of the optimal number of Nearest Neighbors or k-number. Thus, a 10-fold cross-validation procedure was employed, based on the work of Nguyen et al. [83], to find the best k-value. Therefore, the best k-number was estimated by modifying the k-values until the highest classification accuracy, calculated for both kNN and kNN–AHP, was reached. The best k-values were 19 for kNN and 18 for kNN–AHP.

Concerning the training of the KS stand-alone and KS–AHP ensemble, Weka 3.9 software was used. Regarding this, in the first step, the same data used for kNN models were converted into Comma Separated Values (CSV) format to be imported into the Weka 3.9 software. Using the Weka software, the performance of the KS models is dependent on the quality of training data, but it is highly influenced also by the selection of a global blending parameter. Similar to the procedure applied by Naji et al. [84], the best value of a global blending parameter for both the KS and KS–AHP was established after a trial and error procedure, taking into account the classification accuracy. Concerning terms of the KS stand-alone model, the best global blending parameter was established at 16 while for the KS–AHP ensemble model, the best global blending value was 20.

5.6. Evaluation of the Model Performance

After configuring and running the models, the quality of the results was evaluated through several statistical indices like Sensitivity, Specificity, and Accuracy. The description and detailed definitions of these indices were included in several studies regarding the flash-flood or landslide susceptibility [4,85]. The statistical measures used in this study were calculated through the following Equations (16)–(18):

S e n s i t i v i t y = \frac{T P}{T P + F N}

(16)

S p e c i f i c i t y = \frac{T N}{F P + T N}

(17)

A c c u r a c y = \frac{T P + T N}{T P + F P + T N + F N}

(18)

where

F P

(false positive) and

F N

(false negative) are the number of pixels erroneously classified.

5.7. Flash-Flood Potential Mapping and Results Validation

Once the best configuration for the four models was established, the algorithms were run, and the flash-flood predictor importance was determined. These importance values were finally used to derive the FFPI. The importance flash-flood predictors determined through the kNN and KS stand-alone models were multiplied with the rasters without AHP weights, while the importance derived through the application of the kNN–AHP and KS–AHP ensembles was multiplied with the rasters having assigned the AHP weights. Regarding these operations, the ArcGIS 10.5 software was applied.

During the first stage, the validation of flash-flood potential mapping was made by calculating the share of training and validating samples in the computed FFPI classes. This is a widely used method for results validation [38], by which the total percentage of training and validation pixels that correspond to the high and very high FFPI class can be quantified.

During the second stage, the results validation was done through the Receiver Operating Characteristic (ROC) Curve, which belongs to widely used methods for the assessment of the model accuracy [86,87]. The acquired results and existing torrential areas were compared when creating the ROC Curve. Therefore, both the success and prediction rates were created to assess the reliability and accuracy of the FFPI maps. Regarding this, the training 70% of the torrential areas were used to create the success rate, while the validating 30% of torrential areas were used to construct the prediction rate. The Area Under Curve (AUC) in the case of the success rate follows how the model classified the results. Considering the case of the prediction rate, it highlights the accuracy of the results. The model is defined as efficient when the value of AUC is close to 1. When the AUC value is close to 0, it highlights a non-informative model [88]. The AUC value can be estimated through the following Equation (19) [3]:

AUC = \frac{(\sum T P + \sum T N)}{(P + N)}

(19)

where

T P

(true positive) and

T N

(true negative) are the numbers of pixels that are correctly classified,

P

is the total number of pixels with torrential phenomena and

N

is the total number of pixels without torrential phenomena.

6. Results and Discussion

6.1. Predictive Ability of Flash-Flood Conditioning Factors

The results of the IGR method, which was applied to determine the predictive ability of the selected flash-flood conditioning factors, are shown in Figure 7.

Based on the results of the analysis, slope angle was the most significant flash-flood conditioning factor with an average predictive ability of 0.9, followed by curve number (0.87) profile curvature (0.79), Modified Fournier Index (0.69), lithology (0.63), plan curvature (0.58), TWI (0.51), convergence index (0.46), TPI (0.39) and aspect (0.29). It can be observed that the lowest value of average merit was 0.29, which signifies that all the factors are important in a flash-flood occurrence process and, therefore, were taken into account for the presented analysis.

6.2. AHP Weights Results

Table S1 from the Supplementary Materials showed all possible pair-wise comparisons between the classes of the selected flash-flood predictors. The weight values that indicate the importance of each class/category were also included in Table 1. Among all class/categories, the highest importance, equal to 0.633, was achieved by the profile curvature class between 0.9 and 9.7, followed by the plan curvature class between 0.1 and 0.5 (0.512), slope angles higher than 25° (0.507) and convergence index class between −96 and −3 (0.485).

Mentioned in Section 4.2, the evaluation of the consistency of expert judgments was assessed through the Consistency Ratio (CR) values, which were computed for each comparison matrix. According to Table 1, all CR values are less than 0.1, which demonstrates that all the comparisons between the class/categories within the same predictor are consistent. Table 2 also highlights the values of several parameters used to determine the weights of each factor or the CR value.

6.3. Application of kNN and kNN–AHP Ensemble Model

Based on the cross-validation procedure, the optimal k-value was estimated to be used for training purposes of the kNN individual model and kNN–AHP ensemble. Regarding the kNN stand-alone model, it can be seen that the highest accuracy (82.7%) was reached with a number of 19 nearest neighbors, while in the case of the kNN–AHP ensemble, the most accurate model (82.1%) results in a number of 18 nearest neighbors (Figure 8).

Considering terms of Sensitivity and Specificity, the two kNN-based models achieved good performance. The kNN stand-alone achieved a Sensitivity of 0.805 in the case of the training sample and 0.79 in the case of the validating sample. Regarding Specificity, it reached the value of 0.854 (training sample) and 0.841 (validation sample). Concerning the kNN–AHP ensemble, the Sensitivity was calculated at 0.797 in terms of training areas and 0.805 for validating areas. The same model achieved a Specificity value of 0.85 for training areas and 0.811 for the validating sample (Table 2).

Stated in Section 5.7, the importance of each factor, in terms of Flash-Flood Potential Index—k-Nearest Neighbor(FFPI_kNN) and Flash-Food Potential Index—k-Nearest Neighbot–Analytical Hierarchy Process (FFPI_kNN–AHP), were derived. Regarding terms of FFPI_kNN, the slope angle result was the factor with the highest importance, having a weight of 20.9%. The rest of the factors resulted in the following weights: curve number (19.8%), lithology (12.2%), MFI (10.9), convergence index (9.9%), TPI (8.2%), TWI (7.4%), plan curvature (6.1%), profile curvature (2.9%) and aspect (1.7%). Furthermore, the importance was included in map algebra to calculate the FFPI_kNN. The FFPI_kNN values (0.04–0.96) (Figure 9a), were reclassified into five classes based on the natural breaks (Jenks) method. This grading method is considered the most appropriate for the classification of values into the classes [89]. Very low flash-flood potential values ranged between 0.04 and 0.26 and their share of the study area is approximately 11.3%. According to Figure 9a, these surfaces can be found mostly along the main river valleys. The second class has a share of approximately 33.9% and relates to the surfaces with low flash-flood potential. The medium FFPI_kNN values are spread homogenously across the study area and are included in the range 0.43–0.56. Concurrently, the middle class of flash-flood potential covers 31% and spreads across the whole study area. Altogether, the fourth and fifth FFPI_kNN classes have a share of 23.8% (Figure 10), being located mainly in the northern part of the selected basin. Furthermore, the values of high and very high flash-flood potential vary from 0.57 to 0.96.

Concerning terms of the kNN–AHP ensemble, the highest weight was reached again by the slope angle (21.4%). The weights of the rest of the factors are as follows: curve number (19.3%), lithology (13.2%), MFI (11.2%), convergence index (8.1%), TPI (7.4%), TWI (6.6%), plan curvature (5.7%), profile curvature (4.3%) and aspect (2.8%). Similar to FFPI_AHP and FFPI_kNN, the values of flash-flood potential, which this time vary between 0.03 and 0.35, were classified into five classes based on the natural breaks (Jenks) method (Figure 9b). The values in the class between 0.03 and 0.14 relate to the areas with a very low potential for flash-flooding. A percentage of 7.42% of the study area is covered by surfaces where the flash-floods are the least probable. Low FFPI_kNN–AHP values recorded a share of around 26.74%, while the medium values had the share of approximately 34.98%. FFPI_kNN–AHP values higher than 0.21 belong to areas which have high to very high flash-flood potential. These areas recorded a share of 30.8% and can be found, especially, in the northern half of the study area.

6.4. Application of KS and KS–AHP Ensemble Model

According to Table 2, the Accuracy of the KS stand-alone model resulted in the value of around 0.803, referring to the training sample, and 0.818 in the case of validating sample. Considering the same model, the Sensitivity was 0.78 (training sample) and 0.845 (validating sample), while the Specificity reached 0.829 (training sample) and 0.788 (validating sample). Concerning terms of the KS–AHP ensemble, the Accuracy was established at 0.817 in the case of the training sample and 0.815 in the case of the validating sample. Moreover, the Sensitivity reached a value of 0.791 for the training sample and 0.904 for the validating sample. Regarding the Specificity, it reached the value of 0.848 (training sample) and 0.742 (validating sample). These values show good performance achieved by the models.

The training process of the two aforementioned models led to the calculation of normalized importance for each flash-flood predictor. Regarding the estimation process of the FFPI_KS, the slope angle had the highest impact (20.5%). The other factors resulted as follows: curve number (18.3%), lithology (14.9%), MFI (13.4%), convergence index (10.8%), TPI (9%), TWI (6%), plan curvature (4.9%), profile curvature (1.9%) and aspect (0.4%) (Figure 11).

FFPI_KS values ranging between 0.07 and 0.89 and were divided into five classes with the use of the natural breaks (Jenks) method. The lowest flash-flood potential recorded a share of around 7.42%, while the low FFPI_KS values, ranging from 0.23 to 0.38, occupied 26.74% of the total study area (Figure 12a). Medium values of flash-flood potential, determined through the KS algorithm, covered approximately 34.98% of the study area and were distributed homogenously across the study area. A percentage of 30.83% of the total area was represented by the areas corresponding to the high and very high classes of FFPI_KS. Their values (0.52 to 0.89) were distributed over compact and large areas within the northern half of the research area.

FFPI_KS–AHP was spatially modelled by assigning the following weights for each flash-flood conditioning factor: slope angle was 20.8%, curve number was 19.2%, lithology was 16.5%, MFI was 13.3%, convergence index was 11.4%, TPI was 5.9%, TWI was 5.1%, plan curvature was 3.5%, profile curvature was 2.7% and aspect was 1.6%. Therefore, the values in the range between 0.05 and 0.53 were reclassified into five classes based on the natural breaks (Jenks) method. The interval between 0.05 and 0.14 covered an area equal to 6.58% of the studied river basin and corresponded to the very low flash-flood susceptibility. The share of the second class, showing the low values of FFPI_KS–AHP, was approximately 21.38% and it spread in almost all the regions. Furthermore, it can be seen that 36.03% of the study area corresponded to medium values of FFPI_KS–AHP. Like in the case of FFPI_KS, medium values of FFPI_KS–AHP have a homogeneous spatial distribution. Altogether, the fourth and fifth class of flash-flood potential totalled around 36.12% of the study area (Figure 12b).

6.5. Results Validation

6.5.1. Share of Torrential Pixels in FFPI Classes

Concerning terms of the share of training areas in the high and very high FFPI class, the kNN–AHP ensemble model resulted in the best performance with a percentage of 85.75%. This model is followed by the KS model (84.21%), KS–AHP ensemble model (81.35%) and kNN model (79.45%). The most accurate results, regarding the share of validating areas in the high and very high FFPI class, also were obtained by the kNN–AHP ensemble model (79.74%), followed by the KS–AHP ensemble model (76.99%), KS model (75.23%) and kNN model (74.9%) (Table 3).

6.5.2. Receiver Operating Characteristic (ROC) Curve

The highest performance, regarding the success rate, was reached by the kNN–AHP ensemble model (Area Under Curve (AUC) = 0.901). The KS–AHP model resulted in an AUC equal to 0.886, KS model (AUC = 0.882) and kNN model (AUC = 0.873) (Figure 13a). The best prediction also was made by the kNN–AHP ensemble model (AUC = 0.896), while the KS model resulted in an AUC equal to 0.887, the KS–AHP model (AUC = 0.878) and the kNN model (AUC = 0.852) (Figure 13b). Overall, it can be stated that the performance of all models used was very good regarding the validation of results.

7. Conclusions

The accurate identification of highly susceptible areas is the basis for adopting the main non-structural measures intended to prevent the negative impacts of flash-floods. Taking this context, the presented article proposed a complex methodology for identifying the areas susceptible to flash-flooding in the upper and middle parts of the Prahova river basin. The methodology adopted was based on the computation of FFPI using two machine learning models (k-Nearest Neighbor (kNN) and KS (K-star)) and their novel ensemble with an Analytical Hierarchy Process (AHP). The prediction power of the ensemble models, namely kNN–AHP and KS–AHP, also was tested. The weights determined by the pair-wise comparison matrices were used as input data for kNN–AHP and KS–AHP ensembles. Furthermore, ten flash-flood predictors were selected through the IGR method, along with 70% of torrential areas. These predictors, as well as 70% of torrential area, were used to train the models. The validation of the results was performed with the use of the other 30% of torrential areas. The assessment of the model performances was done by the computation of several statistical measures. Regarding the validation of the results, it was carried out in two steps: i) by the share of torrential areas in the FFPI classes; ii) through the Receiver Operating Characteristics ROC Curve. Concerning the ROC Curve, the best model was the kNN–AHP ensemble for both success rate (AUC = 0.901) and prediction rate (AUC = 0.896). It should be remarked that when the kNN was used in combination with other models for determining the susceptibility to different natural hazards, it also obtained very good performances. An example in this regard was provided by Bui et al. [90] who calculated the landslide susceptibility in Vietnam through a combination of a fuzzy kNN algorithm with differential evolution optimization. The AUC of the success rate in that case was 0.944, while for the prediction rate it was 0.841.

Areas which resulted in high to very high flash-flood potential had a share between 23.8% (FFPI_kNN) and 36.12% (FFPI_KS–AHP). These surfaces were found, especially, in the north-western part characterized by the presence of tourist localities, which creates the premise of a high vulnerability degree to flash-flood occurrence. These localities were frequently affected by floods and flash-floods in the following years: 2005, 2006, 2010, 2014, 2017 and 2018 [91].

The main novelty of the presented research is considered to be the use of the KS model, which, to the best of our knowledge, is the first attempt to use it for the assessment of susceptibility to natural hazards, as well as the application of the two ensembles for estimating the FFPI. Even if the kNN–AHP method achieved the highest accuracy, the KS was used for FFPI computation due to its advantages presented in Section 4. These advantages are highlighted by the higher performances obtained by the KS stand-alone model in comparison with the kNN stand-alone model. Nevertheless, it was observed that, when the input data was represented by reclassified predictors, having assigned the AHP coefficients, the performance of the kNN–AHP exceeded the performance of the KS–AHP.

This study provides results which are applicable for effective spatial planning as well as for improving the flash-flood forecast and warning activities conducted by the National Institute of Hydrology and Water Management of Romania.

Supplementary Materials

The following are available online at https://0-www-mdpi-com.brum.beds.ac.uk/2072-4292/12/1/106/s1, Table S1: Pair-wise comparison matrix and normalized weights for each factor and class/category.

Author Contributions

Conceptualization, R.C., Q.B.P. and M.V.; Data curation, R.C.; Formal analysis, Q.B.P., S.I.A. and D.N.K.; Methodology, R.C., Q.B.P., N.T.T.L. and M.V.; Software, R.C., E.S. and J.V.; Supervision, R.C. and P.T.T.N., E.S.; Validation, R.C., Q.B.P., N.T.T.L. and P.T.T.N.; Visualization, R.C., Q.B.P., N.T.T.L. and S.I.A.; Writing—original draft, R.C., E.S., D.N.K.; Writing—review & editing, E.S., M.V., J.V. and D.N.K. All the authors discussed the results and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

Open Access Funding by the University of Vienna.

Conflicts of Interest

The authors declare no conflict of interest.

References

Costache, R. Flood Susceptibility Assessment by Using Bivariate Statistics and Machine Learning Models-A Useful Tool for Flood Risk Management. Water Resour. Manag. 2019, 33, 3239–3256. [Google Scholar] [CrossRef]
Termeh, S.V.R.; Kornejady, A.; Pourghasemi, H.R.; Keesstra, S. Flood susceptibility mapping using novel ensembles of adaptive neuro fuzzy inference system and metaheuristic algorithms. Sci. Total Environ. 2018, 615, 438–451. [Google Scholar] [CrossRef] [PubMed]
Hong, H.; Tsangaratos, P.; Ilia, I.; Liu, J.; Zhu, A.-X.; Chen, W. Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Sci. Total Environ. 2018, 625, 575–588. [Google Scholar] [CrossRef] [PubMed]
Costache, R. Flash-Flood Potential assessment in the upper and middle sector of Prahova river catchment (Romania). A comparative approach between four hybrid models. Sci. Total Environ. 2019, 659, 1115–1134. [Google Scholar] [CrossRef]
Elkhrachy, I. Flash flood hazard mapping using satellite images and GIS tools: A case study of Najran City, Kingdom of Saudi Arabia (KSA). Egypt. J. Remote Sens. Space Sci. 2015, 18, 261–278. [Google Scholar] [CrossRef] [Green Version]
Jacinto, R.; Grosso, N.; Reis, E.; Dias, L.; Santos, F.; Garrett, P. Continental Portuguese Territory Flood Susceptibility Index: Contribution to a vulnerability index. Nat. Hazards Earth Syst. Sci. 2015, 15, 1907–1919. [Google Scholar] [CrossRef] [Green Version]
Khosravi, K.; Pham, B.T.; Chapi, K.; Shirzadi, A.; Shahabi, H.; Revhaug, I.; Prakash, I.; Bui, D.T. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Sci. Total Environ. 2018, 627, 744–755. [Google Scholar] [CrossRef]
Youssef, A.M.; Pradhan, B.; Sefry, S.A. Flash flood susceptibility assessment in Jeddah city (Kingdom of Saudi Arabia) using bivariate and multivariate statistical models. Environ. Earth Sci. 2016, 75, 12. [Google Scholar] [CrossRef]
Norbiato, D.; Borga, M.; Degli Esposti, S.; Gaume, E.; Anquetin, S. Flash flood warning based on rainfall thresholds and soil moisture conditions: An assessment for gauged and ungauged basins. J. Hydrol. 2008, 362, 274–290. [Google Scholar] [CrossRef]
Georgakakos, K.P. Modern operational flash flood warning systems based on flash flood guidance theory: Performance evaluation. In Proceedings of the International Conference on Innovation Advances and Implementation of Flood Forecasting Technology, Bergen-Tromsø, Norway, 9–13 October 2005; pp. 9–13. [Google Scholar]
Sweeney, T.L. Modernized Areal Flash Flood Guidance; NOAA Technical Memorandum NWS HYDRO 44; Hydrology Laboratory, National Weather Service, NOAA: Silver Spring, MD, USA, 1992; pp. 1–21. [Google Scholar]
Carpenter, T.; Sperfslage, J.; Georgakakos, K.; Sweeney, T.; Fread, D. National threshold runoff estimation utilizing GIS in support of operational flash flood warning systems. J. Hydrol. 1999, 224, 21–44. [Google Scholar] [CrossRef]
Georgakakos, K.P. Analytical results for operational flash flood guidance. J. Hydrol. 2006, 317, 81–103. [Google Scholar] [CrossRef]
Ntelekos, A.A.; Georgakakos, K.P.; Krajewski, W.F. On the uncertainties of flash flood guidance: Toward probabilistic forecasting of flash floods. J. Hydrometeorol. 2006, 7, 896–915. [Google Scholar] [CrossRef]
Petroselli, A.; Vojtek, M.; Vojteková, J. Flood mapping in small ungauged basins: A comparison of different approaches for two case studies in Slovakia. Hydrol. Res. 2019, 50, 379–392. [Google Scholar] [CrossRef] [Green Version]
Costache, R.; Bui, D.T. Spatial prediction of flood potential using new ensembles of bivariate statistics and artificial intelligence: A case study at the Putna river catchment of Romania. Sci. Total Environ. 2019, 691, 1098–1118. [Google Scholar] [CrossRef]
Bui, D.T.; Hoang, N.-D.; Martínez-Álvarez, F.; Ngo, P.-T.T.; Hoa, P.V.; Pham, T.D.; Samui, P.; Costache, R. A novel deep learning neural network approach for predicting flash flood susceptibility: A case study at a high frequency tropical storm area. Sci. Total Environ. 2020, 701, 134413. [Google Scholar]
Zhao, G.; Pang, B.; Xu, Z.; Yue, J.; Tu, T. Mapping flood susceptibility in mountainous areas on a national scale in China. Sci. Total Environ. 2018, 615, 1133–1142. [Google Scholar] [CrossRef]
Mohammady, M.; Pourghasemi, H.R.; Amiri, M. Assessment of land subsidence susceptibility in Semnan plain (Iran): A comparison of support vector machine and weights of evidence data mining algorithms. Nat. Hazards 2019, 99, 951–971. [Google Scholar] [CrossRef]
Tehrany, M.S.; Pradhan, B.; Jebur, M.N. Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS. J. Hydrol. 2014, 512, 332–343. [Google Scholar] [CrossRef]
Tehrany, M.S.; Kumar, L. The application of a Dempster–Shafer-based evidential belief function in flood susceptibility mapping and comparison with frequency ratio and logistic regression methods. Environ. Earth Sci. 2018, 77, 490. [Google Scholar] [CrossRef]
Shafapour Tehrany, M.; Kumar, L.; Neamah Jebur, M.; Shabani, F. Evaluating the application of the statistical index method in flood susceptibility mapping and its comparison with frequency ratio and logistic regression methods. Geomat. Nat. Hazards Risk 2019, 10, 79–101. [Google Scholar] [CrossRef]
Rahmati, O.; Pourghasemi, H.R.; Zeinivand, H. Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto Int. 2016, 31, 42–70. [Google Scholar] [CrossRef]
Khosravi, K.; Pourghasemi, H.R.; Chapi, K.; Bahri, M. Flash flood susceptibility analysis and its mapping using different bivariate models in Iran: A comparison between Shannon’s entropy, statistical index, and weighting factor models. Environ. Monit. Assess. 2016, 188, 656. [Google Scholar] [CrossRef] [PubMed]
Azareh, A.; Rahmati, O.; Rafiei-Sardooi, E.; Sankey, J.B.; Lee, S.; Shahabi, H.; Ahmad, B.B. Modelling gully-erosion susceptibility in a semi-arid region, Iran: Investigation of applicability of certainty factor and maximum entropy models. Sci. Total Environ. 2019, 655, 684–696. [Google Scholar] [CrossRef] [PubMed]
Souissi, D.; Zouhri, L.; Hammami, S.; Msaddek, M.H.; Zghibi, A.; Dlala, M. GIS-based MCDM-AHP modeling for flood susceptibility mapping of arid areas, southeastern Tunisia. Geocarto Int. 2019, 1–25. [Google Scholar] [CrossRef]
Khosravi, K.; Shahabi, H.; Pham, B.T.; Adamowski, J.; Shirzadi, A.; Pradhan, B.; Dou, J.; Ly, H.-B.; Gróf, G.; Ho, H.L. A comparative assessment of flood susceptibility modeling using Multi-Criteria Decision-Making Analysis and Machine Learning Methods. J. Hydrol. 2019, 573, 311–323. [Google Scholar] [CrossRef]
Kim, T.H.; Kim, B.; Han, K.-Y. Application of Fuzzy TOPSIS to Flood Hazard Mapping for Levee Failure. Water 2019, 11, 592. [Google Scholar] [CrossRef] [Green Version]
Shafizadeh-Moghadam, H.; Valavi, R.; Shahabi, H.; Chapi, K.; Shirzadi, A. Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping. J. Environ. Manag. 2018, 217, 1–11. [Google Scholar] [CrossRef] [Green Version]
Pham, Q.B.; Abba, S.I.; Usman, A.G.; Linh, N.T.T.; Gupta, V.; Malik, A.; Costache, R.; Vo, N.D.; Tri, D.Q. Potential of Hybrid Data-Intelligence Algorithms for Multi-Station Modelling of Rainfall. Water Resour. Manag. 2019, 1–21. [Google Scholar] [CrossRef]
Choubin, B.; Moradi, E.; Golshan, M.; Adamowski, J.; Sajedi-Hosseini, F.; Mosavi, A. An Ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci. Total Environ. 2019, 651, 2087–2096. [Google Scholar] [CrossRef]
Tehrany, M.S.; Pradhan, B.; Jebur, M.N. Flood susceptibility analysis and its verification using a novel ensemble support vector machine and frequency ratio method. Stoch. Environ. Res. Risk Assess. 2015, 29, 1149–1165. [Google Scholar] [CrossRef]
Ahmadlou, M.; Karimi, M.; Alizadeh, S.; Shirzadi, A.; Parvinnejhad, D.; Shahabi, H.; Panahi, M. Flood susceptibility assessment using integration of adaptive network-based fuzzy inference system (ANFIS) and biogeography-based optimization (BBO) and BAT algorithms (BA). Geocarto Int. 2019, 34, 1252–1272. [Google Scholar] [CrossRef]
Bui, D.T.; Ngo, P.-T.T.; Pham, T.D.; Jaafari, A.; Minh, N.Q.; Hoa, P.V.; Samui, P. A novel hybrid approach based on a swarm intelligence optimized extreme learning machine for flash flood susceptibility mapping. Catena 2019, 179, 184–196. [Google Scholar] [CrossRef]
Tien Bui, D.; Khosravi, K.; Shahabi, H.; Daggupati, P.; Adamowski, J.F.; Melesse, A.M.; Thai Pham, B.; Pourghasemi, H.R.; Mahmoudi, M.; Bahrami, S. Flood spatial modeling in northern Iran using remote sensing and gis: A comparison between evidential belief functions and its ensemble with a multivariate logistic regression model. Remote Sens. 2019, 11, 1589. [Google Scholar] [CrossRef] [Green Version]
Wang, Y.; Hong, H.; Chen, W.; Li, S.; Pamučar, D.; Gigović, L.; Drobnjak, S.; Bui, D.T.; Duan, H. A hybrid GIS multi-criteria decision-making method for flood susceptibility mapping at Shangyou, China. Remote Sens. 2019, 11, 62. [Google Scholar] [CrossRef] [Green Version]
Costache, R. Flash-flood Potential Index mapping using weights of evidence, decision Trees models and their novel hybrid integration. Stoch. Environ. Res. Risk Assess. 2019, 33, 1375–1402. [Google Scholar] [CrossRef]
Costache, R.; Zaharia, L. Flash-flood potential assessment and mapping by integrating the weights-of-evidence and frequency ratio statistical methods in GIS environment–case study: Bâsca Chiojdului River catchment (Romania). J. Earth Syst. Sci. 2017, 126, 59. [Google Scholar] [CrossRef]
Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L. The shuttle radar topography mission. Rev. Geophys. 2007, 45, 1–33. [Google Scholar] [CrossRef] [Green Version]
Van Zyl, J.J. The Shuttle Radar Topography Mission (SRTM): A breakthrough in remote sensing of topography. Acta Astronaut. 2001, 48, 559–565. [Google Scholar] [CrossRef]
Costache, R.; Fontanine, I.; Corodescu, E. Assessment of surface runoff depth changes in Sǎrǎţel River basin, Romania using GIS techniques. Open Geosci. 2014, 6, 363–372. [Google Scholar]
Costache, R.; Pravalie, R.; Mitof, I.; Popescu, C. Flood vulnerability assessment in the low sector of Saratel Catchment. Case study: Joseni Village. Carpathian J. Earth Environ. Sci. 2015, 10, 161–169. [Google Scholar]
Zaharia, L.; Costache, R.; Prăvălie, R.; Ioana-Toroimac, G. Mapping flood and flooding potential indices: A methodological approach to identifying areas susceptible to flood and flooding risk. Case study: The Prahova catchment (Romania). Front. Earth Sci. 2017, 11, 229–247. [Google Scholar] [CrossRef]
Shehata, M.; Mizunaga, H. Geospatial analysis of surface hydrological parameters for Kyushu Island, Japan. Nat. Hazards 2019, 96, 33–52. [Google Scholar] [CrossRef]
Costache, R. Using GIS techniques for assessing lag time and concentration time in small river basins. Case study: Pecineaga river basin, Romania. Geogr. Tech. 2014, 9, 31–38. [Google Scholar]
COSTACHE, R. Estimating multiannual average runoff depth in the middle and upper sectors of Buzău River Basin. Geogr. Tech. 2014, 9, 21–29. [Google Scholar]
Prăvălie, R.; Costache, R. The vulnerability of the territorial-administrative units to the hydrological phenomena of risk (flash-floods). Case study: The subcarpathian sector of Buzău catchment. Analele Univ. Din Oradea–Seria Geogr. 2013, 23, 91–98. [Google Scholar]
Pravalie, R.; Costache, R. The Analysis of the Susceptibility of the Flash-Floods’ Genesis in the Area of the Hydrographical Basin of Bāsca Chiojdului River/Analiza susceptibilitatii Genezei Viiturilor īn Aria Bazinului Hidrografic al Rāului Bāsca Chiojdului; Department of Geography, University of Craiova: Craiova, Romania, 2014; Volume 13, p. 39. [Google Scholar]
Zaharia, L.; Costache, R.; Prăvălie, R.; Minea, G. Assessment and mapping of flood potential in the Slănic catchment in Romania. J. Earth Syst. Sci. 2015, 124, 1311–1324. [Google Scholar] [CrossRef] [Green Version]
Kourgialas, N.N.; Karatzas, G.P. Flood management and a GIS modelling method to assess flood-hazard areas—A case study. Hydrol. Sci. J. J. Sci. Hydrol. 2011, 56, 212–225. [Google Scholar] [CrossRef]
Corrao, M.V.; Link, T.E.; Heinse, R.; Eitel, J.U. Modeling of terracette-hillslope soil moisture as a function of aspect, slope and vegetation in a semi-arid environment. Earth Surf. Process. Landf. 2017, 42, 1560–1572. [Google Scholar] [CrossRef]
Dahri, N.; Abida, H. Monte Carlo simulation-aided analytical hierarchy process (AHP) for flood susceptibility mapping in Gabes Basin (southeastern Tunisia). Environ. Earth Sci. 2017, 76, 302. [Google Scholar] [CrossRef]
Minea, G. Assessment of the flash flood potential of Bâsca River Catchment (Romania) based on physiographic factors. Open Geosci. 2013, 5, 344–353. [Google Scholar] [CrossRef] [Green Version]
Pradhan, B.; Lee, S. Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ. Earth Sci. 2010, 60, 1037–1054. [Google Scholar] [CrossRef]
Martina, M.L.; Entekhabi, D. Identification of runoff generation spatial distribution using conventional hydrologic gauge time series. Water Resour. Res. 2006, 42, 1–9. [Google Scholar] [CrossRef]
Costache, R.; Prăvălie, R. The analysis of May 29 2012 flood phenomena in the lower sector of Slănic drainage basin (case of Cernăteşti locality area). GEOREVIEW Sci. Ann. Stefan Cel Mare Univ. Suceava Geogr. Ser. 2013, 22, 78–87. [Google Scholar] [CrossRef] [Green Version]
Jenness, J.S. The Effects of Fire on Mexican Spotted Owls in Arizona and New Mexico. Master’s Thesis, Northern Arizona University, Flagstaff, AZ, USA, 2000. [Google Scholar]
Kirkby, M.; Beven, K. A physically based, variable contributing area model of basin hydrology. Hydrol. Sci. J. 1979, 24, 43–69. [Google Scholar]
Saaty, T.L. The Analytical Hierarchy Process, Planning, Priority Setting, Resource Allocation (Decision Making Series); McGraw-Hill: New York, NY, USA, 1980. [Google Scholar]
Shahabi, H.; Khezri, S.; Ahmad, B.B.; Hashim, M. Landslide susceptibility mapping at central Zab basin, Iran: A comparison between analytical hierarchy process, frequency ratio and logistic regression models. Catena 2014, 115, 55–70. [Google Scholar] [CrossRef]
Chen, W.; Li, W.; Chai, H.; Hou, E.; Li, X.; Ding, X. GIS-based landslide susceptibility mapping using analytical hierarchy process (AHP) and certainty factor (CF) models for the Baozhong region of Baoji City, China. Environ. Earth Sci. 2016, 75, 63. [Google Scholar] [CrossRef]
Ghosh, A.; Kar, S.K. Application of analytical hierarchy process (AHP) for flood risk assessment: A case study in Malda district of West Bengal, India. Nat. Hazards 2018, 94, 349–368. [Google Scholar] [CrossRef]
Razandi, Y.; Pourghasemi, H.R.; Neisani, N.S.; Rahmati, O. Application of analytical hierarchy process, frequency ratio, and certainty factor models for groundwater potential mapping using GIS. Earth Sci. Inform. 2015, 8, 867–883. [Google Scholar] [CrossRef]
Saaty, T.L. A scaling method for priorities in hierarchical structures. J. Math. Psychol. 1977, 15, 234–281. [Google Scholar] [CrossRef]
Kumar, R.; Anbalagan, R. Landslide susceptibility mapping using analytical hierarchy process (AHP) in Tehri reservoir rim region, Uttarakhand. J. Geol. Soc. India 2016, 87, 271–286. [Google Scholar] [CrossRef]
Marjanovic, M.; Bajat, B.; Kovacevic, M. Landslide susceptibility assessment with machine learning algorithms. In Proceedings of the IEEE International Conference on Intelligent. Networking and Collaborative Systems, Barcelona, Spain, 4–6 November 2009; pp. 273–278. [Google Scholar]
Arefin, A.S.; Riveros, C.; Berretta, R.; Moscato, P. Gpu-fs-knn: A software tool for fast and scalable knn computation using gpus. PLoS ONE 2012, 7, e44000. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Sabokbar, H.F.; Roodposhti, M.S.; Tazik, E. Landslide susceptibility mapping using geographically-weighted principal component analysis. Geomorphology 2014, 226, 15–24. [Google Scholar] [CrossRef]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: New York, NY, USA, 2013; Volume 112. [Google Scholar]
Thirumuruganathan, S. A detailed introduction to K-nearest neighbor (KNN) algorithm. Retrieved March 2010, 20, 2012. [Google Scholar]
Kavzoglu, T.; Colkesen, I. Entropic distance based K-Star algorithm for remote sensing image classification. Fresenius Environ. Bull. 2011, 20, 1200–1207. [Google Scholar]
Morrison, D.; De Silva, L.C. Voting ensembles for spoken affect classification. J. Netw. Comput. Appl. 2007, 30, 1356–1365. [Google Scholar] [CrossRef]
Cleary, J.G.; Trigg, L.E. K*: An instance-based learner using an entropic distance measure. In Machine Learning Proceedings 1995; Elsevier: Amsterdam, The Netherlands, 1995; pp. 108–114. [Google Scholar]
Sharma, R.; Kumar, S.; Maheshwari, R. Comparative analysis of classification techniques in data mining using different datasets. Int. J. Comput. Sci. Mob. Comput. 2015, 4, 125–134. [Google Scholar]
Bouckaert, R.R.; Frank, E.; Hall, M.; Kirkby, R.; Reutemann, P.; Seewald, A.; Scuse, D. Weka Manual for Version 3-6-0; University of Waikato: Hamilton, New Zealand, 2008; pp. 1–341. [Google Scholar]
Chapi, K.; Singh, V.P.; Shirzadi, A.; Shahabi, H.; Bui, D.T.; Pham, B.T.; Khosravi, K. A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environ. Model. Softw. 2017, 95, 229–245. [Google Scholar] [CrossRef]
Tien Bui, D.; Shahabi, H.; Shirzadi, A.; Chapi, K.; Hoang, N.-D.; Pham, B.; Bui, Q.-T.; Tran, C.-T.; Panahi, M.; Bin Ahamd, B. A novel integrated approach of relevance vector machine optimized by imperialist competitive algorithm for spatial modeling of shallow landslides. Remote Sens. 2018, 10, 1538. [Google Scholar] [CrossRef] [Green Version]
Costache, R.; Hong, H.; Wang, Y. Identification of torrential valleys using GIS and a novel hybrid integration of artificial intelligence, machine learning and bivariate statistics. Catena 2019, 183, 104179. [Google Scholar] [CrossRef]
Park, S.-J.; Lee, C.-W.; Lee, S.; Lee, M.-J. Landslide susceptibility mapping and comparison using decision tree models: A Case Study of Jumunjin Area, Korea. Remote Sens. 2018, 10, 1545. [Google Scholar] [CrossRef] [Green Version]
Wang, Y.; Hong, H.; Chen, W.; Li, S.; Panahi, M.; Khosravi, K.; Shirzadi, A.; Shahabi, H.; Panahi, S.; Costache, R. Flood susceptibility mapping in Dingnan County (China) using adaptive neuro-fuzzy inference system with biogeography based optimization and imperialistic competitive algorithm. J. Environ. Manag. 2019, 247, 712–729. [Google Scholar] [CrossRef] [PubMed]
Zhang, S. Nearest neighbor selection for iteratively kNN imputation. J. Syst. Softw. 2012, 85, 2541–2552. [Google Scholar] [CrossRef]
Amiri, M.; Amnieh, H.B.; Hasanipanah, M.; Khanli, L.M. A new combination of artificial neural network and K-nearest neighbors models to predict blast-induced ground vibration and air-overpressure. Eng. Comput. 2016, 32, 631–644. [Google Scholar] [CrossRef]
Nguyen, Q.-K.; Tien Bui, D.; Hoang, N.-D.; Trinh, P.; Nguyen, V.-H.; Yilmaz, I. A novel hybrid approach based on instance based learning classifier and rotation forest ensemble for spatial prediction of rainfall-induced shallow landslides using GIS. Sustainability 2017, 9, 813. [Google Scholar] [CrossRef] [Green Version]
Naji, H.I.; Ali, R.H.; Al-Zubaidi, E.A. Risk Management Techniques. In Strategic Management-a Dynamic View; IntechOpen: London, UK, 2019. [Google Scholar]
Chen, W.; Peng, J.; Hong, H.; Shahabi, H.; Pradhan, B.; Liu, J.; Zhu, A.-X.; Pei, X.; Duan, Z. Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County, Jiangxi Province, China. Sci. Total Environ. 2018, 626, 1121–1135. [Google Scholar] [CrossRef]
Alizadeh, M.; Ngah, I.; Hashim, M.; Pradhan, B.; Pour, A. A hybrid analytic network process and artificial neural network (ANP-ANN) model for urban earthquake vulnerability assessment. Remote Sens. 2018, 10, 975. [Google Scholar] [CrossRef] [Green Version]
Kadavi, P.; Lee, C.-W.; Lee, S. Application of ensemble-based machine learning models to landslide susceptibility mapping. Remote Sens. 2018, 10, 1252. [Google Scholar] [CrossRef] [Green Version]
Djeddaoui, F.; Chadli, M.; Gloaguen, R. Desertification susceptibility mapping using logistic regression analysis in the Djelfa area, Algeria. Remote Sens. 2017, 9, 1031. [Google Scholar] [CrossRef] [Green Version]
Chen, J.; Yang, S.; Li, H.; Zhang, B.; Lv, J. Research on geographical environment unit division based on the method of natural breaks (Jenks). Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2013, XL-4/W3, 47–50. [Google Scholar] [CrossRef] [Green Version]
Bui, D.T.; Nguyen, Q.P.; Hoang, N.-D.; Klempe, H. A novel fuzzy K-nearest neighbor inference model with differential evolution for spatial prediction of rainfall-induced shallow landslides in a tropical hilly area using GIS. Landslides 2017, 14, 1–17. [Google Scholar]
General Inspectorate for Emergency Situation. The Archive of General Inspectorate for Emergency Situation-Prahova County Subsidiary, Romania; General Inspectorate for Emergency Situation: Bucharest, Romania, November 2019. [Google Scholar]

Figure 1. Location of the study area.

Figure 2. Flash-flood conditioning factors: (a) lithology; (b) convergence index; (c) profile curvature; (d) curve number; (e) plan curvature; (f) Modified Fournier Index.

Figure 3. Flash-flood conditioning factors (a) slope angle; (b) topographic position index (TPI); (c) topographic wetness index (TWI); (d) aspect.

Figure 4. Relative frequency distribution of torrential phenomena pixels in classes of flash-flood predictors.

Figure 5. Examples of the different number of nearest neighbors used in a k-Nearest Neighbor (kNN) model: (a) 1-nearest neighbor; (b) 2-nearest neighbors; (c) 3-nearest neighbors.

Figure 6. Flowchart of the methodology.

Figure 7. Flash-flood conditioning factors and their predictive ability.

Figure 8. Relation among the number of Nearest Neighbors (k) and accuracy of models.

Figure 9. Spatial distribution of Flash-Flood Potential Index (FFPI) values: (a) Flash-Flood Potential Index—k-Nearest Neighbor (FFPI_kNN); (b) Flash-Flood Potential Index—k-Nearest Neighbor–Analytical Hierarchy Process (FFPI_kNN–AHP).

Figure 10. The weights of FFPI classes.

Figure 11. The importance of flash-flood conditioning factors within FFPI models.

Figure 12. FFPI values and their spatial distribution: (a) Flash-Flood Potential Index—K Star (FFPI_KS); (b) Flash-Flood Potential Index—K Star–Analytical Hierarchy Process (FFPI_KS–AHP).

Figure 13. Receiver Operating Characteristics (ROC) curves and Area Under Curve (AUC) values based on training dataset (a) success rate; and validating dataset (b) prediction rate. kNN—k-Nearest Neighbor; kNN-AHP—k-Nearest Neighbor—Analytical Hierarchy Process; KS—K Star; KS-AHP—K Star—Analytical Hierarchy Process.

Table 1. Properties of comparison matrices in the Table S1 previous Table 1.

Factors	N	λ_max	CI	RI	CR
Slope angle	5	5.196	0.049	1.12	0.040
TPI	5	5.029	0.007	1.12	0.010
TWI	5	5.058	0.014	1.12	0.010
Curve Number	6	6.105	0.021	1.24	0.017
Lithology	12	13.12	0.101	1.53	0.066
Profile curvature	3	3.039	0.019	0.58	0.030
Plan curvature	3	3.109	0.054	0.58	0.090
Slope aspect	9	9.200	0.025	1.45	0.020
Convergence index	5	5.099	0.025	1.12	0.020
Modified Fournier Index	5	5.043	0.011	1.12	0.010

N—number of classes; CI—consistency index; RI—random consistency index; CR—consistency ratio; TPI—topographic position index; TWI—topographic wetness index.

Table 2. Model performance based on the training and validating datasets.

Models	Sample	True Positive	True Negative	False Positive	False Negative	Sensitivity	Specificity	Accuracy
kNN	Training	172,362	147,766	25,271	41,875	0.805	0.854	0.827
kNN	Validating	71,373	72,943	13,750	18,917	0.790	0.841	0.815
kNN–AHP	Training	166,392	149,694	26,407	42,468	0.797	0.850	0.821
kNN–AHP	Validating	63,203	63,920	14,862	15,299	0.805	0.811	0.808
KS	Training	154,893	149,026	30,764	43,678	0.780	0.829	0.803
KS	Validating	63,402	53,218	14,313	11,604	0.845	0.788	0.818
KS–AHP	Training	152,754	142,657	25,547	40,419	0.791	0.848	0.817
KS–AHP	Validating	52,347	51,934	18,012	5589	0.904	0.742	0.815

kNN—k-Nearest Neighbor; kNN-AHP—k-Nearest Neighbor—Analytical Hierarchy Process; KS—K Star; KS-AHP—K Star—Analytical Hierarchy Process.

Table 3. Share of torrential pixels in Flash-Flood Potential Index (FFPI) classes.

FFPI Class	Training Areas					Validating Areas
FFPI Class	kNN	kNN–AHP	KS	KS–AHP	kNN	kNN–AHP	KS	KS–AHP
Very low	1.43%	2.5%	1.57%	2.56%	5.43%	4.21%	3.95%	4.8%
Low	5.67%	4.32%	3.88%	7.44%	7.43%	5.12%	6.43%	6.89%
Medium	13.45%	7.43%	10.34%	8.65%	12.24%	10.93%	14.39%	11.32%
High	27.52%	31.43%	35.43%	25.73%	32.45%	31.73%	29.54%	33.45%
Very high	51.93%	54.32%	48.78%	55.62%	42.45%	48.01%	45.69%	43.54%

FFPI—Flash-Flood Potential Index.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Costache, R.; Pham, Q.B.; Sharifi, E.; Linh, N.T.T.; Abba, S.I.; Vojtek, M.; Vojteková, J.; Nhi, P.T.T.; Khoi, D.N. Flash-Flood Susceptibility Assessment Using Multi-Criteria Decision Making and Machine Learning Supported by Remote Sensing and GIS Techniques. Remote Sens. 2020, 12, 106. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12010106

AMA Style

Costache R, Pham QB, Sharifi E, Linh NTT, Abba SI, Vojtek M, Vojteková J, Nhi PTT, Khoi DN. Flash-Flood Susceptibility Assessment Using Multi-Criteria Decision Making and Machine Learning Supported by Remote Sensing and GIS Techniques. Remote Sensing. 2020; 12(1):106. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12010106

Chicago/Turabian Style

Costache, Romulus, Quoc Bao Pham, Ehsan Sharifi, Nguyen Thi Thuy Linh, S.I. Abba, Matej Vojtek, Jana Vojteková, Pham Thi Thao Nhi, and Dao Nguyen Khoi. 2020. "Flash-Flood Susceptibility Assessment Using Multi-Criteria Decision Making and Machine Learning Supported by Remote Sensing and GIS Techniques" Remote Sensing 12, no. 1: 106. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12010106

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Flash-Flood Susceptibility Assessment Using Multi-Criteria Decision Making and Machine Learning Supported by Remote Sensing and GIS Techniques

Abstract

1. Introduction

2. Study Area

3. Data

3.1. Inventory of Torrential Areas

3.2. Flash-Flood Conditioning Factors

4. Background of the Employed Algorithms

4.1. Analytical Hierarchy Process (AHP)

4.2. k-Nearest Neighbor (kNN)

4.3. Lazy K-Star (KS) Algorithm

5. Proposed Methodology for Predicting Flash-Flood Potential

5.1. Establishment of Flash-Flood Database

5.2. Selection of Flash-Flood Predictors Applying Information Gain Ratio (IGR) Method

5.3. Computing the AHP Weights for Factor Classes/Categories

5.4. Training and Validation Dataset Preparation

5.5. Configuration and Training of Flash-Flood Potential Models

5.6. Evaluation of the Model Performance

5.7. Flash-Flood Potential Mapping and Results Validation

6. Results and Discussion

6.1. Predictive Ability of Flash-Flood Conditioning Factors

6.2. AHP Weights Results

6.3. Application of kNN and kNN–AHP Ensemble Model

6.4. Application of KS and KS–AHP Ensemble Model

6.5. Results Validation

6.5.1. Share of Torrential Pixels in FFPI Classes

6.5.2. Receiver Operating Characteristic (ROC) Curve

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI