Next Article in Journal
Automatic Parametrization of Urban Areas Using ALS Data: The Case Study of Santiago de Compostela
Next Article in Special Issue
Multi-Criteria Decision Making (MCDM) Model for Seismic Vulnerability Assessment (SVA) of Urban Residential Buildings
Previous Article in Journal
The Elephant in the Room: Informality in Tanzania’s Rural Waterscape
Previous Article in Special Issue
Geospatial Assessment of the Post-Earthquake Hazard of the 2017 Pohang Earthquake Considering Seismic Site Effects
Article

Landslide Susceptibility Mapping Using Logistic Regression Analysis along the Jinsha River and Its Tributaries Close to Derong and Deqin County, Southwestern China

College of Construction Engineering, Jilin University, Changchun 130026, China
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2018, 7(11), 438; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi7110438
Received: 5 September 2018 / Revised: 25 October 2018 / Accepted: 4 November 2018 / Published: 8 November 2018
(This article belongs to the Special Issue Natural Hazards and Geospatial Information)

Abstract

The objective of this study was to identify the areas that are most susceptible to landslide occurrence, and to find the key factors associated with landslides along Jinsha River and its tributaries close to Derong and Deqin County. Thirteen influencing factors, including (a) lithology, (b) slope angle, (c) slope aspect, (d) TWI, (e) curvature, (f) SPI, (g) STI, (h) topographic relief, (i) rainfall, (j) vegetation, (k) NDVI, (l) distance-to-river, (m) and distance-to-fault, were selected as the landslide conditioning factors in landslide susceptibility mapping. These factors were mainly obtained from the field survey, digital elevation model (DEM), and Landsat 4–5 imagery using ArcGIS software. A total of 40 landslides were identified in the study area from field survey and aerial photos’ interpretation. First, the frequency ratio (FR) method was used to clarify the relationship between the landslide occurrence and the influencing factors. Then, the principal component analysis (PCA) was used to eliminate multiple collinearities between the 13 influencing factors and to reduce the dimension of the influencing factors. Subsequently, the factors that were reselected using the PCA were introduced into the logistic regression analysis to produce the landslide susceptibility map. Finally, the receiver operating characteristic (ROC) curve was used to evaluate the accuracy of the logistic regression analysis model. The landslide susceptibility map was divided into the following five classes: very low, low, moderate, high, and very high. The results showed that the ratios of the areas of the five susceptibility classes were 23.14%, 22.49%, 18.00%, 19.08%, and 17.28%, respectively. And the prediction accuracy of the model was 83.4%. The results were also compared with the FR method (79.9%) and the AHP method (76.9%), which meant that the susceptibility model was reasonable. Finally, the key factors of the landslide occurrence were determined based on the above results. Consequently, this study could serve as an effective guide for further land use planning and for the implementation of development.
Keywords: landslide susceptibility mapping; frequency ratio; principal component analysis; logistic regression analysis; receiver operating characteristic curve landslide susceptibility mapping; frequency ratio; principal component analysis; logistic regression analysis; receiver operating characteristic curve

1. Introduction

Landslides have become one of the most destructive disasters in mountainous areas [1]. Landslide susceptibility mapping at regional scales is of great significance to risk mitigation and land planning in mountainous areas [2,3]. Landslide susceptibility can be thought of as the tendency of a region to generate landslides [4,5]. Landslide susceptibility only takes into account the likelihood of the landslide predisposing factors of landslide occurrence, not including the instability process and the return period of landslide occurrence [6,7].
With the development of the geographic information system (GIS), global positioning system (GPS), and remote sensing (RS), many researchers have applied these technologies to landslide susceptibility mapping [8,9,10]. Over the last decades, many statistical methods have been used in landslide susceptibility mapping, such as the logistic regression analysis [11], frequency ratio (FR) [10], statistical index (SI) [12], certainty factor (CF) [13], discriminant analysis (DA) [14], evidential belief function [15], and index of entropy [16]. In addition to the statistical methods, a lot of machine learning algorithms, such as artificial neural network models (ANN) [10], support vector machines (SVM) [13], maximum entropy (MaxEnt) [16], and naïve Bayes [17] have also been used for landslide susceptibility mapping. But beyond that, some ensemble methods have also been used in landslide susceptibility mapping, such as the ANN–SVM, ANN–MaxEnt, SVM–MaxEnt, ANN–MaxEnt–SVM [18], and ANN–Bayes analyses [19].
In this study, logistic regression was applied to produce the landslide susceptibility map of the study area, which is a multivariate statistical method and has been widely used in landslide susceptibility analysis. Logistic regression is a statistical method, but it is actually a machine learning method, except that its mathematical expression is known. Compared with other evaluation methods, it has the following features: (1) it is based on statistical methods, and it has low requirements for the quality and quantity of samples; (2) the independent variable can be continuous or discrete, and it does not have to satisfy the normal distribution; (3) this method is very mature, with many mature tests, and the results are easy to test [5,8,9,10,11]. Our choice of this method was based on the fact that landslide occurrence is controlled by many linear and nonlinear influencing factors. The aims of this study are to identify the areas that are most susceptible to the occurrence of landslides, and to find the key factors associated with landslides. So, this study is mainly divided into six steps, as follows: (1) According to the remote sensing and aerial photos’ interpretation and the field survey, a total of 40 landslides were mapped in the study area (including rock slope deformation, rock planar slide, and rock flexural topple, not including debris flow). (2) Based on the field survey, the mechanism of the landslide, the local geo-environmental conditions, and the previous studies, 13 influencing factors, were selected to produce the landslide susceptibility map. (3) In order to clarify the relationship between the landslide occurrence and the influencing factors, the frequency ratio (FR) model was used to describe their relationship. (4) Principal component analysis (PCA) was used to eliminate the multiple collinearity between the 13 influencing factors. (5) Logistic regression analysis was used to produce the landslide susceptibility map, and was compared with other methods. (6) The receiver operating characteristic (ROC) curve was used for validation.
The purpose of this study is to find an accurate landslide susceptibility map, and to find the key factors associated with landslides, which provides a reasonable tool for the landslide risk mitigation of the study area. The landslide susceptibility map could serve as an effective guide for the further land use planning and for the implementation of development.

2. Study Area

The study area lies on the border between the Sichuan province and Yunnan province of china, along the upper reaches of the Jinsha River. On the left bank of the Jinsha River is Derong County, and the right bank is Deqin County. The study area ranges from 99°12′ E to 99°21′ E longitude and from 28°12′ N to 28°26′ N latitude, covering a total area of 364.10 km2 (Figure 1). The elevation of the study area ranges from 1989 to 4888 m, and the maximum elevation difference is 2899 m. The study area belongs to the Tibetan Plateau, which is a rapidly uplifting region. Previous studies have shown that since the Quaternary, neotectonic movement has made the study area uplift at a rate of 5 mm per year [20]. As a result of the rapid uplift, a rapid river incision was caused. Many landslides occurred along the river because of the combined effect of both the rapid uplift and river incision. The annual temperature is 13.8−19.2 °C. As a result of the study area being under the influence of southwest and southeast monsoons, the climatic characteristics of this area are complex. The study area belongs to the subtropical dry–hot valley climate. Because of the huge elevation difference and the monsoons, the foehn effect here is very significant. The mean annual precipitation of the low elevation area is around 300 mm. However, it can reach more than 1000 mm in the high elevation area.

3. Methodology

The methodology of this study is shown in Figure 2. Based on the flow chart, this study is mainly divided into the following three steps: (a) data preparation, (b) landslide susceptibility modeling using logistic regression analysis; and (c) evaluation accuracy analysis using a receiver operating characteristic (ROC) curve.

3.1. Methods

In the analysis of landslide susceptibility, the influencing factors are usually used as the independent variables, and whether the landslides occurred is shown as a binary (“1” represents a landslide occurred, “0” represents a landslide has not occurred). Because of the existence of non-continuous variables in the influencing factors (such as lithology), the multivariate linear regression method will no longer be applied to the derivation of the relationship between such independent variables and dependent variables. However, the logistic regression (LR) method can solve this problem.
The logistic regression method is a commonly used method for the statistical analysis of dichotomous dependent variables (the dependent variable only takes two values). This method can describe the relationship between a binary dependent variable and a series of independent variables. The independent variable can be continuous or discrete, and it does not have to satisfy the normal distribution. Logistic regression can describe the complex nonlinear relationship between the natural phenomena using simple linear regression, and can be used to predict the probability of an event’s occurrence. The odds ratio estimated using logistic regression can also be used to test the strength of the correlation between the independent variables and the dependent variables. So, this method has been widely used in the analysis of landslide susceptibility.
The main idea of the logistic regression model is to determine the likelihood of future landslide occurrence after each factor is converted to a logical variable. Logistic regression use the maximum likelihood method to look for the “best fit”. The simplified logical regression method can be described by the following equation [10,21]:
P   =   1 1   +   e z ,
where P is the probability of landslide occurrence, and the value of P ranges between 0 and 1; z is the liner combination:
z   =   β 0   +   β 1 Y 1   +   β 2 Y 2   +   β G Y G ,
where β0 is a constant; β1, β2, …, and βG are the regression coefficients; and Y1, Y2, …, and YG are the influencing factors. In the analysis of landslide susceptibility, the pixels with landslides have a value of 1, and the pixels without landslides have a value of 0. By using the logistic regression and the observed data, the probability of the landslide occurrence can be calculated.

3.2. Landslides and Influencing Factors

The study area of this paper is very large, and the terrain is very steep. Many areas are inaccessible to human beings. It is very important to conduct a detailed geological survey of the landslides in the study area, but, because of time and manpower constraints, it is impossible to conduct a detailed field survey of all of the landslides in the study area. Therefore, it is necessary to carry out the investigation and research of landslides by means of aerial photos and remote sensing interpretation technology. Landslides have certain recognition features in remote sensing images and aerial photos, such as shape, color, shadow, and differences with surrounding topography and landform [22,23]. In previous remote sensing interpretation work, if there was an area where there were obvious landslide characteristics, we would have tentatively defined it as a landslide. Subsequently, the field survey in the study area was used to determine the accuracy of the interpretation of the landslides, and to supplement those that could not be interpreted.
There are many factors that affect the occurrence of landslides [4,24,25,26,27,28,29,30,31,32]. In order to understand the main factors affecting landslide sensitivity, Hamid Reza Pourghasemi and Mauro Rossi reviewed 220 scientific papers published between 2005 and 2012 in different ISI (Internation Scientific Indexing) journals [15], and they counted the application frequency of the influencing factors, which were used in landslide susceptibility mapping. The statistical results show that the 20 factors with the highest application frequency were the slope degree, lithology, slope aspect, land cover/land use, distance from river, elevation, distance from faults, plan curvature, profile curvature, distance from road, soil type, topographic wetness index (TWI), rainfall, normalized difference vegetation index (NDVI), slope-length, steam power index (SPI), drainage density, geomorphology, soil thickness, and fault density. According to the statistical results and the geological environment characteristics of the study area, 13 influencing factors were selected for landslide susceptibility mapping in this study, and they can be divided into the following three categories: lithology, geomorphological, and environment. These influencing factors maps were converted to a pixel size of 10 × 10 m, and the digital elevation model (DEM) was also converted to the same pixel size. All of those maps are conducted using ArcGIS 10.2 software. The classification of the continuous influencing factors was based on previous studies [24,25].

3.2.1. Landslide Inventory

By means of the aerial photo and remote sensing interpretation and extensive field survey, a total of 40 landslides were identified and mapped along the Jinsha River and Dingqu River. Fourteen of these landslides are located on the right bank of the Jinsha River and eight are on its left bank. Five landslides are located on the right of the Dingqu River, and three are located on its left (Figure 1). As shown in Figure 1, the landslides in the study area are mainly distributed on the left bank of the Jinsha River and Dingqu River. The reason for this is that landslide occurrence is affected by topography, stratigraphic lithology, geological structure, meteorological conditions, hydrological conditions, human engineering activities, and so on.

3.2.2. Lithology Factor

The influence of formation lithology on the occurrence of landslides is obvious. The type, degree of hardness, structural characteristics, and so on have great influence on the physical and mechanical properties, weathering resistance, deformation, and failure modes of the slopes. According to the geological map of scale 1:200,000, the exposed strata in the study area are form the Devonian, Carboniferous, Permian, Triassic, and Quaternary. The Quaternary strata include the landslide accumulation (Qhdel) and the bench gravel and sand layer (Qp3). The Triassic strata include the Jiabila formation (T3j) and Qugasi formation (T2q1, T2q2, and T2q3). The lithology of the Jiabila formation are mainly composed of siltstone, volcanic rock, slate, sandstone, and limestone. The lithology of the Qugasi formation are mainly composed of volcanic rock, slate, sandstone, and limestone. The Permian strata include the Gangdadai formation (P2 and P2g) and the Ranlang formation (P1b, P1a, and P1r). The lithology of the Gangdadai formation and the Ranlang formation are mainly composed of volcanic rock, slate, sandstone, and limestone. The carboniferous strata include the Dingpo formation (C3) and Zhapu formation (C2). The lithology of the Dingpo formation and the Zhapu formation are mainly composed of basalt, andesite, rhyolite, and volcanic breccia. The Devonian strata include the Qiongcuo formation (D2q) and the Gerong formation (D1g). The lithology of the Qiongcuo formation and Gerong formation are mainly composed of volcanic rock, slate, sandstone, and limestone. The geological map was used to extract a lithology map.

3.2.3. Geomorphological Factors

The statistical results of Hamid Reza Pourghasemi and Mauro Rossi show that geomorphological factors have a significant influence on the occurrence of landslides. According to the statistical results, seven factors are selected in this category, including the slope angle, slope aspect, topographic wetness index (TWI), curvature, steam power index (SPI), sediment transport index (STI), and topographic relief.
Slope angle: Slope angle is one of the most important factors for landslide susceptibility mapping [33]. Within a certain slope angle, because of the increases of the slope angle, the gravity stress and shear stress of the slope generally increase and the probability of slope failure is increased [34]. Based on the digital elevation model (DEM), with a resolution of 10 m of the study area, the slope angle map can be extracted using the ArcGIS 10.2.
Slope aspect: As another important factor for landslide susceptibility mapping, the slope aspect affects the rainfall direction, and the amount and the effluence of the solar radiation of the slope. So, it makes the moisture and vegetation unevenly distributed in the slope [15,24,25,35]. Therefore, the slope aspect has a different influence on the slope stability. The slope aspect map was produced using the DEM.
TWI: The TWI reflects the amount of flow accumulation at any point in the study area [36]. To some extent, the TWI represents the distribution of the soil moisture [37]. The TWI can be calculated using the following equations [38,39,40]:
TWI   =   l n ( A S   ÷   t a n β ) ,
where AS is the upslope contributing area and β is the slope angle.
Curvature: Curvature describes the morphological characteristics of the slope shape, which reflects the formation of surface erosion and surface runoff. The slope shape provides spaces for slope sliding [41]. The curvature map was extracted using the DEM.
SPI: The SPI reflects the erosion capacity of water flow in the study area [42]. The SPI can be calculated using the following equations [43]:
SPI   =   A S   ×   t a n β ,
where AS is the upslope contributing area and β is the slope angle.
STI: The STI is a dimensionless parameter, and it is calculated by combining the length and steepness. It describes the process of erosion and deposition of the study area [44]. The STI can be calculated using the following equations [40]:
STI   =   ( A s 22.13 ) 0.6   ×   ( s i n β 0.0896 ) 1.3 ,
where AS is the upslope contributing area and β is the slope angle.
Topographic relief: Topographic relief can reflect the change of the rolling of the slope surface and can reveal the law of topography change of an entirety area.

3.2.4. Environmental Factors

The environmental factors include the average annual rainfall, vegetation, normalized difference vegetation index (NDVI), distance-to-river, and distance-to-fault.
Average annual rainfall: Rainfall is one of the most important factors that trigger landslides. Rainfall will cause the erosion on the slope surface. The water infiltration will increase the gravity of the rock and reduce the shear strength of the joints, thus inducing landslide hazards. Because of the foehn effect, the precipitation of the study area follows an obvious vertical distribution. Previous studies have shown that the precipitation increases with increasing elevation, and it is proportional to the elevation [45]. There are many precipitation stations distributed in the Yunnan and Sichuan province (China Meteorological Data Service Center) [24,46], but most of these are distributed in the county, rather than along Jinsha River and Dingqu River. Based on the distance to the Jinsha River and Dingqu River, the climate zone, and other factors, nine precipitation stations were selected to establish the relationship between the average annual rainfall and elevation, as listed in Table 1. In Figure 3, the red points represent the rainfall data collected from the nine precipitation stations. The fitting equation is as follows:
P a   =   0.265 H 223.4 ,
where H is the elevation of the precipitation stations and Pa is the average annual rainfall. Caochen [12] suggested that the precipitation gradient is 24.4 mm/100 m of the Xulong reservoir, which is similar to this area. The precipitation gradient of this study is 26.5 mm/100 m. So, 26.5 mm/100 m is a reasonable precipitation gradient of the study area.
Vegetation: For the field survey, the vegetation of the study area can be divided into the following five types: (a) in the elevation range, 1989 to 2500 m is the bare soil; (b) in the elevation range, 2500 to 3300 m is the brush-forbs; (c) in the elevation range, 3300 to 4200 m is the woods; (d) in the elevation range, 4200 to 4500 m is the grassland; and (e) in the elevation range, more than 4500 m is the snow.
NDVI: The NDVI can be used to reflect the vegetation coverage of the study area. If the normalized vegetation index is less than zero, it means that the ground is covered with water or snow. If the normalized vegetation index is equal to zero, it means that there is bare land or rock. If the normalized vegetation index is greater than zero, it indicates that there is vegetation cover, and the greater the value, the higher the vegetation coverage. The Landsat 4–5 image was used to extract the NDVI map.
Distance-to-river: The slope on both sides of the river is usually eroded by rivers. In normal conditions, at the closer distance to the river, the stronger the erosion and the higher probability of the occurrence of landslides [47]. The distance-to-river map was calculated in 300 m intervals.
Distance-to-fault: In the faulted zone, the rock is relatively broken and the joint fracture is developed, which makes the slope of these areas less stable and more prone to landslide occurrence [48]. The distance-to-fault map was calculated in 300 m intervals.
All of the influencing factor maps are shown in Figure 4, Figure 5, Figure 6 and Figure 7.

3.3. Evaluation of Influencing Factors

3.3.1. Probabilistic Relationship Analysis between Landslides and the Influencing Factors

Bivariate statistical methods are commonly used to compute the probabilistic relationship between the dependent and independent variables. In this paper, the frequency ratio (FR) method will be used to ensure the relationship between the influencing factors and the occurrence of the landslides. In the FR method, the quantitative relationship between the landslide occurrence and the different conditioning parameters can be identified and expressed as an FR value. The FR value calculation process is very concise, and can be realized as follows [49]:
FR   =   a / A b / B ,
where a is the number of pixels with landslides for each conditioning factor, A is the total number of pixels with landslides in study area, b is the number of pixels for each conditioning factor, and B is the total number of pixels in the study area. If the values are greater than 1, it means there is a greater correlation, whereas values less than 1 represent a minor correlation [12].

3.3.2. Principal Component Analysis

In general, there is no independent test on the selected influencing factors before logistic regression. However, the adjustment of the logistic regression model is sensitive to the linear correlation of the influencing factors [50]. The linear correlation of the influencing factors will increase the variance of the logistic regression coefficients. Some studies use an independence test to verify the mutual independence of each influencing factor, such as the variance inflation factor (VIF) [51] and conditional independence test [52], and can then exclude the influencing factors, which are highly correlated. However, compared with these methods, the principal component analysis (PCA) can not only eliminate the multicollinearity problem among the influencing factors, but can also be used to evaluate by how much the different influencing factors affect the landslide susceptibility of the study area. This is crucial to the subsequent search for the key factor of landslide occurrence. So, in this paper, PCA is used to reduce the dimension of the preselected influencing factors and change the factors, which are reselected, so as to make them independent of each other. Then, the reselected factors will be used in the logistic regression to eliminate the influence of the linear correlation between the factors on the predicted results.
The principal component analysis uses the liner correlation between the preselected factors, replacing the preselected factors with a small number “principal components”. Those “principal components” can represent most of the information of the preselected factors [53]. The algebraic essentials of PCA are as follows: Let Y (t, x) be a preselected data at point x (x = 1, …, p) and time t (t = 1, …, m). The matrices {Y(t, x): x = 1, …, p} mean all of the values of Y(t, x) at point t from 1 to m, and the matrices center on their time averages. Those matrices can be replaced as the p × 1 column vectors, Y(t) = [Y(t, 1), …, Y(t, p)]T, and “T” means the transformation operation. The vectors will form a series of points around the origin of a p-dimensional Euclidian space, Ep. So, PCA can transform the preselected factors system to a new factors system, using the linear transformation. PCA makes the greatest variance using any projection of the data lies as the first principal component, and the second greatest variance as the second principal component, and so on. Thus, by retaining the characteristics of the data set that contribute most to its variance, PCA can be used to reduce the dimensional of the data set.
The steps of PCA are as follows:
(1)
Use the following equation to normalize the preselected influencing factors:
M   =   H H m i n H m a x     H m i n ,
where M is the preselected influencing factor’s normalized value, H is the value of each preselected influencing factor’s pixels, Hmax and Hmin are the maximum and minimum values of each preselected influencing factor, respectively.
(2)
In ArcGIS 10.2 software, a 20 × 20 m fishnet was built to sample 13 preselected factors.
(3)
Using the Kaiser–Meyer–Olkin (KMO) test and the Bartlett’s test of the sample data, the applicability of PCA can be verified.
(4)
PCA was carried out for the sample data, and a correlation matrix eigenvalue greater than 0.9 was selected as the principal component.
(5)
According to the principal component, a new influencing factors system will be built.

3.4. Data for the Logistic Regression Analysis

In order to establish the logistic regression model, which needed pixels with or without the presence of landslides [54], we created datasets containing 200,000 pixels with landslides and an equal number of non-landslides pixels, which were randomly chosen from the study area. Both of the landslides pixels and non-landslides pixels are divided into two sets. One set of the pixels that was used as the training dataset for the regression analysis included 90% (180,000 pixels) of the pixels. And the other set was used as the validation dataset, which included 10% (20,000 pixels) of the pixels. So, the final datasets consisted of 400,000 pixels. All of the pixel’s information are put into a table. One column of the table contained the status information of the landslides. A value of 1 was assigned to the pixels with landslides, and a value of 0 was assigned for the pixels without landslides. In the other column, the one for the influencing factors, contained the influencing factors value information. Finally, the datasets will be used in the logistic regression analysis, and the value of β0, …, βG can be achieved, which can be used to calculate the value of z.

3.5. Model Development

First, before the PCA, the datasets should be tested using the Kaiser–Meyer–Olkin test and the Bartlett’s test. The Kaiser–Meyer–Olkin test is used to test the correlation between the influencing factors, and its values range from 0 to 1. The Bartlett’s test is used to test whether the influencing factors are independent from each other. When the KMO value is greater than 0.6 and the Bartlett’s value is less than 0.01, it is suitable for the PCA. Second, the goodness of fit of the logistic regression model was evaluated using the Cox and Snell pseudo R2 test and the Negelkerke pseudo R2 test [55]. The value of the Cox and Snell pseudo R2 test is usually less than 1. The value of the Negelkerke pseudo R2 test ranges from 0 to 1 [55,56]. If an R2 value is more than 0.2, it means that it is a good fit [57]. Finally, the landslide susceptibility map produced by the logistic regression model will be divided into the following five classes using the natural breaks method: very high, high, moderate, low, and very low.

3.6. Model Validation

It is necessary to validate the accuracy of the model. In this study, the receiver operating characteristic curve (ROC) analysis was used to evaluate the prediction power of the model. The ROC curve is drawn with the false positive rate (sensitivity) as the X-axis, and the true positive rate (1-specificity) as the Y-axis. It has a chance diagonal (the connection between the origin and the point (1, 1)), and the ROC curve area (AUC) of the opportunity diagonal is 0.5. The farther away the opportunity diagonal, the larger the AUC value, and the more accurate the prediction. For any prediction experiment, the value of AUC is between 0.5 and 1. The ROC curve area is commonly used as a standard to evaluate the goodness of the susceptibility model [25,51,58,59,60,61].

4. Results

4.1. Evaluation of Influencing Factors

In order to clarify the relationship between the landslide occurrence and the influencing factors, the FR model was used to describe the relationship between them. Table 2 shows the results of the application of the FR model. From this table, it is seen that the percentages of the landslide area of the Qhdel, T2q2, T2q1, P2g, and D2q were 7.32%, 14.00%, 17.40%, 35.67%, and 22.82%, respectively, which means 97.21% of the landslide area was distributed among the five lithologies. The lithologies with an FR value greater than 1 are Qhdel, Q3P, T2q1, P2g, and D2q, and the highest value of the FR was Qhdel (13.04), following by T2q1 (4.72), P2g (2.48), Q3P (1.37), and D2q (1.26). This means that the relationship between the lithology and the occurrence of the landslide from small to large is T2q1, P2g, Q3P, and D2q. The slope angle was between 0 and 73° in the study area. For the slope angle factor, classes 20−30, 30−40, and 60−70 had a positive FR value (1.11, 1.05, and 2.06, respectively). For the slope angle classes <20, 40−60, and >70, the FR values were negative. This means that the slope angle classes of 20−40 and 60–70 were prone to landslide occurrence. As for the slope aspect, 75.83% of the landslide areas were found in the slope aspect of the S, SW, W, and NW. The areas facing the SW, W, and NW have higher FR values, which means that they have higher probabilities of landslide occurrence. The TWI of the study area was divided into the following four classes: <6, 6–12, 12–18, and >18. Over 70 % of the landslide areas were found in the class of 6–12. The FR values of the four classes were 0.93, 1.01, 1.42, and 0.60, respectively. It can be seen that the TWI classes of 6–12 and 12–18 are prone to landslide occurrence. As for the curvature, the percentages of the landslide area of the classes of concave, flat, and convex were 39.71%, 29.00%, and 39.30%, respectively, but the FR values of the three classes were not high.
As for the SPI, the highest FR value was found to be related to the class of 15.78–1432.47 (1.46), followed by the classes of <15.78 (0.98) and >1432.47 (0.93). Over 90% of the landslide area was found in the class of <15.78. The FR values for each class of the STI were 0.88, 1.02, 1.39, and 1.11, respectively. Over 69% of the landslide area was found in the class of 35–600. The FR values of the topographic relief classes of 0–10, 10–20, 20–30, 30–40, and >40 were 0.94, 1.05, 0.87, 0.70, and 2.24, respectively, which means that the class of >40 was favorable for landslide hazards. Over 60% of the landslide area was found in the class of 20–30. As for the rainfall, 45.40% of the landslide area was found in the class of 303.68–439.10, 35.19% of the landslide area was found in the class of 439.10–571.60, and 15.44% of the landslide area was found in the class of 571.60–704.10. The classes of 303.68–439.10 and 439.10–571.60 had positive FR values (2.05 and 1.24, respectively). For the rainfall class of >571.60, the FR value was negative. As for the vegetation, 45.40% of the landslide areas were found in the bare soil zones, and 47.64% of the landslide areas were found in brush-forbs zones. The FR value of the vegetation classes of bare soil, brush-forbs, woods, grassland, and snow were 2.05, 1.13, 0.30, 0.00, and 0.00, respectively. The NDVI was between −0.378 and 0.705. More than 90% of the landslide area had an NDVI value below 0.272. The classes of −0.378–0.038 and 0.038–0.149 had positive FR values (1.50 and 1.87, respectively). The landslides were mainly distributed within 0–1500 m of the river, and 0–1200 m of the faults.

4.2. Result of the PCA

The adjustment of the logistic regression model is sensitive to the linear correlation of the influencing factors. In this study, the PCA was used to eliminate the linear correlation between the influencing factors. First, all of the influencing factors were normalized. Next, a 20 × 20 m fishnet was built in the study area, so as to sample 13 preselected factors. A total of 12,748,442 sampling points were obtained. Then, the Kaiser–Meyer–Olkin (KMO) test and the Bartlett’s test of the sample data were carried out. The test results are shown in Table 3. It can be seen from Table 3, that the KMO test value is 0.640 and the p-value is <0.05, which shows that there was a certain correlation between the influencing factors, and it was suitable for the PCA.
The correlation matrix of the influencing factors is shown in Table 4. As can be seen from Table 4, the correlation coefficient between the slope angle and the topographic relief was 0.90, the correlation coefficient between the SPI and STI was 0.96, and the correlation coefficients between the rainfall and vegetation was 0.95. These results show that there was a high correlation between some of the influencing factors. In other words, there were extra elements between the preselected influencing factors.
According to the eigenvalue of the correlation matrix, six principal components with eigenvalues greater than 0.9 were selected. In general, the higher the eigenvalue, the greater the difference reflected by the principal component, and the more the actual information of the preselected influencing factors can be retained. From Table 5, it can be seen that the sum of the variance contribution rates of the six principal components was 82.36%, which means that they extracted 82.36% of the information from the original data.
In the process of dimension reduction, PCA is used. The newly generated factors are liner combinations of the preselected influencing factors. According to the component score coefficient matrix, shown in Table 6, six new factors can be obtained, namely, Factor 1, Factor 2, Factor 3, Factor 4, Factor 5, and Factor 6. In the matrix, the higher the coefficient of the preselected influencing factor was, the higher the correlation between the new factors and the preselected influencing factors was. The new factor maps are shown in Figure 8 and Figure 9.

4.3. Landslide Probability

The six factors that were obtained using the PCA will be introduced into the logistic regression analysis. We checked the significance of each factor. We will retain only the significant factor that had a p-value less than 0.05. In other words, we will exclude the factors with a p-value more than 0.05 from the model. The first regression analysis results (Table 7) show that the p-value of Factor 4 was greater than 0.05, so it was excluded from the logistic regression analysis model.
The Cox and Snell pseudo R2 test and the Negelkerke pseudo R2 test were used to evaluate the goodness of the fit of the logistic regression model. For the final logistic regression model, the Cox and Snell pseudo R2 value was 0.233, and the Negelkerke pseudo R2 value was 0.310 (Table 8). Both the Cox and Snell R2 value and the Negelkerke pseudo R2 value were greater than 0.200, which indicates that the fitting result was good.
In the final logistic regression model, Factor 1, Factor 2, Factor 3, Factor 5, and Factor 6, were introduced into the logistic regression analysis model. In this study, the odds ratio was used to assess the relationship of the factors and the landslide susceptibility. If the odds ratio value of the factors is greater than 1, it means that the factors are related to landslide susceptibility. If the odds value of the factors is equal to 1, it means that the factors are neutral with landslide susceptibility. If the odds ratio value is less than 1, it means that the factors are negated with landslide susceptibility. From Table 9, it can be seen that Factor 2 and Factor 5 were related with landslide susceptibility, while Factor 1, Factor 3, and Factor 6 were negated with landslide susceptibility.
Using Equation (1), we calculated the predicted probability of landslides for the entirety of the study area. The result was a raster map and the value of each pixel of the map represents the estimated probability of landslide occurrence. The map was divided into the following five classes: very high, high, moderate, low, and very low (Figure 10). Table 10 shows that the areas of the five classes are 62.93, 98.48, 65.55, 81.89, and 84.25 km2, respectively.
In order to establish a more accurate model of landslide susceptibility, the FR method and the analytic hierarchy process (AHP) were also used in this study. As the application of the FR method and the AHP in landslide susceptibility modeling is quite known, the theory was not introduced in this study. This study only lists the evaluation results of the FR method and the AHP method. The landslide susceptibility maps of the FR method and the AHP method were also divided into five classes using the natural breaks method (Figure 10). Table 10 shows that the areas of the five susceptibility classed of the AHP method (very high, high, moderate, low, and very low) were 40.14, 70.10, 88.99, 103.14, and 61.73 km2, respectively. For the FR method, they were 35.08, 74.00, 84.82, 101.67, and 68.53 km2, respectively.

5. Discussion

5.1. Validation

The validation is very important for landslide susceptibility mapping. Without validation, the landslide susceptibility model will have no meaning. In order to verify the quality of the prediction and the stability of the model, the ROC curve has been used to estimate the model’s accuracy, which is used as a quantitative measurement. The ROC curves of the model built in this study are shown in Figure 11. From Figure 11, it can be seen that the AUCs of the PCA-LR model, AHP model, and FR model were 0.834, 0.769, and 0.799, respectively. Many studies have introduced the traditional academic point system into the accuracy ranking, and they have suggested that the accuracy rate between 0.90 and 1.00 is excellent, the accuracy rate between 0.80 and 0.90 is good, the accuracy rate between 0.70 and 0.80 is fair, the accuracy rate between 0.60 and 0.70 is poor, and the accuracy rate between 0.50 and 0.60 is failing [62,63]. Thus, the accuracy rate of the PCA-LR model fell within the ‘‘good’’ classification category, and the accuracy rate of the FR model and the AHP model fell within the ‘‘fair’’ classification category. We also compared our results with other studies in similar areas. The prediction accuracy of the landslide susceptibility model of the Xulong reservoir, which is similar to this area, based on the combination of the information content method and the hierarchical analysis method established by Caochen [24] is 85.74%. This result is basically equal to the result of the PCA-LR model established in this paper, and the prediction accuracy of this model is the highest. Therefore, the subsequent discussion in this paper is based on the PCA-LR model.

5.2. Key Factors for Landslide Occurrence

The landslide susceptibility mapping should not only produce the landslide susceptibility map, but also identify the main factors of landslide occurrence, and evaluate the contribution and influence of these factors. In order to establish the landslide susceptibility model, we adopted an FR method to analyze the correlation between landslide occurrence and preselected factors, using PCA to eliminate the multicollinearity between the preselected factors. Finally, the 13 preselected factors were reduced to six factors, and the landslide susceptibility model was established using logistic regression. For the logistic regression model, the odds ratios (Exp(βG)) can be used to measure the correlation between the factors and the landslide occurrence. The component score coefficient of the PCA shows the extent of the correlation between the principal components and the preselected factors. The FR method can reflect the correlation between each class of each preselected factor and landslide occurrence. Based on the above discussion, we can find the combination of the most favorable factors for landslide occurrence.
Because the odds ratios of Factor 2 (1.613) and Factor 5 (10.215) (Table 9) are greater than 1, this indicates that Factor 2 and Factor 5 play a major role in landslide occurrence in the study area. From Table 6, it can be seen that the slope angle (−0.588), TWI (0.611), SPI (0.719), STI (0.746), and topographic relief (−0.590) are the preselected factors with the highest correlation for Factor 2, and the lithology (0.299), slope aspect (−0.588), vegetation (0.132), rainfall (−0.210), NDVI (−0.251), distance-to-river (0.798), and distance-to-fault (−0.299) are the preselected factors with the highest correlation for Factor 5. This means that these preselected factors have a stronger effect on landslide occurrence than the other factors.
Lithology: From Table 2, it can be seen that the lithologies with an FR value greater than 1 are Qhdel, Q3P, T2q1, P2g, and D2q. The lithology of these strata is mainly limestone, volcanic rock, slate, and green schist. In the area where the landslides densely occurred in the study area, the thickness of the rock mass is thin or medium–thin. The rock mass is cut by joins and fractures, and these discontinuities are highly developed. Thus, the local gravity deformation of the rock mass is serious. It is common to see the bending phenomena and tearing deformation phenomena in the rock mass. Therefore, these factors provide favorable conditions for landslide occurrence.
Slope angle and distance-to-river: According to the FR value, the slope angle classes that were the most prone to landslide were 20–40 and 60–70. As for the distance-to-river, the class that was most prone to landslide was the class of 0–1500. The study area is located in a rapidly uplifting region [64]. According to previous research, it was shown that the annual uplift rate of the study area was 5.8 ± 1.0 mm from 1970 to 2012 [65]. The rapid uplift give rise to a rapid river incision. Under the combined action of the bedrock uplift and the river incision, the slope along the river become steeper [24,25,34,66]. In this case, landslides can make the slopes adjust to the rapid river incision quickly [66]. So, a large number of landslides have occurred along the Jinsha River and Dingqu River.
Slope aspect: The FR value of the slope aspect classes show that the areas facing the SW, W, and NW have higher probabilities of landslide occurrence. The slope aspect usually affects the slope structure of the rock mass. The altitude of rock dipping toward to the inner slope was more prone to bending, and the dip bedded rock slope is more prone to landslide occurrence than the escarpment slope is.
TWI, SPI, and STI: From Table 2, it can be seen the TWI classes of 6–12 and 12–18, the SPI class of 1.58–1432.47, and the STI classes of 35–600, 600–9509, and >9509 had a positive effect on the landslide occurrence. These factors reflect the hydrologic condition of the study area. The smaller the TWI value, the lower the moisture. The higher TWI value symbolizes a higher order water channel. In this study, the TWI classes of 6–12 and 12–18 represent a lower order drainage, which is vulnerable to instability. As for SPI and STI, a high value is indicative of water contributions from upslope and high water flow velocities, and the effect of topography on erosion, which are directly linked to landslide occurrence [67].
Topographic relief: The FR value of the topographic relief shows that the class of >40 was favorable for landslide hazards.
Rainfall, vegetation, and NDVI: As for the rainfall, vegetation, and NDVI, the classes most prone to landslide were the class of 303.68–439.10 (rainfall), classes bare soil and brush-forbs (vegetation), and the class −0.378–0.149 (NDVI). It can be seen that the area with high rainfall has fewer landslides. The reason for this was that the annual rainfall in the study was generally low (around 300 mm in the low elevation and 1000 mm in the high elevation), and the precipitation commonly occurred as snowfall in the high elevation area (elevation more than 4800 m) [12]. Therefore, it is difficult to have effective rainfall in a short period, leading to the occurrence of landslides. Second, because of the close relationship between the vertical distribution of the precipitation and the distribution of the vegetation, the distribution of the vegetation also followed a vertical distribution law. The low elevation areas have little vegetation, because of the low rainfall (the low and high elevation areas has a low NDVI value and the moderate elevation area have a high NDVI value). Areas without vegetation cover are more prone to cause the landslides.
Distance-to-fault: The landslides mainly occurred within 0–1200 m of the faults. In the faulted zone, the rock is relatively broken and the joint fracture is developed, which makes the slope of these areas less stable and more prone to landslide occurrence [48].
In summary, the factors that are most favorable to landslide occurrence are as follows: (a) lithology: Qhdel, Q3P, T2q1, P2g, and D2q; (b) slope angle: 20–40 and 60–70; (c) slope aspect: SW, W, and NW; (d) TWI: 6–18; (e) SPI: 1.58–1432.47; (f) STI: >35; (g) topographic relief: >40; (h) rainfall: 303.68–571.60; (i) vegetation: bare soil and brush-forbs; (j) distance-to-river: 0–1500 m; (k) distance-to-fault: 0–1200 m.

5.3. Landslide Susceptibility Mapping

The landslide susceptibility map (PCA-LR model) is shown in Figure 6, and the statistical results of the landslide susceptibility mapping are shown in the Table 10. It can be seen from Figure 5 and Table 10 that the very low and the low susceptibility areas had an area of 84.25 km2 and 81.89 km2, accounting for 23.14% and 22.49% of the total study area, respectively. This region is mainly distributed in the high and moderate elevation area. The strata of this area are T3j1, T2q3, P2, P1b, P1a, and P1r. The vegetation is mainly woods, grassland, and snow. The NDVI value of this area is high, which means that this area has a high vegetation coverage. The rainfall is high relative to the whole study area, but the rainfall commonly occurred as snowfall in the high elevation area. This area is far away from the rivers and has little erosion from the rivers. The moderate, high, and very high susceptibility areas had an area of 65.55 km2, 69.48 km2, and 62.93 km2, accounting for 18.00%, 19.08%, and 17.28%, respectively. The vegetation of this area is mainly brush-forbs and bare soil. The NDVI value of this area is low, which indicates that this area has low vegetation coverage. The strata of this area are Qhdel, Q3p, T2q1, P2g, and D2q. This area is close to the rivers and faults. In the faulted zone, the rock is relatively broken and the joint fracture is developed, which makes the slope of these areas less stable, and landslides are more likely to occur. The study area belongs to the rapidly uplifting region, and the interaction between the bedrock uplift and river incision made the landslides occur widely along rivers.
As for the landslide occurrence, the very low, low, moderate, high, and very high susceptibility areas had an area of 0.80 km2, 1.63 km2, 3.39 km2, 8.81 km2, and 11.85 km2, accounting for 3.02%, 6.12%, 12.76%, 33.15%, and 44.59% of the entire landslide area, respectively. The moderate, high, and very high susceptibility area make up 90.86% of the total landslide area. According to the field survey, the landslide mainly occurred within the high and very high susceptibility ranges. Hence, the landslide susceptibility map that was produced in this study is reasonable.
The landslide susceptibility map shows that the very high, high, and moderate susceptibility areas are mainly distributed in Guxue town, Taentong village, Yongduo village, Rancun village, Deze village, Aluogong village, Jiaxue village, Senen village, Benzilan town, Waka town, and so on, all of which are located near both sides of the Jinsha River and Dingqu River. These villages are densely populated, with a high density of buildings and cultivated land, and some villages have developed industries. Moreover, these villages are located in the high susceptibility areas of the landslide occurrence, so these villages suffer a higher degree of landslide hazards. Therefore, there should be a focus on disaster reduction and prevention in these villages. The low and very low susceptibility areas are mainly distributed in the regions far away from the Jinsha River and Dingqu River. Human activities in this area are relatively weak, and even if landslide occurs, the damage is relatively small.
In general, the areas with very high, high, and moderate susceptibility to landslide occurrence are mainly distributed in the areas with intensive human activities, so disaster prevention and reduction should be emphasized. Human activities are sparse in the areas with a low and very low susceptibility to landslide occurrence, and the potential threat caused by landslide disasters is small or harmless, but the prevention of disaster risk reduction should also be done.

6. Conclusions

According to the field survey, the mechanism of the landslide, the local geo-environmental conditions, and the previous studies, 13 influencing factors, including (a) lithology, (b) slope angle, (c) slope aspect, (d) TWI, (e) curvature, (f) SPI, (g) STI, (h) topographic relief, (i) rainfall, (j) vegetation, (k) NDVI, (l) distance-to-river, and (m) distance-to-fault, were selected to produce the landslide susceptibility map in this study. In order to clarify the relationship between the landslides and the influencing factors, the FR model was used to describe their relationship, because the adjustment of the logistic regression model is sensitive to the linear correlation of the influencing factors. In this paper, the principal component analysis (PCA) is used to reduce the dimension of the preselected influencing factors and to change the factors, which are then reselected, so as to make them independent of each other. According to the eigenvalue of the correlation matrix, six principal components with eigenvalues greater than 0.9 were selected. The sum of variance contribution rates of the six principal components was 82.36%, which means that they extracted 82.36% of the information of the original data. As for the logistic regression analysis, the p-value was used to check the significance of the six factors obtained using the PCA. The factors with a p-value more than 0.05 were excluded from the LR model. Because the P-value of Factor 4 is 0.784, it was excluded from the model. The odds ratio was used to assess the relationship of the six factors and landslide susceptibility. It can be seen that Factor 2 and Factor 5 were related with landslide susceptibility, while Factor 1, Factor 3, and Factor 6 were negated with landslide susceptibility. The slope angle, TWI, SPI, STI, and topographic relief are the preselected factors with the highest correlation with Factor 2, and the lithology, slope aspect, vegetation, rainfall, distance-to-river, and distance-to-fault are the preselected factors with the highest correlation with Factor 5. These factors have been identified as key factors in the occurrence of landslides. The Cox and Snell pseudo R2 test and the Negelkerke pseudo R2 test were used to evaluate the goodness of the fit of the logistic regression model. Both the Cox and Snell R2 value and Negelkerke pseudo R2 value were greater than 0.200, which indicates that the fitting result was good. The landslide susceptibility map that was produced by the logistic regression model was divided into the following five classes using the natural breaks method: very low, low, moderate, high, and very high. The ratios of the areas of the susceptibility classes were 23.14%, 22.49%, 18.00%, 19.08%, and 17.28%, respectively. The total proportion of the landslide pixels of the moderate, high, and very high susceptibility area was 90.86%. The validation result shows that the prediction accuracy of the model was 84.9%, which means that the landslide susceptibility map was reliable and reasonable. Consequently, this study could serve as an effective guide for further land use planning and for the implementation of development.

Author Contributions

X.S. contributed to the data analysis and manuscript writing. J.C. proposed the main structure of this study. Y.B., X.H., J.Z., and W.P. provided useful advice and revised the manuscript. All of the authors read and approved the final manuscript.

Funding

This research was funded by the National Natural Science Fund of China, grant number 41330636, and the Graduate Innovation Fund of Jilin University, grant number 2017137.

Acknowledgments

Thanks to anonymous reviewers for their valuable feedback on the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Aleotti, P.; Chowdhury, R. Landslide hazard assessment: Summary review and new perspectives. Bull. Eng. Geol. Environ. 1999, 58, 21–44. [Google Scholar] [CrossRef]
  2. Saha, A.K.; Gupta, R.P.; Sarkar, I.; Arora, M.K.; Csaplovics, E. An approach for GIS-based statistical landslide susceptibility zonation—With a case study in the himalayas. Landslides 2005, 2, 61–69. [Google Scholar] [CrossRef]
  3. Fell, R.; Corominas, J.; Bonnard, C.; Cascini, L.; Leroi, E.; Savage, W.Z. Guidelines for landslide susceptibility, hazard and risk zoning for land use planning. Eng. Geol. 2008, 102, 85–98. [Google Scholar] [CrossRef][Green Version]
  4. Guzzetti, F.; Reichenbach, P.; Ardizzone, F.; Cardinali, M.; Galli, M. Estimating the quality of landslide susceptibility models. Geomorphology 2006, 81, 166–184. [Google Scholar] [CrossRef]
  5. Raja, N.B.; Çiçek, I.; Türkoğlu, N.; Aydin, O.; Kawasaki, A. Correction to: Landslide susceptibility mapping of the sera river basin using logistic regression model. Nat. Hazards 2018, 91, 1423–1423. [Google Scholar] [CrossRef]
  6. Brabb, E.E.; Pampeyan, E.H.; Bonilla, M.G. Landslide Susceptibility in San Mateo County, California. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 1972. [Google Scholar]
  7. Corominas, J.; Westen, C.V.; Frattini, P.; Cascini, L.; Malet, J.P.; Fotopoulou, S. Recommendations for the quantitative analysis of landslide risk. Bull. Eng. Geol. Environ. 2014, 73, 209–263. [Google Scholar] [CrossRef][Green Version]
  8. Bai, S.B.; Wang, J.; Lü, G.N.; Zhou, P.G.; Hou, S.S.; Xu, S.N. Gis-based logistic regression for landslide susceptibility mapping of the zhongxian segment in the three gorges area, China. Geomorphology 2010, 115, 23–31. [Google Scholar] [CrossRef]
  9. Yesilnacar, E.; Topal, T. Landslide susceptibility mapping: A comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng. Geol. 2005, 79, 251–266. [Google Scholar] [CrossRef]
  10. YïLmaz, I. Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat-Turkey). Comput. Geosci. 2009, 35, 1125–1138. [Google Scholar] [CrossRef]
  11. Chen, W.; Pourghasemi, H.R.; Zhao, Z. A GIS-based comparative study of Dempster-Shafer, logistic regression and artificial neural network models for landslide susceptibility mapping. Geocarto Int. 2017, 32, 367–385. [Google Scholar] [CrossRef]
  12. Cao, C.; Xu, P.; Wang, Y.; Chen, J.; Zheng, L.; Niu, C. Flash flood hazard susceptibility mapping using frequency ratio and statistical index methods in coalmine subsidence areas. Sustainability 2016, 8, 948. [Google Scholar] [CrossRef]
  13. Chen, W.; Li, W.; Chai, H.; Hou, E.; Li, X.; Ding, X. GIS-based landslide susceptibility mapping using analytical hierarchy process (AHP) and certainty factor (CF) models for the Baozhong region of Baoji city, China. Environ. Earth Sci. 2016, 75, 1–14. [Google Scholar] [CrossRef]
  14. He, S.; Pan, P.; Dai, L.; Wang, H.; Liu, J. Application of kernel-based fisher discriminant analysis to map landslide susceptibility in the Qinggan river delta, Three Gorges, China. Geomorphology 2012, 171–172, 30–41. [Google Scholar] [CrossRef]
  15. Pourghasemi, H.R.; Rossi, M. Landslide susceptibility modeling in a landslide prone area in Mazandarn Province, north of Iran: A comparison between GLM, GAM, MARS, and M-AHP methods. Theor. Appl. Climatol. 2017, 130, 1–25. [Google Scholar] [CrossRef]
  16. Lombardo, L.; Bachofer, F.; Cama, M.; Märker, M.; Rotigliano, E. Exploiting maximum entropy method and aster data for assessing debris flow and debris slide susceptibility for the Giampilieri catchment (north-eastern Sicily, Italy). Earth Surf. Process. Landf. 2016, 41, 1776–1789. [Google Scholar] [CrossRef]
  17. Pham, B.T.; Bui, D.T.; Pourghasemi, H.R.; Indra, P.; Dholakia, M.B. Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: A comparison study of prediction capability of naive bayes, multilayer perceptron neural networks, and functional trees methods. Theor. Appl. Climatol. 2015, 122, 1–19. [Google Scholar] [CrossRef]
  18. Chen, W.; Pourghasemi, H.R.; Kornejady, A.; Zhang, N. Landslide spatial modeling: Introducing new ensembles of ANN, MaxEnt, and SVM machine learning techniques. Geocarto Int. 2017, 305, 314–327. [Google Scholar] [CrossRef]
  19. Lee, S.; Ryu, J.H.; Won, J.S.; Park, H.J. Determination and application of the weights for landslide susceptibility mapping using an artificial neural network. Eng. Geol. 2004, 71, 289–302. [Google Scholar] [CrossRef]
  20. Wang, E.; Burchfiel, B.C. Late Cenozoic to Holocene deformation in southwestern Sichuan and Adjacent. Yunnan, China, and its role in formation of the southeastern part of the Tibetan Plateau. Geol. Soc. Am. Bull. 2000, 112, 413–423. [Google Scholar] [CrossRef]
  21. Can, T.; Nefeslioglu, H.A.; Gokceoglu, C.; Sonmez, H.; Duman, T.Y. Susceptibility assessments of shallow earthflows triggered by heavy rainfall at three catchments by logistic regression analyses. Geomorphology 2005, 72, 250–227. [Google Scholar] [CrossRef]
  22. Guzzetti, F.; Mondini, A.C.; Cardinali, M.; Fiorucci, F.; Santangelo, M.; Chang, K.T. Landslide inventory maps: New tools for an old problem. Earth-Sci. Rev. 2012, 112, 42–66. [Google Scholar] [CrossRef]
  23. Yang, X.; Chen, L. Using multi-temporal remote sensor imagery to detect earthquake-triggered landslides. Int. J Appl. Earth Obs. 2010, 12, 487–495. [Google Scholar] [CrossRef][Green Version]
  24. Cao, C.; Wang, Q.; Chen, J.; Ruan, Y.; Zheng, L.; Song, S.; Niu, C. Landslide susceptibility mapping in vertical distribution law of precipitation area: Case of the Xulong hydropower station reservoir, Southwestern China. Water 2016, 8, 270. [Google Scholar] [CrossRef]
  25. Wang, F.; Xu, P.; Wang, C.; Wang, N.; Jiang, N. Application of a GIS-based slope unit method for landslide susceptibility mapping along the Longzi river, southeastern Tibetan plateau, China. ISPRS Int. J. Geo-Inf. 2017, 6, 172. [Google Scholar] [CrossRef]
  26. Li, J.; Wang, C.; Wang, G.; Liu, W. Analysis of landslide influential factors and coupling intensity based on third theory of quantification. Chin. J. Rock Mech. Eng. 2010, 29, 1206–1213. [Google Scholar]
  27. Li, J.-X.; Wang, C.M.; Wang, G.C. Landslide risk assessment based on combination weighting-unascertained measure theory. Rock Soil Mech. 2013, 34, 468–474. [Google Scholar]
  28. Marjanović, M.; Kovačević, M.; Bajat, B.; Voženílek, V. Landslide susceptibility assessment using SVM machine learning algorithm. Eng. Geol. 2011, 123, 225–234. [Google Scholar] [CrossRef]
  29. Kanungo, D.P.; Arora, M.K.; Sarkar, S.; Gupta, R.P. A comparative study of conventional, ANN black box, fuzzy and combined neural and fuzzy weighting procedures for landslide susceptibility zonation in Darjeeling Himalayas. Eng. Geol. 2006, 85, 347–366. [Google Scholar] [CrossRef]
  30. Yalcin, A. GIS-based landslide susceptibility mapping using analytical hierarchy process and bivariate statistics in Ardesen (Turkey): Comparisons of results and confirmations. Catena 2008, 72, 1–12. [Google Scholar] [CrossRef]
  31. Akgun, A.; Sezer, E.A.; Nefeslioglu, H.A.; Gokceoglu, C.; Pradhan, B. An easy-to-use Matlab program (Mamland) for the assessment of landslide susceptibility using a Mamdani fuzzy algorithm. Comput. Geosci. 2012, 38, 23–34. [Google Scholar] [CrossRef]
  32. Mohammady, M.; Pourghasemi, H.R.; Pradhan, B. Landslide susceptibility mapping at Golestan province, Iran: A comparison between frequency ratio, dempster–shafer, and weights-of-evidence models. J. Asian Earth Sci. 2012, 61, 221–236. [Google Scholar] [CrossRef]
  33. Lee, S.; Min, K. Statistical analysis of landslide susceptibility at Yongin, Korea. Environ. Geol. 2001, 40, 1095–1113. [Google Scholar] [CrossRef]
  34. Simons, M. The Morphological Analysis of Landforms: A New Review of the Work of Walther Penck (1888–1923); JSTOR: New York, NY, USA, 1962. [Google Scholar]
  35. Conforti, M.; Pascale, S.; Robustelli, G.; Sdao, F. Evaluation of prediction capability of the artificial neural networks for mapping landslide susceptibility in the Turbolo river catchment (Northern Calabria, Italy). Catena 2014, 113, 236–250. [Google Scholar] [CrossRef]
  36. Hungr, O.; Leroueil, S.; Picarelli, L. The varnes classification of landslide types, an update. Landslides 2014, 11, 167–194. [Google Scholar] [CrossRef]
  37. Ozdemir, A. Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey). J. Hydrol. 2011, 405, 123–136. [Google Scholar] [CrossRef]
  38. Gokceoglu, C.; Sonmez, H.; Nefeslioglu, H.A.; Duman, T.Y.; Can, T. The 17 March 2005 Kuzulu Landslide (Sivas, Turkey) and landslide-susceptibility map of its near vicinity. Eng. Geol. 2005, 81, 65–83. [Google Scholar] [CrossRef]
  39. Beven, K.; Kirkby, M.J. A physically based, variable contributing area model of basin hydrology. Hydrol. Sci. Bull. 1979, 24, 43–69. [Google Scholar] [CrossRef]
  40. Moore, I.D.; Burch, G.J. Physical Basis of the Length-slope Factor in the Universal Soil Loss Equation. Soil Sci. Soc. Am. J. 1986, 50, 1294–1298. [Google Scholar] [CrossRef]
  41. Oh, H.J.; Pradhan, B. Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area. Comput. Geosci. 2011, 37, 1264–1276. [Google Scholar] [CrossRef]
  42. Jebur, M.N.; Pradhan, B.; Tehrany, M.S. Optimization of landslide conditioning factors using very high-resolution airborne laser scanning (lidar) data at catchment scale. Remote Sens. Environ. 2014, 152, 150–165. [Google Scholar] [CrossRef]
  43. Moore, I.D.; Grayson, R.B.; Ladson, A.R. Digital terrain modelling: A review of hydrological, geomorphological, and biological applications. Hydrol. Process. 1991, 5, 3–30. [Google Scholar] [CrossRef]
  44. Pradhan, A.M.S.; Kang, H.S.; Lee, S.; Kim, Y.T. Spatial model integration for shallow landslide susceptibility and its runout using a GIS-based approach in Yongin, Korea. Geocarto Int. 2016, 32, 420–441. [Google Scholar] [CrossRef]
  45. Cheng, Q.; Ko, C.; Yuan, Y.; Ge, Y.; Zhang, S. GIS modeling for predicting river runoff volume in ungauged drainages in the greater Toronto area, Canada. Comput. Geosci. 2006, 32, 1108–1119. [Google Scholar] [CrossRef]
  46. Huang, Y.; Chen, S.; Cao, Q.; Hong, Y.; Wu, B.; Huang, M.; Qiao, L.; Zhang, Z.; Li, Z.; Li, W.; et al. Evaluation of version-7 TRMM multi-satellite precipitation analysis product during the Beijing extreme heavy rainfall event of 21 July 2012. Water 2013, 6, 32–44. [Google Scholar] [CrossRef]
  47. Park, S.; Choi, C.; Kim, B.; Kim, J. Landslide susceptibility mapping using frequency ratio, analytic hierarchy process, logistic regression, and artificial neural network methods at the Inje area, Korea. Environ. Earth Sci. 2013, 68, 1443–1464. [Google Scholar] [CrossRef]
  48. Hong, H.; Pradhan, B.; Xu, C.; Bui, D.T. Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 2015, 133, 266–281. [Google Scholar] [CrossRef]
  49. Laxton, J. Geographic information systems for geoscientists—Modelling with GIS—Bonhamcarter, GF. Int. J. Geogr. Inf. Syst. 1996, 10, 355–356. [Google Scholar] [CrossRef]
  50. Hosmer, D.W., Jr.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression, 3rd ed.; John Wiley: Hoboken, NJ, USA, 2013. [Google Scholar]
  51. Djeddaoui, F.; Chadli, M.; Gloaguen, R.; Djeddaoui, F.; Chadli, M.; Gloaguen, R. Desertification susceptibility mapping using logistic regression analysis in the Djelfa area, Algeria. Remote Sens. 2017, 9, 1031. [Google Scholar] [CrossRef]
  52. Agterberg, F.P.; Cheng, Q. Conditional independence test for weights-of-evidence modeling. Nat. Resour. Res. 2002, 11, 249–255. [Google Scholar] [CrossRef]
  53. Preisendorfer, R.W.; Mobley, C.D. Principal component analysis in meteorology and oceanography. Dev. Atmos. Sci. 1988, 17, 55–72. [Google Scholar]
  54. Dai, F.C.; Lee, C.F. Landslide characteristics and slope instability modeling using GIS, Lantau Island, Hong Kong. Geomorphology 2002, 42, 213–228. [Google Scholar] [CrossRef]
  55. Bewick, V.; Cheek, L.; Ball, J. Statistics review 14: Logistic regression. Crit. Care 2005, 9, 112–118. [Google Scholar] [CrossRef] [PubMed]
  56. Regmi, N.R.; Giardino, J.R.; McDonald, E.V.; Vitek, J.D. A comparison of logistic regression-based models of susceptibility to landslides in Western Colorado, USA. Landslides 2014, 11, 247–262. [Google Scholar] [CrossRef]
  57. Clark, W.; Hosking, P. Statistical Methods for Geographers; John Wiley & Sons: New York, NY, USA, 1986. [Google Scholar]
  58. Mathew, J.; Jha, V.K.; Rawat, G.S. Landslide susceptibility zonation mapping and its validation in part of Garhwal Lesser Himalaya, India, using binary logistic regression analysis and receiver operating characteristic curve method. Landslides 2009, 6, 17–26. [Google Scholar] [CrossRef]
  59. Othman, A.A.; Gloaguen, R.; Andreani, L.; Rahnama, M. Landslide susceptibility mapping in Mawat area, Kurdistan Region, NE Iraq: A comparison of different statistical models. Nat. Hazards Earth Syst. Sci. Discuss. 2015, 3, 1789–1833. [Google Scholar] [CrossRef]
  60. Devkota, K.C.; Regmi, A.D.; Pourghasemi, H.R.; Yoshida, K.; Pradhan, B.; Ryu, I.C.; Dhital, M.R.; Althuwaynee, O.F. Landslide susceptibility mapping using certainty factor, index of entropy and logistic regression models in GIS and their comparison at Mugling–Narayanghat road section in Nepal Himalaya. Nat. Hazards 2013, 65, 135–165. [Google Scholar] [CrossRef][Green Version]
  61. Lee, S. Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data. Int. J. Remote Sens. 2005, 26, 1477–1491. [Google Scholar] [CrossRef]
  62. Hosmer, D.W., Jr.; Lemeshow, S.; Sturdivant, R.X. Model-building strategies and methods for logistic regression. In Applied Logistic Regression, 3rd ed.; Wiley: Hoboken, NJ, USA, 2000; pp. 89–151. [Google Scholar]
  63. Alatorre, L.C.; Sánchez-Andrés, R.; Cirujano, S.; Beguería, S.; Sánchez-Carrillo, S. Identification of mangrove areas by remote sensing: The ROC curve technique applied to the northwestern Mexico coastal zone using Landsat imagery. Remote Sens. 2011, 3, 1568–1583. [Google Scholar] [CrossRef][Green Version]
  64. Chen, Y.; Booth, D.C. The Wenchuan Earthquake of 2008: Anatomy of a Disaster; Springer Science & Business Media: New York, NY, USA, 2011. [Google Scholar]
  65. Hao, M.; Wang, Q.; Shen, Z.; Cui, D.; Ji, L.; Li, Y.; Qin, S. Present day crustal vertical movement inferred from precise leveling data in eastern margin of Tibetan plateau. Tectonophysics 2014, 632, 281–292. [Google Scholar] [CrossRef]
  66. Burbank, D.W.; Leland, J.; Fielding, E.; Anderson, R.S.; Brozovic, N.; Reid, M.R.; Duncan, C. Bedrock incision, rock uplift and threshold hillslopes in the northwestern Himalayas. Nature 1996, 379, 505–510. [Google Scholar] [CrossRef]
  67. Pradhan, A.M.S.; Kim, Y.T. Relative effect method of landslide susceptibility zonation in weathered granite soil: A case study in Deokjeok-ri Creek, South Korea. Nat. Hazards 2014, 72, 1189–1217. [Google Scholar] [CrossRef]
Figure 1. Geographical position and landslide inventory of the study area.
Figure 1. Geographical position and landslide inventory of the study area.
Ijgi 07 00438 g001
Figure 2. Flow chart of this study.
Figure 2. Flow chart of this study.
Ijgi 07 00438 g002
Figure 3. The relationship between elevation and average annual precipitation based on the nine precipitation stations.
Figure 3. The relationship between elevation and average annual precipitation based on the nine precipitation stations.
Ijgi 07 00438 g003
Figure 4. Influencing factors maps of the study area: (a) lithology; (b) slope angle; (c) slope aspect; and (d) topographic wetness index (TWI).
Figure 4. Influencing factors maps of the study area: (a) lithology; (b) slope angle; (c) slope aspect; and (d) topographic wetness index (TWI).
Ijgi 07 00438 g004
Figure 5. Influencing factor maps of the study area: (a) curvature; (b) steam power index (SPI); (c) sediment transport index (STI); and (d) topographic relief.
Figure 5. Influencing factor maps of the study area: (a) curvature; (b) steam power index (SPI); (c) sediment transport index (STI); and (d) topographic relief.
Ijgi 07 00438 g005
Figure 6. Influencing factor maps of the study area: (a) rainfall; (b) vegetation; (c) normalized difference vegetation index (NDVI); and (d) distance-to-river.
Figure 6. Influencing factor maps of the study area: (a) rainfall; (b) vegetation; (c) normalized difference vegetation index (NDVI); and (d) distance-to-river.
Ijgi 07 00438 g006
Figure 7. Influencing factors maps of the study area: (a) distance-to-fault and (b) elevation.
Figure 7. Influencing factors maps of the study area: (a) distance-to-fault and (b) elevation.
Ijgi 07 00438 g007
Figure 8. Influencing factor maps selected using principal component analysis (PCA): (a) Factor 1; (2) Factor 2; (3) Factor 3; and (b) Factor 4.
Figure 8. Influencing factor maps selected using principal component analysis (PCA): (a) Factor 1; (2) Factor 2; (3) Factor 3; and (b) Factor 4.
Ijgi 07 00438 g008
Figure 9. Influencing factors maps selected by PCA: (a) Factor 5 and (b) Factor 6.
Figure 9. Influencing factors maps selected by PCA: (a) Factor 5 and (b) Factor 6.
Ijgi 07 00438 g009
Figure 10. Landslide susceptibility map: (a) PCA-logistic regression (LR) method; (b) FR method; and (c) analytic hierarchy process (AHP) method.
Figure 10. Landslide susceptibility map: (a) PCA-logistic regression (LR) method; (b) FR method; and (c) analytic hierarchy process (AHP) method.
Ijgi 07 00438 g010
Figure 11. Receiver operating characteristic (ROC) curve of the model.
Figure 11. Receiver operating characteristic (ROC) curve of the model.
Ijgi 07 00438 g011
Table 1. Annual precipitation measured by precipitation station at different elevation.
Table 1. Annual precipitation measured by precipitation station at different elevation.
Precipitation StationLongitudeLatitudeElevation/mAverage Annual Precipitation/mmData Resources/Year
Derong99°10.2′28°25.8′2422.9347.11981−2010
Batang99°03.6′30°00.0′2589.2497.01981−2010
Xiangcheng99°28.8′28°33.6′2842.0483.11981−2010
Xianggelila99°25.2′27°30.0′3276.7651.11981−2010
Deqin98°33.0′28°17.4′3319.0696.71981−2010
Dege98°35.0′31°48.0′3184.0622.41981−2010
Baiyu98°50.0′31°13.0′3260.0626.61981−2010
Benilan99°17.0′28°17.0′2023.0308.01965−1998
Shangqiaotou99°24.0′28°10.0′2040.0369.71961−2004
Table 2. Distribution of the training pixels. NDVI—normalized difference vegetation index; FR—frequency ratio; TWI—topographic wetness index; SPI—steam power index; STI—sediment transport index; NDVI—normalized difference vegetation index.
Table 2. Distribution of the training pixels. NDVI—normalized difference vegetation index; FR—frequency ratio; TWI—topographic wetness index; SPI—steam power index; STI—sediment transport index; NDVI—normalized difference vegetation index.
FactorsClassLandslide Not OccurredLandslide OccurredTotal CountFR
CountRatioCountRatio
LithologyQhdel10670.03%19,3797.32%20,44613.04
Q3p44,7411.33%49641.88%49,7051.37
T3j182,6442.45%00.00%82,6440.00
T2q3433,45812.84%00.00%433,4580.00
T2q2646,00619.13%37,07514.00%683,0810.75
T2q188,0432.61%46,05617.40%134,0994.72
P2265,7227.87%6390.24%266,3610.03
P2g430,10612.74%94,43435.67%524,5402.48
P1b461,12613.66%00.00%461,1260.00
P1a127,9723.79%00.00%127,9720.00
P1r141,0154.18%00.00%141,0150.00
C355,8051.65%17630.67%57,5680.42
D2q598,58517.73%60,42222.82%659,0071.26
Slope Angle0–1099,0222.93%38321.45%102,8540.51
10–20390,70911.57%27,92610.55%418,6350.92
20–301,105,15232.73%96,97336.63%1,202,1251.11
30–401,284,36438.04%105,99040.04%1,390,3541.05
40–50430,87212.76%25,5189.64%456,3900.77
50–6060,7261.80%35491.34%64,2750.76
60–7053310.16%9390.35%62702.06
>701140.00%50.00%1190.58
Slope AspectFlat19380.06%60.00%19440.04
N433,95012.85%72072.72%441,1570.22
NE364,64210.80%13,4115.07%378,0530.49
E387,48011.48%28,77910.87%416,2590.95
SE340,42510.08%14,5635.50%354,9880.56
S435,60412.90%31,77012.00%467,3740.93
SW466,65813.82%60,96623.03%527,6241.59
W509,21615.08%67,25225.40%576,4681.60
NW436,37712.92%40,77815.40%477,1551.18
TWI<6812,43924.06%58,59922.14%871,0380.93
6–122,486,60473.65%197,69874.68%2,684,3021.01
12–1870,3362.08%81223.07%78,4581.42
>1869110.20%3130.12%72240.60
CurvatureConcave1,341,95139.75%105,13339.71%1,447,0841.00
Flat689,22120.41%55,59521.00%744,8161.03
Convex1,345,07839.84%104,04439.30%1,449,1220.99
SPI (×104)<15.781,029,73430.50%2,447,73693.58%3,477,4700.98
15.78–1432.47138,6384.11%16,4226.20%155,0601.46
>1432.4778980.23%5740.22%84720.93
STI<35887,11426.27%60,89723.00%948,0110.88
35–6002,320,98768.74%185,08569.91%2,506,0721.02
600–9509162,2414.81%18,2276.89%180,4681.39
>950959480.18%5230.20%64711.11
Topographic Relief0–10707,75220.96%52,21119.72%759,9630.94
10–202,052,32760.79%170,33164.34%2,222,6581.05
20–30552,58316.37%37,51714.17%590,1000.87
30–4054,6361.62%29281.11%57,5640.70
>4089920.27%17450.66%10,7372.24
Rainfall303.68–439.10684,22220.27%120,18945.40%804,4112.05
439.10–571.60940,47027.86%93,15635.19%1,033,6261.24
571.60–704.10751,01522.24%40,88615.44%791,9010.71
704.10–836.60526,64815.60%10,5013.97%537,1490.27
836.60–969.10364,55810.80%00.00%364,5580.00
969.10–1071.92109,3773.24%00.00%109,3770.00
VegetationBare Soil684,22220.27%120,18945.40%804,4112.05
Brush-forbs1,410,76341.78%126,12347.64%1,536,8861.13
Woods831,80524.64%18,4206.96%850,2250.30
Grassland340,12310.07%00.00%340,1230.00
Snow109,3773.24%00.00%109,3770.00
NDVI−0.378–0.038369,51010.94%45,34517.13%414,8551.50
0.038–0.149901,20126.69%141,96453.63%1,043,1651.87
0.149–0.272799,70823.69%51,24119.36%850,9490.83
0.272–0.412734,34821.75%18,1776.87%752,5250.33
0.412–0.705571,52316.93%80053.02%579,5280.19
Distance-to-river0–300260,3597.71%30,29711.44%290,6561.43
300–600222,2376.58%52,31419.76%274,5512.62
600–900210,4926.23%46,04817.39%256,5402.47
900–1200204,8696.07%39,29014.84%244,1592.21
1200–1500198,8065.89%31,04111.73%229,8471.86
>15002,279,52767.52%65,74224.83%2,345,2690.39
Distance-to-fault0–300522,09315.46%79,02929.85%601,1221.81
300–600510,16015.11%75,87028.66%586,0301.78
600–900297,1568.80%38,66114.60%335,8171.58
900–1200533,95815.81%46,48617.56%580,4441.10
1200–1500348,65210.33%18,6007.03%367,2520.70
1500–1800286,1788.48%54862.07%291,6640.26
1800–2100207,8716.16%6000.23%208,4710.04
>2100670,22219.85%00.00%670,2220.00
Table 3. Results of the Kaiser–Meyer–Olkin (KMO) test and the Bartlett’s test.
Table 3. Results of the Kaiser–Meyer–Olkin (KMO) test and the Bartlett’s test.
KMO test0.640
Bartlett’s test8,177,019.716
p-value0.000
Table 4. The correlation matrix of the influencing factors.
Table 4. The correlation matrix of the influencing factors.
FactorsF1F2F3F4F5F6F7F8F9F10F11F12F13
F11.000.12−0.08−0.030.000.010.020.12−0.40−0.38−0.27−0.34−0.41
F20.121.00−0.01−0.250.01−0.020.000.90−0.06−0.07−0.04−0.13−0.02
F3−0.08−0.011.000.000.000.000.00−0.010.090.070.080.020.09
F4−0.03−0.250.001.00−0.280.190.31−0.26−0.08−0.07−0.04−0.04−0.01
F50.000.010.00−0.281.00−0.02−0.050.010.030.020.010.010.00
F60.01−0.020.000.19−0.021.000.96−0.02−0.04−0.04−0.04−0.04−0.01
F70.020.000.000.31−0.050.961.000.00−0.06−0.05−0.04−0.05−0.01
F80.120.90−0.01−0.260.01−0.020.001.00−0.07−0.09−0.05−0.15−0.02
F9−0.40−0.060.09−0.080.03−0.04−0.06−0.071.000.950.480.800.38
F10−0.38−0.070.07−0.070.02−0.04−0.05−0.090.951.000.410.770.34
F11−0.27−0.040.08−0.040.01−0.04−0.04−0.050.480.411.000.350.34
F12−0.34−0.130.02−0.040.01−0.04−0.05−0.150.800.770.351.000.36
F13−0.41−0.020.09−0.010.00−0.01−0.01−0.020.380.340.340.361.00
Notes: F1—lithology; F2—slope angle; F3—slope aspect; F4—TWI; F5—curvature; F6—SPI; F7—STI; F8—topographic relief; F9—rainfall; F10—vegetation; F11—NDVI; F12—distance-to-river; F13—distance-to-fault.
Table 5. Total variance explained.
Table 5. Total variance explained.
ComponentsInitial EigenvaluesExtraction Sums of Squared Loadings
Total% of VarianceCumulative %Total% of VarianceCumulative %
13.506 26.969 26.969 3.506 26.969 26.969
22.190 16.843 43.812 2.190 16.843 43.812
31.873 14.409 58.220 1.873 14.409 58.220
41.146 8.813 67.034 1.146 8.813 67.034
51.038 7.985 75.019 1.038 7.985 75.019
60.910 6.997 82.016 0.910 6.997 82.016
70.724 5.568 87.584 ---
80.624 4.797 92.382 ---
90.570 4.384 96.766 ---
100.252 1.939 98.705 ---
110.096 0.737 99.441 ---
120.040 0.310 99.751 ---
130.032 0.249 100.000 ---
Table 6. Component score coefficient matrix.
Table 6. Component score coefficient matrix.
Factors123456
F1−0.577−0.076−0.0190.112−0.2990.447
F2−0.207−0.5880.717−0.177−0.034−0.003
F30.1260.0050.032−0.1150.7980.573
F4−0.0540.611−0.091−0.486−0.061−0.004
F50.032−0.195−0.0010.8500.198−0.112
F6−0.1020.7190.6190.2230.0120.009
F7−0.1170.7460.6260.1390.0040.007
F8−0.220−0.5900.714−0.174−0.020−0.012
F90.925−0.0520.1400.035−0.1760.199
F100.895−0.0380.1240.047−0.2100.230
F110.592−0.0400.097−0.0690.132−0.107
F120.8420.0160.0570.045−0.2510.165
F13−0.577−0.076−0.0190.112−0.2990.447
Notes: F1—lithology; F2—slope angle; F3—slope aspect; F4—TWI; F5—curvature; F6—SPI; F7—STI; F8—topographic relief; F9—rainfall; F10—vegetation; F11—NDVI; F12—distance-to-river; F13—distance-to-fault.
Table 7. The first logistic regression analysis results.
Table 7. The first logistic regression analysis results.
FactorsFactor 1Factor 2Factor 3Factor 4Factor 5Factor 6
P-value0.0000.0420.0000.7840.0000.000
Table 8. Results of the Cox and Snell pseudo R2 test and the Negelkerke pseudo R2 test.
Table 8. Results of the Cox and Snell pseudo R2 test and the Negelkerke pseudo R2 test.
Pseudo R2 testvalue
Cox and Snell pseudo R2 test0.233
Negelkerke pseudo R20.310
Table 9. The factors estimated coefficients.
Table 9. The factors estimated coefficients.
FactorsBGStandard Error of EstimateWald χ2 Valuep-ValueOdds Ratio
Factor 1−5.370 0.036 21,795.771 0.000 0.005
Factor 20.478 0.168 8.081 0.004 1.613
Factor 3−0.859 0.131 42.868 0.000 0.424
Factor 52.324 0.019 14,953.978 0.000 10.215
Factor 6−0.538 0.017 991.685 0.000 0.584
Constant0.925 0.016 3183.937 0.000 2.522
Table 10. Statistical results of the landslide susceptibility mapping. PCA-LR—principal component analysis logistic regression; FR—frequency ratio; AHP—analytic hierarchy process.
Table 10. Statistical results of the landslide susceptibility mapping. PCA-LR—principal component analysis logistic regression; FR—frequency ratio; AHP—analytic hierarchy process.
ModelsSusceptibilityLandslide OccurredTotal Study AreaPrediction Accuracy
CountRatioArea (km2)CountRatioArea (km2)
PCA-LRVery Low80213.02%0.80 84254923.14%84.25 83.4%
Low16256.12%1.63 81889522.49%81.89
Moderate33,90112.76%3.39 65549918.00%65.55
High88,08033.15%8.81 69477019.08%69.48
Very High1,184,80044.59%11.85 62930917.28%62.93
AHPVery Low24410.92%0.24 61726916.95%61.73 76.9%
Low16,8436.34%1.68 103143628.33%103.14
Moderate29,42111.07%2.94 88989624.44%88.99
High76,81428.91%7.68 70102819.25%70.10
Very High139,21352.39%13.92 40139311.02%40.14
FRVery Low47741.80%0.48 68525318.82%68.53 79.9%
Low18,5987.00%1.86 101674527.92%101.67
Moderate44,10616.60%4.41 84823223.30%84.82
High101,13838.06%10.11 74000720.32%74.00
Very High96,11636.17%9.61 3507859.63%35.08
Back to TopTop