Landslide Detection Using the Unsupervised Domain-Adaptive Image Segmentation Method

Chen, Weisong; Chen, Zhuo; Song, Danqing; He, Hongjin; Li, Hao; Zhu, Yuxian

doi:10.3390/land13070928

Open AccessArticle

Landslide Detection Using the Unsupervised Domain-Adaptive Image Segmentation Method

by

Weisong Chen

^1,2,

Zhuo Chen

^1,2,*,

Danqing Song

^3,4,

Hongjin He

^1,2,

Hao Li

^1,2 and

Yuxian Zhu

^1,2

¹

College of Civil Engineering, Sichuan Agricultural University, Dujiangyan 611830, China

²

Sichuan Higher Education Engineering Research Center for Disaster Prevention and Mitigation of Village Construction, Sichuan Agricultural University, Dujiangyan 611830, China

³

School of Civil Engineering and Transportation, South China University of Technology, Guangzhou 510640, China

⁴

State Key Laboratory of Subtropical Building Science, South China University of Technology, Guangzhou 510640, China

^*

Author to whom correspondence should be addressed.

Land 2024, 13(7), 928; https://0-doi-org.brum.beds.ac.uk/10.3390/land13070928

Submission received: 8 April 2024 / Revised: 22 June 2024 / Accepted: 23 June 2024 / Published: 26 June 2024

(This article belongs to the Section Land Innovations – Data and Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

:

After a landslide, swift and precise identification of the affected area is paramount for facilitating urgent rescue operations and damage assessments. This is particularly vital for land use planners and policymakers, enabling them to efficiently address hazard mitigation, the resettlement of those affected by the hazards, and to strategize land planning in the impacted regions. Despite the importance, conventional methods of monitoring landslides often fall short due to their restricted scope and the challenges associated with data acquisition. This study proposes a landslide detection method based on unsupervised multisource and target domain adaptive image segmentation (LUDAS) that is capable of achieving robust and generalized landslide mapping across multiple sources and target domains. Specifically, LUDAS consists of two phases. In the first phase, we introduce an unsupervised interdomain translation network to align the styles of multiple source domains to multiple target domains, generating pseudotarget domain data. Our interdomain translation network is capable of style transfer between any two domains. Through careful design of the network structure and loss functions, we ensure effective style transfer while preserving the content structure of the source domain images. In the second phase, the landslide segmentation model is trained in a supervised manner using annotated data from multiple source domains and multiple pseudotarget domains, resulting in a model with strong generalization capabilities that can adapt to multiple source and target domains. Finally, through extensive qualitative and quantitative analysis experiments, our study confirms that the proposed domain-adaptive segmentation model not only achieves exceptional landslide segmentation performance across multiple target domains but also, due to its good generalizability and transferability, has great potential for application in the emergency response to landslide. This capability can provide strong support for post-disaster emergency rescue, disaster assessment, and land planning in areas with scarce data.

Keywords:

landslide detection; multiple source and target domain; domain adaptation; image segmentation

1. Introduction

Landslides are extremely common natural hazards in mountainous areas where the soil, rocks, or other substances inside the sliding body on the surface of the mountain slide down the slope under the influence of human or natural factors [1]. They are mainly caused by natural factors such as rainfall and earthquakes, as well as human activities such as overexploitation and mining. Once they occur, both in vulnerable regions and civil centers, they will have negative impacts on human life in various ways. In severe hazards, they may even threaten people’s lives [2]. In addition, the observation and detection of the scope and impact of mountain landslides often hinder the engineering process due to the complex terrain and unique topography of the mountainous areas where landslides occur. Therefore, accurately mapping the causes and affected areas of landslides is crucial for developing effective early warning and emergency strategies, which will help reduce casualties and property losses before and after hazards occur. In recent years, the development of aviation and satellite remote sensing imaging technology has provided possibilities for landslide prediction. This approach also reduces the time cost and potential risks for staff, and remote sensing images have become one of the key data sources for landslide detection [3].

Traditional landslide detection mainly relies on the prior knowledge of geological experts or qualitative or semiquantitative evaluations based on historical statistical data analysis, as the weights of various influencing factors cannot be determined. Therefore, this method is subjective and empirical and lacks reasonable scientific explanations and guidance [4]. In recent years, with the rapid development of computer technology, many scholars have begun exploring the application of machine learning methods such as random forests [5,6,7,8], depth convolutional neural networks (CNNs), and regression models [9,10,11,12]. In the field of landslide detection. Xie et al. [13] proposed the U2PL semi-supervised semantic segmentation method, which automatically identifies landslides in optical remote sensing images and uses unlabeled samples to avoid the performance degradation of traditional supervised learning when training samples are insufficient. The model captures more features hidden in landslide images while using only a small number of labeled samples to reduce the amount of manual labeling work. Mezaal et al. [14] proposed the application of recurrent neural networks (RNNs) and multilayer perceptual neural networks (MLP-NNs) in landslide detection. The segmentation parameters and feature selection are optimized by the supervisory method and the feature selection based on correlation, respectively. A system-based grid search defines the hyperparameters of the network architecture. The proposed model with optimized hyperparameters produces the most accurate classification results. Nava et al. [15] utilized CNNS implemented in TensorFlow to analyze optical data combined with multipole SAR data for rapid detection even in extremely harsh situations and achieved an accuracy comparable to that of classical optical change detection methods for landslide identification and detection. Ghorbanzadeh et al. [16] used optical data and terrain factors from the Rapid Eye satellite to analyze the potential of machine learning methods. Twenty different maps were created using ANN, SVM, and RF, as well as different CNN instances, and compared with extensive fieldwork results by means of the intersection Union (mIOU) and other common metrics. We believe that deep learning can improve landslide detection in the future. Wang et al. [17] performed comparative experiments and reported that among eight machine learning and deep learning models (LR, SVM, RF, Discrete AdaBoost, LogitBoost, Gentle AdaBoost, CNN-6 and CNN-11), CNN-11 is the most promising model for landslide identification. It is concluded that the landslide identification method based on the combination of machine learning and deep learning technology has good robustness and great potential.

However, despite certain predictive successes achieved by machine learning in landslide detection, there are several drawbacks. First, landslide detection is influenced by multiple unrelated or nonlinearly related factors, and simple machine learning methods have limited predictive effects or are prone to overfitting. Second, different feature factors have varying degrees of influence on landslide detection, leading to some subjectivity and randomness in the selection of influencing factors. Finally, machine learning requires a large amount of sample data, but in many cases, important data are lost or not easily mapped, and the collection and organization of sample data have considerable human and time costs.

These breakthroughs in aerospace technology have led to new directions for landslide detection. Satellite-collected remote sensing geological images are more comprehensive and resolve the issue of human costs, becoming one of the important data sources for assessing landslide detection. Moreover, deep learning possesses strong feature extraction capabilities and can adaptively learn relevant feature information from input data. Therefore, image segmentation techniques based on deep learning have gradually become mainstream methods for detecting landslides and accurately identifying and predicting landslide areas and potential risk zones [18,19,20]. Li et al. [21] proposed an improved PSPNet network model for the semantic segmentation of landslides in Linzhi, Tibet, adopting the MobileNetV2 structure as the feature extraction network, which reduced the number of network parameters while improving the convergence speed of the network. Du et al. [22] used six popular semantic segmentation models for landslide detection, and the experimental results showed that the global convolutional network performed better in identifying specific types of landslide images, while DeepLabv3 had better robustness. Li et al. [23] adopted a fully supervised learning approach based on a U-Net model fused with a fully connected conditional random field to segment landslide areas and used the STAPLE algorithm to generate ground truth masks of landslide occurrences. Ullo et al. [24] proposed an improved landslide detection method based on Mask R-CNN and employed transfer learning to train the proposed model, enhancing its generalizability.

A segmentation model based on deep learning can accurately locate the location of landslides and achieve pixel-level prediction of landslide areas. However, there are significant differences in landslide characteristics among different regions, and the generalizability of these models is limited, requiring additional training to adapt to images from other regions. In addition, the above segmentation algorithms are all based on supervised learning models. Although supervised learning can achieve general landslide detection through manual annotation and other methods, it is difficult to avoid the influence of human errors. Moreover, due to the scarcity of landslide datasets, difficulty in obtaining data, and high annotation costs, these methods not only struggle to accurately segment landslides in different regions but also struggle to meet the complex engineering problems of landslide detection.

In recent research, domain adaptation techniques have gradually gained attention. Domain adaptation refers to the process of transferring and adapting a model developed and trained in one domain to a different, new environment. This is particularly important in landslide analysis, where annotated data are scarce. Effective domain adaptation improves the applicability and accuracy of the model in new areas and reduces the dependence on annotated data. Single-source to single-target domain adaptation is the starting point of domain adaptation research, which focuses on how to transfer knowledge learned from a data-rich source domain (e.g., a domain with abundant annotated data) to a specific target domain (usually a domain with scarce or no annotated data) [25]. Wang et al. [26] proposed a novel network structure, Conditional Coupled Generative Adversarial Networks (CoCoGANs), to address the zero-shot domain adaptation (ZSDA) problem, enabling knowledge transfer in situations where only a single-source domain is available and the target domain is inaccessible. To address the generalizability of unseen target domains, Xu et al. [27] introduced a method called simple domain expansion (SimDE), which learns domain-invariant information by generating uncertain samples. As research on domain adaptation has deepened, single-source to multitarget domain adaptation has become a new research hotspot. In this scenario, the model needs to not only learn from a single-source domain but also adapt to multiple different target domains. To connect the shared knowledge across multiple domains, Zeng et al. [28] proposed a multitarget domain adaptation (MTDA) method applied to rotating machinery fault diagnosis using features from multiple target domains to enhance the transfer accuracy. Gholami et al. [29] introduced an information-theoretic approach for domain adaptation in the context of unlabeled multiple target domains and a labeled single-source domain, which can find a shared latent space among all domains while considering specific elements in each domain.

In response to the above issues, this paper proposes a robust detection method for landslide detection based on unsupervised domain adaptive image segmentation (LUDAS), aiming to achieve a generalized segmentation model for landslide detection based on multisource and multi-objective domains. Specifically, our proposed landslide segmentation method is divided into two stages. The first stage is responsible for aligning data from multiple source domains to different target domains, obtaining source domain data with a target domain style, which is called pseudotarget domain data. This stage is achieved by training an unsupervised interdomain image translation network. In the second stage, the landslide segmentation model is supervised and trained using multisource domain data and the pseudotarget domain data generated in the first stage. This results in a model with strong generalization ability that can perform precise and robust segmentation across multiple source and target domains. Finally, through a large number of qualitative and quantitative analysis experiments, it was verified that the proposed region adaptive segmentation model has good segmentation performance in multiple objective domains.

2. Study Area

In this study, two datasets were selected as research subjects: the Bijie Landslide Dataset and the GVLM Dataset [30,31].

Bijie Landslide Dataset [30]. Bijie is located in the northwest of Guizhou Province at the junction of Guizhou, Yunnan and Sichuan provinces. It occupies 15.25% of Guizhou’s land area, covering an area of 26,900 square kilometers, with a width of 158 km from north to south and a length of 310 km from east to west. Bijie is mainly mountainous, with a large topography and strong cutting. Its geographical coordinates range between 26°21′ N to 27°46′ N latitude and 103°36′ E to 106°43′ E longitude, with altitudes varying from 457 m to 2900 m [32,33,34,35,36,37,38,39]. The geological strata in Bijie City are quite complete, with a variety of geological strata units and complex rock types. Strata from the Proterozoic to the Cenozoic are exposed, mainly consisting of sedimentary rocks. Based on the rock strength, the rock layers in the area are divided into four categories: hard rock group, soft and hard interlayered rock group, soft rock group, and loosely accumulated engineering geological rock group [32,33,34]. The hard rock group is mainly composed of limestone, basalt, and dolomite; the soft and hard interlayered rock group is dominated by dolomite and limestone with interlayers of sandstone and mudstone; the soft rock group has a large proportion of mudstone, siltstone, and shale; the loosely accumulated engineering geological rock group is mostly composed of Quaternary gravel. There are 193 rivers in Bijie that are longer than 10 km and belong to the four major river systems of Wujiang, Chishui, Beipanjiang and Jinsha [35]. There are abundant groundwater resources, which are divided into carbonate karst water, bedrock fissure water and local loose accumulation layer pore water [36,37]. Additionally, the area has a humid subtropical monsoon climate, with annual average precipitation ranging from 849 mm to 1399 mm. Due to these geological and climatic conditions, the region has relatively poor geological stability and is one of the most severe landslide areas in China [38,39]. The dataset includes satellite optical images, landslide area mask images, land-slide boundary shapefiles, and digital elevation models. The landslide inventory map of the Bijie dataset is shown in Figure 1, and the landslide covered almost the entire city of Bijie. The images were collected from the TripleSat satellite, spanning from May to August 2018, and consisted of 770 landslide images and 2003 non-landslide images. This dataset includes various types of landslides, such as rock falls, rockslides, and debris slides.

GVLM Dataset [31]. The Global Very-High-Resolution Landslide Mapping (GVLM) dataset is a large-scale landslide dataset covering different countries and regions across six continents, including Asia, Africa, North America, South America, Europe, and Oceania, with a total coverage area of 163.77 km². GVLM contains 17 subdatasets, each comprising a pair of dual-time very-high-resolution (VHR) remote sensing satellite images (pre- and post-landslides) along with the corresponding landslide area mask images. These subdatasets cover landslide sites in various geographical locations, including landslides of different sizes, shapes, occurrence times, spatial distributions, phenological states, and land cover types. This highly diverse dataset aids in assessing the generalization performance of deep learning models due to its spectral heterogeneity and significant intensity variations [40,41]. Table 1 shows the time of landslide image acquisition, triggering causes, and example images for each region, reflecting the differences in the types and visual appearance of landslides in different regions.

Based on the above two datasets, we randomly divided each area into source and target domains for the purpose of this study. The source domain includes 5 areas, and the target domain includes 13 areas. The geographical distribution of all the study areas is shown in Figure 2. The domain division of our research dataset is detailed in Table 1. The examples of different areas in Table 1 show the differences in landslides between the different areas. These differences include not only geological and vegetation differences between mountains but also the shape and distribution of landslides. Because of these differences, it is difficult to generalize previous single-source-domain and single-target-domain segmentation methods to landslide images in other unknown target domains, which leads to segmentation difficulties. Therefore, aiming at the most common scenes of multisource and multitarget domains, we propose a landslide segmentation method based on multisource domain and multitarget domain adaptation.

Mask-based sliding window cropping. Due to the large size of the images in the GVLM dataset and their relatively limited number, we expanded the dataset by cropping the images. To ensure that the cropped images still contained landslide areas, a mask-based sliding window cropping strategy was adopted. Specifically, the size of the cropped image slices was set to 256 × 256, with both the horizontal and vertical sliding steps set to 64. While cropping the original images, the mask images were cropped at the same locations. Finally, by counting the number of pixels containing landslide areas in the mask images and calculating the proportion of landslide areas to the total image area, images with a proportion less than 0.05 were excluded.

Different scale cropping strategies. To enrich the training data, this study also employed sliding windows of various sizes for dataset augmentation. Sliding windows of different sizes can capture information on different scales. Larger windows can cover broader scenes, completely encompassing larger landslide areas and providing global information. In contrast, smaller windows are beneficial for capturing local details. Utilizing cropping methods with windows of different scales helps the model learn more comprehensive and diverse features, thereby enhancing the model’s generalization ability and enabling it to better adapt to inputs of varying scales and positions.

3. Materials and Methods

3.1. Implementation Details

In our experiments, the hardware platform consisted of a CPU equipped with an Intel Core i7 (Intel, CA, USA) and 16 GB of memory and an NVIDIA A100 GPU (NVIDIA, Santa Clara, CA, USA) with 80 GB and 12 GB of video memory. The software platform used was an Ubuntu 16.04 version (Canonical, London, UK) of the Linux system, which uses the PyTorch (Meta, Menlo Park, CA, USA) graphics processing unit (GPU) framework and Python programming language. During the model training process, the Adam algorithm was used to update and optimize the model parameters, with the initial learning rate set at 0.001. When the number of training epochs is 100, the learning rate is multiplied by 0.2 to decrease when the number of iteration rounds reaches 30, 50, and 80, ensuring the model’s stable convergence in the later stages of training. The batch size was set to 32, fully and reasonably utilizing the hardware resources to improve the computational efficiency and training speed during the training process while also considering the stability and convergence of model training.

3.2. Multiple Source and Target Domain-Adaptive Landslide Segmentation Framework

In this paper, we propose a landslide segmentation method based on multisource and multitarget domain adaptation, which includes two stages, as illustrated in Figure 3. The first stage involves training an interdomain translation network, aiming to achieve style transfer between any two domains. This stage is an unsupervised learning process that does not require annotated data. As shown in Figure 3, there is a shift in the feature distribution between two different domains. Through the image translation model, we align the feature distribution of the source domain to that of the target domain, generating source domain data with the feature distribution of the target domain, which we refer to as pseudotarget domain data. In other words, through the interdomain image translation model of this stage, we can easily generate images that have the content and structure of one domain and the style of another domain. Since the translated images maintain the content structure of the source domain and the style of the target domain, the content and structure of the images remain largely consistent; hence, the translated data come with annotation labels from the source domain.

An important motivation for domain adaptation is the lack of labeled data in the target domain for supervised learning. Through the interdomain translation model of the first stage, we can generate annotated image data with the distribution of any target domain for supervised training of downstream tasks. Therefore, the second stage of our method (LUDAS) combines data from the source domain itself and the pseudotarget domain data generated in the first stage as our training data to supervise the training of a robust landslide image segmentation model. This model can perform precise landslide segmentation in multiple unseen target domains. We will detail the interdomain image translation model and the landslide segmentation model in the following sections.

3.3. Unsupervised Lightweight Intradomain Image Translation Network

In this section, we detail the network architecture and training specifications of the proposed interdomain image translation model. Similar to most existing adversarial-based works, our method (LUDAS) follows an adversarial generative training process. The interdomain image translation model consists of an interdomain translation network based on an encoder–decoder structure and a multidomain discriminator. Our proposed interdomain image translation network is unsupervised and does not require additional annotations. At the same time, to use this translation model as an online data augmenter for online domain adaptation in training the landslide segmentation model, our proposed interdomain image translation network is designed to be very concise and lightweight.

3.3.1. Interdomain Translation Network

Our proposed interdomain translation network aims to achieve style transfer between images from any two different domains, allowing annotated source domain data to be aligned and transferred to the distribution of the target domain. For ease of description, we refer to these two domains as Domain A and Domain B.

Early works on style transfer directly trained an encoder–decoder network, using only content images as input, which limited the network to single-style transfers. Inspired by AdaIN (adaptive instance normalization), we propose a method that can achieve the transfer of arbitrary styles, as illustrated in Figure 4 [42]. Our proposed style transfer network accepts images from two different domains as inputs, where an image from Domain A serves as the content image, and an image from Domain B serves as the style image. The network generates a stylized image by combining the content features of the content image with the style information of the style image. Specifically, we use a pre-trained VGG-19 model as the encoder of the network, which is responsible for extracting multiscale feature representations

f_{i}

[41]. The decoder uses a structure symmetrical to the encoder and is responsible for upsampling and decoding the encoded features. To better decouple the content and style features of the input images, we use the adaptive instance normalization module to transfer feature statistics, particularly the per-channel means and variances, for style transfer in feature space [42]. The AdaIN module includes two inputs, the content image features

x_{c o n t e n t}

and style image features

y_{s t y l e}

, and the computation process is as follows:

A d a I N (x_{c o n t e n t}, y_{s t y l e}) = σ (y_{s t y l e}) (\frac{x_{c o n t e n t} - μ (x_{c o n t e n t})}{σ (x_{c o n t e n t})}) + μ (y_{s t y l e})

(1)

where

μ

and

σ

represent the mean and variance, respectively. Through the AdaIN module, the mean and variance of the content image features are transformed to match those of the style image features. This ensures the consistency of the content structure of the content image features while effectively transferring the style information of the style image features to the content image. AdaIN is a computational process without learnable parameters, allowing for the modulation of content and style features between any two domain images at a low computational cost. Unlike [43], which added only the AdaIN module between the encoder and decoder, we added an AdaIN-based skip connection module between the encoding and decoding modules at symmetric scales to achieve style transfer of features at different scales. The computational process of the AdaIN-based skip connection module is as follows:

Skip - AdaIN (f_{i}^{c}, f_{i}^{s}, d_{i + 1}) = A d a I N (f_{i}^{c}, f_{i}^{s}) + Upsample (d_{i + 1})

(2)

where

f_{i}^{c}

and

f_{i}^{s}

represent the content and style image features at the

i^{t h}

scale of the encoder, respectively.

d_{i + 1}

represents the decoded features at the

i^{t h}

scale of the decoder. Upsample denotes the upsampling operation, which is achieved through bilinear interpolation. With this structure, our IDTNet, by combining high-resolution low-level features, can retain more of the content and structural details of the content image. At the same time, it can also utilize high-level features that contain global information for effective style transfer.

3.3.2. Multidomain Discriminator

After generating style-transferred images through the aforementioned interdomain translation model, we need to ensure that the feature distributions of the style image (real image in domain B) and the stylized image (fake image in domain B) become more closely aligned. To achieve this goal, similar to most style transfer works, we employ adversarial learning to align the data distribution of stylized images with the data distribution of style images [44,45,46]. However, previous methods require a separate discriminator for each target domain to achieve style transfer in n target domains. When there are a large number of target domains, the model becomes more complex and redundant, increasing the cost and difficulty of training. To address this issue, we propose a multidomain discriminator, as shown in Figure 5. The multidomain discriminator mainly consists of a shared feature extractor and multiple domain classification heads. The shared feature extractor is used to extract feature representations of the input images for subsequent domain classification. Then, for different target domains that require transfer, the corresponding domain classifier determines whether the input data are real, belonging to that domain, or fake, generated by the above interdomain translation model. By combining the interdomain translation model and multidomain discriminator, we can achieve style transfer across multiple target domains through an adversarial learning training strategy.

3.3.3. Training

During the training process, in addition to the adversarial loss mentioned above to ensure that the data distribution of the stylized images aligns with the data distribution of the style images, we also combined content and style losses, as shown in Figure 4 [46]. The content loss helps the model learn how to better preserve the content and structural information of the source domain image, while the style loss assists the model in learning how to make the generated image’s style more similar to that of the target domain image.

The definition of the adversarial loss is as follows:

{L o s s}_{a d v} = - [\log (D (G (I_{c o n t e n t}, I_{s t y l e})))]

(3)

Loss_adv represents the adversarial loss.

I_{c o n t e n t}

and

I_{s t y l e}

represent the input content image and style image, respectively. G and D represent the interdomain image translation model and the multidomain discriminator, respectively.

For content loss, according to [47], we adopted a combined loss function of self-similarity loss and normalized perceptual loss. The self-similarity loss and normalized perceptual loss guide the model in effectively maintaining the consistency of content and structure between the content image and the stylized image. Given a content image

I_{c o n t e n t} \in R^{h_{c} \times w_{c} \times c}

and the generated stylized image

O_{s t y l i z e d} \in R^{h_{o} \times w_{o} \times c}

, the content loss is calculated as follows:

{L o s s}_{s s} = \frac{1}{{(h_{c} w_{c})}^{2}} \sum_{i, j} |\frac{D_{i j}^{c}}{\sum_{i} D_{i j}^{c}} - \frac{D_{i j}^{o}}{\sum_{i} D_{i j}^{o}}|

(4)

{L o s s}_{n p} = {‖n o r m (I_{c o n t e n t}) - n o r m (O_{s t y l i z e d})‖}_{2}

(5)

{L o s s}_{c o n t e n t} = {L o s s}_{s s} + {L o s s}_{n p}

(6)

where

{L o s s}_{c o n t e n t}

represents the content loss.

{L o s s}_{s s}

and

{L o s s}_{n p}

represent the self-similarity loss and normalized perceptual loss, respectively.

For the style loss, we combined the relaxed Earth Mover’s Distance (rEMD) loss and the mean-variance loss [48]. The rEMD is used to calculate the distance between the feature distributions of the style image and the stylized image, achieving domain alignment between the two images. The mean-variance loss ensures that the style image and the stylized image are closer in terms of color style. Given a style image

I_{s t y l e} \in R^{h_{s} \times w_{s} \times c}

and the generated stylized image

O_{s t y l i z e d} \in R^{h_{o} \times w_{o} \times c}

, the calculation formula for the style loss is as follows:

{L o s s}_{r E M D} = \max (\frac{1}{h_{s} w_{s}} \sum_{i = 1}^{h w} \min_{j} {C o s}_{i j}, \frac{1}{h w} \sum_{j = 1}^{h w} \min_{i} {C o s}_{i j})

(7)

{L o s s}_{m - v} = {‖μ (I_{s t y l e}) - μ (O_{s t y l i z e d})‖}_{2} + {‖σ (I_{s t y l e}) - σ (O_{s t y l i z e d})‖}_{2}

(8)

{L o s s}_{s t y l e} = {L o s s}_{r E M D} + {L o s s}_{m - v}

(9)

where

{L o s s}_{s t y l e}

represents the style loss.

{L o s s}_{r E M D}

and

{L o s s}_{m - v}

represent the relaxed Earth Mover’s distance loss and the mean-variance loss, respectively.

{C o s}_{i j}

denotes the cosine distance between features i and j.

Finally, the total loss function can be expressed as

{L o s s}_{t o t a l} = {L o s s}_{a d v} + {L o s s}_{c o n t e n t} + {L o s s}_{s t y l e}

(10)

3.4. Robust Landslide Segmentation Network

3.4.1. Architecture Overview

U-Net is a milestone in the field of segmentation, and its encoder–decoder structure and skip connections are key to its success [49]. Our proposed landslide segmentation model follows a U-Net-like encoder–decoder structure, as illustrated in Figure 6. Below, we will introduce the model’s encoder and decoder separately.

Encoder. Numerous studies have shown that using pre-trained models and fine-tuning them for downstream tasks is more efficient and stable, as their feature extractors already possess a good general feature extraction capability. For this reason, we use ResNet18 as the encoder part of the network, employing pre-trained weights. The encoder includes four residual blocks, each connected by a downsampling module to capture features at different scales. The encoder and decoder have four symmetric stages, with the scales of the encoded and decoded features at each stage being the same. As shown in Figure 6, the encoded features of each stage are combined with the corresponding stage’s decoded features through skip connections, compensating for the loss of local detail information during downsampling.

Decoder. Recent works have shown that achieving good segmentation requires the model to extract not only local fine details but also global semantic information [50,51]. To this end, we propose a global–local-based decoder. The decoder consists of four Transformer–Convolution Blocks, each connected by an upsampling module to restore the scale of the features to be consistent with the original image. The Transformer–Convolution Block is used to extract both global and local features simultaneously. Finally, the decoder is connected to a classification head to implement pixel-level classification prediction tasks.

3.4.2. Transformer–Convolution Block

Landslide areas vary greatly in shape and size. Large landslides can span almost the entire image, while some areas have very small and dispersed landslides. This characteristic poses a significant challenge in the field of landslide segmentation. To address this issue, we propose a transformer and convolution feature extraction module, as illustrated in Figure 7. When segmenting large landslide areas, global context information interaction is performed through the transformer to achieve long-range information sharing between landslide pixels, thus aiding in segmentation [52]. Convolution effectively extracts local detail features and is capable of effectively segmenting and identifying small landslides [53].

The computational process of the Transformer–Convolution Block can be represented as follows:

Z_{g} = T r a n s f o r m e r (X)

(11)

Z_{l} = C o n v o l u t i o n (X)

(12)

Z = D W C (Z_{g} + Z_{l})

(13)

where X and Z represent the input and output features of the transformer–convolution block, respectively.

Z_{g}

and

Z_{l}

represent the global features extracted by the transformer module and the local features extracted by the convolution module, respectively. DWC denotes the depthwise convolution module, which is used to fuse local and global features.

Vision Transformer is a pioneering work that applies the standard Transformer directly to two-dimensional images [54]. The core of the standard Transformer is its self-attention mechanism. However, the computational complexity of multihead self-attention (MSA) is quadratically proportional to the length of the input sequence, making the computational cost prohibitive when applied to images [55]. Therefore, following the work of [56], we use the Swin Transformer module as our model’s global feature extraction module. The Swin Transformer restricts MSA computations within nonoverlapping local windows and alternates with a window-shifting strategy to ensure information interaction between adjacent windows, significantly reducing computational costs. Through the parallel feature extraction module of the Transformer and Convolution, the model can perceive both global and local information of the image, helping the segmentation model to perform more accurate segmentation on landslide images.

3.5. Evaluation Metric

In this work, we used the intersection over union (IoU), accuracy, and F1-score as the evaluation metrics for our experiments. The IoU emphasizes the degree of area overlap, the accuracy focuses on the correctness of the overall classification, and the F1-score provides a balance between the precision and recall. By combining these metrics, researchers can comprehensively assess and compare different image segmentation models.

IoU. In the realm of deep learning-driven image segmentation, the IoU is widely adopted as a primary metric for evaluation. The IoU offers a quantifiable standard for measuring the overlap between the predicted segmentation area and the actual segmentation area. Specifically, the IoU is defined as the ratio of the area of intersection to the area of union between the prediction and ground truth. The mathematical expression is:

I o U = \frac{T P}{T P + F P + F N}

(14)

where TP represents true positives, which are the correctly predicted segmented areas; FP (false positives) denotes areas wrongly predicted as positive; and FN (false negatives) indicates areas wrongly marked as negative. The IoU ranges from 0 to 1, where higher values indicate better segmentation performance.

Accuracy. The accuracy (ACC) is another core metric used to evaluate the overall performance of a model across an entire dataset. The calculation of accuracy involves comparing the number of correctly classified pixels to the total number of pixels. Its formula is as follows:

A c c = \frac{T P + T N}{T P + T N + F P + F N}

(15)

where TN denotes the true negatives, referring to correctly classified background or nontarget areas. As an intuitive performance indicator, accuracy is applicable for analyzing the overall effect of a model.

F1-score. The F1 score is the harmonic mean of precision and recall, offering an effective solution to balance these two metrics. It is particularly useful in scenarios with an imbalanced distribution of classes. The F1-score is computed using the following formula:

P r e c i s i o n = \frac{T P}{T P + F P}

(16)

R e c a l l = \frac{T P}{T P + F N}

(17)

F 1 - s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(18)

The range of the F1-score is also 0 to 1, where higher values indicate better performance in maintaining a balance between precision and recall.

4. Results

4.1. Interdomain Translation Network

To assess the effectiveness of our proposed interdomain translation network, IDTNet, we performed image translations of Askja and Big Sur, transformed their imagery to match the style of Bijie, and compared the results of our algorithm with those of other networks. As shown in Figure 8, we compared the details of AdaptSegNet, BDL, DAugNet and our method for generating faux source domain images with the target domain style. It is evident that the faux images generated by AdaptSegNet and BDL significantly differ from the real source images in terms of semantic information. BDL produces somewhat blurry faux source domain images, which could adversely affect the performance of subsequent image segmentation algorithms. AdaptSegNet’s faux source images contain many details not present in the target domain, leading to segmentation algorithms learning incorrect weights. DAugNet performs relatively well, but there are still some differences in color restoration compared to the algorithm proposed in this paper. The results clearly demonstrate that our proposed IDTNet outperforms its competitors in detail preservation, style consistency, and visual coherence. Particularly for complex textures and structural details, our network can capture the characteristics of the target domain more precisely while preserving the structural integrity of the source images.

To demonstrate our method’s performance in multisource and multitarget domain adaptive translation, we selected Bijie (China), Santa Catarina (Brazil), Los Lagos (Chile), Taitung (China), and Osh (Kyrgyzstan) as source domains and A Luoi (Vietnam), Kodagu (India), Asakura (Japan), Kupang (Indonesia), Askja (Iceland), Kurucasile (Turkey), Big Sur (United States), Shimen (China), Chimanimani (Zimbabwe), Tbilisi (Georgia), Jiuzhaigou (China), Tenejapa (Mexico), and Kaikoura (New Zealand) as target domains. The satellite remote sensing images of these 18 domains were mutually translated, resulting in a 19 × 18 grid image, as shown in Figure 9. The leftmost image is the source image, with each column representing faux remote sensing images of different domains translated to the same domain style and each row showing faux remote sensing images of a domain translated to other domain styles. Each grid cell represents an instance of interdomain translation. This image offers a comprehensive perspective for observing the ability of IDTNet to handle multisource and multitarget domain translation tasks. According to Figure 9, our proposed IDTNet exhibits strong robustness and coherence in interdomain translation among multiple domains.

Furthermore, as shown in Table 2, our proposed interdomain translation network has a shorter execution time, achieving improvements over other style transfer networks in terms of running speed and algorithm parameters of 0.00563 s and 12.458 M, respectively.

In summary, the aforementioned experiments demonstrate the exceptional ability of IDTNet to perform multisource and multitarget domain translation tasks. Compared to other algorithms, IDTNet has significant advantages in detail preservation and style consistency, especially in processing complex textures and structural details with greater precision. Additionally, a performance evaluation revealed that IDTNet significantly outperforms other networks in terms of running speed, algorithm parameters, and number of FLOPs, highlighting its advantages in terms of efficiency and practicality.

4.2. Multisource and Target Domain Adaptation Segmentation

4.2.1. Quantitative Evaluation

Table 3 summarizes the generalization ability of the segmentation algorithm proposed in this paper compared to other segmentation algorithms for unknown style data. In this set of experiments, the algorithm of this paper and the comparison test algorithms were first trained on five sets of source domain data without interdomain translation and then directly tested on thirteen sets of target domains, and the results are shown in Table 3. The method proposed in this paper significantly outperforms the other methods, with improvements in the PixAcc, IoU and F1 metrics of 2.11–16.90%, 4.77–22.17%, and 4.91–14.68%, respectively. Consequently, it is evident that the method proposed in this paper has strong generalizability and robustness.

Table 4 displays the performance of the method proposed in this paper and other cross-domain segmentation methods in the target domain. In this group of experiments, the method proposed in this paper first undergoes domain adaptation, transferring the target domain style to the source domain dataset and generating 13 different types of target domain style faux source domain data for each group. These source and faux source domains, totaling 18 groups of data, are used for training and then tested in the target domain. To reflect the overall performance of the model in cross-domain segmentation tasks, this paper considers all target domains as a whole and calculates metrics such as the IoU, F1, and Acc for all target domains. As shown in Table 4, after training on domain-adapted data, most methods show improved performance, with the domain-adaptive landslide segmentation algorithm proposed in this paper achieving 83.69%, 74.47%, and 82.31% improvement in the PixAcc, IoU and F1 metrics, respectively, surpassing the second place by 2.00%, 2.96%, and 2.22%. Quantitative evaluation has demonstrated the good performance of the method proposed in this paper in cross-domain image segmentation.

4.2.2. Qualitative Evaluation

In the qualitative evaluation, the experimental results are shown in Figure 10 and Figure 11, corresponding to Table 2 and Table 3 in the quantitative evaluation, respectively. Figure 10 presents the comparative experimental results of the image segmentation algorithm proposed in this paper against other image segmentation algorithms. These results are based on training in the source domain and direct application to the target domain for segmentation. U-Net and DeepLabv3 incorrectly identified large non-landslide areas as landslide regions during landslide prediction. The segmentation results of the EfficientUNet++ method included false landslide areas. The segmentation results of the FCN and DANET methods were relatively better, but they lost many details in the segmented edges. In comparison, the landslide segmentation results obtained by the method proposed in this paper were closer to the ground truth (GT). Overall, the segmentation results of all methods are not satisfactory, which may be attributed to the algorithms’ inability to adapt to unseen target domain images.

Figure 11 shows the comparative experimental results of the method proposed in this paper with other cross-domain segmentation methods, where the algorithm is trained on source domain data styled as both source and target domains and tested in the target domain. The AdaptSegNet and BDL methods exhibited noticeable omissions in the segmentation results in the Askja area and produced larger false landslide areas in the Big Sur region. The DAugNet method performed well in segmentation in the Askja area but still missed some details at the edges and showed false landslide phenomena in the Big Sur region, which has larger landslide areas. The landslide areas delineated by the domain-adaptive landslide segmentation algorithm proposed in this paper were closest to the ground truth (GT). Furthermore, compared with Figure 10, it is evident that the segmentation results of the algorithm in this paper, after training on a domain-adapted dataset, are significantly better than those trained only in the source domain.

4.2.3. Ablation Studies

We designed an ablation study to determine the best number of Transformer–Convolution modules. From Table 5, we can see that when we add module numbers from 2 to 4, the effect is better, but the effect is rapidly worse when the module number is greater than 4. A possible reason is that a deep network can better extract context information, but a network that is too deep can cause context information loss. We set the number of Transformer–Convolution modules to 4.

5. Discussion

5.1. Delving into Unsupervised Domain Adaptation

Unsupervised domain adaptation (UDA) is an adaptation method that targets different scenarios and aims to address the weak generalizability of models caused by regional differences. Traditional single-domain adaptation methods primarily focus on how to learn from a source domain (a domain with abundant labeled data) and adapt to a target domain (a domain with scarce labeled data) to enhance the model’s performance on the target domain. However, with the increasing demands of practical applications, researchers have begun to explore more complex multisource domain adaptation and multitarget domain adaptation issues.

5.1.1. Adaptation from a Single-Source Domain to Multiple Target Domains

The TAD module proposed by Lee et al. and the common subspace learning method proposed by Yang et al. are both designed to transfer knowledge from a single-source domain to multiple target domains, aiming to improve the model’s performance across various domains [63,64].

5.1.2. Adaptation from Multiple Source Domains to a Single Target Domain

The multisource unsupervised domain adaptation method by Luo et al. and the source model and pseudolabeling combination method by Ahmed et al. integrate information from multiple source domains to adapt to a single target domain, addressing the issue of scarce labeled data in the target domain [65,66].

5.1.3. Adaptation from Multiple Source Domains to Multiple Target Domains

This is the most complex scenario, involving learning and transferring knowledge from multiple source domains to multiple target domains. The AMDA method by Wang et al. and the DGWA method by Lu et al. are both aimed at addressing the complex issue of multisource multitarget domain adaptation, capturing transferable information between different domains and dynamically adjusting the parameters of the feature generator to enhance the model’s generalization capability on multiple source and target domains. This redefines the traditional domain adaptation problem as a novel multisource multitarget domain adaptation problem [67,68].

5.2. Expansion to Landslide Identification

The successful application of unsupervised domain adaptation (UDA) technology in the field of biomedical engineering indeed provides insights for other areas, particularly in environmental monitoring and disaster management fields such as landslide hazard identification.

Fortunately, since the artistic style of the vast majority of landslide remote sensing images is relatively uniform, these images can be effectively adapted to unsupervised multisource domain and multitarget domain adaptation techniques. This can greatly alleviate the difficulties faced by landslide identification in practical applications, such as data scarcity, regional image feature differences, and insufficient model generalization capabilities, creating more possibilities for extremely scarce landslides.

Due to the influence of factors such as landslide surface type, lighting conditions, and vegetation recovery, source domain and target domain images often have different image styles and visual effects. This is one of the main factors hindering cross-scene landslide detection and identification. Li et al. [69] attempted to transform a source domain landslide into a type similar to the style of the target domain. The style transfer image not only retains the original content of the source domain but also retains the style of the target domain, filling the domain gap and increasing the diversity of landslide images. At the feature level, they combined adversarial learning and domain distance minimization to reduce the large feature distribution differences and learn domain-invariant information. In addition, to avoid information loss, they improved the U-Net3+ model by integrating complete landslide features at different scales, demonstrating the great potential of unsupervised domain adaptation semantic segmentation for landslide detection.

However, such single-source domain-to-single-target domain adaptation often utilizes only limited target domain information. Can we consider expanding the single-source domain to single-source domain adaptation to multisource domain to multisource domain adaptation, thereby synergistically considering both source domain and target domain information?

Specifically, our proposed multiple source and target domain-adaptive landslide segmentation method includes two phases. In the first phase, we propose an unsupervised interdomain translation network aiming to achieve style transfer between any two domains. This phase is an unsupervised learning process that does not require annotated data. Through this phase’s image translation model, we can align the feature distribution of any source domain with any target domain to generate source domain landslide images with the feature distribution of the target domain. For our proposed interdomain translation network, the main idea is to generate stylized images by combining the content features of the content image and the style information of the style image. We integrate the adaptive instance normalization module to effectively decouple the content and style features of the input images and to achieve feature distribution transfer in the feature space. Furthermore, we added AdaIN modules between each scale’s encoder–decoder to achieve style transfer of features at different scales. Through the first phase’s interdomain translation network, we can generate any source domain image data with the distribution of any target domain, along with annotation labels, as pseudotarget domain data. In the second phase, we use the pseudontarget domain data obtained from the first phase and source domain data to supervise the training of a landslide segmentation model that can generalize across multiple source domains and target domains. Additionally, considering the image characteristics of landslides, we propose a U-shaped segmentation network based on the Transformer–Convolution module. The Transformer–Convolution module uses the Transformer for global context information interaction to achieve long-range information sharing between landslide pixels, thus aiding in segmenting images with large landslides or dispersed landslides. The convolution effectively extracts local detail features and can effectively segment and identify small-area landslides. Finally, we effectively validated our proposed segmentation method through qualitative and quantitative comparative experiments.

5.3. Limitations and Future Work

The limitation of our proposed method is that it is not an end-to-end approach; hence, the training process of this method needs to be carried out in two phases to train the interdomain translation model and the landslide segmentation model separately. At the same time, our current work lacks the ability to perceive new domains. In our future research, we will explore the perception of new domains to continuously generalize to newly added unseen domains. Furthermore, LUDAS has important potential significance and application prospects in the detection of landslides in multitemporal landslide inventories. By applying the LUDAS method, it is possible to perform uniform style transformation and landslide detection on remote sensing images from different periods, thus realizing continuous monitoring of landslide areas. This dynamic monitoring capability helps to capture the occurrence, development and expansion process of landslides and provides a scientific basis for timely disaster prevention and mitigation measures. In addition, the establishment of multiperiod landslide detection methods can provide important data support for early warning and risk assessment of landslides [70,71]. By analyzing the historical records and trends of landslide activities, potential landslide risk areas can be better predicted [72,73,74]. Finally, we plan to conduct field surveys to collect high-quality and complete landslide inventory data to validate and continuously optimize our model. For this reason, in our future work, we will combine more remote sensing images from different periods to further validate and optimize the performance of the LUDAS method and explore the effectiveness of its application on a larger scale, in more diverse terrains, and under multiple temporal conditions.

6. Conclusions

In this study, we propose a multisource and multitarget domain adaptive image segmentation method (LUDAS). In the first stage, we propose an unsupervised interdomain translation network to generate stylized images by combining the content features of the content images with the style information of the style images. Then, it can generate any source domain image data distributed in any target domain and label it as pseudotarget domain data. In the second stage, we use the pseudontarget domain data and source domain data obtained from the first stage to supervise the training of the landslide segmentation model, which helps segment images of large landslides or scattered landslide areas through remote information sharing among landslide pixels. Convolution can be used to effectively extract local detailed features and segment and identify small landslides. Finally, a segmentation model with strong generalization ability is obtained, which can adapt to multiple source domains and target domains.

The two main conclusions can be summarized as follows:

From a quantitative point of view, LUDAS has an excellent performance in terms of PixAcc, IoU, F1 and other indicators. LUDAS has good performance in cross-domain image segmentation and fully demonstrates the strong generalizability and robustness of LUDAS.

In a qualitative sense, compared with other popular methods, the landslide region delimited by the regional adaptive landslide segmentation algorithm proposed in this paper is obviously the closest to the ground truth (GT). Moreover, the segmentation result of the algorithm proposed in this paper after the training of the domain-adaptive dataset is obviously better than the segmentation result trained only in the source domain.

Author Contributions

W.C.: writing—original draft, writing—review and editing, data curation, visualization; Z.C.: writing—review and editing, data curation, visualization; D.S.: supervision, software, formal analysis; H.H.: supervision, formal analysis; H.L.: software, formal analysis; Y.Z.: formal analysis. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (52208359), the Natural Science Foundation of Sichuan Province (2024NSFSC0925), the National Natural Science Foundation of China (52109125), the Fundamental Research Funds for the Central Universities (2023ZYGXZRx2tjD2231010), and the Sichuan Agricultural University “Innovation and Entrepreneurship Training Program for College Students in Sichuan Province” (S202310626053).

Data Availability Statement

The data or code used in this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Cruden, D.M.; Novograd, S.; Pilot, G.A.; Krauter, E.; Bhandari, R.K.; Cotecchia, V.; Nakamura, H.; Okagbue, C.O.; Zhang, Z.; Hutchinson, J.N.; et al. Suggested nomenclature for landslides. Bull. Int. Assoc. Eng. Geol. 1990, 41, 13–16. [Google Scholar]
Yashar, A.; Asadallah, N.; Ali, Y. Landslide process and impacts: A proposed classification method. CATENA 2013, 104, 219–232. [Google Scholar]
Sharma, A.; Sharma, K.K. A Review on Satellite Image Processing for Landslides Detection. In Artificial Intelligence and Machine Learning in Satellite Data Processing and Services, Proceedings of the International Conference on Small Satellites, Punjab, India, 29–30 April 2022; Springer Nature: Singapore, 2023; pp. 123–129. [Google Scholar]
Hou, H.; Chen, M.; Tie, Y.; Li, W. A Universal Landslide Detection Method in Optical Remote Sensing Images Based on Improved YOLOX. Remote Sens. 2022, 14, 4939. [Google Scholar] [CrossRef]
Chen, F.; Yu, B.; Li, B. A practical trial of landslide detection from single-temporal Landsat8 images using contour-based proposals and random forest: A case study of national Nepal. Landslides 2018, 15, 453–464. [Google Scholar] [CrossRef]
Chen, W.; Li, X.; Wang, Y.; Chen, G.; Liu, S. Forested landslide detection using LiDAR data and the random forest algorithm: A case study of the Three Gorges, China. Remote Sens. Environ. 2014, 152, 291–301. [Google Scholar] [CrossRef]
Meena, S.R.; Soares, L.P.; Grohmann, C.H.; van Westen, C.; Bhuyan, K.; Singh, R.P.; Floris, M.; Catani, F. Landslide detection in the Himalayas using machine learning algorithms and U-Net. Landslides 2022, 19, 1209–1229. [Google Scholar] [CrossRef]
Uehara, T.D.T.; Corrêa, S.P.L.P.; Quevedo, R.P.; Körting, T.S.; Dutra, L.V.; Rennó, C.D. Landslide scars detection using remote sensing and pattern recognition techniques: Comparison among artificial neural net-works, gaussian maximum likelihood, random forest, and support vector machine classifiers. Rev. Bras. Cartogr. 2020, 72, 665–680. [Google Scholar] [CrossRef]
Yu, H.; Ma, Y.; Wang, L.; Zhai, Y.; Wang, X. A landslide intelligent detection method based on CNN and RSG_R. In Proceedings of the 2017 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, Japan, 6–9 August 2017; pp. 40–44. [Google Scholar]
Ding, A.; Zhang, Q.; Zhou, X.; Dai, B. Automatic recognition of landslide based on CNN and texture change detection. In Proceedings of the 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 444–448. [Google Scholar]
Sameen, M.I.; Pradhan, B. Landslide detection using residual networks and the fusion of spectral and topographic information. IEEE Access 2019, 7, 114363–114373. [Google Scholar] [CrossRef]
Phakdimek, S.; Komori, D.; Chaithong, T. Combination of optical images and SAR images for detecting landslide scars, using a classification and regression tree. Int. J. Remote Sens. 2023, 44, 3572–3606. [Google Scholar] [CrossRef]
Xie, D.; Yang, R.; Qiao, Y.; Zhang, J. Intelligent Identification of Landslide Based on Deep Semi-supervised Learning. In Proceedings of the 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Chengdu, China, 19–21 August 2022; pp. 264–269. [Google Scholar]
Mezaal, M.R.; Pradhan, B.; Sameen, M.I.; Mohd Shafri, H.Z.; Yusoff, Z.M. Optimized Neural Architecture for Automatic Landslide Detection from High-Resolution Airborne Laser Scanning Data. Appl. Sci. 2017, 7, 730. [Google Scholar] [CrossRef]
Nava, L.; Monserrat, O.; Catani, F. Improving Landslide Detection on SAR Data Through Deep Learning. IEEE Geosci. Remote Sens. Lett. 2021, 19, 4020405. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Blaschke, T.; Gholamnia, K.; Meena, S.R.; Tiede, D.; Aryal, J. Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection. Remote Sens. 2019, 11, 196. [Google Scholar] [CrossRef]
Wang, H.; Zhang, L.; Yin, K.; Luo, H.; Li, J. Landslide identification using machine learning. Geosci. Front. 2021, 12, 351–364. [Google Scholar] [CrossRef]
Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar] [CrossRef] [PubMed]
Kotaridis, I.; Maria, L. Remote sensing image segmentation advances: A meta-analysis. ISPRS J. Photogramm. Remote Sens. 2021, 173, 309–322. [Google Scholar] [CrossRef]
Yuan, J.; Wang, D.; Li, R. Remote sensing image segmentation by combining spectral and texture features. IEEE Trans. Geosci. Remote Sens. 2013, 52, 16–24. [Google Scholar] [CrossRef]
Li, Z.; Guo, Y. Semantic segmentation of landslide images in Nyingchi region based on PSPNet network. In Proceedings of the 2020 7th International Conference on Information Science and Control Engineering (ICISCE), Changsha, China, 18–20 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1269–1273. [Google Scholar]
Du, B.; Zhao, Z.; Hu, X.; Wu, G.; Han, L.; Sun, L.; Gao, Q. Landslide susceptibility prediction based on image semantic segmentation. Comput. Geosci. 2021, 155, 104860. [Google Scholar] [CrossRef]
Li, H.; He, Y.; Xu, Q.; Deng, J.; Li, W.; Wei, Y.; Zhou, J. Sematic segmentation of loess landslides with STAPLE mask and fully connected conditional random field. Landslides 2023, 20, 367–380. [Google Scholar] [CrossRef]
Ullo, S.L.; Mohan, A.; Sebastianelli, A.; Ahamed, S.E.; Kumar, B.; Dwivedi, R.; Sinha, G.R. A new mask R-CNN-based method for improved landslide detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3799–3810. [Google Scholar] [CrossRef]
Zhao, S.; Yue, X.; Zhang, S.; Li, B.; Zhao, H.; Wu, B.; Krishna, R.; Gonzalez, J.E.; Sangiovanni-Vincentelli, A.L.; Seshia, S.A.; et al. A review of single-source deep unsupervised visual domain adaptation. IEEE Trans. Neural Netw. Learn. Syst. 2020, 33, 473–493. [Google Scholar] [CrossRef]
Wang, J.; Jiang, J. Learning across tasks for zero-shot domain adaptation from a single source domain. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6264–6279. [Google Scholar] [CrossRef] [PubMed]
Xu, Q.; Zhang, R.; Wu, Y.Y.; Zhang, Y.; Liu, N.; Wang, Y. SimDE: A Simple Domain Expansion Approach for Single-Source Domain Generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 4797–4807. [Google Scholar]
Zeng, M.; Li, S.; Li, R.; Lu, J.; Xu, K.; Gu, J.; Chen, Y. A multi-target domain adaptive method for intelligent transfer fault diagnosis. Measurement 2023, 207, 112352. [Google Scholar] [CrossRef]
Gholami, B.; Sahu, P.; Rudovic, O.; Bousmalis, K.; Pavlovic, V. Unsupervised multi-target domain adaptation: An information theoretic approach. IEEE Trans. Image Process. 2020, 29, 3993–4002. [Google Scholar] [CrossRef]
Ji, S.; Yu, D.; Shen, C.; Li, W.; Xu, Q. Landslide detection from an open satellite imagery and digital elevation model dataset using attention boosted convolutional neural networks. Landslides 2020, 17, 1337–1352. [Google Scholar] [CrossRef]
Zhang, X.; Yu, W.; Pun, M.-O.; Shi, W. Cross-domain landslide mapping from large-scale remote sensing images using prototype-guided domain-aware progressive representation learning. ISPRS J. Photogramm. Remote Sens. 2023, 197, 1–17. [Google Scholar] [CrossRef]
Zhai, K.L. Study on the Evaluation of Geological Hazard Susceptibility of Collapse and Landslide in Bijie City, Guizhou Province. Master Thesis, Jilin University, Changchun, China, 2020. (In Chinese). [Google Scholar]
Tao, T.; Shi, W.; Liang, F.; Wang, X. Failure mechanism and evolution of the Jinhaihu landslide in Bijie City, China, on January 3, 2022. Landslides 2022, 19, 2727–2736. [Google Scholar] [CrossRef]
Guo, B.; Yang, F.; Fan, Y.; Zang, W. The dominant driving factors of rocky desertification and their variations in typical mountainous karst areas of Southwest China in the context of global change. CATENA 2023, 220, 106674. [Google Scholar] [CrossRef]
Zhang, J.P. Soil erosion in Guizhou province of China: A case study in Bijie prefecture. Soil Use Manag. 1999, 15, 68–70. [Google Scholar]
Yuan, J.; Xu, F.; Deng, G.; Tang, Y.; Li, P. Hydrogeochemistry of Shallow Groundwater in a Karst Aquifer System of Bijie City, Guizhou Province. Water 2017, 9, 625. [Google Scholar] [CrossRef]
Yuan, Z.; Yao, J.; Wang, F.; Guo, Z.; Dong, Z.; Chen, F.; Hu, Y.; Sunahara, G. Potentially toxic trace element contamination, sources, and pollution assessment in farmlands, Bijie City, southwestern China. Environ. Monit. Assess. 2017, 189, 25. [Google Scholar] [CrossRef]
Zhang, Y.; Shen, C.; Zhou, S.; Luo, X. Analysis of the Influence of Forests on Landslides in the Bijie Area of Guizhou. Forests 2022, 13, 1136. [Google Scholar] [CrossRef]
Shen, C.; Zhou, S.; Luo, X.; Zhang, Y.; Liu, H. Using DInSAR to inventory landslide geological disaster in Bijie, Guizhou, China. Front. Earth Sci. 2023, 10, 1024710. [Google Scholar] [CrossRef]
Wang, L.; Zhang, M.; Gao, X.; Shi, W. Advances and Challenges in Deep Learning-Based Change Detection for Remote Sensing Images: A Review through Various Learning Paradigms. Remote Sens. 2024, 16, 804. [Google Scholar] [CrossRef]
Asadi, A.; Baise, L.G.; Chatterjee, S.; Koch, M.; Moaveni, B. Regional landslide mapping model developed by a deep transfer learning framework using post-event optical imagery. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2024, 18, 186–210. [Google Scholar] [CrossRef]
Huang, X.; Serge, B. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision 2017, Venice, Italy, 22–29 October 2017; pp. 1501–1510. [Google Scholar]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2020, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
Lin, T.; Ma, Z.; Li, F.; He, D.; Li, X.; Ding, E.; Wang, N.; Li, J.; Gao, X. Drafting and revision: Laplacian pyramid network for fast high-quality artistic style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021, Nashville, TN, USA, 20–25 June 2021; pp. 5141–5150. [Google Scholar]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Kolkin, N.; Jason, S.; Gregory, S. Style transfer by relaxed optimal transport and self-similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 15–20 June 2019; pp. 10051–10060. [Google Scholar]
Sahana, M.; Pham, B.T.; Shukla, M.; Costache, R.; Thu, D.X.; Chakrabortty, R.; Satyam, N.; Nguyen, H.D.; Van Phong, T.; Van Le, H.; et al. Rainfall induced landslide susceptibility mapping using novel hybrid soft computing methods based on multi-layer perceptron neural network classifier. Geocarto Int. 2022, 37, 2747–2771. [Google Scholar] [CrossRef]
Ronneberger, O.; Philipp, F.; Thomas, B. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Proceedings of the18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings 2015, Part III 18; Springer International Publishing: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
Liang, J.; Yang, C.; Zeng, M.; Wang, X. TransConver: Transformer and convolution parallel network for developing automatic brain tumor segmentation in MRI images. Quant. Imaging Med. Surg. 2022, 12, 2397. [Google Scholar] [CrossRef]
Li, S.; Sui, X.; Luo, X.; Xu, X.; Liu, Y.; Goh, R. Medical image segmentation using squeeze-and-expansion transformers. arXiv 2021, arXiv:2105.09511. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision 2021, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
Silva, J.L.; Menezes, M.N.; Rodrigues, T.; Silva, B.; Pinto, F.J.; Oliveira, A.L. Encoder-decoder architectures for clinically relevant coronary artery segmentation. arXiv 2021, arXiv:2106.11447. [Google Scholar]
Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 15–20 June 2019; pp. 3146–3154. [Google Scholar]
Tsai, Y.H.; Hung, W.C.; Schulter, S.; Sohn, K.; Yang, M.H.; Chandraker, M. Learning to adapt structured output space for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7472–7481. [Google Scholar]
Li, Y.; Lu, Y.; Nuno, V. Bidirectional learning for domain adaptation of semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 15–20 June 2019; pp. 6936–6945. [Google Scholar]
Tasar, O.; Giros, A.; Tarabalka, Y.; Alliez, P.; Clerc, S. DAugNet: Unsupervised, multisource, multitarget, and life-long domain adaptation for semantic segmentation of satellite images. IEEE Trans. Geosci. Remote Sens. 2020, 59, 1067–1081. [Google Scholar] [CrossRef]
Lee, S.; Choi, W.; Kim, C.; Choi, M.; Im, S. Adas: A direct adaptation strategy for multi-target domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022, New Orleans, LA, USA, 18–24 June 2022; pp. 19196–19206. [Google Scholar]
Yang, D.; Wang, H.; Zou, Y. Unsupervised multi-target domain adaptation for acoustic scene classification. arXiv 2021, arXiv:2105.10340. [Google Scholar]
Luo, Z.; Zhang, X.; Lu, S.; Yi, S. Domain consistency regularization for unsupervised multi-source domain adaptive classification. Pattern Recognit. 2022, 132, 108955. [Google Scholar] [CrossRef]
Ahmed, S.M.; Raychaudhuri, D.S.; Paul, S.; Oymak, S.; Roy-Chowdhury, A.K. Unsupervised multi-source domain adaptation without access to source data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021, Nashville, TN, USA, 20–25 June 2021; pp. 10103–10112. [Google Scholar]
Wang, Y.; Zhang, Z.; Hao, W.; Song, C. Attention Guided Multiple Source and Target Domain Adaptation. IEEE Trans. Image Process. 2020, 30, 892–906. [Google Scholar] [CrossRef]
Lu, Y.; Huang, H.; Zeng, B.; Lai, Z.; Li, X. Multi-Source and Multi-Target Domain Adaptation Based on Dynamic Generator with Attention. IEEE Trans. Multimed. 2024, 26, 6891–6905. [Google Scholar] [CrossRef]
Li, P.; Wang, Y.; Si, T.; Ullah, K.; Han, W.; Wang, L. DSFA: Cross-scene domain style and feature adaptation for landslide detection from high spatial resolution images. Int. J. Digit. Earth 2023, 16, 2426–2447. [Google Scholar] [CrossRef]
Bhuyan, K.; Tanyaş, H.; Nava, L.; Puliero, S.; Meena, S.R.; Floris, M.; van Westen, C.; Catani, F. Generating multi-temporal landslide inventories through a general deep transfer learning strategy using HR EO data. Sci. Rep. 2023, 13, 162. [Google Scholar] [CrossRef]
Samia, J.; Temme, A.; Bregt, A.; Wallinga, J.; Guzzetti, F.; Ardizzone, F.; Rossi, M. Do landslides follow landslides? Insights in path dependency from a multi-temporal landslide inventory. Landslides 2017, 14, 547–558. [Google Scholar] [CrossRef]
Luino, F.; Barriendos, M.; Gizzi, F.T.; Glaser, R.; Gruetzner, C.; Palmieri, W.; Porfido, S.; Sangster, H.; Turconi, L. Historical Data for Natural Hazard Risk Mitigation and Land Use Planning. Land 2023, 12, 1777. [Google Scholar] [CrossRef]
Marr, P.; Jiménez Donato, Y.A.; Carraro, E.; Kanta, R.; Glade, T. The Role of Historical Data to Investigate Slow-Moving Landslides by Long-Term Monitoring Systems in Lower Austria. Land 2023, 12, 659. [Google Scholar] [CrossRef]
Bentivenga, M.; Gizzi, F.T.; Palladino, G.; Piccarreta, M.; Potenza, M.R.; Perrone, A.; Bellanova, J.; Calamita, G.; Piscitelli, S. Multisource and Multilevel Investigations on a Historical Landslide: The 1907 Servigliano Earth Flow in Montemurro (Basilicata, Southern Italy). Land 2022, 11, 408. [Google Scholar] [CrossRef]

Figure 1. The landslide inventory map of the Bijie dataset (modified after [30]).

Figure 2. Location distribution of our study areas.

Figure 3. Illustration of our proposed multiple-source and target domain-adaptive landslide segmentation framework.

Figure 4. The architecture of the interdomain translation network.

Figure 5. The architecture of the multidomain discriminator.

Figure 6. Overview of our proposed landslide segmentation network.

Figure 7. The architecture of the transformer–convolution block.

Figure 8. Results of interdomain translation comparative experiments.

Figure 9. Results of multisource and multitarget domain translation (the left first column shows the raw images, and the other columns show the translation images). The top 5 rows are the source domain, and the other rows are the target domain. The images from top to bottom and from left to right are from Bijie, Los Lagos, Osh, Santa Catarina, Taitung, A Luoi, Asakura, Askja, Big Sur, Chimanimani, Jiuzhaigou, Kaikoura, Kodagu, Kupang, Kurucasile, Shimen, Tbilisi, and Tenejapa.

Figure 10. Effectiveness of source domain training and target domain segmentation.

Figure 11. Domain-adaptive data training and target domain segmentation effectiveness.

Table 1. Domain division and details of our study data.

Domain	Area (Country)	Time	Triggers	Area (Country)	Time	Triggers
	Bijie (China)	Aug, 2018	Rainfall	Santa Catarina (Brazil)	Feb, 2021	Torrential rain
Source	Los Lagos (Chile)	Jan, 2018	Glacier melting, rainfall	Taitung (China)	Oct, 2011	Typhoon, rainfall
	Osh (Kyrgyzstan)	Feb, 2018	Melting snow, rainfall
	A Luoi (Vietnam)	Feb, 2021	Rainfall	Kodagu (India)	Mar, 2017	Rainfall
	Asakura (Japan)	Sep, 2017	Earthquake	Kupang (Indonesia)	Apr, 2021	Rainfall
	Askja (Iceland)	Aug, 2020	Snow, glacier melting	Kurucasile (Turkey)	Jun, 2017	Flood
Target	Big Sur (United States)	Jun, 2017	Loose soil, rock splitting	Shimen (China)	Nov, 2020	Rainfall
	Chimanimani (Zimbabwe)	Mar, 2019	Tropical cyclone	Tbilisi (Georgia)	Jun, 2015	Flood
	Jiuzhaigou (China)	Aug, 2017	Earthquake	Tenejapa (Mexico)	Feb, 2021	Hurricane
	Kaikoura (New Zealand)	Nov, 2016	Earthquake

Table 2. Parameters of the interdomain translation algorithm.

Method	Inference Time (s)	Parameters (M)
U-Net [49]	0.00975	31.038
FCN [56]	0.00572	15.307
DeepLabV3 [57]	0.02528	41.812
EfficientUNet++ [58]	0.00413	6.126
DANet [59]	0.03522	49.617
AdaptSegNet [60]	0.02472	42.721
BDL [61]	0.02846	45.337
DAugNet [62]	0.01854	42.545
LUDAS	0.00563	12.458

Table 3. Source domain training and target domain segmentation results.

Method	PixAcc (%)	IoU (%)	F1 (%)
U-Net [49]	58.6	57.54	58.24
FCN [56]	73.39	57.4	62.4
DeepLabv3 [57]	67.45	40.14	48.16
EfficientUnet++ [58]	71.23	50.35	52.63
DANet [59]	69.85	52.41	58.44
LUDAS	75.50	62.31	67.31

Table 4. Domain-adaptive data training and target domain segmentation results.

Method	PixAcc (%)	IoU (%)	F1 (%)
AdaptSegNet [60]	78.34	63.46	76.93
BDL [61]	77.99	65.34	75.93
DAugNet [62]	81.69	71.51	80.09
LUDAS	83.69	74.47	82.31

Table 5. Transformer–Convolution module number ablation study.

Transformer–Convolution Module Number	F1 (%)
2	77.68
3	80.37
4	82.31
5	78.19
6	74.35

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, W.; Chen, Z.; Song, D.; He, H.; Li, H.; Zhu, Y. Landslide Detection Using the Unsupervised Domain-Adaptive Image Segmentation Method. Land 2024, 13, 928. https://0-doi-org.brum.beds.ac.uk/10.3390/land13070928

AMA Style

Chen W, Chen Z, Song D, He H, Li H, Zhu Y. Landslide Detection Using the Unsupervised Domain-Adaptive Image Segmentation Method. Land. 2024; 13(7):928. https://0-doi-org.brum.beds.ac.uk/10.3390/land13070928

Chicago/Turabian Style

Chen, Weisong, Zhuo Chen, Danqing Song, Hongjin He, Hao Li, and Yuxian Zhu. 2024. "Landslide Detection Using the Unsupervised Domain-Adaptive Image Segmentation Method" Land 13, no. 7: 928. https://0-doi-org.brum.beds.ac.uk/10.3390/land13070928

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Landslide Detection Using the Unsupervised Domain-Adaptive Image Segmentation Method

Abstract

1. Introduction

2. Study Area

3. Materials and Methods

3.1. Implementation Details

3.2. Multiple Source and Target Domain-Adaptive Landslide Segmentation Framework

3.3. Unsupervised Lightweight Intradomain Image Translation Network

3.3.1. Interdomain Translation Network

3.3.2. Multidomain Discriminator

3.3.3. Training

3.4. Robust Landslide Segmentation Network

3.4.1. Architecture Overview

3.4.2. Transformer–Convolution Block

3.5. Evaluation Metric

4. Results

4.1. Interdomain Translation Network

4.2. Multisource and Target Domain Adaptation Segmentation

4.2.1. Quantitative Evaluation

4.2.2. Qualitative Evaluation

4.2.3. Ablation Studies

5. Discussion

5.1. Delving into Unsupervised Domain Adaptation

5.1.1. Adaptation from a Single-Source Domain to Multiple Target Domains

5.1.2. Adaptation from Multiple Source Domains to a Single Target Domain

5.1.3. Adaptation from Multiple Source Domains to Multiple Target Domains

5.2. Expansion to Landslide Identification

5.3. Limitations and Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI