remotesensing-logo

Journal Browser

Journal Browser

Artificial Intelligence and Machine Learning with Applications in Remote Sensing

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Engineering Remote Sensing".

Deadline for manuscript submissions: closed (15 November 2022) | Viewed by 55250

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science & Information Engineering, National Central University, Taoyuan 32001, Taiwan
Interests: remote sensing; artificial intelligence; machine learning; image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Electrical Engineering, National Taipei University of Technology, Taipei 10608, Taiwan
Interests: remote sensing; high performance computing; deep learning; pattern recognition; image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Center for Space and Remote Sensing Research, National Central University, Taoyuan 32001, Taiwan
Interests: hyperspectral; multispectral signal processing; machine learning; deep learning; image processing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Recently, with the advancement of technology, there are more and more data with higher spectral, spatial and temporal resolutions obtained from active and passive sensors. In addition, the applications of remote sensing data in environmental, commercial and military fields are becoming more and more popular. This poses challenges to effectively and efficiently process big remote sensing data. In recent years, many useful feature mining algorithms, deep learning algorithms, and decision tree inspired algorithms for remote sensing data processing has drawn a lot of researchers and received unprecedented popularity. Even with so many works and algorithms have been devoted to this popular topic, there is still so much we can do about artificial intelligence, machine learning and deep learning. Therefore, this Special Issue of Remote Sensing aims to demonstrate state-of-the-art works in employing artificial intelligence machine learning and deep learning algorithms for effective and efficient remote sensing applications. Papers are solicited in, but not limited to, the following areas:

  • Hyperspectral, multispectral applications with machine learning, deep learning algorithms
  • Remote sensing data processing based on artificial intelligence and machine learning
  • Hyperspectral, multispectral image processing
  • AI/Deep learning/Machine learning for big hyperspectral, multispectral data analysis
  • Remote sensing data for disasters, weather, water and climate applications based on AI/DL/ML algorithms
  • Deep learning-based transfer learning
  • Feature extraction with machine learning or deep learning for remote sensing data

Prof. Dr. Kuo-Chin Fan
Prof. Dr. Yang-Lang Chang
Prof. Dr. Toshifumi Moriyama
Dr. Ying-Nong Chen
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Remote sensing data
  • Artificial intelligence
  • Machine learning
  • Deep learning
  • Hyperspectral images

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review, Other

10 pages, 633 KiB  
Editorial
Special Issue Review: Artificial Intelligence and Machine Learning Applications in Remote Sensing
by Ying-Nong Chen, Kuo-Chin Fan, Yang-Lang Chang and Toshifumi Moriyama
Remote Sens. 2023, 15(3), 569; https://0-doi-org.brum.beds.ac.uk/10.3390/rs15030569 - 18 Jan 2023
Cited by 6 | Viewed by 3503
Abstract
Remote sensing is used in an increasingly wide range of applications. Models and methodologies based on artificial intelligence (AI) are commonly used to increase the performance of remote sensing technologies. Deep learning (DL) models are the most widely researched AI-based models because of [...] Read more.
Remote sensing is used in an increasingly wide range of applications. Models and methodologies based on artificial intelligence (AI) are commonly used to increase the performance of remote sensing technologies. Deep learning (DL) models are the most widely researched AI-based models because of their effectiveness and high performance. Therefore, we organized a Special Issue on remote sensing titled “Artificial Intelligence and Machine Learning Applications in Remote Sensing.” In this paper, we review nine articles included in this Special Issue, most of which report studies based on satellite data and DL, reflecting the most prevalent trends in remote sensing research, as well as how DL architecture and the functioning of DL models can be analyzed and explained is a hot topic in AI research. DL methods can outperform conventional machine learning methods in remote sensing; however, DL remains a black box and understanding the details of the mechanisms through which DL models make decisions is difficult. Therefore, researchers must continue to investigate how explainable DL methods for use in the field of remote sensing can be developed. Full article
Show Figures

Figure 1

Research

Jump to: Editorial, Review, Other

21 pages, 5532 KiB  
Article
Combining Object-Oriented and Deep Learning Methods to Estimate Photosynthetic and Non-Photosynthetic Vegetation Cover in the Desert from Unmanned Aerial Vehicle Images with Consideration of Shadows
by Jie He, Du Lyu, Liang He, Yujie Zhang, Xiaoming Xu, Haijie Yi, Qilong Tian, Baoyuan Liu and Xiaoping Zhang
Remote Sens. 2023, 15(1), 105; https://0-doi-org.brum.beds.ac.uk/10.3390/rs15010105 - 25 Dec 2022
Cited by 6 | Viewed by 1927
Abstract
Soil erosion is a global environmental problem. The rapid monitoring of the coverage changes in and spatial patterns of photosynthetic vegetation (PV) and non-photosynthetic vegetation (NPV) at regional scales can help improve the accuracy of soil erosion evaluations. Three deep learning semantic segmentation [...] Read more.
Soil erosion is a global environmental problem. The rapid monitoring of the coverage changes in and spatial patterns of photosynthetic vegetation (PV) and non-photosynthetic vegetation (NPV) at regional scales can help improve the accuracy of soil erosion evaluations. Three deep learning semantic segmentation models, DeepLabV3+, PSPNet, and U-Net, are often used to extract features from unmanned aerial vehicle (UAV) images; however, their extraction processes are highly dependent on the assignment of massive data labels, which greatly limits their applicability. At the same time, numerous shadows are present in UAV images. It is not clear whether the shaded features can be further classified, nor how much accuracy can be achieved. This study took the Mu Us Desert in northern China as an example with which to explore the feasibility and efficiency of shadow-sensitive PV/NPV classification using the three models. Using the object-oriented classification technique alongside manual correction, 728 labels were produced for deep learning PV/NVP semantic segmentation. ResNet 50 was selected as the backbone network with which to train the sample data. Three models were used in the study; the overall accuracy (OA), the kappa coefficient, and the orthogonal statistic were applied to evaluate their accuracy and efficiency. The results showed that, for six characteristics, the three models achieved OAs of 88.3–91.9% and kappa coefficients of 0.81–0.87. The DeepLabV3+ model was superior, and its accuracy for PV and bare soil (BS) under light conditions exceeded 95%; for the three categories of PV/NPV/BS, it achieved an OA of 94.3% and a kappa coefficient of 0.90, performing slightly better (by ~2.6% (OA) and ~0.05 (kappa coefficient)) than the other two models. The DeepLabV3+ model and corresponding labels were tested in other sites for the same types of features: it achieved OAs of 93.9–95.9% and kappa coefficients of 0.88–0.92. Compared with traditional machine learning methods, such as random forest, the proposed method not only offers a marked improvement in classification accuracy but also realizes the semiautomatic extraction of PV/NPV areas. The results will be useful for land-use planning and land resource management in the areas. Full article
Show Figures

Graphical abstract

16 pages, 1082 KiB  
Article
Consolidated Convolutional Neural Network for Hyperspectral Image Classification
by Yang-Lang Chang, Tan-Hsu Tan, Wei-Hong Lee, Lena Chang, Ying-Nong Chen, Kuo-Chin Fan and Mohammad Alkhaleefah
Remote Sens. 2022, 14(7), 1571; https://0-doi-org.brum.beds.ac.uk/10.3390/rs14071571 - 24 Mar 2022
Cited by 49 | Viewed by 3954
Abstract
The performance of hyperspectral image (HSI) classification is highly dependent on spatial and spectral information, and is heavily affected by factors such as data redundancy and insufficient spatial resolution. To overcome these challenges, many convolutional neural networks (CNN) especially 2D-CNN-based methods have been [...] Read more.
The performance of hyperspectral image (HSI) classification is highly dependent on spatial and spectral information, and is heavily affected by factors such as data redundancy and insufficient spatial resolution. To overcome these challenges, many convolutional neural networks (CNN) especially 2D-CNN-based methods have been proposed for HSI classification. However, these methods produced insufficient results compared to 3D-CNN-based methods. On the other hand, the high computational complexity of the 3D-CNN-based methods is still a major concern that needs to be addressed. Therefore, this study introduces a consolidated convolutional neural network (C-CNN) to overcome the aforementioned issues. The proposed C-CNN is comprised of a three-dimension CNN (3D-CNN) joined with a two-dimension CNN (2D-CNN). The 3D-CNN is used to represent spatial–spectral features from the spectral bands, and the 2D-CNN is used to learn abstract spatial features. Principal component analysis (PCA) was firstly applied to the original HSIs before they are fed to the network to reduce the spectral bands redundancy. Moreover, image augmentation techniques including rotation and flipping have been used to increase the number of training samples and reduce the impact of overfitting. The proposed C-CNN that was trained using the augmented images is named C-CNN-Aug. Additionally, both Dropout and L2 regularization techniques have been used to further reduce the model complexity and prevent overfitting. The experimental results proved that the proposed model can provide the optimal trade-off between accuracy and computational time compared to other related methods using the Indian Pines, Pavia University, and Salinas Scene hyperspectral benchmark datasets. Full article
Show Figures

Graphical abstract

18 pages, 96409 KiB  
Article
BO-DRNet: An Improved Deep Learning Model for Oil Spill Detection by Polarimetric Features from SAR Images
by Dawei Wang, Jianhua Wan, Shanwei Liu, Yanlong Chen, Muhammad Yasir, Mingming Xu and Peng Ren
Remote Sens. 2022, 14(2), 264; https://0-doi-org.brum.beds.ac.uk/10.3390/rs14020264 - 07 Jan 2022
Cited by 24 | Viewed by 2895
Abstract
Oil spill pollution at sea causes significant damage to marine ecosystems. Quad-polarimetric Synthetic Aperture Radar (SAR) has become an essential technology since it can provide polarization features for marine oil spill detection. Using deep learning models based on polarimetric features, oil spill detection [...] Read more.
Oil spill pollution at sea causes significant damage to marine ecosystems. Quad-polarimetric Synthetic Aperture Radar (SAR) has become an essential technology since it can provide polarization features for marine oil spill detection. Using deep learning models based on polarimetric features, oil spill detection can be achieved. However, there is insufficient feature extraction due to model depth, small reception field lend due to loss of target information, and fixed hyperparameter for models. The effect of oil spill detection is still incomplete or misclassified. To solve the above problems, we propose an improved deep learning model named BO-DRNet. The model can obtain a more sufficiently and fuller feature by ResNet-18 as the backbone in encoder of DeepLabv3+, and Bayesian Optimization (BO) was used to optimize the model’s hyperparameters. Experiments were conducted based on ten prominent polarimetric features were extracted from three quad-polarimetric SAR images obtained by RADARSAT-2. Experimental results show that compared with other deep learning models, BO-DRNet performs best with a mean accuracy of 74.69% and a mean dice of 0.8551. This paper provides a valuable tool to manage upcoming disasters effectively. Full article
Show Figures

Figure 1

33 pages, 22312 KiB  
Article
A Natural Images Pre-Trained Deep Learning Method for Seismic Random Noise Attenuation
by Haixia Zhao, Tingting Bai and Zhiqiang Wang
Remote Sens. 2022, 14(2), 263; https://0-doi-org.brum.beds.ac.uk/10.3390/rs14020263 - 07 Jan 2022
Cited by 6 | Viewed by 1884
Abstract
Seismic field data are usually contaminated by random or complex noise, which seriously affect the quality of seismic data contaminating seismic imaging and seismic interpretation. Improving the signal-to-noise ratio (SNR) of seismic data has always been a key step in seismic data processing. [...] Read more.
Seismic field data are usually contaminated by random or complex noise, which seriously affect the quality of seismic data contaminating seismic imaging and seismic interpretation. Improving the signal-to-noise ratio (SNR) of seismic data has always been a key step in seismic data processing. Deep learning approaches have been successfully applied to suppress seismic random noise. The training examples are essential in deep learning methods, especially for the geophysical problems, where the complete training data are not easy to be acquired due to high cost of acquisition. In this work, we propose a natural images pre-trained deep learning method to suppress seismic random noise through insight of the transfer learning. Our network contains pre-trained and post-trained networks: the former is trained by natural images to obtain the preliminary denoising results, while the latter is trained by a small amount of seismic images to fine-tune the denoising effects by semi-supervised learning to enhance the continuity of geological structures. The results of four types of synthetic seismic data and six field data demonstrate that our network has great performance in seismic random noise suppression in terms of both quantitative metrics and intuitive effects. Full article
Show Figures

Figure 1

25 pages, 8428 KiB  
Article
MDPrePost-Net: A Spatial-Spectral-Temporal Fully Convolutional Network for Mapping of Mangrove Degradation Affected by Hurricane Irma 2017 Using Sentinel-2 Data
by Ilham Jamaluddin, Tipajin Thaipisutikul, Ying-Nong Chen, Chi-Hung Chuang and Chih-Lin Hu
Remote Sens. 2021, 13(24), 5042; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13245042 - 12 Dec 2021
Cited by 11 | Viewed by 3985
Abstract
Mangroves are grown in intertidal zones along tropical and subtropical climate areas, which have many benefits for humans and ecosystems. The knowledge of mangrove conditions is essential to know the statuses of mangroves. Recently, satellite imagery has been widely used to generate mangrove [...] Read more.
Mangroves are grown in intertidal zones along tropical and subtropical climate areas, which have many benefits for humans and ecosystems. The knowledge of mangrove conditions is essential to know the statuses of mangroves. Recently, satellite imagery has been widely used to generate mangrove and degradation mapping. Sentinel-2 is a volume of free satellite image data that has a temporal resolution of 5 days. When Hurricane Irma hit the southwest Florida coastal zone in 2017, it caused mangrove degradation. The relationship of satellite images between pre and post-hurricane events can provide a deeper understanding of the degraded mangrove areas that were affected by Hurricane Irma. This study proposed an MDPrePost-Net that considers images before and after hurricanes to classify non-mangrove, intact/healthy mangroves, and degraded mangroves classes affected by Hurricane Irma in southwest Florida using Sentinel-2 data. MDPrePost-Net is an end-to-end fully convolutional network (FCN) that consists of two main sub-models. The first sub-model is a pre-post deep feature extractor used to extract the spatial–spectral–temporal relationship between the pre, post, and mangrove conditions after the hurricane from the satellite images and the second sub-model is an FCN classifier as the classification part from extracted spatial–spectral–temporal deep features. Experimental results show that the accuracy and Intersection over Union (IoU) score by the proposed MDPrePost-Net for degraded mangrove are 98.25% and 96.82%, respectively. Based on the experimental results, MDPrePost-Net outperforms the state-of-the-art FCN models (e.g., U-Net, LinkNet, FPN, and FC-DenseNet) in terms of accuracy metrics. In addition, this study found that 26.64% (41,008.66 Ha) of the mangrove area was degraded due to Hurricane Irma along the southwest Florida coastal zone and the other 73.36% (112,924.70 Ha) mangrove area remained intact. Full article
Show Figures

Graphical abstract

23 pages, 13547 KiB  
Article
Tree Recognition on the Plantation Using UAV Images with Ultrahigh Spatial Resolution in a Complex Environment
by Xuzhan Guo, Qingwang Liu, Ram P. Sharma, Qiao Chen, Qiaolin Ye, Shouzheng Tang and Liyong Fu
Remote Sens. 2021, 13(20), 4122; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13204122 - 14 Oct 2021
Cited by 10 | Viewed by 2253
Abstract
The survival rate of seedlings is a decisive factor of afforestation assessment. Generally, ground checking is more accurate than any other methods. However, the survival rate of seedlings can be higher in the growing season, and this can be estimated in a larger [...] Read more.
The survival rate of seedlings is a decisive factor of afforestation assessment. Generally, ground checking is more accurate than any other methods. However, the survival rate of seedlings can be higher in the growing season, and this can be estimated in a larger area at a relatively lower cost by extracting the tree crown from the unmanned aerial vehicle (UAV) images, which provides an opportunity for monitoring afforestation in an extensive area. At present, studies on extracting individual tree crowns under the complex ground vegetation conditions are limited. Based on the afforestation images obtained by airborne consumer-grade cameras in central China, this study proposes a method of extracting and fusing multiple radii morphological features to obtain the potential crown. A random forest (RF) was used to identify the regions extracted from the images, and then the recognized crown regions were fused selectively according to the distance. A low-cost individual crown recognition framework was constructed for rapid checking of planted trees. The method was tested in two afforestation areas of 5950 m2 and 5840 m2, with a population of 2418 trees (Koelreuteria) in total. Due to the complex terrain of the sample plot, high weed coverage, the crown width of trees, and spacing of saplings vary greatly, which increases both the difficulty and complexity of crown extraction. Nevertheless, recall and F-score of the proposed method reached 93.29%, 91.22%, and 92.24% precisions, respectively, and 2212 trees were correctly recognized and located. The results show that the proposed method is robust to the change of brightness and to splitting up of a multi-directional tree crown, and is an automatic solution for afforestation verification. Full article
Show Figures

Graphical abstract

22 pages, 11371 KiB  
Article
Change Detection from SAR Images Based on Convolutional Neural Networks Guided by Saliency Enhancement
by Liangliang Li, Hongbing Ma and Zhenhong Jia
Remote Sens. 2021, 13(18), 3697; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13183697 - 16 Sep 2021
Cited by 8 | Viewed by 4542
Abstract
Change detection is an important task in identifying land cover change in different periods. In synthetic aperture radar (SAR) images, the inherent speckle noise leads to false changed points, and this affects the performance of change detection. To improve the accuracy of change [...] Read more.
Change detection is an important task in identifying land cover change in different periods. In synthetic aperture radar (SAR) images, the inherent speckle noise leads to false changed points, and this affects the performance of change detection. To improve the accuracy of change detection, a novel automatic SAR image change detection algorithm based on saliency detection and convolutional-wavelet neural networks is proposed. The log-ratio operator is adopted to generate the difference image, and the speckle reducing anisotropic diffusion is used to enhance the original multitemporal SAR images and the difference image. To reduce the influence of speckle noise, the salient area that probably belongs to the changed object is obtained from the difference image. The saliency analysis step can remove small noise regions by thresholding the saliency map, and interest regions can be preserved. Then an enhanced difference image is generated by combing the binarized saliency map and two input images. A hierarchical fuzzy c-means model is applied to the enhanced difference image to classify pixels into the changed, unchanged, and intermediate regions. The convolutional-wavelet neural networks are used to generate the final change map. Experimental results on five SAR data sets indicated the proposed approach provided good performance in change detection compared to state-of-the-art relative techniques, and the values of the metrics computed by the proposed method caused significant improvement. Full article
Show Figures

Graphical abstract

Review

Jump to: Editorial, Research, Other

22 pages, 3941 KiB  
Review
Accuracy Assessment in Convolutional Neural Network-Based Deep Learning Remote Sensing Studies—Part 2: Recommendations and Best Practices
by Aaron E. Maxwell, Timothy A. Warner and Luis Andrés Guillén
Remote Sens. 2021, 13(13), 2591; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13132591 - 02 Jul 2021
Cited by 41 | Viewed by 7537
Abstract
Convolutional neural network (CNN)-based deep learning (DL) has a wide variety of applications in the geospatial and remote sensing (RS) sciences, and consequently has been a focus of many recent studies. However, a review of accuracy assessment methods used in recently published RS [...] Read more.
Convolutional neural network (CNN)-based deep learning (DL) has a wide variety of applications in the geospatial and remote sensing (RS) sciences, and consequently has been a focus of many recent studies. However, a review of accuracy assessment methods used in recently published RS DL studies, focusing on scene classification, object detection, semantic segmentation, and instance segmentation, indicates that RS DL papers appear to follow an accuracy assessment approach that diverges from that of traditional RS studies. Papers reporting on RS DL studies have largely abandoned traditional RS accuracy assessment terminology; they rarely reported a complete confusion matrix; and sampling designs and analysis protocols generally did not provide a population-based confusion matrix, in which the table entries are estimates of the probabilities of occurrence of the mapped landscape. These issues indicate the need for the RS community to develop guidance on best practices for accuracy assessment for CNN-based DL thematic mapping and object detection. As a first step in that process, we explore key issues, including the observation that accuracy assessments should not be biased by the CNN-based training and inference processes that rely on image chips. Furthermore, accuracy assessments should be consistent with prior recommendations and standards in the field, should support the estimation of a population confusion matrix, and should allow for assessment of model generalization. This paper draws from our review of the RS DL literature and the rich record of traditional remote sensing accuracy assessment research while considering the unique nature of CNN-based deep learning to propose accuracy assessment best practices that use appropriate sampling methods, training and validation data partitioning, assessment metrics, and reporting standards. Full article
Show Figures

Graphical abstract

27 pages, 4410 KiB  
Review
Accuracy Assessment in Convolutional Neural Network-Based Deep Learning Remote Sensing Studies—Part 1: Literature Review
by Aaron E. Maxwell, Timothy A. Warner and Luis Andrés Guillén
Remote Sens. 2021, 13(13), 2450; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13132450 - 23 Jun 2021
Cited by 100 | Viewed by 12548
Abstract
Convolutional neural network (CNN)-based deep learning (DL) is a powerful, recently developed image classification approach. With origins in the computer vision and image processing communities, the accuracy assessment methods developed for CNN-based DL use a wide range of metrics that may be unfamiliar [...] Read more.
Convolutional neural network (CNN)-based deep learning (DL) is a powerful, recently developed image classification approach. With origins in the computer vision and image processing communities, the accuracy assessment methods developed for CNN-based DL use a wide range of metrics that may be unfamiliar to the remote sensing (RS) community. To explore the differences between traditional RS and DL RS methods, we surveyed a random selection of 100 papers from the RS DL literature. The results show that RS DL studies have largely abandoned traditional RS accuracy assessment terminology, though some of the accuracy measures typically used in DL papers, most notably precision and recall, have direct equivalents in traditional RS terminology. Some of the DL accuracy terms have multiple names, or are equivalent to another measure. In our sample, DL studies only rarely reported a complete confusion matrix, and when they did so, it was even more rare that the confusion matrix estimated population properties. On the other hand, some DL studies are increasingly paying attention to the role of class prevalence in designing accuracy assessment approaches. DL studies that evaluate the decision boundary threshold over a range of values tend to use the precision-recall (P-R) curve, the associated area under the curve (AUC) measures of average precision (AP) and mean average precision (mAP), rather than the traditional receiver operating characteristic (ROC) curve and its AUC. DL studies are also notable for testing the generalization of their models on entirely new datasets, including data from new areas, new acquisition times, or even new sensors. Full article
Show Figures

Graphical abstract

Other

19 pages, 67459 KiB  
Technical Note
Surround-Net: A Multi-Branch Arbitrary-Oriented Detector for Remote Sensing
by Junkun Luo, Yimin Hu and Jiadong Li
Remote Sens. 2022, 14(7), 1751; https://0-doi-org.brum.beds.ac.uk/10.3390/rs14071751 - 06 Apr 2022
Cited by 2 | Viewed by 1617
Abstract
With the development of oriented object detection technology, especially in the area of remote sensing, significant progress has been made, and multiple excellent detection architectures have emerged. Oriented detection architectures can be broadly divided into five-parameter systems and eight-parameter systems that encounter the [...] Read more.
With the development of oriented object detection technology, especially in the area of remote sensing, significant progress has been made, and multiple excellent detection architectures have emerged. Oriented detection architectures can be broadly divided into five-parameter systems and eight-parameter systems that encounter the periodicity problem of angle regression and the discontinuous problem of vertex regression during training, respectively. Therefore, we propose a new multi-branch anchor-free one-stage model that can effectively alleviate the corner case when representing rotating objects, called Surround-Net. The creative contribution submitted in this paper mainly includes three aspects. Firstly, a multi-branch strategy is adopted to make the detector choose the best regression path adaptively for the discontinuity problem. Secondly, to address the inconsistency between classification and quality estimation (location), a modified high-dimensional Focal Loss and a new Surround IoU Loss are proposed to enhance the unity ability of the features. Thirdly, in the refined process after backbone feature extraction, a center vertex attention mechanism is adopted to deal with the environmental noise introduced in the remote sensing images. This type of auxiliary module is able to focus the model’s attention on the boundary of the bounding box. Finally, extensive experiments were carried out on the DOTA dataset, and the results demonstrate that Surround-Net can solve regression boundary problems and can achieve a more competitive performance (e.g., 75.875 mAP) than other anchor-free one-stage detectors with higher speeds. Full article
Show Figures

Figure 1

17 pages, 25406 KiB  
Technical Note
Remote Sensing Image Target Detection: Improvement of the YOLOv3 Model with Auxiliary Networks
by Zhenfang Qu, Fuzhen Zhu and Chengxiao Qi
Remote Sens. 2021, 13(19), 3908; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13193908 - 30 Sep 2021
Cited by 31 | Viewed by 2590
Abstract
Remote sensing image target detection is widely used for both civil and military purposes. However, two factors need to be considered for remote sensing image target detection: real-time and accuracy for detecting targets that occupy few pixels. Considering the two above issues, the [...] Read more.
Remote sensing image target detection is widely used for both civil and military purposes. However, two factors need to be considered for remote sensing image target detection: real-time and accuracy for detecting targets that occupy few pixels. Considering the two above issues, the main research objective of this paper is to improve the performance of the YOLO algorithm in remote sensing image target detection. The reason is that the YOLO models can guarantee both detection speed and accuracy. More specifically, the YOLOv3 model with an auxiliary network is further improved in this paper. Our model improvement consists of four main components. Firstly, an image blocking module is used to feed fixed size images to the YOLOv3 network; secondly, to speed up the training of YOLOv3, DIoU is used, which can speed up the convergence and increase the training speed; thirdly, the Convolutional Block Attention Module (CBAM) is used to connect the auxiliary network to the backbone network, making it easier for the network to notice specific features so that some key information is not easily lost during the training of the network; and finally, the adaptive feature fusion (ASFF) method is applied to our network model with the aim of improving the detection speed by reducing the inference overhead. The experiments on the DOTA dataset were conducted to validate the effectiveness of our model on the DOTA dataset. Our model can achieve satisfactory detection performance on remote sensing images, and our model performs significantly better than the unimproved YOLOv3 model with an auxiliary network. The experimental results show that the mAP of the optimised network model is 5.36% higher than that of the original YOLOv3 model with the auxiliary network, and the detection frame rate was also increased by 3.07 FPS. Full article
Show Figures

Graphical abstract

Back to TopTop