DFP-Net: A Crack Segmentation Method Based on a Feature Pyramid Network

Li, Linjing; Liu, Ran; Ali, Rashid; Chen, Bo; Lin, Haitao; Li, Yonglong; Zhang, Hua

doi:10.3390/app14020651

Open AccessArticle

DFP-Net: A Crack Segmentation Method Based on a Feature Pyramid Network

by

Linjing Li

¹

,

Ran Liu

^1,2

,

Rashid Ali

^1,3

,

Bo Chen

¹,

Haitao Lin

¹,

Yonglong Li

⁴

and

Hua Zhang

^1,5,*

¹

School of Information Engineering, Southwest University of Science and Technology, Mianyang 621010, China

²

Engineering Product Development, Singapore University of Technology and Design, Singapore 487372, Singapore

³

Department of Computer Science, University of Turbat, Turbat 92600, Pakistan

⁴

Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

⁵

Tianfu Institute of Research and Innovation, Southwest University of Science and Technology, Chengdu 610299, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(2), 651; https://0-doi-org.brum.beds.ac.uk/10.3390/app14020651

Submission received: 4 December 2023 / Revised: 4 January 2024 / Accepted: 8 January 2024 / Published: 12 January 2024

(This article belongs to the Special Issue Deep Learning and Computer Vision for Object Recognition)

Download

Browse Figures

Versions Notes

Abstract

:

Timely detection of defects is essential for ensuring safe and stable operation of concrete buildings. Automatic segmentation of concrete buildings’ surfaces is challenging due to the high diversity of crack appearance, the detailed information, and the unbalanced proportion of crack pixels and background pixels. In this work, the Double Feature Pyramid Network is designed for high-precision crack segmentation. Our work reached the state-of-the-art level in crack segmentation, with key contributions outlined as follows: firstly, considering the diversity of crack shapes, the network constructs a feature pyramid containing three feature extraction backbones to extract the global feature map with three scale input images. In particular, due to the biggest challenge being too much single-pixel crack area, the targeted feature pyramid based on the high-resolution is added to extract adequate shallow semantic information. Lastly, designing a cascade feature fusion unit to aggregate the extracted multi-dimensional feature maps and obtain the final prediction. Compared with existing crack detection methods, the superior performance of this method has been verified based on extensive experiments, with Pixel Accuracy of 65.99%, Intersection over Union of 44.71%, and Recall of 62.95%, providing a reliable and efficient solution for the health monitoring and maintenance of concrete structures. This work contributes to the advancement of research and practical applications in related fields, offering robust support for the monitoring and maintenance of concrete structures.

Keywords:

semantic segmentation; crack localization; feature pyramid; image cascade; feature fusion

1. Introduction

In recent years, the construction industry, known for its high energy consumption and intensive labor [1], has given rise to an urgent demand for transformation towards digitization and greenization. This includes leading the green development with digitization, driving the intelligent transformation with greenization, promoting the construction of digital technology and digital infrastructure, promoting intelligent construction, and creating digital buildings. Consequently, the overall transformation and upgrading of the construction industry has become an inevitable aspect of the development of this era [2,3].

When the cement reaction happens, cracks can result from the shrinkage and deformation of concrete structures. The occurrence of cracks can trigger additional problems such as corrosion of the internal steel structure of buildings [4,5]. In concrete structures, the cracks will impact the visual aesthetics of buildings, and most importantly, they diminish the load-bearing capacity of their internal framework [6]. This, in turn, shortens the service life of the buildings. Therefore, to control the harmful consequences of cracks and extend the lifespan of buildings, it is crucial to detect and treat concrete cracks in a timely manner. This is of great significance for protecting the personal safety of citizens and insuring social property against significant losses.

In concrete structures, variations and uneven distribution of carrying capacities can lead to deformations and the formation of cracks, especially when the building is subjected to excessive loads. Additionally, during the construction process, the heat of hydration occurs. Non-uniform dissipation of heat both internally and externally can result in concrete deformation. The tensile strength of concrete is fixed, and when the tensile stress on the surface of the concrete is higher than its tensile strength, it will lead to the generation and growth of cracks. Simultaneously, suppose the concrete has no scientific and proper maintenance after forming [7]. In that case, significant differences in the moisture content of concrete buildings between the surface and the interior can result in cracks due to shrinkage [8,9]. These cracks typically take the shape of elongated and branched patterns, with specific shapes influenced by the direction of tensile stress, resulting in diversity. The cracks spread to the far end from coarse to fine, often accompanied by small branches. The coarseness is highly differentiated, and the cracks at the furthest end may occupy only a single pixel in the sample.

Currently, there are various methods for detecting cracks in concrete structures [10]. Traditional manual screening and identification are still widely used, but suffer from subjectivity, slow speed, and a heavy reliance on the subjective judgment of researchers [11]. In crack detection tasks, early non-vision-based crack detection techniques, including stress waves [12,13,14,15,16,17,18] and electrical characteristics [19,20], necessitated sophisticated monitoring instruments with elevated capabilities and incurred higher costs compared with digital image processing. Therefore, visual technology is commonly used in this field. Edge detection involves fewer computational parameters, offering high real-time capability, but it struggles to accurately extract edge information from complex cracks, resulting in low accuracy [21,22]. With the continuous development of deep learning, it is gradually applied to inspection tasks in the field of concrete architecture. Object detection, in particular, enables the swift and accurate identification of areas containing cracks. However, it cannot precisely capture the actual pixel points of the cracks, hindering the foundation for subsequent quantitative analysis of cracks [23,24]. In terms of accuracy, reliability, and precision in crack detection, semantic segmentation based on deep learning emerges as an effective and robust approach. As advancements in science and technology unfold, researchers are actively exploring more sophisticated methods to improve the pixel accuracy and detection efficiency in crack detection [25,26].

In this study, we conducted extensive experiments on crack datasets, specifically Structure [27]. Compared with typical segmentation networks, the experiments validate the superior pixel accuracy of our work. The research presented in this paper primarily addresses two main challenges:

(i) Timely crack monitoring in a concrete building is the foundation of aging buildings’ safety. Object detection methods require limited effort because they can only locate crack regions but not shapes. Hence, it is essential to integrate the crack monitor with semantic segmentation based on deep learning to precisely locate cracks pixel by pixel.

(ii) Considering the significant variations in the width and length of cracks, with some cracks occupying only a few pixels in minimum area, accurate identification of cracks is challenging when the extracted information is insufficiently detailed. Another challenging issue is the need to increase the attention to detail of information to enhance segmentation accuracy.

The subsequent sections are outlined as follows: Section 2 provides a comprehensive review of crack localization methods. Section 3 details the DFP-Net, encompassing the network structure, the image cascade mechanism, the cascade feature fusion unit, and the loss function. In Section 4, we present the experiments and results conducted using the STRUCTURE dataset. Finally, the advantages and limitations of the proposed method are discussed in Section 5.

2. Related Work

Cracks in concrete buildings are usually caused by an external force, and due to the influence of materials and structure, cracks are often narrow and slender, branch-like, unstable, and inconsistent [28]. And the occurrence of cracks is not independent; there are often many associated cracks that are smaller and are difficult to distinguish [29]. In the case of irregular shapes, cracks have very steep edges with much more detailed information than ordinary segmentation objects such as vehicles and pedestrians [30,31,32]. A challenging problem that arises in this domain is increasing the attention to detail of information to improve the segmentation accuracy.

Traditional crack segmentation methods mainly include image segmentation algorithms that use specific tools [33], edge detection, and digital image processing. The detection results of these methods using an original canny operator [34], Sobel operator [35] and other commonly edge operators are greatly affected by the background noise. These traditional crack segmentation methods [36,37,38] have some deficiencies, such as lower precision, higher computational cost, and a slower prediction speed. The relentless advancement of machine vision has necessitated the inevitable integration of deep learning into crack localization for the upkeep of concrete structures. And under the premise of sufficient samples, deep learning performs better during crack localization than traditional crack segmentation methods [39,40,41].

2.1. Digital Image Processing

In 2014, Liu et al. [42] employed a four-wheel guide rail cart equipped with a high-definition industrial camera for crack image acquisition on concrete surfaces of the ground and bridge. The image processing phase involved obtaining the binary image of cracks through edge detection, morphological processing, and subsequent extraction based on statistical analysis of feature values combined with crack characteristics. Simultaneously, in 2014, R. S. Adhikari et al. [43] simulated defect detection through models based on digital image processing, which included crack detection, quantification and 3D visualization.

In 2015, H Oliveira et al. [44] proposed an algorithm dedicated to detecting cracks on road surfaces. The toolbox included modules for crack detection algorithms, pattern recognition, image processing-based algorithms, crack feature description algorithms, and crack detection and evaluation solution performance. Subsequently, in 2018, H Cho et al. [45] proposed a edge-based crack detection technique, utilizing the crack width transform (CWT) algorithm to address the issue of inaccurate edge pixel extraction in traditional methods based on line-enhanced filtering.

In 2019, C. Shao et al. [46] presented a crack detection method for pavements. The method involves initial gray-level processing and gray-level correction of acquired crack images. Subsequently, it removes noise from the crack samples through a weighted median filter. Finally, a mixed ion swarm optimization-based image segmentation method is designed to obtain the crack area. The process concludes with pixel size calibration and extraction of characteristic parameters of the crack. In 2020, N. Gehri et al. [47] proposed a method that utilizes digital image correlation (DIC) technology to achieve automatic crack detection and crack movement detection. This approach enhances the accuracy and reliability of crack identification, although the results are dependent on DIC configuration, and there are limitations when cracks are in close proximity.

Algorithms based on digital image processing offer the advantage that their design does not heavily depend on extensive image data support. Algorithm designers only need to comprehend specific feature recognition rules and establish corresponding discrimination conditions. However, the disadvantage lies in its relatively low robustness, which are not completely guaranteed to eliminate of the interference of various complex noises, which will lead to an unstable crack identification effect, and there are some defects in automatic identification without manual assistance.

2.2. Object Detection

Due to issues like edge connection instability and sensitivity to noise in digital image processing, deep-learning-based crack recognition methods have achieved widespread use.

In 2016, L. Zhang et al. [48] pioneered the application of object detection based on deep learning in pavement surface defect detection, demonstrating the superiority of convolutional neural networks over support vector machines in classifying crack images. However, the primitive network design and incomplete training data scene led to a slow final recognition speed and low accuracy. In 2018, Vishal Mandal et al. [49] designed an automatic road defect detection and analysis method. This method takes YOLOv2 as the core and uses the evaluation indicator F1-score as the rating standard to evaluate the condition of road surface defects.

In 2020, Zhang et al. [50] presented an improved algorithm based on YOLOv3 for the bridge surface crack detection tasks. This algorithm combined MobileNets and a convolutional block attention module to achieve real-time detection of bridge surface cracks. Additionally, in 2020, J. Yang et al. [51] used a single-shot multi-box detector as the basis, expanded the receptive field of convolutional layers, improved the feature extraction ability of the model’s contextual information, and improved the accuracy of road crack detection. Furthermore, in 2021, K. Yan et al. [52] used VGG16 as the feature extraction backbone network and designed a deformable SSD network. By introducing deformable convolution, they improved the detection ability of the model (Figure 1).

Compared to traditional digital image processing, methods related to target detection offer advantages in multi-target processing and real-time crack detection [53]. However, challenges such as target category and shape limitations and sensitivity to complex backgrounds still exist.

2.3. Image Segmentation

To address the limitation of pixel-level accuracy in target detection when dealing with cracks in concrete buildings and achieve more precise crack identification and localization, the intelligent identification field of concrete building cracks has incorporated the semantic segmentation method. By using semantic segmentation, each pixel in the image can be divided more meticulously to provide more accurate crack boundary information, effectively solving the problem of object detection not being able to obtain crack pixel positions.

In 2019, König J et al. [54] combined U-Net and full convolution in the semantic segmentation task of cracks, introduced residual modules to optimize the network structure, and added attention mechanisms. Also, in the same year, CV Dung [55] use VGG16 as the backbone network to extract target features and construct a network structure based on deep fully convolutional networks to obtain sufficient sample context information. This method can effectively achieve crack localization and segmentation.

In 2021, W Wang et al. [56] designed a semi-supervised segmentation model to reduce the workload of sample labeling in dataset preparation. This model extracts crack features using EfficientUNet and introduces a teacher–student network to improve the detection efficiency of cracks while ensuring the segmentation accuracy of the model. Also, in the same year, H Fu et al. [57] have made improvements to DeeplabV3+. By introducing spatial pyramid pooling, the receptive field of the convolutional layer is expanded while reducing the computational parameters of the model. This allows the model to extract more global semantic information while minimizing the size of the model. In 2022, C Xiang et al. [58] improved the input sample resolution through super-resolution reconstruction and achieved quantification of cracks.

At present, the crack identification method based on semantic segmentation has made remarkable progress in improving pixel-level accuracy. These methods successfully locate cracks in images pixel by pixel. However, currently, existing methods still face some problems. For example, current methods may have leak detection problems for small or very small cracks. In addition, there is still room to improve the robustness of the model when dealing with complex scenes, light changes, etc. In order to alleviate these problems, a method more adaptable to different crack types and scenarios is needed, and will achieve a more comprehensive and accurate identification of cracks.

In this paper, the proposed work Double Feature Pyramid Network (DFP-Net) mainly makes the following three contributions.

(i) First, the cascade image input mechanism (i.e., low, medium, full, and high resolution) is utilized in the six feature branches which extract different scale features, and the input image with a higher resolution corresponds to the feature extract branch, which extracts lower-dimension semantic information.

(ii) Then, the inner feature pyramid is established in the input image with the highest resolution to obtain copious shallow semantic information; meanwhile, the outer feature pyramid is constructed as a three-resolution image to extract multi-dimension semantic information.

(iii) Additionally, we introduce a Cascade Feature Fusion unit (CFF) designed to sequentially fuse feature maps from cascade branches with varying resolutions. This innovative unit aims to enhance the integration of global semantic information.

3. Methods

The overview of the entire crack segmentation model is illustrated schematically in Figure 2. The architecture of DFP-Net mainly consists of a deep feature pyramid and a shallow feature pyramid, containing six feature extraction branches to obtain multi-dimensional crack semantic information. The deep feature pyramid has three backbones for feature extraction, corresponding to input images of three sizes. Based on the extracted multi-level deep semantic feature informations of cracks, the deep feature pyramid ensures the model has the ability to capture feature information of cracks with different shapes.

At the same time, considering that the detailed information on cracks is more abundant than conventional targets, a shallow feature pyramid is designed. This feature pyramid also includes three feature extraction branches. To obtain a large amount of crack edge information, and to compensate for the shortcomings of conventional models in small crack segmentation, different feature extraction branches are used to obtain twice the resolution of the input image. Finally, a multi-level feature fusion module used for multiple scales and depth feature maps is designed for the network structure, so that the prediction results will not only have excellent performance in capturing crack branches but also have good performance in the far end and details of cracks.

3.1. Deep Feature Pyramid

In the semantic segmentation model, the multi-dimension semantic information from deep to shallow is extracted as joint sample context information, accurately predicting the target. When extracting deep semantic information, the branch outputs rough predictions to realize the main shape of the target, while when extracting shallow semantic information, the branch focuses on details such as the edge lines of the target.

To enhance segmentation accuracy while mitigating computational overhead, we integrate the image pyramid into the network. The image pyramid is a structure that scales images to different resolution sizes. The low-resolution image input for the rough prediction branch intuitively reduces the calculation parameters while ensuring that the output prediction is unaffected. Then, a high-resolution image is introduced for the detail prediction branch, reducing the difficulty of fine prediction by amplifying the detail information to capture more accurate edge details. Based on the image pyramid, the deep feature pyramid extracts features from three input images of different scales. This module has a top-down hierarchical feature extraction architecture, which obtains feature maps with progressively increasing scales. Then, to obtain lower-level feature maps that aggregate contextual information, feature maps of different scales are gradually fused from bottom to top.

Given the concrete crack samples and corresponding ground truths, the images are first resized to three image scales from small to large through an image pyramid module, then a feature pyramid is established on the image pyramid, and features are extracted from different dimensions from the bottom up. Each feature extraction branch of the deep feature pyramid corresponds to a level of crack semantic information. At each branch, a feature map fusion operation is performed based on the top-down architecture, gradually fusing higher-dimensional feature maps into lower-dimensional feature maps layer by layer, enabling lower-level feature maps to obtain higher-level contextual semantic information. At each branch, the feature maps undergo the hierarchical feature fusion module, ensuring consistency in dimensionality and size between high-dimensional and low-dimensional feature maps. The ultimate feature map from the deep feature pyramid undergoes two fusion operations to incorporate ample contextual information. Ultimately, the fused feature map of the deep feature pyramid is fed into the final fused feature map of the shallow feature pyramid for conclusive fusion, resulting in the generation of a crack prediction map.

In this work, the original input image (e.g., 256 × 256 in the structure crack dataset) is up-sampled by factors of two and down-sampled by a factor of two and a factor of four, forming a cascade input with four image scales. In the low-resolution input branch, a 1/4-sized image is input into the initial self-designed extractor with a down-sampling rate of 8, producing a 1/32-resolution feature map containing high-dimensional semantic information. In the medium-resolution input branch, a 1/2-sized image is used to obtain a 1/16-resolution feature map with the same self-designed backbone. In the full-resolution input branch, we convolve the original image to obtain a 1/4-resolution feature map. Finally, the two-resolution image is utilized in the shallow information extractors.

3.2. Shallow Feature Pyramid

The inherent attribute of cracks is their diverse shape and rich details. Currently, many networks use high-dimensional features to classify targets or pixels. When the target is tiny and has much detailed information, it is easy to lose the target while extracting high-dimensional features, and the accuracy is poor when there is a significant difference in crack size. The proposed method utilizes small-scale input image scale changes to construct an image pyramid, based on the concept of multi-scale change enhancement, which alleviates the problem of a low recognition rate caused by crack shape differentiation. However, high-resolution image pyramids will influence the computational costs. Constructing a shallow feature pyramid based on high-resolution input images to extract shallow semantic information of cracks, alleviating the problem of low accuracy caused by differences in crack size.

The shallow feature pyramid consists of the last three feature extract branches of DPF-Net. Establishing shallow feature pyramid based on the 2-times resolution input image, whose branches have fewer convolution layers and lower depth of the network. In the shallow feature pyramid, the receptive fields of the three feature extraction branches are 43 × 43, 9 × 9, and 7 × 7 respectively. Compared to deep feature pyramids, their receptive field is smaller.

The smaller the receptive field, the more focused the region of interest, leading to the extraction of finer edge information. Cascade fuses the three outputs of internal FPN to obtain the feature map containing abundant shallow semantic information and subsequently utilizes the feature map as the last part of external FPN to enrich the details of prediction.

The output of the first three branches and the fusion of the feature maps of the last branches consist of the external FPN. The formula of the receptive field is depicted as follow.

l_{K} = l_{K - 1} + [(f_{K} - 1) * \prod_{i = 1}^{k - 1} s_{i}]

(1)

where

l_{k - 1}

is the receptive field of layer

k - 1

,

f_{k}

is the filer size, and

s_{i}

is the stride of layer i. The feature fusion scheme is shown in Figure 2. The receptive fields of the first two feature maps are both 43 × 43, and the third feature map has a small receptive field of 7 × 7. The first two branches are used to obtain the deeper feature, and the third branch is used to capture the shallow feature. Thus, the first branch extracts the deeper global semantic information and the second branch enriches it.

The result of fusing the top three feature maps cannot meet the requirement of crack segmentation which has too much detailed information. Therefore, the integration of both internal FPN and external FPN can significantly enhance the detailed information related to cracks, thereby improving the segmentation accuracy of the model.

3.3. Cascade Feature Fusion Unit

In this paper, we designed a Cascade Feature Fusion (CFF) module to amalgamate the feature maps with different scale, which were extracted from cascade branches. Figure 3 illustrated the architecture of this module.

The input of the CFF module contains two feature maps and a ground truth. The feature maps

F L

has lower resolution, and the size of

F L

is

C_{1} \times H_{1} \times W_{1}

. The feature maps

F H

has higher resolution, and the size of

F H

is

C_{2} \times H_{2} \times W_{2}

. Besides, the scale of

F L

is 1/2 of the scale of

F H

.

LABEL is the ground truth of the public dataset [Structure]. The size of LABEL is

1 \times H_{2} \times W_{2}

.

For feature

F L

, we first adopt an up-sampling rate of four, resulting in

F L^{'}

, and

F L^{'}

having the same size with

F H

. Next, a dilated convolution layer with a kernel size of

C_{3} \times 3 \times 3

and a dilation of 2 is applied to

F L^{'}

. The size of

F L^{'}

is

C_{3} \times H_{2} \times W_{2}

. A batch normalization layer is used to normalize

F L^{'}

.

For effective feature fusion, we employ a projection convolution with a kernel size of

C_{3} \times 1 \times 1

on

F H

, resulting in

F H^{'}

.

F H^{'}

has the same number of channels as

F L^{'}

. A batch normalization layer is added, too.

A sum layer and a relu layer are used to fuse

F H^{'}

and

F L^{'}

, resulting in

F H

. The size of

F H

is

C_{3} \times H_{2} \times W_{2}

. In order to improve the learning of

F L

, a ground truth label as

1 \times H_{2} \times W_{2}

is used for supervised learning.

3.4. Image Pre-Processing

To enhance the model’s generalization ability and reduce its dependency on specific attributes, sample expansion becomes imperative. Considering the characteristics of fracture samples, this paper introduces three image preprocessing methods with which to preprocess the dataset. They are horizontal flip, random adjustment of brightness and contrast, and geometric deformation.

The brightness of the training image X is expressed as the mean gray level

μ

. The contrast of the training image X is expressed as mean variance

σ

. The calculation is depicted in Formulas (2) and (3).

u = \frac{1}{m} \sum_{i = 1}^{m} x_{i}

(2)

σ^{2} = \frac{1}{m} \sum_{i = 1}^{m} (x_{i} - u)^{2}

(3)

x_{i}

is used to represent the gray-level of the i-th pixel of the training sample. m represents the pixel number contained in a single training sample. To improve the discrimination of the enhanced sample, a transformation factor

η

is added, and

255 η

is taken as the mean of the normal distribution curve. The standard deviation is set to

2 η

to generate a new contrast value

2 η σ

and a new brightness value

u^{'}

that obey this distribution, resulting in the enhanced sample

X^{'}

.

η = \frac{1}{1 + exp (5 - u / 25)}

(4)

X^{'} = 2 η σ \frac{X - u}{\sqrt{σ^{2} + ε}} + u^{'}

(5)

According to the characteristics of cracks with various shapes and irregular edges, the pixel positions of the training image are transformed to produce a deformation effect on the image, and the deformation formula is as follows.

y^{'} = y - 10 sin (\frac{2 π w}{152} + 10)

(6)

x^{'} = x - 10 sin (\frac{2 π h}{152} + 10)

(7)

in which x and y are the horizontal and vertical coordinate positions of each pixel before deformation and

x^{'}

,

y^{'}

present the position after deformation.

3.5. Loss Function

The proposed method aims at crack segmentation. Simple cracks have a pure edge, while complex cracks have many small branches, irregular shape and pseudo-crack.

Considering the variety of crack shapes and their imbalance problem about distribution and difficulty, we select Focal Loss to deal with it. The formula for focal loss is presented in Equation (2).

L_{f l} = {\begin{matrix} - α {(1 - y^{'})}^{γ} log y^{'}, y = 1 \\ - (1 - α) y^{' γ} log (1 - y^{'}), y = 0 \end{matrix}

(8)

γ

makes the model pay more attention to complex samples and reduces the influence of simple samples.

α

alleviated the problem resulting from uneven pixel proportion of object and background [14]. This paper set

α

= 0.25,

γ

= 2.

4. Experiment Results

The structure of this section is designed to provide a comprehensive understanding of the experiments. First, we introduce the evaluation indicators used in experiments. And then, we delve into detailed ablation studies conducted on the STRUCTURE dataset. These studies will offer insights into the individual impact of each component. To conclude, we will present comparisons with four segmentation networks to showcase the advantages of our work.

This paper conducts five experiments on the dataset with the same hardware configuration and parameter settings, with 300 images tested each time. All indicators in this article are the average results of the indicator parameters obtained after testing 1500 test images.

During the training stage, Adam optimizer is employed with a learning rate of 0.00001 and minibatch size of 4 for the optimization of all segmentation networks. The model with the minimum average loss value on validation datasets is selected as the final model after 100 epochs of training. In the inference stage, pixels with an output probability value no less than 0.5 are classified as crack, while others are identified as background.

4.1. Evaluation Index

The experimental environment for all experiments in this paper was configured using a Hardware select Intel Xeon E5-2620 v3 processor, Nvidia TITAN Xp graphics card, recall (Rc), Pixel Accuracy (PA), Intersection over Union (IoU), and F1-score were selected as the evaluation indicators for each model in this experiment. The calculations are as follows:

P A = \frac{T P}{T P + F P}

(9)

R c = \frac{T P}{T P + F N}

(10)

F_{1} = \frac{2 \times P A \times R c}{P A + R c}

(11)

I o U = \frac{T P}{T P + F N + F P}

(12)

TP (True Positive)—the crack pixel correctly predicted.

TN (True Negative)—the background pixel correctly predicted.

FP (False Positive)—the background pixel is incorrectly predicted as a crack.

FN (False Negative)—the crack pixel is incorrectly predicted as background.

4.2. Dataset

In this paper, we conduct extensive experiments on the public dataset [Structure] [27] to verify the effect of DFP-Net. The dataset contains different types of defects such as mottled walls. In the sub-dataset of the public dataset [Structure], the characteristics of the cracks are as follows: the simple crack sample has a smooth edge, while the edge of the complex crack is steep, and a complex crack always has many small branches. And in the same sample image, the foreground occupies a small area, while the background occupies the largest area. The crack sample is shown in Figure 4.

Considering the proposed method is aimed at crack localization, we processed the dataset to conduct experiments. First, we selected the crack sample from the dataset. Then, we pre-processed the collected samples to increase the diversity of samples. Three image processing methods mentioned in Section 3.4 are used in the training stage, improving the diversity of training data and expanding the robustness of the segmentation method. The final effect of image processing is shown in Figure 5. The sub-dataset finally contains 3000 samples, which are divided into 8:1:1 of training, verification, and testing.

4.3. Cascade Feature Fusion

We evaluate the architecture on a sub-dataset of the dataset

[S t r u c t u r e]

with image resolution 256 × 256. In the proposed method, we only feed low-resolution images to a heavy backbone, and the light backbone is used to create high-resolution images. The low- and mid-resolution input branch (top branch in Figure 2) extract deep segmentation information while yielding coarse prediction. To verify the method’s effectiveness in improving accuracy, there is a comparison among cascade branches as shown in Table 1.

The result of primary fusion contains the semantic information of the first two branches. The secondary fusion added information extracted from the third branch based on primary fusion. And the tertiary fusion fused all the semantic information. The logical structure of this module is shown in Figure 3. All indexes were improved significantly after each fusion, and the result of tertiary fusion was the best. This indicated that combining the multi-layer feature information can improve the model’s segmentation ability. For visual improvement, Figure 6 shows the visual results of DFP-Net on the structure dataset.

Obviously, the second branch can only capture tough objects, because of the low-resolution input. The full-resolution branch provides more detailed information so that the predictions of the secondary fusion show the crack shapes and incoherent edges. After adding the feature map of the high-resolution branch, the results provide distinct crack shapes and consistent edge information.

Above all, combining cascade branches can satisfy the crack segmentation requirements better and improve the segmentation accuracy.

4.4. Comparison

To further validate the advantages of DFP-Net on the crack segmentation tasks, we compared it with Deeplabv3, SegNet, U-Net, and FCN-8S, which are classical semantic segmentation networks. It is representative to compare them. When evaluating various segmentation network models for crack localization, we delve into critical observations and comparative analyses.

BiseNet preserves spatial information and high-resolution features by employing global average pooling to capture global information and achieve a sufficient receptive field through downsampling feature learning. However, in complex environments with imbalanced foreground-to-background ratios, the model’s ability to capture targets may be compromised. Simultaneously, global average pooling inevitably results in the loss of shallow semantic information. Although the segmentation visualization results reflect crack shapes, the model struggles with crack edge information, as shown in Figure 7e. Nevertheless, its use of average pooling effectively ensures a relatively small computational cost, making it the fastest detection model.

ICNet incorporates the pyramid pooling module from PSPNet and reduces the model’s inference time through a cascade guidance structure. Despite introducing a cascade feature fusion module to enhance the segmentation performance, the model remains lightweight and is limited in handling tasks with high demands for detail processing. Therefore, the model performs exceptionally well in terms of speed, nearly reaching the detection speed of BiseNet. However, it is evident that such a network structure faces challenges in the feature extraction of finer cracks, only capturing broader areas of cracks, as shown in Figure 7f. The model does not exhibit a strong crack segmentation performance, with all four representative evaluation metrics being the lowest among all of the models.

The encoder of FCN adeptly extracts high-level feature information from images through operations such as convolutional layers, playing a pivotal role in tasks like image classification, object detection, and semantic segmentation. While capable of restoring image details and boasting a relatively simple structure, the encoder–decoder architecture gradually forfeits image information due to multiple convolution and pooling operations. This results in the loss of unrecoverable details such as edges and textures, as portrayed in Figure 7c. Although this method identifies robust cracks, it encounters challenges when it comes to recognizing finer cracks, presenting them in a fragmented or missing state. The model’s extraction of learned crack features is insufficient, leading to a higher background recognition of cracks, impacting performance its performance in intersections over union metrics despite a superior recall rate.

SegNet effectively preserves contours and geometric information, showcasing superior pixel segmentation accuracy among the compared models. However, when dealing with low-resolution feature maps, detailed information tends to be overlooked. As illustrated in Figure 7d, while SegNet segments crack shapes with completeness, the edges are excessively smoothed, rendering the small cracks nearly invisible. The model excels in learning deep features of cracks, with a low probability of recognizing the background as cracks, performing well in pixel accuracy metrics. Nevertheless, the model falls short when learning shallow features of cracks, resulting in a higher probability of identifying less obvious crack regions as the background, limiting the recall rate.

Our proposed method constructs a deep feature pyramid on low-resolution images and a shallow feature pyramid on high-resolution images. This dual-feature pyramid approach captures rich contextual semantic information for cracks, enhancing the feature extraction ability of the model and ensuring the extraction of crack shape and edge information. Moreover, achieving high-precision crack segmentation requires obtaining detailed information, resulting in a higher computational cost. While the model demonstrates an average recognition speed for testing a single image, its key strength lies in effectively capturing cracks of different shapes and sizes. Figure 7g shows that the predictions accurately captured crack shapes, retained complete edge details, and highlighted fine crack branches. Compared to the comparison models, it performs best in the four metrics representing segmentation capabilities: accuracy, recall rate, intersection over union, and F1 score. All of the model evaluation metrics are summarized in Table 2.

5. Conclusions

Targeting crack localization tasks in concrete buildings, we designed a double feature pyramid network to obtain more refined prediction results. A significant departure from previous approaches lies in our construction of a shallow feature pyramid to capture detailed information comprehensively, coupled with a deep feature pyramid to extract multi-dimensional semantic information. Additionally, we input different scale images to branches that extract different dimension semantic information, reducing the unnecessary computational cost. We conducted ablation experiments to verify the effectiveness of each module, and Comparative experiments demonstrating the advantage of our work. Our work achieved the new state-of-the-art performance between the compared networks on the STRUCTURE dataset.

Considering the importance of crack detection accuracy over detection speed in the maintenance task of concrete buildings, this method focuses on improving the reliability of crack detection. Compared to lightweight network models, this model’s performance is mediocre in terms of speed, but its speed can meet real engineering needs. This method is practical in concrete crack detection tasks and appropriate in the safety inspection task of concrete buildings. Through its highly accurate crack capture, it can effectively prevent hazardous accidents in buildings.

Author Contributions

Writing—original draft preparation, L.L.; Methodology, R.L.; Data curation, R.A. and B.C.; Investigation, H.L.; Funding acquisition, Y.L.; Writing—review and editing, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Key Research and Development Program of Heilongjiang No. 2023ZX01A18 and the National Key Research and Development Program of China No. 2022YFB4703404.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/OSUPCVLab/StructureCrackDataset (accessed on 7 January 2023).

Acknowledgments

The authors would like to thank all the researchers for their extensive work on machine vision. Further, the authors would like to thank the Robot Technology used for special environment key laboratory of Sichuan Province for financially supporting this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, H.; Liu, Y.; Wang, H.; Yang, J.; Zhou, X. Research on the coordinated development of greenization and urbanization based on system dynamics and data envelopment analysis—A case study of Tianjin. J. Clean. Prod. 2019, 214, 195–208. [Google Scholar] [CrossRef]
Wei, Z.; Yuguo, J.; Jiaping, W. Greenization of venture capital and green innovation of Chinese entity industry. Ecol. Indic. 2015, 51, 31–41. [Google Scholar] [CrossRef]
Wei, Z.; Yuguo, J.; Jiaping, W. Dynamic correlation between industry greenization development and ecological balance in China. Sustainability 2020, 12, 8329. [Google Scholar]
Erlin, B.; Jana, D. Forces of hydration that can cause havoc in concrete. Concr. Int. 2003, 25, 51–57. [Google Scholar]
Golewski, G.L. The phenomenon of cracking in cement concretes and reinforced concrete structures: The mechanism of cracks formation, causes of their initiation, types and places of occurrence, and methods of detection—A review. Buildings 2023, 13, 765. [Google Scholar] [CrossRef]
Zandi, K. Load-Carrying Capacity of Damaged Concrete Structures. Ph.D. Thesis, Chalmers University of Technology, Gothenburg, Sweden, 2008. [Google Scholar]
Biondini, F.; Bontempi, F.; Frangopol, D.M.; Malerba, P.G. Probabilistic service life assessment and maintenance planning of concrete structures. J. Struct. Eng. 2006, 132, 810–825. [Google Scholar] [CrossRef]
Zhang, J.; Qi, K.; Huang, Y. Calculation of moisture distribution in early-age concrete. J. Eng. Mech. 2009, 135, 871–880. [Google Scholar] [CrossRef]
Jinlong, S.; Yanan, Q.; Tian, Y. Analysis and Control of Cracks in Concrete Buildings. Eng. Constr. Des. 2021, 2021, 149–151. [Google Scholar]
Zawad, M.R.S.; Zawad, M.F.S.; Rahman, M.A.; Priyom, S.N. A comparative review of image processing based crack detection techniques on civil engineering structures. J. Soft Comput. Civ. Eng. 2021, 5, 58–74. [Google Scholar]
Fujita, Y.; Mitani, Y.; Hamamoto, Y. A method for crack detection on a concrete structure. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; Volume 3, pp. 901–904. [Google Scholar]
Das, A.K.; Leung, C.K.Y. ICD: A methodology for real time onset detection of overlappedacoustic emission waves. Autom. Constr. 2020, 119, 103341. [Google Scholar] [CrossRef]
Das, A.K.; Suthar, D.; Leung, C.K.Y. Machine learning based crack mode classification fromunlabeled acoustic emission waveform features. Cem. Concr. Res. 2019, 121, 42–57. [Google Scholar] [CrossRef]
Yang, Y.; Lepech, M.D.; Yang, E.; Li, V.C. Autogenous healing of engineered cementitiouscomposites under wet-dry cycles. Cem. Concr. Res. 2009, 39, 382–390. [Google Scholar] [CrossRef]
Das, A.K.; Leung, C.K.Y. Fast Tomography: A greedy, heuristic, mesh size–independentmethodology for local velocity reconstruction for AE waves in distance decaying environmentin semi real-time. Struct. Health Monit. 2021, 21, 1555–1573. [Google Scholar] [CrossRef]
Das, A.K.; Mishra, D.K.; Yu, J.; Leung, C.K.Y. Smart Self-Healing and Self-Sensing Cementitious Composites—Recent Developments, Challenges and Prospects. Adv. Civ. Eng. Mater. 2019, 8, 554–578. [Google Scholar] [CrossRef]
Das, A.K.; Leung, C.K.Y. A new power-based method to determine the first arrival information of an acoustic emission wave. Struct. Health Monit. 2018, 18, 1620–1632. [Google Scholar] [CrossRef]
Hou, T.; Lynch, J.P. Electrical impedance tomographic methods for sensing strain fields andcrack damage in cementitious structures. J. Intell. Mater. Syst. Struct. 2009, 20, 1363–1379. [Google Scholar] [CrossRef]
Ranade, R.; Zhang, J.; Lynch, J.P.; Li, V.C. Influence of micro-cracking on the compositeresistivity of Engineered Cementitious Composites. Cem. Concr. Res. 2024, 58, 1–12. [Google Scholar] [CrossRef]
Zijl, G.P.A.G.; Slowik, V.; Filho, R.D.T.; Wittmann, F.H.; Mihashi, H. Comparative testing ofcrack formation in strain-hardening cement-based composites (SHCC). Mater. Struct. 2015, 49, 1175–1189. [Google Scholar] [CrossRef]
Das, A.K.; Leung, C.K.Y.; Wan, K.T. Application of deep convolutional neural networks for automated and rapid identification and computation of crack statistics of thin cracks in strain hardening cementitious composites (SHCCs). Cem. Concr. Res. 2021, 122, 104159. [Google Scholar] [CrossRef]
Das, A.K.; Leung, C.K.Y. A Novel Deep Learning Model for End-to-End Characterization of Thin Cracking in SHCCs. In International Conference on Strain-Hardening Cement-Based Composites; Springer: Cham, Switzerland, 2022; pp. 188–198. [Google Scholar]
Prasanna, P.; Dana, K.J.; Gucunski, N.; Basily, B.B.; La, H.M.; Lim, R.S.; Parvardeh, H. Automated crack detection on concrete bridges. IEEE Trans. Autom. Sci. Eng. 2014, 13, 591–599. [Google Scholar] [CrossRef]
Huihui, Z.; Di, Z. A method for identifying and measuring early cracks in concrete buildings. Shanxi Archit. 2023, 49, 183–187. [Google Scholar]
Yamane, T.; Chun, P. Crack detection from a concrete surface image based on semantic segmentation using deep learning. J. Adv. Concr. Technol. 2020, 18, 493–504. [Google Scholar] [CrossRef]
Wang, J.J.; Liu, Y.F.; Nie, X.; Mo, Y.L. Deep convolutional neural networks for semantic segmentation of cracks. Struct. Control. Health Monit. 2022, 29, E2850. [Google Scholar] [CrossRef]
Bai, Y.; Zha, B.; Sezen, H.; Yilmaz, A. Deep cascaded neural networks for automatic detection of structural damage and cracks from images. ISPRS Ann. Photogramm. Remote. Sens. Spat. Inf. Sci. 2020, 2, 411–417. [Google Scholar] [CrossRef]
Feng, C.; Zhang, H.; Wang, S.; Li, Y.; Wang, H. Research on Intelligent Detection Method for Surface Crack Damage of Overflow Dams in Hydroelectric Power Stations. Autom. Instrum. 2021, 36, 55–60. [Google Scholar]
Lattanzi, D.; Miller, G.R. Robust automated concrete damage detection algorithms for field applications. J. Comput. Civ. Eng. 2014, 28, 253–262. [Google Scholar] [CrossRef]
Budiansky, B.; O’connell, R. Elastic moduli of acracked solid. Int. J. Solids Struct. 1976, 12, 81–97. [Google Scholar] [CrossRef]
Aboudi, J. Stiffness reduction of cracked solids. Eng. Fract. Mech. 1987, 26, 637–650. [Google Scholar] [CrossRef]
Hu, D.; Tian, T.; Yang, H.; Xu, S.; Wang, X. Wall crack detection based on image processing. In Proceedings of the 2012 Third International Conference on Intelligent Control and Information Processing, Dalian, China, 15–17 July 2012; pp. 597–600. [Google Scholar] [CrossRef]
Zhou, J. Research on Crack Detection Technology Based on Convolution Neural Network. Master’s Thesis, Xi’an University of Technology, Xi’an, China, 2020. [Google Scholar]
Canny, J. A Computational Approach To Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 679–698. [Google Scholar] [CrossRef]
Gao, W.; Zhang, X.; Yang, L.; Liu, H. An improved Sobel edge detection. In Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology, Chengdu, China, 9–11 July 2010; Volume 1, pp. 67–71. [Google Scholar]
Li, Y.; Liu, B. Improved edge detection algorithm for canny operator. ITAIC 2022, 10, 1–5. [Google Scholar]
Wu, F.; Zhu, C.; Xu, J.; Bhatt, M.W.; Sharma, A. Research on image text recognition based on canny edge detection algorithm and k-means algorithm. Int. J. Syst. Assur. Eng. Manag. 2022, 13, 72–80. [Google Scholar] [CrossRef]
Kazemi, M.F.; Mazinan, A.H. Neural network based CT-Canny edge detector considering watermarking framework. Evol. Syst. 2022, 13, 145–157. [Google Scholar] [CrossRef]
Chen, B.; Zhang, H.; Li, Y.; Wang, S.; Zhou, H.; Lin, H. Quantify pixel-level detection of dam surface crack using deep learning. Meas. Sci. Technol. 2022, 33, 065402. [Google Scholar] [CrossRef]
Pang, J.; Zhang, H.; Zhao, H.; Li, L. DcsNet: A real-time deep network for crack segmentation. Signal Image Video Process. 2022, 16, 911–919. [Google Scholar] [CrossRef]
Pang, J.; Zhang, H.; Feng, C.; Li, L. Research on crack segmentation method of hydro-junction project based on target detection network. KSCE J. Civ. Eng. 2020, 24, 2731–2741. [Google Scholar] [CrossRef]
Liu, X. Detection of Bottom Surface Cracks in Concrete Bridges Based on Digital Image Processing. Masters’s Thesis, Wuhan University of Technology, Wuhan, China, 2014. [Google Scholar]
Adhikari, R.; Moselhi, O.; Bagchi, A. Image-based retrieval of concrete crack properties for bridge inspection. Autom. Constr. 2014, 39, 180–194. [Google Scholar] [CrossRef]
Oliveira, H.; Correia, P.L. CrackIT—An image processing toolbox for crack detection and characterization. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014; pp. 798–802. [Google Scholar]
Cho, H.; Yoon, H.J.; Jung, J.Y. Image-based crack detection using crack width transform (CWT) algorithm. IEEE Access 2018, 6, 60100–60114. [Google Scholar] [CrossRef]
Shao, C.; Chen, Y.; Xu, F.; Wang, S. A kind of pavement crack detection method based on digital image processing. In Proceedings of the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chengdu, China, 20–22 December 2019; pp. 397–401. [Google Scholar]
Gehri, N.; Mata-Falcón, J.; Kaufmann, W. Automated crack detection and measurement based on digital image correlation. Constr. Build. Mater. 2020, 256, 119383. [Google Scholar] [CrossRef]
Zhang, L.; Yang, F.; Zhang, Y.D.; Zhu, Y.J. Road crack detection using deep convolutional neural network. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3708–3712. [Google Scholar]
Mandal, V.; Uong, L.; Adu-Gyamfi, Y. Automated road crack detection using deep convolutional neural networks. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 5212–5215. [Google Scholar]
Zhang, Y.; Huang, J.; Cai, F. On bridge surface crack detection based on an improved YOLO v3 algorithm. IFAC-PapersOnLine 2020, 53, 8205–8210. [Google Scholar] [CrossRef]
Yang, J.; Fu, Q.; Nie, M. Road crack detection using deep neural network with receptive field block. Iop Conf. Ser. Mater. Sci. Eng. 2020, 782, 042033. [Google Scholar] [CrossRef]
Yan, K.; Zhang, Z. Automated asphalt highway pavement crack detection based on deformable single shot multi-box detector under a complex environment. IEEE Access 2021, 9, 150925–150938. [Google Scholar] [CrossRef]
Li, L.; Zhang, H.; Pang, J.; Huang, J. Dam surface crack detection based on deep learning. In Proceedings of the 2019 International Conference on Robotics, Intelligent Control and Artificial Intelligence, Shanghai, China, 20–22 September 2019; pp. 738–743. [Google Scholar]
König, J.; Jenkins, M.D.; Barrie, P.; Mannion, M.; Morison, G. A convolutional neural network for pavement surface crack segmentation using residual connections and attention gating. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1460–1464. [Google Scholar]
Dung, C.V. Autonomous concrete crack detection using deep fully convolutional neural network. Autom. Constr. 2019, 99, 52–58. [Google Scholar] [CrossRef]
Wang, W.; Su, C. Semi-supervised semantic segmentation network for surface crack detection. Autom. Constr. 2021, 128, 103786. [Google Scholar] [CrossRef]
Fu, H.; Meng, D.; Li, W.; Wang, Y. Bridge crack semantic segmentation based on improved Deeplabv3+. J. Mar. Sci. Eng. 2021, 9, 671. [Google Scholar] [CrossRef]
Xiang, C.; Wang, W.; Deng, L.; Shi, P.; Kong, X. Crack detection algorithm for concrete structures based on super-resolution reconstruction and segmentation network. Autom. Constr. 2022, 140, 104346. [Google Scholar] [CrossRef]

Figure 1. The detection effect of improved YOLOv3 on the crack data set. The red box marks a part of the crack, and the position of the crack is somewhat clear through a large number of small marking boxes.

Figure 2. Network architecture of DFP-Net.

Figure 3. Cascade feature fusion unit.

Figure 4. Concrete crack samples in various scenarios.

Figure 5. Pre-processing effects of the three methods occurring independently and in random combinations. The column (a) presents the original images. The columns called (b,c,e) are the results of three separate methods. The columns called (d,f,g) are treated with a combination of two methods. (h) is the effect of three methods at the same time.

Figure 6. Comparison of prediction results of the second FPN branches. The size of the three branches outputs are 16 × 16, 64 × 64, 256 × 256, and then we adjust them to the same size for comparison.

Figure 7. Visual prediction comparison of models on crack test set.

Table 1. Comparison of each feature fusion. The result of primary fusion contains semantic information of the first two branches. The secondary fusion added information extracted from the third branch based on primary fusion. And the tertiary fusion fused all the semantic information.

Branch	Recall (%)	PA (%)	IoU (%)	F1 (%)
primary fusion	37.85	23.00	16.79	0.29
secondary fusion	58.10	42.29	33.86	0.51
tertiary fusion	62.95	65.99	44.71	0.62

Table 2. Results on structure crack test set with image resolution 256 × 256.

Model	Recall (%)	PA (%)	IoU (%)	F1	Fps
BiseNet	56.59	41.58	31.52	0.48	166.8
IC-Net	45.34	27.32	24.50	0.35	160.5
FCN-8S	58.74	51.12	37.61	0.55	69.1
SegNet	56.71	64.33	43.14	0.60	36.5
Our method	62.95	65.99	44.71	0.62	43.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, L.; Liu, R.; Ali, R.; Chen, B.; Lin, H.; Li, Y.; Zhang, H. DFP-Net: A Crack Segmentation Method Based on a Feature Pyramid Network. Appl. Sci. 2024, 14, 651. https://0-doi-org.brum.beds.ac.uk/10.3390/app14020651

AMA Style

Li L, Liu R, Ali R, Chen B, Lin H, Li Y, Zhang H. DFP-Net: A Crack Segmentation Method Based on a Feature Pyramid Network. Applied Sciences. 2024; 14(2):651. https://0-doi-org.brum.beds.ac.uk/10.3390/app14020651

Chicago/Turabian Style

Li, Linjing, Ran Liu, Rashid Ali, Bo Chen, Haitao Lin, Yonglong Li, and Hua Zhang. 2024. "DFP-Net: A Crack Segmentation Method Based on a Feature Pyramid Network" Applied Sciences 14, no. 2: 651. https://0-doi-org.brum.beds.ac.uk/10.3390/app14020651

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DFP-Net: A Crack Segmentation Method Based on a Feature Pyramid Network

Abstract

1. Introduction

2. Related Work

2.1. Digital Image Processing

2.2. Object Detection

2.3. Image Segmentation

3. Methods

3.1. Deep Feature Pyramid

3.2. Shallow Feature Pyramid

3.3. Cascade Feature Fusion Unit

3.4. Image Pre-Processing

3.5. Loss Function

4. Experiment Results

4.1. Evaluation Index

4.2. Dataset

4.3. Cascade Feature Fusion

4.4. Comparison

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI