A Novel Ground Filtering Method for Point Clouds in a Forestry Area Based on Local Minimum Value and Machine Learning

Wu, Yueqiao; Sang, Mengting; Wang, Wei

doi:10.3390/app12189113

Open AccessArticle

A Novel Ground Filtering Method for Point Clouds in a Forestry Area Based on Local Minimum Value and Machine Learning

by

Yueqiao Wu

,

Mengting Sang

and

Wei Wang

^*

School of Geosciences and Info-Physics, Central South University, Changsha 410083, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(18), 9113; https://0-doi-org.brum.beds.ac.uk/10.3390/app12189113

Submission received: 29 July 2022 / Revised: 3 September 2022 / Accepted: 7 September 2022 / Published: 10 September 2022

(This article belongs to the Special Issue Methods and Applications of Lidar Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

Lidar point cloud filtering is the process of separating ground points from non-ground points and is a particularly important part of point cloud data processing. Forest filtering has always been a difficult topic in point cloud filtering research. Given that vegetation cannot be completely summarized according to the structure of ground objects, and given the diversity and complexity of the terrain in woodland areas, filtering in the forest area is a particularly difficult task. However, only few studies have tested the application of the point cloud filtering method for forest areas, the parameter setting of filtering methods is highly complex, and their terrain adaptability is weak. This paper proposes a new filtering method for forest areas that effectively combines iterative minima with machine learning, thereby greatly reducing the degree of manual participation. Through filtering tests on three types of woodlands, the filtering results were evaluated based on the filtering error definition proposed by ISPRS and were compared with the filtering results of other classical methods. Experimental results highlight the advantages of the proposed method, including its high accuracy, strong terrain universality, and limited number of parameters.

Keywords:

LiDAR point cloud; forest ground filtering method; machine learning

1. Introduction

Light detection and ranging technology (LiDAR) is a laser-based remote sensing measurement technology. Its principle is to aim the laser at the surface of the object, calculate the time it takes for the laser to return to the starting point, and then calculate the distance [1,2,3]. This technology is commonly used in geographic information systems (GIS) to generate digital elevation models (DEMs) or digital surface models (DSMs) for 3D modeling [4,5,6,7,8]. As an active remote sensing method, LiDAR can be used for carbon dioxide retrieval [9], aerosol detection [10], boundary layer detection [11], and target classification [12]. With the maturity of hardware production technology and the rapid development of related data processing software, LiDAR has made great research progress and has shown great advantages in many aspects. Compared with traditional measurement methods, which require a lot of field control, aerial triangulation, long mapping period and low precision to obtain the topographic surface, LiDAR can observe the ground around the clock [13,14,15]. In addition, when the LiDAR scans a work area that is occluded by vegetation or shaded by vegetation, it can more directly and effectively obtain 3D information of surface targets [16,17,18,19].

As the cost and size of scanners have decreased, LiDAR has had new applications. In forestry science [20,21], LiDAR data have been used for tree and canopy height measurements and large-scale biomass estimation. In geological disaster monitoring [22,23], airborne LiDAR can be used to predict areas prone to landslides, providing reference for safety assessments. In route planning [24,25], LiDAR can also be used for pedestrian wayfinding to ensure the accessibility of routes in urban environments. In archaeological applications [26,27], archaeologists have enhanced the visualization of high-resolution digital elevation models generated by airborne LiDAR by combining perception and interpretation, and discovering numerous new settlement sites of ancient communities and new archaeological ground features.

Point cloud ground filtering is an important link in LiDAR data processing that can effectively distinguish ground points from non-ground points to obtain real ground point coordinates and improve the accuracy of DTMs [28,29,30]. However, point cloud filtering is generally difficult due to the diversity of terrain and features, and the features displayed by LiDAR data are often very complex [31]. The International Society for Photogrammetry and Remote Sensing (ISPRS) team analyzed the point cloud data structure and listed some difficult objects in filtering when testing the filtering method [32]. Traditional point cloud filtering methods include the mathematical morphology method [33], terrain slope method [34], and irregular triangulation method [35]. Many studies have explored the optimization and improvement of traditional filtering methods [36,37,38,39,40]. With the wide application of various machine learning methods, some scholars have gradually integrated point cloud filtering into machine learning. For instance, Rizaldy et al. [41] used fully convolutional networks for terrestrial and multiclass classification of airborne laser scanner point clouds, providing highly accurate classification results in a computationally efficient manner. An Airborne LiDAR Building-Extraction Method Based on the Naive Bayes–RANSAC Method for Proportional Segmentation of Quantitative Features [42] proposed by Yi et al. Xue et al. [43] proposed a point cloud classification method based on ICSF and weighted weakly correlated random forests with a particular focus on the extraction of urban LiDAR point cloud buildings, which has improved the accuracy and efficiency of point cloud classification.

Many mature methods are being used in urban point cloud filtering research [44,45]. However, these methods have a poor filtering effect for forest areas. Vegetation is a special group that cannot be completely generalized according to the structure of ground objects, and in some steep terrain areas the ground points on the slope may be in the same height as the vegetation points. Therefore, in complex forest areas, the filtering method still needs to be improved. The purpose of our research was to improve the accuracy and terrain applicability of point cloud filtering in forest areas. In this paper, a point cloud filtering method is proposed for forest areas that combines machine learning with iterative minima. Section 2 of this paper introduces the principles of the method and the experimental data, Section 3 verifies the filtering results and uses dense point clouds for testing, Section 4 compares results of this method with other methods, and discusses the disadvantages and advantages of the method, and Section 5 provides the conclusions from this study.

2. Method

The main idea of the proposed method is to combine an iterative minima calculation of point cloud height difference with machine learning to classify raw points into ground and non-ground points by prediction. This method is mainly divided into three parts. First, the variable factors are obtained from the original point cloud data. Second, the variable factors are selected using the mRMR [46] feature selection method. Third, the ground and non-ground points are predicted using a decision tree model [47]. The flow of the method is shown in Figure 1.

2.1. Calculation of Point Cloud Height Difference

In this paper, a filtering reference surface is defined, which is obtained by means of irregular triangulation interpolation [48]. The filtering reference surface plays the role of simulating the ground, and can better distinguish between ground points and non-ground points. The height difference of the point cloud is defined as the partial height difference of the point cloud and the height difference between the point cloud and filtering reference surface. Selecting two height differences as variable factors at the same time can reduce the error caused with some special terrains. Figure 2 shows the difference between the two height differences.

Step 1: Calculation of local height difference of point cloud. First, data preprocessing on point clouds. Read the data, establish a KD tree [49] according to the position information of the point cloud, use a KNN [50] search to perform hierarchical processing on the point cloud from bottom to top, and extract the point cloud with the smallest elevation in each layer as the partial lowest point. The “layer” here means that after KNN searches for some point clouds, the elevation of these point clouds is restricted, so that the point clouds are divided into layers. Define the partial height difference as the difference between the point cloud height of the layer and the partial lowest point of the layer, and then calculate this difference layer by layer to obtain the partial height difference of all point clouds.

Step 2: Calculation of the height difference of the point cloud filtering reference surface: First, set a height threshold and compare it with the local height difference obtained in step 1 to obtain preliminary ground points. The function of the height threshold here is to roughly distinguish between ground points and non-ground points. It does not need to be very accurate and can be changed according to the specific terrain. Generally speaking, 1 m is suitable for most terrains. It is assumed that if it is within the range of the height threshold, the point is regarded as a ground point, and if it is greater than the height threshold, it is regarded as a non-ground point. Subsequently, the ground points within the height threshold range are interpolated in the form of an irregular triangulation [48] to obtain the filtering reference surface. Then, extract the distance between each point cloud and the filtering reference surface, allowing the height difference between the point cloud and the filtering reference surface to be obtained.

2.2. Selection of Variable Factors

Feature selection plays an important role in machine learning. While the variables obtained in the experiment contain very important classification information, there are some less relevant and redundant information. To avoid the over-fitting problem of the classifier and improve the accuracy and efficiency of the classifier, it is necessary to perform feature selection. There are many current feature selection methods [51,52]. The advantage of the mRMR [46] algorithm is that it not only considers the correlation between input variables and output variables, but also considers the correlation between each input variable. Through the mRMR algorithm, the feature subset with the largest correlation with the output variable and the smallest correlation with the input variable can be found from the initial input variables. Wang et al. [53] proved by experiments that no matter which criterion is adopted, mRMR has encouraging feature selection results compared with the traditional K-BEST feature selection method. Therefore, we used the mRMR algorithm to screen the obtained nine variable factors (namely, the 3D coordinates of the point cloud, the normal vector of the point cloud in the x-y-z direction, the curvature of the point cloud, the partial height difference of the point cloud, and the height difference of the point cloud filter reference surface).

The mRMR method is a feature selection method based on mutual information as a subset evaluation criterion, which can effectively measure the correlation between features and categories and the redundancy between feature sets. One goal of feature selection is to find a subset of features

S

that has the highest correlation with class

c

and contains

m

features, called max-relevance. Another purpose of feature selection is to reduce the redundancy between features in the feature subset

S

, called minimum redundancy.

The correlation between feature subset

S

and class

c

is measured by Equation (1). The larger the value of

D

, the higher the relevance between the feature subset and class

c

; The redundancy of the features in the feature subset

S

is calculated by Equation (2). The smaller the value of

R

, the less redundancy between feature subsets.

\max D (S, c), D = I (\{x_{i}, i = 1, . . ., m\}; c)

(1)

\min R (S), R = \frac{1}{{|S|}^{2}} \sum_{x_{i}, x_{i} \in S} I (x_{i}, x_{j})

(2)

Combining the above two indicators, the standard for measuring the quality of feature subsets under the mRMR framework is determined by Equation (3). The higher the value of

\emptyset

, the higher the evaluation of the subset of features.

\max \emptyset (D, R), \emptyset = D - R

(3)

According to the evaluation results, each variable factor is scored and sorted from high to low. Figure 3 shows the scores of all variable factors, where the 3D coordinates x, y, and z of the point cloud have scores of 0.0463, 0.082, and 0.086, respectively, the point cloud y-normal vector score is 0.1554, the point cloud partial height difference score is 0.2923, the score of the cloud x-normal vector is 0.3676, the score of the point cloud curvature is 0.3725, the score of the point cloud z-normal vector is 0.3771, and the score of the point cloud height difference of filtering reference surface is 0.5218.

Selecting “excellent” variable factors from the candidate variable factors can improve model performance and the accuracy and efficiency of filtering. According to the score ranking of each variable, the scores of the three variable factors of the 3D coordinates x, y, and z of the point cloud are all less than 0.1, and the impact on the final result is minimal. Therefore, these three variable factors are eliminated, and the remaining six variable factors are retained.

2.3. Model Validation and Analysis

A decision tree model [47] was used to classify and train the six variable factors(the point cloud y-normal vector, the point cloud partial height difference, the point cloud x-normal vector, the point cloud curvature, the point cloud z-normal vector, and the point cloud height difference of the filtering reference surface) mentioned above.

As a classification algorithm, the decision tree algorithm uses an induction algorithm to generate a series of visible rules and decision trees after processing the data, and then uses the decision tree to distinguish new data by rules [54,55]. Among them, the processed data are called the training set, and the analyzed data are called the test set. In short, the goal of the decision tree algorithm is to use the training set data to automatically build a decision tree model, and then use the decision tree to classify or predict the test data.

The main task of the decision tree algorithm is to find the most critical node and the most suitable branch decision rule; impurity is the most critical and most suitable indicator used to measure the classification tree [56]. In general, the lower the impurity, the better the training set fit of the decision tree. Impurity can be characterized by information (Entropy), and the calculation method is as Equation (4).

E n t r o p y (t) = - \sum_{i = 0}^{c - 1} P (i | t) \log_{2} P (i | t)

(4)

Among them,

t

represents a given node and

i

represents any classification of labels, and represents the proportion of label classification

i

on the node. Note that the information gain based on information ancestry is calculated when using the information ancestry, that is, the difference between the information ancestry of the parent node and the information ancestry of the child nodes. The decision tree that uses information inheritance grows more “meticulous”, especially when the model training effect is not good, or the weight of factors is not obvious.

The proposed method trains the model, performs cross-validation, and uses the filtering error evaluation standard proposed by ISPRS [32] to verify the filtering accuracy. The error definition is shown in Table 1, where

a

represents the number of ground points that are correctly classified after filtering,

b

denotes the number of ground points that are misclassified as non-ground points,

c

represents the number of ground points that are misclassified by non-ground points, and

d

represents the number of correctly classified non-ground points after filtering. The total error reflects the filtering quality and feasibility of the method. A smaller total error indicates a more accurate filtering result.

2.4. Experimental Data

Considering that different terrains may affect the filtering results, in order to ensure the terrain applicability of this filtering method, we used three groups of different types of forest data to train the filtering model, and each group consists of three data. Group I contains data 1~3. The topographic features of this group are flat terrain or gentle slope, and no steep slopes. Group II includes data 4~6. The topographic features of this group are with steep or terraced slopes. Group III includes data 7~9. The topographic features of this group are high and steep slopes.

The LiDAR data used in this method comes from the open topography website. Data 1, data 4, and data 7 are from the dataset “Hydrologic Effects of Forest Restoration, WA 2021”. The study area of this LiDAR dataset consists of sections located northwest of Ellensburg, WA, and covers approximately 62.7 km². Data 2, data 3, and data 5 are from the dataset “Mapping the Kaibab Plateau, AZ”. This dataset provides LiDAR data for portions of the Kaibab National Forest and Grand Canyon National Park on the Kaibab Plateau. The study area covers an area of about 1882.3 km². Data 6, data 8, and data 9 are from the dataset “Study in Erosion Response: Horse Linto, CA 2019”. The study area of this LiDAR dataset covers approximately 63 km² near Horse Linto, in Trinity National Forest, California. Information about the collected data can be found in Table 2.

3. Experiment and Results

3.1. Validation of the Filtering Results

Cross-validation is used to verify the accuracy of filtering as follows. Put all the experimental data Group I–III together. Each group has three data, there are nine experimental data (data 1~data 9) in total. For the specific method, take out one of the data, use the other eight data for model training, and finally use the previously extracted data to filter and verify. For example, take out data 1 separately, use the remaining data 2~data 10 for model training, and then use data 1 to verify the accuracy of filtering, and complete the verification of 9 data in turn. The verification results are shown in Figure 4, Figure 5 and Figure 6.

The filtering verification results of Group I are shown in Figure 4. The terrain in this group is relatively flat, and some areas may have some gentle slopes. The filtering accuracy of data 1 is 92.18%, the filtering accuracy of data 2 is 82.25%, and the filtering accuracy of data 3 is 81.19%; The filtering verification results of Group II are shown in Figure 5. This group of data includes gentle slopes and steep slopes, and the overall terrain presented is relatively simple. The filtering accuracy of data 4 is 91.56%, the filtering accuracy of data 5 is 81.69%, and the filtering accuracy of data 6 is 83.32%; The filtering verification results of Group III are shown in Figure 6. The terrain of this group is steep slopes and high steep slopes, with some ravines and complex terrain. The filtering accuracy of data 7 is 94.32%, the filtering accuracy of data 8 is 89.03%, and the filtering accuracy of data 9 is 90.25%.

It can be seen from the filtering result graph that this method can achieve good performance in Group I-III, and the filtering accuracy of each data can reach more than 80%. Especially in Group III, with steep slopes and high steep slopes, the classification effect of filtering is excellent. At the same time, judging from the misclassification of each result map, the misclassified point clouds mainly exist near the ground points, which are very close to the terrain surface.

The filtering accuracy of each Group I-III, and the overall filtering accuracy, are calculated to better demonstrate the accuracy and terrain applicability of this method. The results are shown in Table 3. It can be seen that the filtering accuracy of Group III is the highest, reaching 91.26%. As shown in Figure 7, the overall filtering cross-validation accuracy is 87.30%, and the final training model accuracy is 87.25%, which proves that the algorithm has a good effect on ground point filtering under different terrain features and can effectively extract ground points.

3.2. Testing with Dense Point Cloud

3.2.1. Test Data

The point clouds of other forest data from the open topography website are used for filtering tests, and to select the forest data Test 1–Test 3 of three different terrains. Information about the test data is shown in Table 4.

Test 1 is from the dataset “Characterizing Forest Structure Changes and Effects on Snowpack, AZ 2017”, located approximately 8 km west and north of Flagstaff, AZ within the Coconino National Forest. The terrain features are relatively flat and contain 10,339,269 point clouds. Test 2 is from the dataset “2014 USFS Tahoe National Forest Lidar”, located in the Tahoe National Forest. The terrain includes slope, and contains 7,797,662 point clouds. Test 3 is from the dataset “2013 USFS Tahoe National Forest Lidar”, located in the Tahoe National Forest, with high and steep slopes and complex terrain features, including 8,942,695 point clouds.

3.2.2. Test Results

The filtering accuracy evaluation results are shown in Table 5. Test 1 is a flat terrain with a total filtering error of 3.46%. Type I and II errors are very small and evenly distributed. Test 2 has a relatively steep terrain with a total filtering error of 11.50%. The type I error is large, and the misclassification rate of ground points is high. Test 3 is a steep slope terrain including ravines. The total filtering error is 9.91%, and the type I and II errors are relatively close. The results are shown in Figure 8.

The total error represents the feasibility of using the filtering method. From the perspective of the total filtering error of three different terrains, the filtering effect of the proposed method is relatively good and has feasibility.

3.3. The Effectiveness of Feature Selection

In this research, feature selection was performed on the variable factors, and three variable factors with a score less than 0.1 were excluded from the nine variable factors. In order to prove the necessity of feature selection, relevant experiments were carried out. The test data Test1-Test3 were tested for classification accuracy with all variable factors, and variable factors after feature selection, respectively. The compared results are shown in Table 6.

According to the test results, in the test data Test1-Test3, the filtering accuracy without feature selection is 96.48%, 88.39%, and 90.02%, and the filtering accuracy after feature selection is 96.54%, 88.50%, and 90.09%. At the same time, the time consumption of the filtering model without feature selection is 203 min, and the time consumption of the model after feature selection is 96 min. It can be clearly seen that the time consumption of the model after feature selection is significantly shortened, being less than half of the former.

It can be concluded that the filter test results using all variable factors are not much different from the filter test results of variable factors after feature selection, and the accuracy of point cloud classification after feature selection is slightly higher, indicating that the variable factor (point cloud 3D coordinates) eliminated in this paper is dispensable for the accuracy of results. However, the required to run the model without feature selection is far from that of the model after feature selection. The total time consumed by the model after feature selection is greatly reduced, which greatly improves the filtering efficiency, indicating that these three excluded variables do not improve the filtering results, but increase the time consumption of the experiment. Therefore, in this research, we used the method of feature selection to screen the variable factors.

4. Discussion

4.1. Comparison with Other Methods

To facilitate comparison with other methods, three other filtering methods were used in the Test 1-Test 3 of test data for experiments. These methods include two traditional methods, namely, the morphological filtering method (MF) [57] and the slope-based filtering method (SF) [34], as well as the recently proposed cloth simulation filtering method (CSF) [58]. The method results are compared in Table 7.

Type I error divides the original ground points into a non-ground point set. The type II error is a wrong classification of non-ground points into the ground point set. The total error is computed as the proportion of type I and II errors to the total number of points. The smaller the total error, the more accurate the filtering result, so the result of the total error is the most important. According to the results of the total error, in Test 1 all methods perform well, while in Test 2 and Test 3, the total errors of MF and SF filtering methods are high, and the total errors of CSF and this method are low. This means that this method and the CSF method have higher accuracy and stronger terrain applicability.

In addition, concerning overall filtering the total error, the average total error of CSF is 7.74%, the average total error of MF is 12.70%, the average total error of SF is 15.53%, the average total error of this method is 7.91%. Compared with the two filtering methods of MF and SF, the accuracy of this method was improved. Although the total filtering error of this method is 0.17% higher than that of CSF, the CSF method needs to set parameters according to different terrains. Compared with the CSF method, the advantage of this method is that there are fewer parameters to set, and there is no need to distinguish the type of terrain.

Compared with other filtering methods, the proposed method shows certain advantages in all aspects and can achieve good filtering effects in three different terrains, thereby showing its high precision and strong universality.

4.2. Strength of the Proposed Method

The test results show that the proposed method has excellent filtering accuracy for forest land, relatively uniform types I and II errors, and a small total filtering error. The results of experiments on forest point cloud data with different terrain features show that the proposed filtering method performs well in any terrain, especially in steep terrain, multiple ravines, and complex scenes. The filter also shows strong terrain applicability in forest areas. The proposed filtering method was also applied to urban areas for testing, but experiment results showed the method is not effective for urban areas, and will misjudge the roof as a ground point, indicating that the proposed method is more suitable for scenes with an uneven terrain surface.

Results of the filtering comparison experiment show that the morphological filtering method must set several parameters, such as window size, altitude threshold, and elevation scale coefficient. The slope-based filtering method must set grid size and slope threshold, and the CSF must set terrain mode, distance threshold between the point cloud and cloth simulation point, and cloth rigidity. Meanwhile, the proposed method requires less manual participation and only needs to set a universal height threshold when calculating the height difference at the early stage. Moreover, the subsequent experimental steps do not require any parameter setting, thereby greatly improving the convenience of the method.

4.3. Misclassification Analysis

The distribution of the filter misclassification point cloud of the test data is shown in Figure 9. From the graph it can be seen that there are still many misclassification points in this method. Since the original data are all in the forest area, the ground points in the filtering results are almost covered by forest vegetation. From the misclassification of each result map, the misclassified point clouds mainly exist near the ground points, which are very close to the terrain surface.

According to the terrain characteristics of the forest and the principle of the proposed filtering method, the above misclassified points may be caused by the following:

(1): Vegetation located close to the ground in the forest area is difficult to distinguish. The forest area has many low vegetation, and the shape, range, and location of this vegetation show a great degree of randomness and uncertainty. Moreover, given that the very low vegetation is located very close to the ground, it can be easily misidentified as the ground.
(2): The fitting of the filter reference surface is not good enough. The proposed method uses an irregular triangular network to interpolate the ground points, but some errors may be generated in the actual interpolation process, thereby leading to a filtering reference surface that cannot be completely fitted with the actual ground.
(3): The decision tree model was used for training, and this model has an accuracy of 87.25%. In the future, a model that is more suitable for point cloud classification can be selected for training to improve the model accuracy.

In future research, we plan to further optimize and improve the proposed method considering the above aspects.

5. Conclusions

We propose a point cloud filtering method suitable for forest areas that combines iterative minima and machine learning and provides a new idea for point cloud filtering. This method uses a training filter model to predict ground and non-ground points with a cross-validation accuracy of 87.3%, and the type II error is generally lower than the type I error. In a forest area with steep slopes, the method performs better. Especially in Test 2, the total error of this method is reduced by 1.63% compared with the cloth simulation filtering method, 11.04% compared with the morphological filtering method, and 21.52% compared with the slope-based filtering method. Experiment results highlight the several advantages of this method, including its limited number of parameters, low manual participation, high filtering accuracy, strong terrain applicability, and excellent performance compared with other filtering methods.

This study has several shortcomings. In follow-up work, to optimize the proposed method we plan to examine low vegetation located near the surface, improve the fitting accuracy of the filtering reference surface, and adopt a training model that is more suitable for point cloud classification.

Author Contributions

Conceptualization, Y.W. and W.W.; methodology, Y.W. and W.W.; software, Y.W., W.W. and M.S.; validation, Y.W. and W.W.; formal analysis, Y.W., W.W. and M.S.; investigation, Y.W., W.W. and M.S.; re-sources, Y.W., W.W. and Y.W.; data curation, Y.W.; writing—original draft preparation, Y.W. and W.W.; writing—review and editing, Y.W. and W.W.; visualization, Y.W., W.W. and M.S.; supervision, Y.W. and W.W.; project administration, W.W.; funding acquisition, W.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Key Research and Development Program of China (2018YFB0504500), the National Natural Science Foundation of China (41901295), Basic Science-Center Project of National Natural Science Foundation of China (72088101), the Natural Science Foundation of Hunan Province, China (2020JJ5708), and the Key Program of the National Natural Science Foundation of China (41930108).

Data Availability Statement

All data sets of this experiment can be found from Open Topography (www.opentopography.org accessed on 6 March 2022), which include data 1,4,7 (https://0-doi-org.brum.beds.ac.uk/10.5069/G989142F; accessed on 5 April 2022), da-ta 2,3,5 (https://0-doi-org.brum.beds.ac.uk/10.5069/G9TX3CH3; accessed on 6 April 2022), data 6,8,9 (https://0-doi-org.brum.beds.ac.uk/10.5069/G9NK3C62; accessed on 7 April 2022), test 1 (https://0-doi-org.brum.beds.ac.uk/10.5069/G90Z716B; accessed on 8 April 2022), test 2 (https://0-doi-org.brum.beds.ac.uk/10.5069/G9V122Q1; accessed on 8 April 2022), test 3 (https://0-doi-org.brum.beds.ac.uk/10.5069/G9ZS2TF2; accessed on 8 April 2022). We express our sincere gratitude to the data providers and related agencies.

Acknowledgments

We sincerely thank the people who provided support and advice for this article.

Conflicts of Interest

The authors have declared that no competing interests exist.

References

Meng, X.; Wang, L.; Silván-Cárdenas, J.L.; Currit, N. A multi-directional ground filtering algorithm for airborne LIDAR. ISPRS J. Photogramm. Remote Sens. 2009, 64, 117–124. [Google Scholar] [CrossRef]
Shan, J.; Sampath, A. Urban DEM generation from raw lidar data: A labeling algorithm and its performance. Photogramm. Eng. Remote Sens. 2005, 71, 217–226. [Google Scholar] [CrossRef]
Sithole, G.; Vosselman, G. Bridge detection in airborne laser scanner data. ISPRS J. Photogramm. Remote Sens. 2006, 61, 33–46. [Google Scholar] [CrossRef]
Wei, Y.H.; Zhang, H.; Zhong, H.; Liu, L.; Jiang, Y.; Wang, Y. Indoor Environment Mapping of Mobile Robot Based on Efficient DSM-PF Method. IEEE Sens. J. 2022, 22, 3672–3685. [Google Scholar] [CrossRef]
Guo, L.; Deng, X.; Liu, Y.; He, H.; Lin, H.; Qiu, G.; Yang, W. Extraction of Dense Urban Buildings from Photogrammetric and LiDAR Point Clouds. IEEE Access 2021, 9, 111823–111832. [Google Scholar] [CrossRef]
Yilmaz, S.C.; Gungor, O. Comparison of the performances of ground filtering algorithms and DTM generation from a UAV-based point cloud. Geocarto Int. 2018, 33, 522–537. [Google Scholar] [CrossRef]
Fanos, A.M.; Pradhan, B.; Alamri, A.; Lee, C.-W. Machine Learning-Based and 3D Kinematic Models for Rockfall Hazard Assessment Using LiDAR Data and GIS. Remote Sens. 2020, 12, 1755. [Google Scholar] [CrossRef]
Barazzetti, L.; Previtali, M.; Scaioni, M. Roads Detection and Parametrization in Integrated BIM-GIS Using LiDAR. Infrastructures 2020, 5, 55. [Google Scholar] [CrossRef]
Pei, Z.; Han, G.; Ma, X.; Shi, T.; Gong, W. A Method for Estimating the Background Column Concentration of CO₂ Using the Lagrangian Approach. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
Liu, B.; Ma, X.; Ma, Y.; Li, H.; Jin, S.; Fan, R.; Gong, W. The relationship between atmospheric boundary layer and temperature inversion layer and their aerosol capture capabilities. Atmos. Res. 2022, 271, 106121. [Google Scholar] [CrossRef]
Xu, W.; Wang, W.; Wang, N.; Chen, B. A New Algorithm for Himawari-8 Aerosol Optical Depth Retrieval by Integrating Regional PM_2.5 Concentrations. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–11. [Google Scholar] [CrossRef]
Luo, B.; Yang, J.; Song, S.; Shi, S.; Gong, W.; Wang, A.; Du, L. Target Classification of Similar Spatial Characteristics in Complex Urban Areas by Using Multispectral LiDAR. Remote Sens. 2022, 14, 238. [Google Scholar] [CrossRef]
Eitel, J.U.H.; Höfle, B.; Vierling, L.A.; Abellán, A.; Asner, G.P.; Deems, J.S.; Glennie, C.L.; Joerg, P.C.; LeWinter, A.L.; Magney, T.S.; et al. Beyond 3-D: The new spectrum of lidar applications for earth and ecological sciences. Remote Sens. Environ. 2016, 186, 372–392. [Google Scholar] [CrossRef]
Cheng, L.; Chen, S.; Liu, X.; Xu, H.; Wu, Y.; Li, M.; Chen, Y. Registration of Laser Scanning Point Clouds: A Review. Sensors 2018, 18, 1641. [Google Scholar] [CrossRef]
Li, X.; Liu, C.; Wang, Z.; Xie, X.; Li, D.; Xu, L. Airborne LiDAR: State-of-the-art of system design, technology and application. Meas. Sci. Technol. 2020, 32, 032002. [Google Scholar] [CrossRef]
Yan, W.Y.; Shaker, A.; El-Ashmawy, N. Urban land cover classification using airborne LiDAR data: A review. Remote Sens. Environ. 2015, 158, 295–310. [Google Scholar] [CrossRef]
Sankey, T.; Donager, J.; McVay, J.; Sankey, J.B. UAV lidar and hyperspectral fusion for forest monitoring in the southwestern USA. Remote Sens. Environ. 2017, 195, 30–43. [Google Scholar] [CrossRef]
Sankey, J.B.; Sankey, T.T.; Li, J.; Ravi, S.; Wang, G.; Caster, J.; Kasprak, A. Quantifying plant-soil-nutrient dynamics in rangelands: Fusion of UAV hyperspectral-LiDAR, UAV multispectral-photogrammetry, and ground-based LiDAR-digital photography in a shrub-encroached desert grassland. Remote Sens. Environ. 2021, 253, 112223. [Google Scholar] [CrossRef]
Kuras, A.; Brell, M.; Rizzi, J.; Burud, I. Hyperspectral and Lidar Data Applied to the Urban Land Cover Machine Learning and Neural-Network-Based Classification: A Review. Remote Sens. 2021, 13, 3393. [Google Scholar] [CrossRef]
Jin, S.C.; Sun, X.; Wu, F.; Su, Y.; Li, Y.; Song, S.; Xu, K.; Ma, Q.; Baret, F.; Jiang, D.; et al. Lidar sheds new light on plant phenomics for plant breeding and management: Recent advances and future prospects. ISPRS J. Photogramm. Remote Sens. 2021, 171, 202–223. [Google Scholar] [CrossRef]
Hu, T.; Sun, X.; Su, Y.; Guan, H.; Sun, Q.; Kelly, M.; Guo, Q. Development and Performance Evaluation of a Very Low-Cost UAV-Lidar System for Forestry Applications. Remote Sens. 2021, 13, 77. [Google Scholar] [CrossRef]
Chang, K.J.; Chan, Y.-C.; Chen, R.-F.; Hsieh, Y.-C. Geomorphological evolution of landslides near an active normal fault in northern Taiwan, as revealed by lidar and unmanned aircraft system data. Nat. Hazards Earth Syst. Sci. 2018, 18, 709–727. [Google Scholar] [CrossRef]
She, J.; Mabi, A.; Liu, Z.; Sheng, M.; Dong, X.; Liu, F.; Wang, S. Analysis Using High-Precision Airborne LiDAR Data to Survey Potential Collapse Geological Hazards. Adv. Civ. Eng. 2021, 2021, 6475942. [Google Scholar] [CrossRef]
Balado, J.; Díaz-Vilariño, L.; Arias, P.; Lorenzo, H. Point clouds for direct pedestrian pathfinding in urban environments. ISPRS J. Photogramm. Remote Sens. 2019, 148, 184–196. [Google Scholar] [CrossRef]
Yermo, M.; Rivera, F.F.; Cabaleiro, J.C.; Vilariño, D.L.; Pena, T.F. A fast and optimal pathfinder using airborne LiDAR data. ISPRS J. Photogramm. Remote Sens. 2022, 183, 482–495. [Google Scholar] [CrossRef]
Stanton, T.W.; Ardren, T.; Barth, N.C.; Fernandez-Diaz, J.C.; Rohrer, P.; Meyer, D.; Miller, S.J.; Magnoni, A.; Pérez, M. ‘Structure’ density, area, and volume as complementary tools to understand Maya Settlement: An analysis of lidar data along the great road between Coba and Yaxuna. J. Archaeol. Sci. Rep. 2020, 29, 102178. [Google Scholar] [CrossRef]
Zhou, W.; Chen, F.; Guo, H.; Hu, M.; Li, Q.; Tang, P.; Zheng, W.; Liu, J.a.; Luo, R.; Yan, K.; et al. UAV Laser scanning technology: A potential cost-effective tool for micro-topography detection over wooded areas for archaeological prospection. Int. J. Digit. Earth 2020, 13, 1279–1301. [Google Scholar] [CrossRef]
Klápště, P.; Fogl, M.; Barták, V.; Gdulová, K.; Urban, R.; Moudrý, V. Sensitivity analysis of parameters and contrasting performance of ground filtering algorithms with UAV photogrammetry-based and LiDAR point clouds. Int. J. Digit. Earth 2020, 13, 1672–1694. [Google Scholar] [CrossRef]
Zhao, X.; Guo, Q.; Su, Y.; Xue, B. Improved progressive TIN densification filtering algorithm for airborne LiDAR data in forested areas. ISPRS J. Photogramm. Remote Sens. 2016, 117, 79–91. [Google Scholar] [CrossRef]
Wan, P.; Zhang, W.; Skidmore, A.K.; Qi, J.; Jin, X.; Yan, G.; Wang, T. A simple terrain relief index for tuning slope-related parameters of LiDAR ground filtering algorithms. ISPRS J. Photogramm. Remote Sens. 2018, 143, 181–190. [Google Scholar] [CrossRef]
Chen, Q.; Gong, P.; Baldocchi, D.; Xie, G. Filtering Airborne Laser Scanning Data with Morphological Methods. Photogramm. Eng. Remote Sens. 2007, 73, 175–185. [Google Scholar] [CrossRef]
Sithole, G.; Vosselman, G. Experimental comparison of filter algorithms for bare-Earth extraction from airborne laser scanning point clouds. ISPRS J. Photogramm. Remote Sens. 2004, 59, 85–101. [Google Scholar] [CrossRef]
Soille, P.; Pesaresi, M. Advances in mathematical morphology applied to geoscience and remote sensing. IEEE Trans. Geosci. Remote Sens. 2002, 40, 2042–2055. [Google Scholar] [CrossRef]
Vosselman, G. Slope based filtering of laser altimetry data. ISPRS J. Photogramm. Remote Sens. 2000, 33 Pt 3, 935–942. [Google Scholar]
Axelsson, P. DEM generation from laser scanner data using adaptive TIN models. Int. Arch. Photogramm. Remote Sens. 2000, 33, 110–117. [Google Scholar]
Salleh, M.R.M.; Abd Rahman, M.Z.; Ismail, Z.; Khanan, M.F.A.; Asmadi, M.A. Revised Progressive Morphological Method for Ground Point Classification of Airborne LiDAR Data. Int. J. Built Environ. Sustain. 2019, 6, 31–38. [Google Scholar] [CrossRef]
Li, Y.; Wang, J.; Li, B.; Sun, W.; Li, Y. An adaptive filtering algorithm of multilevel resolution point cloud. Surv. Rev. 2021, 53, 300–311. [Google Scholar] [CrossRef]
Sui, L.; Zhu, J.; Zhu, H.; Zhong, M. Filtering of LiDAR Point Cloud Data Based on new TIN Algorithm. In Proceedings of the 2017 7th International Conference on Manufacturing Science and Engineering (ICMSE 2017), Xiamen, China, 20–22 October 2017. [Google Scholar]
Susaki, J. Adaptive Slope Filtering of Airborne LiDAR Data in Urban Areas for Digital Terrain Model (DTM) Generation. Remote Sens. 2012, 4, 1804–1819. [Google Scholar] [CrossRef]
Shi, W.; Ahmed, W.; Wu, K. Morphologically iterative triangular irregular network for airborne LiDAR filtering. J. Appl. Remote Sens. 2020, 14, 034525. [Google Scholar]
Rizaldy, A.; Persello, C.; Gevaert, C.; Oude Elberink, S.; Vosselman, G. Ground and Multi-Class Classification of Airborne Laser Scanner Point Clouds Using Fully Convolutional Networks. Remote Sens. 2018, 10, 1723. [Google Scholar] [CrossRef]
Yi, Z.H.; Wang, H.; Duan, G.; Wang, Z. An Airborne LiDAR Building-Extraction Method Based on the Naive Bayes-RANSAC Method for Proportional Segmentation of Quantitative Features. J. Indian Soc. Remote Sens. 2021, 49, 393–404. [Google Scholar] [CrossRef]
Xue, D.; Cheng, Y.; Shi, X.; Fei, Y.; Wen, P. An Improved Random Forest Model Applied to Point Cloud Classification. IOP Conf. Ser. Mater. Sci. Eng. 2020, 768, 072037. [Google Scholar] [CrossRef]
Yang, Y.B.; Zhang, N.N.; Li, X.L. Adaptive slope filtering for airborne Light Detection and Ranging data in urban areas based on region growing rule. Surv. Rev. 2017, 49, 139–146. [Google Scholar] [CrossRef]
Bujan, S.; Sellers, C.A.; Cordero, M.; Miranda, D. DecHPoints: A New Tool for Improving LiDAR Data Filtering in Urban Areas. PFG-J. Photogramm. Remote Sens. Geoinf. Sci. 2020, 88, 239–255. [Google Scholar] [CrossRef]
Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]
Jones, M.J.; Fielding, A.; Sullivan, M. Analysing extinction risk in parrots using decision trees. Biodivers. Conserv. 2006, 15, 1993–2007. [Google Scholar] [CrossRef]
Amidror, I. Scattered data interpolation methods for electronic imaging systems: A survey. J. Electron. Imaging 2002, 11, 157–176. [Google Scholar] [CrossRef]
Liu, W.; Sun, J.; Li, W.; Hu, T.; Wang, P. Deep Learning on Point Clouds and Its Application: A Survey. Sensors 2019, 19, 4188. [Google Scholar] [CrossRef] [PubMed]
Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Wang, R. Efficient kNN Classification with Different Numbers of Nearest Neighbors. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 1774–1785. [Google Scholar] [CrossRef] [PubMed]
Shang, C.X.; Li, M.; Feng, S.; Jiang, Q.; Fan, J. Feature selection via maximizing global information gain for text classification. Knowl.-Based Syst. 2013, 54, 298–309. [Google Scholar] [CrossRef]
Tang, B.; Kay, S.; He, H. Toward Optimal Feature Selection in Naive Bayes for Text Categorization. IEEE Trans. Knowl. Data Eng. 2016, 28, 2508–2521. [Google Scholar] [CrossRef]
Wang, X.; Tao, Y.; Zheng, K. Feature Selection Methods in the Framework of mRMR. In Proceedings of the 2018 Eighth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC), Harbin, China, 19–21 July 2018. [Google Scholar]
Wang, P.; Zhang, N. Decision tree classification algorithm for non-equilibrium data set based on random forests. J. Intell. Fuzzy Syst. 2020, 39, 1639–1648. [Google Scholar] [CrossRef]
Tamura, S.; Kobayashi, M.; Saito, Y.; Asakura, T.; Usuda, S. Fall prediction using decision tree analysis in acute care units. J. Phys. Ther. Sci. 2020, 32, 722–728. [Google Scholar] [CrossRef]
Zhang, Y.; Yang, G. Application of Decision Tree Algorithm Based on Clustering and Entropy Method Level Division for Regional Economic Index Selection. In Data Mining and Big Data; Springer: Singapore, 2020. [Google Scholar]
Zhang, K.; Chen, S.-C.; Whitman, D.; Shyu, M.-L.; Yan, J.; Zhang, C. A progressive morphological filter for removing nonground measurements from airborne LIDAR data. IEEE Trans. Geosci. Remote Sens. 2003, 41, 872–882. [Google Scholar] [CrossRef]
Zhang, W.; Qi, J.; Wan, P.; Wang, H.; Xie, D.; Wang, X.; Yan, G. An Easy-to-Use Airborne LiDAR Data Filtering Method Based on Cloth Simulation. Remote Sens. 2016, 8, 501. [Google Scholar] [CrossRef]

Figure 1. Flow of the method.

Figure 2. Illustration of two height differences: (a) Partial height difference of the point cloud;(b) Height difference of the point cloud filtering reference surface.

Figure 3. Feature importance scores ranked using the mRMR method (features 1, 2, and 3 are the point cloud 3D coordinates, feature 4 is the point cloud y-normal vector, feature 5 is the point cloud partial height difference, feature 6 is the point cloud x-normal vector, feature 7 is the point cloud curvature, feature 8 is the point cloud z-normal vector, feature 9 is the point cloud height difference of the filtering reference surface).

Figure 4. Filtering results for group I.

Figure 5. Filtering results for group II.

Figure 6. Filtering results for group III.

Figure 7. Cross-validation results.

Figure 8. Filtering results of the test data.

Figure 9. Filtering misclassification results of test data.

Table 1. Definition of filtering error.

Category	Ground Points	Non-Ground Points	Total Number	Errors
Ground points	$a$	$b$	$e = a + b$	$b / e$ (T.I)
Non-ground points	$c$	$d$	$f = c + d$	$c / f$ (T.Ⅱ)
Total number			$n = e + f$	$(b + c) / n$ (T.E.)

Table 2. Basic information of the collected data.

Group		Number of Points	Feature	Dataset Short Name
I	data 1	14,745,103	Flat terrain or gentle slope, no steep slopes	WA21_Lumbrazo
	data 2	10,727,129		AZ12_USFS
	data 3	9,803,227		AZ12_USFS
II	data 4	9,403,664	With steep or terraced slopes	WA21_Lumbrazo
	data 5	13,036,927		AZ12_USFS
	data 6	10,348,378		CA19_Reed
III	data 7	9,772,423	High and steep slopes	WA21_Lumbrazo
	data 8	8,343,649		CA19_Reed
	data 9	11,162,971		CA19_Reed

Table 3. Model and cross-validation accuracy.

Training Dataset	Model Accuracy	Cross Validation Accuracy
Group I	0.8730	0.8611
Group II	0.8703	0.8504
Group III	0.8743	0.9126
Total	0.8725	0.8730

Table 4. Characteristics of testing datasets.

Data	Number of Point Clouds	Feature	Dataset Short Name
Test 1	10,339,269	Flat	AZ17_Donager
Test 2	7,797,662	slope	CA14_Guo
Test 3	8,942,695	high and steep slopes	CA13_Guo

Table 5. Accuracy evaluation of test data.

Data	Type I Error	Type II Error	Total Error
Test 1	0.0361	0.0327	0.0346
Test 2	0.1804	0.0742	0.1150
Test 3	0.1287	0.0965	0.0991

Table 6. Comparison of filtering accuracy before and after feature selection.

Data	Classification Accuracy of Filtering
	without feature selection	after feature selection
Test 1	96.48%	96.54%
Test 2	88.39%	88.50%
Test 3	90.02%	90.09%
Time consuming of model	203 min	96 min

Table 7. Error comparison of the filtering method results (unit: %).

Data	Type I Error				Type II Error				Total Error
Data	CSF	MF	SF	Ours	CSF	MF	SF	Ours	CSF	MF	SF	Ours
Test 1	4.15	4.30	4.36	3.61	0.93	4.57	0.58	3.27	2.80	4.45	2.79	3.46
Test 2	23.24	3.4	8.36	18.04	5.56	56.53	33.95	7.42	13.13	22.54	33.02	11.50
Test 3	25.60	0.32	3.92	12.87	5.92	84.95	15.10	9.65	8.76	13.67	15.01	9.91
Total									7.74	12.70	15.53	7.91

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, Y.; Sang, M.; Wang, W. A Novel Ground Filtering Method for Point Clouds in a Forestry Area Based on Local Minimum Value and Machine Learning. Appl. Sci. 2022, 12, 9113. https://0-doi-org.brum.beds.ac.uk/10.3390/app12189113

AMA Style

Wu Y, Sang M, Wang W. A Novel Ground Filtering Method for Point Clouds in a Forestry Area Based on Local Minimum Value and Machine Learning. Applied Sciences. 2022; 12(18):9113. https://0-doi-org.brum.beds.ac.uk/10.3390/app12189113

Chicago/Turabian Style

Wu, Yueqiao, Mengting Sang, and Wei Wang. 2022. "A Novel Ground Filtering Method for Point Clouds in a Forestry Area Based on Local Minimum Value and Machine Learning" Applied Sciences 12, no. 18: 9113. https://0-doi-org.brum.beds.ac.uk/10.3390/app12189113

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Ground Filtering Method for Point Clouds in a Forestry Area Based on Local Minimum Value and Machine Learning

Abstract

1. Introduction

2. Method

2.1. Calculation of Point Cloud Height Difference

2.2. Selection of Variable Factors

2.3. Model Validation and Analysis

2.4. Experimental Data

3. Experiment and Results

3.1. Validation of the Filtering Results

3.2. Testing with Dense Point Cloud

3.2.1. Test Data

3.2.2. Test Results

3.3. The Effectiveness of Feature Selection

4. Discussion

4.1. Comparison with Other Methods

4.2. Strength of the Proposed Method

4.3. Misclassification Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI