Next Article in Journal
Contextual Building Selection Based on a Genetic Algorithm in Map Generalization
Next Article in Special Issue
Overview of the OGC CDB Standard for 3D Synthetic Environment Modeling and Simulation
Previous Article in Journal
Simulation-Based Evaluation of Ease of Wayfinding Using Digital Human and As-Is Environment Models
Previous Article in Special Issue
Texture-Cognition-Based 3D Building Model Generalization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Consistent Roof Geometry Encoding for 3D Building Model Retrieval Using Airborne LiDAR Point Clouds

Department of Geomatics, National Cheng-Kung University, Tainan 70101, Taiwan
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2017, 6(9), 269; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi6090269
Submission received: 4 July 2017 / Revised: 17 August 2017 / Accepted: 17 August 2017 / Published: 28 August 2017

Abstract

:
A 3D building model retrieval method using airborne LiDAR point clouds as input queries is introduced. Based on the concept of data reuse, available building models in the Internet that have geometric shapes similar to a user-specified point cloud query are retrieved and reused for the purpose of data extraction and building modeling. To retrieve models efficiently, point cloud queries and building models are consistently and compactly encoded by the proposed method. The encoding focuses on the geometries of building roofs, which are the most informative part of a building in airborne LiDAR acquisitions. Spatial histograms of geometric features that describe shapes of building roofs are utilized as shape descriptor, which introduces the properties of shape distinguishability, encoding compactness, rotation invariance, and noise insensitivity. These properties facilitate the feasibility of the proposed approaches for efficient and accurate model retrieval. Analyses on LiDAR data and building model databases and the implementation of web-based retrieval system, which is available at http://pcretrieval.dgl.xyz, demonstrate the feasibility of the proposed method to retrieve polygon models using point clouds.

1. Introduction

Recent development on 3D scanning and modeling technologies has led to an increasing number of 3D models, and most of the models have been made available in web-based platforms with data-sharing service. In this context, the question of “How to generate 3D building models?” may evolve to “How to find them in model databases and in the Internet?” [1]. This concept motivates this study to encode unorganized, noisy, and incomplete building point clouds acquired by airborne LiDAR and efficiently retrieve 3D models from databases and the Internet. A set of complete or semi-complete building models is retrieved and reused instead of reconstructing point clouds using complicated modeling techniques [2,3,4,5].
The naive approach called text-based retrieval uses keywords in metadata to search for the desired 3D models. This method is simple and efficient, but using keywords as queries suffers from difficulties caused by inappropriate annotations and language varieties. By contrast, content-based retrieval is a promising approach that encodes geometries of queries and models using a shape descriptor and matches the encoded coefficients for data retrieval. Most previous studies on content-based 3D model retrieval take polygon models as input queries [6,7]. These methods can efficiently and accurately extract models from databases. However, obtaining a desired polygon model as input query is difficult, which limits the usage of retrieval systems. With the aid of airborne LiDAR, which has the capability of efficient spatial data acquisitions, a model retrieval by using LiDAR point clouds as input queries is proposed. By consistently encoding point cloud and polygon models, a set of building models similar to an input point cloud in geometry shape can be retrieved for the purposes of data extraction and efficient cyber city modeling. Based on the concepts of data reuse and crowsourcing, the proposed system can efficiently construct a 3D city model which is one of key components in virtual geographic environment.
Content-based model retrieval methods can be classified into two categories, model-based retrieval and view-based retrieval, based on the shape descriptor used in encoding. In model-based retrieval, shape similarities are measured using various geometric descriptors, including shape distribution [6], shape spectral [8], and shape topology [9]. In methods based on shape distribution, geometric features defined over feature spaces are accumulated in bins [6]. A histogram of these bins is utilized as the signature of a 3D model. In shape spectral methods, geometric shapes are transformed to a spectral domain and the spectral coefficients are used in shape matching and retrieval [8]. In topology-based methods, model topologies are represented as skeletons, and retrieval is performed based on the assumption that similar shapes have similar skeletons [9]. Unlike model-based methods, view-based methods represent 3D geometric shapes as a set of 2D projections, and 3D models are matched using their visual similarities instead of geometric similarities [7,10,11]. Each projection is described by image descriptors, and shape matching is reduced to measurement of similarities among the views of the query object and models in the database. Model-based and view-based methods perform well on existing benchmarks for polygon model encoding and retrieval. However, these methods are not designed for unorganized, noisy, sparse, and incomplete point clouds. Recently, Chen et al. [12] proposed point cloud encoding using a set of low-frequency spherical harmonic functions (SHs). With the preprocessing of data resampling and filling, the approach can alleviate the difficulties caused by sparse and incomplete sampling of point clouds. However, the use of low-frequency SHs decreases the ability to distinguish objects with similar geometric shapes thereby leading to ambiguity in shape description.
To improve shape distinguishability, an roof geometry encoding that integrates shape distribution with visual similarity is proposed. The main idea is to represent point clouds and polygon models using top-view depth images that can describe the shapes of building roofs and avoid disturbances from insufficient sampling of building side-views. The depth images are further encoded by geometry features with spatial histograms, which introduce the properties of compact description, rotation invariance, noise insensitivity, and consistent encoding of point clouds and polygon models. These properties lead to a compact storage size and real-time retrieval response time. Furthermore, the visual similarity in depth images and shape distribution in spatial histograms increase the distinguishability of geometric shapes. The remainder of this paper is organized as follows. Section 2 describes the methodology of point cloud encoding and building model retrieval. Section 3 introduces the properties of the proposed encoding method. Section 4 discusses experimental results, and Section 5 presents conclusions and future work.

2. Methodology

2.1. System Overview

Figure 1 schematically illustrates the proposed system which consists of three main steps, depth image generation, data encoding, and data retrieval. In the first step, the building models and point cloud, that is, the input query, are represented by top-view depth images for the geometric description of building roofs. An interpolation process is then applied to the depth image of the point cloud for hole filling to facilitate consistent encoding. In data encoding, a set of geometric features are extracted from the interpolated depth images. The extracted geometric features are encoded by utilizing a spatial histogram with a determined origin, which can provide a rotation-invariance shape description. During data retrieval, the encoded coefficients of input query are matched with those of the models in a database using a shape similarity metric. A set of best-matched models are extracted. In this section, the process of depth image generation is described in Section 2.2, followed by the data encoding and retrieval, which are described in Section 2.3, Section 2.4 and Section 2.5.

2.2. Generation of Depth Image

Top-view projection is selected in depth image generation because of the characteristics of airborne LiDAR scanning. The roof is generally the most informative and identifiable part of a building in airborne LiDAR acquisitions. In the top-view projection, the pixel value of the projection is generally defined as the height difference between the ground and the pixel in the building roof. However, pixel intensity in this definition is dominated by building height, and the roof geometry is insignificant, especially for tall buildings. In this study, the pixel value of depth image, which is denoted as D ( x , y ) , is defined as the distance between the maximal height of building and the height on that position, that is,
D ( x , y ) = M a x B H B ( x , y ) ,
where M a x B represents the maximal height of a building B, and H B ( x , y ) denotes the building height at the position ( x , y ) . In this definition, the pixel value of depth image corresponds to the relative height difference, which can enhance roof geometry in the depth image.
The depth image generation is a process of rasterization. The spatial resolution of a geometric description depends on the setting of grid size in rasterization. A large grid size indicates low possibility of missing depth information and efficient rasterization, but will result in low spatial resolution of depth images and rough geometric representation of point clouds. By contrast, small grid size is linked to a high-resolution depth image with fine geometric description, but will result in time-consuming computation and large number of empty pixels. In this study, small grid size is preferred because data interpolation can be performed to alleviate the problem of missing information. In addition, the grid size is set according to point cloud density with the expectation that, on average, each grid in the depth image has one inside point. Therefore, grid size is defined as G S = 1 / A v g P D , where A v g P D represents average point density of a building point cloud. Figure 2 shows an example of point cloud rasterization. The average point density of the building is 16.67 pts/m 2 and grid size G S is set to 0.245 m in rasterization. In this setting, the geometric detail of building roof is preserved and only small holes are present in the depth image. The search for optimal setting of grid size is difficulty because the determination of grid size is data-sensitive. Any advanced approach on that such as the method proposed by Awrangjeb et al. [13] can be adopted and integrated into the system.
Holes are generally present in depth images because of rasterization of irregularly distributed LiDAR point clouds. To address this issue, a hole filling and completion process is performed using grayscale morphological operators. A grayscale morphological closing, which consists of dilation and erosion operators with a flat structuring element, is adopted to fill gaps in depth images. The proposed encoding method requires the preservation of point cloud topology and geometry, but it does not require a perfect depth image, that is, an image without holes. Therefore, a simple and efficient morphology-based approach is adopted. With this hole filling, only holes that are smaller than the structuring element are filled; thus, the geometric topology, such as hollow shape, can be maintained. In the implementation, a 7 × 7 structuring element is used.

2.3. Data Encoding

The first step of data encoding is determining a translation and rotation invariant origin of a depth image. Chen et al. [12] suggested the selection of the center of minimal bounding box of an object as origin. However, the origin obtained in this approach is slightly sensitive to rotation, especially for non-symmetric buildings. For instance, in Figure 3, the cyan boxes are the minimal bounding boxes of rotated depth images. These bounding boxes differ in height and width, and thus, the obtained origins represented by cyan dots are inconsistent. In this study, a depth image is regarded as a 2.5D model. The barycenter of the 2.5D model is selected as origin because that this center is unchanged after similarity transformation. Examples of obtained origins are shown in Figure 3. The origins displayed in red remain constant as the object is rotated, which implies this origin is invariant to rotation and translation.
In formula, given a depth image D ( x , y ) , the shape origin ( x 0 , y 0 ) is defined as the weighted position in the depth image, that is,
( x 0 , y 0 ) = 1 n i = 1 n ( x i , y i ) × D i ,
where n represents the number of roof pixels, and D i denotes the value of D ( x i , y i ) , which is used as the weight of vector ( x i , y i ) .
Before data encoding, the point cloud and the models are aligned to their origins to facilitate rotation and translation invariant encoding. Three geometric features, namely, height feature, edge feature, and planarity feature, as illustrated in Figure 4, are used to describe the shape of a building roof in depth image domain. These three features are described as follows.
Height Feature. A building roof is represented as a depth image, and its pixel intensity denotes the relative height of the roof in that position. Therefore, using pixel intensities in a depth image to represent roof geometry is intuitive and efficient. Following the study [14], height feature, which is denoted as F h ( x , y ) , is defined as the pixel intensities of depth images, that is, F h ( x , y ) = D ( x , y ) .
Edge Feature. Sharp edges in an image are linked to high-frequency components that represent details in an image. Therefore, sharp edges are used as geometric features to represent roof geometries. Inspired by the study [15], Laplacian of Gaussian (LoG) filter is adopted to extract sharp edges while alleviating disturbances from noise. The LoG is composed of Gaussian and Laplacian filters, where the low-pass kernel function in the Gaussian filter is used to suppress noise and the second derivative kernel function in Laplacian filter is utilized to extract sharp edges. The LoG is applied to depth images with inherent noises, and the resulting data are used as edge feature, which is denoted by F e ( x , y ) .
Planarity Feature. Different from sharp edges, which are high-frequency components, planes in an image relate to low-frequency components that represent rough information in an image. Therefore, the planarity feature is selected as geometric feature. The planarity feature from the principal component analysis of a point set is a useful descriptor that can describe the local geometry of a point set and indicate whether local geometry is planar. Given a 2.5D point set P = { ( x i , y i , D i ) } i = 1 n from the pixels in a depth image where the points are located within a circle of diameter r centered at a position p c : ( x c , y c , D c ) , a simple method for computing the principal components of the point set P is to diagonalize the covariance matrix of P. In matrix form, the covariance matrix of P is written as C ( P ) = p i P ( p i p c ) T ( p i p c ) . The eigenvectors and eigenvalues of the covariance matrix are computed using matrix diagonalization, that is, V 1 CV = D , where D is the diagonal matrix containing the eigenvalues { λ 1 , λ 2 , λ 3 } of the covariance matrix C , and matrix V contains the corresponding eigenvectors. In geometry, the eigenvalues relate to an ellipsoid that represents the local geometric structure of a point set. The combinations of these eigenvalues provide discriminant geometric features. For instance, λ 1 λ 2 λ 3 indicates a flat ellipsoid that can represent planar structures, and λ 1 λ 2 λ 3 corresponds to a volumetric structure, such as the corners of buildings. Following the definitions in [16,17], planarity feature is defined as ( λ 2 λ 3 ) / λ 1 , which has the ability of enhancing and describing planar structures. The planarity feature F λ ( x , y ) obtained from principal component analysis is used as geometric feature.
The height, edge, and planarity features provide the descriptions of height, high-frequency, and low-frequency components of a building roof, respectively. These three features compose a complete set of geometric description for a depth image and a building roof. An example of geometric features is shown in Figure 5. To visualize these features, the feature values are normalized to the range of [0, 255] and displayed as a gray image.

2.4. Spatial Histogram

The purposes of using spatial histogram in encoding are to reduce the data sizes of geometric features and to achieve rotation-invariance encoding. In spatial histogram generation, the feature image is partitioned into several disjoint circular subspaces or called bins, which are all centered on the determined origin. Similar to image histogram, the intensities of pixels in the feature image are accumulated in bins according to their positions. The accumulated value in each bin is then normalized by the sum of pixel intensities. The range of bin value is [0.0, 1.0] and the sum of all values in bins is 1.0. The distribution of pixel intensities and spatial positions of the pixels are encoded in the histogram, which leads to an encoding with the ability of geometric distinguishability. In the implementation, the feature image is circularly partitioned into k bins with equal width, and the maximum radius of the depth image is set to the distance between the farthest pixel and the determined origin. The number of bins k is set to 30 to balance the encoding compactness and shape distinguishability. A small k may lead to a rough and insufficient description of a geometric shape, whereas a large k may cause inefficient retrieval and increase the sensitivity to noise.
With spatial histogram, the height, edge, and planarity geometric features are described and encoded as ( h 1 , , h k ) , ( e 1 , , e k ) , and ( p 1 , , p k ) , respectively. The total number of components in an encoded coefficient is 90. Examples of spatial histograms with 10 bins are shown in Figure 6. The height, edge, and planarity features used to describe different geometrics have different spatial histogram results.
To further distinguish the building of objects with similar roof geometries but different sizes, the area of the roof, the height of building, and the maximum radius are integrated in the encoding, which can lead to a scale-variant shape representation. The maximal height of a building, denoted as M h , is defined as
M h = m a x ( H B ( x , y ) ) H m i n ) ,
where the function m a x ( ) returns the maximum of an input, and H m i n represents the minimum of the roof height H B ( x , y ) . The maximum radius, denoted as M r , is defined as
M r = m a x _ r a d i u s ( H ˜ B ( x , y ) | H B ( x , y ) H m i n ) ,
where the function m a x _ r a d i u s ( ) returns the maximal distance in the xy-space between the roof pixel and its origin. The building outliers, that is, H B ( x , y ) = H m i n , are excluded from the distance calculation because outliers are generally the protruded exterior facades lying near the roof boundaries. The roof area, denoted by A f , is defined as
A f = c o u n t ( H ˜ B ( x , y ) | H B ( x , y ) H m i n ) ,
where the function c o u n t ( ) returns the total number of pixels in the input, except pixels with the minimal value H m i n .

2.5. Data Retrieval

By combining the spatial histograms of three geometric features, a point cloud or polygon model is encoded as
F 1 = { ( h 1 , , h k ) , ( e 1 , , e k ) , ( p 1 , , p k ) } .
In addition, the maximal height, maximal radius, and roof area for scale-variance encoding are combined as
F 2 = { M h , M r , A f } .
Given the encoding in (6) and (7), the measurement of shape similarity is formulated as a combination of F 1 and F 2 . Given a point cloud P and a building model M, shape similarity is defined as
d i s t ( P , M ) = | F 1 ( P ) F 2 ( M ) | × 1 + 2 × | F 2 ( P ) F 2 ( M ) | | F 2 ( P ) + F 2 ( M ) | ,
where the first part of the equation, | F 1 ( P ) F 2 ( M ) | , is the measure of geometric similarity between the objects P and M, and the second part, 1 + 2 × | F 2 ( P ) F 2 ( M ) | | F 2 ( P ) + F 2 ( M ) | , which ranges from 1.0 (no penalty) to 2.0 (maximal penalty), denotes the penalty for different object scales. No penalty is given for buildings with the same scale, and a maximal penalty is set for buildings with extremely large difference on scale. In (8), a small d i s t represents a high similarity between the point cloud P and the polygon model M, and vice versa.
The building models downloaded from the Internet are encoded in preprocessing using (6) and (7). The encoded coefficients are stored in a database. When a point cloud is selected as input query in the online retrieval system (http://pcretrieval.dgl.xyz), the point cloud is also encoded by using (6) and (7). The obtained coefficients are then matched with the coefficients in the database using (8). Shape similarities are sorted and several top-ranking models are extracted as the query response.

3. Encoding Properties

The proposed encoding method possesses several properties that make a retrieval system feasible to extract polygon building models using point clouds as input queries. The point clouds and polygon models are consistently encoded with the aids of morphological hole-filling and consistent origin determination. To demonstrate this property, a building model and its corresponding point cloud were tested. The encoding results in Figure 7 show that the encoded height, edge, and planarity coefficients of the point cloud and polygon model are similar. This result indicates consistent encoding and the feasibility of retrieving polygon models by using point clouds. Second, the proposed approach provides a shape similarity metric, wherein similar shapes have small distances and dissimilar ones have large distances. This property implies that encoding can distinguish objects with different geometric shapes, which is the foundation of data retrieval.
Third, the encoded coefficients are rotation invariant because of the spatial histogram and the rotation-invariance origin. Figure 8 shows that the coefficients remain unchanged when a 45-degree rotation is applied to the object. Fourth, the proposed encoding scheme is insensitive to noise because the noise problem is alleviated by statistics-based histogram encoding. For example, in Figure 9, a point cloud with Gaussian noise is tested. The standard deviation in the Gaussian noise is set to 0.1. The results show that the encoded coefficients exhibit slight differences when the Gaussian noise is added to the data. Fifth, the building models in the database are compactly encoded. The encoded coefficient for a building model is a set of 90 real numbers, which is smaller than the size of original model data. With the aforementioned properties, the proposed retrieval method can efficiently and accurately retrieve building models using point clouds.

4. Experimental Results and Discussion

4.1. Datasets

A database containing about one million 3D building models was tested. The models are downloaded from the Internet using web crawlers that systematically browse the Internet to search for 3D building models. The study area is the campus of National Cheng-Kung University, Taiwan and Taipei 101 building. The airborne LiDAR point cloud in the study area was acquired by Optech ALTM 30/70 in 2011. This point cloud was extracted from a combination of three overlapped scanning strips. The number of points in the test LiDAR point cloud is 2,772,880 and the average density is 10.72 pts/m 2 or a point spacing of 0.31 m. The LiDAR point cloud of the study area and its corresponding aerial image are shown in Figure 10. The trees nearby buildings were removed and building point clouds were segmented and extracted from the test data manually for simplicity. Any advanced point cloud segmentation or building extraction algorithms [18,19,20] can be adopted to extract building point clouds automatically or semi-automatically.

4.2. Computational Performance

The encoding time for a point cloud of 22,440 points and about 2000 m 2 areas is 150 ms, and the retrieval response time is 600 ms for a database containing one million models. The information of the tested building point clouds, grid sizes, and computation time are shown in Table 1. The grid sizes are automatically calculated according to the average point density of the building. Encoding time mainly depends on the number of points and the xy-area of a building, and the retrieval time is linearly dependent on the size of the model database.

4.3. Evaluation of Model Encoding

To evaluate the proposed method, two sets of building models that have different level-of-details (LODs) on roofs were tested. The purpose of this experiment is to evaluate the distinguishability of encoding approaches on building models with different geometric roof details. The first dataset that contains four models is a building with hollow geometry. The original model with hollow geometry, which is denoted by L o D 1 a , is gradually reduced to a non-hollow model with simplified roof geometry, which is denoted as L o D 4 a . The simplified and detached parts of the buildings are marked by red. The second dataset contains five models, where the original model with detailed geometry on roof, denoted by L o D 1 b , is gradually simplified to be a simple box, denoted by L o D 5 b . The aerial photos, LiDAR point clouds, and well-constructed building models, which are, L o D 1 a and L o D 1 b , are shown in Figure 11. The point clouds and the models are encoded using the proposed method. The shape similarities between the models and their corresponding point clouds are measured using (8). The models and their similarity values are shown in Figure 12. For the first dataset, the results indicate that the proposed encoding method can separate the model with hollow geometry from the model without hollow geometry. The results in the second dataset show that the shape similarities decrease near-linearly when the geometry of the original model is simplified gradually. These experiments demonstrates the ability to describe hollow geometric shapes and distinguish models with different details on roof geometry, which are superior over the related method [12] that encodes models using a set of low-frequency spherical harmonic functions. Because of the use of low-frequency and spherical basis functions, the method [12] is not able to distinguish models with similar geometry details and models with and without hollow geometries.

4.4. Evaluation of Model Retrieval

The proposed method was compared with the method [12] by using a database that contains around one million 3D building models downloaded from the Internet. Three building point clouds were tested as the input queries. To evaluate the retrieval accuracy, the commonly used measurement, namely, root mean square deviation (RMSD), is adopted to calculate the shape similarities between the input query and the retrieved models. Before RMSD calculation, the retrieved models are aligned with the input query semi-automatically using the standard registration algorithm called iterative closet point [21]. The shape similarities obtained by RMSD are used as reference values in retrieval evaluation. Figure 13 shows the retrieval results, including the retrieved models, ranks, and reference RMSD values. The visual comparisons show that the models extracted by these two methods are similar to the queries in shape, and the ranking generated by the proposed method is better than that in [12], especially for the third data. This experiment shows that the proposed method combining shape distribution with visual similarity can improve the ability to distinguish geometric shapes, compared with [12]. However, the ranks obtained by these two approaches are not well-matched with that of RMSD references. For instance, in the first test data, the RMSD of the ranked 5th model is larger than that of rank 7th model, which means that the ranking is inconsistent with the reference. The imperfect ranking is caused by the use of compact shape description. To achieve efficient retrieval, data are compactly encoded by shape description, and only 90 coefficients are used in the proposed method to encode a point cloud or a polygon model. To solve this problem, the ranks of the extracted models can be further refined using RMSD measurement if perfect ranking is required.
To further analyze retrieval performance, the ranking of all extracted models by using RMSD is used as reference. Then, the measurement for ranking is defined as the difference between the obtained ranking and reference, that is, D i f f r = | R m e t h o d R r e f | , where R r e f and R m e t h o d represent the reference ranking and the ranking from a method, respectively. In this measure, the average of D i f f r , which is denoted as A v g . D i f f r , is used to estimate ranking quality. The statistical analysis for A v g . D i f f r is shown in Table 2. A small A v g . D i f f r means high-quality ranking. In addition, the commonly used measurements precision η s and recall η n were adopted to evaluate the retrieval accuracy using the Data #3 in Figure 13. These measurements are defined as η s = T P T P + F P and η n = T P T P + F N , where T P , F P , and F N represent true positive, false positive, and false negative, respectively. The results are shown in Table 3. From the statistical analysis, the retrival results by the proposed method are superior to that by [12].

5. Conclusions and Future Work

This study proposed a new method for 3D building model retrieval using LiDAR point clouds as input query. To archive consistent encoding of polygonal models and point clouds, rotation-invariance origin determination is adopted, which utilizes object volume to determine the origin instead of using the minimal bounding box of an object. In addition, the morphological closing operator is applied to fill the holes in the depth images of point clouds. Filling holes effectively alleviates the difficulties caused by sparse and incomplete sampling of point clouds and facilitates consistent encoding. The proposed encoding that integrates shape distribution with visual similarity can increase the distinguishability of geometric shapes. The experiments on building models with different LoDs show that ambiguous shape description is avoided. The experiments also demonstrate that using the spatial histogram of geometric features as shape descriptor introduces the properties of noise-insensitivity and rotation-invariance to the retrieval. These properties facilitate the feasibility of the proposed approaches to achieve efficient and accurate model extraction. Based on the qualitative and quantitative analyses on LiDAR data and the building models and based on the implemented web-based retrieval system, we conclude that retrieve building models by using point clouds is feasible. In the future, we plan to extend the proposed method to point clouds for photogrammetry techniques and terrestrial LiDAR, which also have a great need of 3D modeling.

Acknowledgments

The authors would like to thank the anonymous reviewers for their insightful comments. This work was supported in part by the National Science Council (Contracts MOST 104-2628-M-006-002-MY3 and MOST 104-2628-E-006-003-MY3), Taiwan.

Author Contributions

Yi-Chen Chen conducted the research, performed the experiments and implemented the retrieval system; Bo-Yi Lin designed the website interface of retrieval system; Chao-Hung Lin contributed to the analysis and to writing.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Funkhouser, T.; Min, P.; Kazhdan, M.; Chen, J.; Halderman, A.; Dobkin, D.; Jacobs, D. A search engine for models. ACM Trans. Graph. 2003, 22, 83–105. [Google Scholar] [CrossRef]
  2. Huang, H.; Brenner, C.; Sester, M. A generative statistical approach to automatic 3d building roof reconstruction from laser scanning data. ISPRS J. Photogramm. Remote Sens. 2013, 79, 29–43. [Google Scholar] [CrossRef]
  3. Jarzgbek-Rychard, M.; Borkowski, A. 3D building reconstruction from ALS data using unambiguous decomposition into elementary structures. ISPRS J. Photogramm. Remote Sens. 2016, 118, 1–12. [Google Scholar] [CrossRef]
  4. Perera, G.S.N.; Maas, H.-G. Cycle graph analysis for 3D roof structure modelling: concepts and performance. ISPRS J. Photogramm. Remote Sens. 2014, 93, 213–216. [Google Scholar] [CrossRef]
  5. Xiong, B.; Elberink, S.O.; Vosselman, G. A graph edit dictionary for correcting errors in roof topology graphs reconstructed from point clouds. ISPRS J. Photogramm. Remote Sens. 2014, 93, 227–242. [Google Scholar] [CrossRef]
  6. Akgul, C.B.; Sankur, B.; Yemez, Y.; Schmitt, F. 3D model retrieval using probability density-based shape descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 1117–1133. [Google Scholar] [CrossRef] [PubMed]
  7. Gao, Y.; Wang, M.; Zha, Z.-J.; Tian, Q.; Dai, Q.-H.; Zhang, N.-Y. Less is more: efficient 3D object retrieval with query view selection. IEEE Trans. Multimedia 2011, 13, 1007–1018. [Google Scholar] [CrossRef]
  8. Mademlis, A.; Daras, P.; Tzovaras, D.; Strintzis, M.G. Ellipsoidal harmonics for 3-D shape description and retrieval. IEEE Trans. Multimedia 2009, 11, 1422–1433. [Google Scholar] [CrossRef]
  9. Tam, K.L.; Lau, W.H. Deformable model retrieval based on topological and geometric signatures. IEEE Trans. Vis. Comput. Graph. 2007, 13, 470–482. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  10. Chen, D.Y.; Tian, X.P.; Shen, Y.T. On visual similarity based 3D model retrieval. Comput. Graph. Forum 2003, 22, 223–232. [Google Scholar] [CrossRef]
  11. Papadakis, P.; Pratikakis, I.; Perantonis, S.; Theoharis, T. Efficient 3D shape matching and retrieval using a concrete radialized spherical projection representation. Pattern Recognit. 2007, 40, 2437–2452. [Google Scholar] [CrossRef]
  12. Chen, J.-Y.; Lin, C.-H.; Hsu, P.-C.; Chen, C.-H. Point cloud encoding for 3D building model retrieval. IEEE Trans. Multimedia 2014, 16, 337–345. [Google Scholar] [CrossRef]
  13. Mohammad, A.; Zhang, C.; Fraser, C.S. Building Detection in Complex Scenes Thorough Effective Separation of Buildings from Trees. Photogramm. Eng. Remote Sens. 2012, 78, 729–745. [Google Scholar]
  14. Mallet, C.; Bretar, F.; Roux, M.; Soergel, U.; Heipke, C. Relevance assessment of full-waveform lidar data for urban area classification. ISPRS J. Photogramm. Remote Sens. 2011, 66, s71–s84. [Google Scholar] [CrossRef]
  15. Maltezos, E.; Ioannidis, C. Automatic Detection of Building Points from LIDAR and Dense Image Matching Point Clouds. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, II-3/W5, 33–40. [Google Scholar] [CrossRef]
  16. Gross, H.; Thoennessen, U. Extraction of lines from laser point clouds. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2016, 36, 86–91. [Google Scholar]
  17. Lin, C.-H.; Chen, J.-Y.; Su, P.-L.; Chen, C.-H. Eigen-feature analysis of weighted covariance matrices for LiDAR point cloud classification. ISPRS J. Photogramm. Remote Sens. 2014, 94, 70–79. [Google Scholar] [CrossRef]
  18. Li, Y.; Hu, Q.; Wu, M.; Liu, J.; Wu, X. Extraction and Simplification of Building Facade Pieces from Mobile Laser Scanner Point Clouds for 3D Street View Services. ISPRS Int. J. Geoinf. 2016, 5, 231. [Google Scholar] [CrossRef]
  19. Serna, A.; Marcotegui, B.; Hernandez, J. Segmentation of Facades from Urban 3D Point Clouds Using Geometrical and Morphological Attribute-Based Operators. ISPRS Int. J. Geoinf. 2016, 5, 6. [Google Scholar] [CrossRef]
  20. Mongus, D.; Lukač, N.; Žalik, B. Ground and building extraction from LiDAR data based on differential morphological profiles and locally fitted surfaces. ISPRS J. Photogramm. Remote Sens. 2014, 93, 145–156. [Google Scholar] [CrossRef]
  21. Besl, P.J.; McKay, N.D. Method for registration of 3D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 239–256. [Google Scholar] [CrossRef]
Figure 1. System overview. The system consists of three main steps, depth image generation, data encoding, and data retrieval.
Figure 1. System overview. The system consists of three main steps, depth image generation, data encoding, and data retrieval.
Ijgi 06 00269 g001
Figure 2. Depth image. (Left, Middle): perspective and top views of point cloud, respectively; (Right): top view of depth image.
Figure 2. Depth image. (Left, Middle): perspective and top views of point cloud, respectively; (Right): top view of depth image.
Ijgi 06 00269 g002
Figure 3. Comparisons between the origins obtained by using the minimal bounding box (the cyan dots) and the building volume (the red dots).
Figure 3. Comparisons between the origins obtained by using the minimal bounding box (the cyan dots) and the building volume (the red dots).
Ijgi 06 00269 g003
Figure 4. Illustration of roof geometric features. (Left): height feature; (Middle): edge feature; (Right): planarity feature. The features are marked in red.
Figure 4. Illustration of roof geometric features. (Left): height feature; (Middle): edge feature; (Right): planarity feature. The features are marked in red.
Ijgi 06 00269 g004
Figure 5. Results of geometric features. (Left): original point cloud; (Right): results of height, edge, and planarity features of the point cloud, respectively.
Figure 5. Results of geometric features. (Left): original point cloud; (Right): results of height, edge, and planarity features of the point cloud, respectively.
Ijgi 06 00269 g005
Figure 6. Results of the spatial histograms of height (left); edge (middle); and planarity features (right); Feature images (top) and corresponding spatial histograms of 10 bins (bottom).
Figure 6. Results of the spatial histograms of height (left); edge (middle); and planarity features (right); Feature images (top) and corresponding spatial histograms of 10 bins (bottom).
Ijgi 06 00269 g006
Figure 7. Consistent encoding of point cloud (top) and building model (bottom). 1st row: original data; 2nd–4th rows: results of height, edge and planarity features and their corresponding spatial histograms.
Figure 7. Consistent encoding of point cloud (top) and building model (bottom). 1st row: original data; 2nd–4th rows: results of height, edge and planarity features and their corresponding spatial histograms.
Ijgi 06 00269 g007
Figure 8. Demonstration of rotation invariance. (Top): results of geometric features and corresponding spatial histograms; (Bottom): results of rotated data (45-degree rotation).
Figure 8. Demonstration of rotation invariance. (Top): results of geometric features and corresponding spatial histograms; (Bottom): results of rotated data (45-degree rotation).
Ijgi 06 00269 g008
Figure 9. Demonstration of noise insensitivity. (Top): point cloud encoding results; (Bottom): encoding results of a point cloud with Gaussian noise addition.
Figure 9. Demonstration of noise insensitivity. (Top): point cloud encoding results; (Bottom): encoding results of a point cloud with Gaussian noise addition.
Ijgi 06 00269 g009
Figure 10. Study area. The aerial image (left) and corresponding point cloud (right).
Figure 10. Study area. The aerial image (left) and corresponding point cloud (right).
Ijgi 06 00269 g010
Figure 11. Tested building models. (Left): orthogonal image; (Middle): airborne LiDAR point clouds; (Right): 3D models downloaded from the Internet.
Figure 11. Tested building models. (Left): orthogonal image; (Middle): airborne LiDAR point clouds; (Right): 3D models downloaded from the Internet.
Ijgi 06 00269 g011
Figure 12. Tested models with different LODs. The numbers shown below the models represent the shape similarities between the models and their corresponding point clouds.
Figure 12. Tested models with different LODs. The numbers shown below the models represent the shape similarities between the models and their corresponding point clouds.
Ijgi 06 00269 g012
Figure 13. Comparisons among the retrieval results obtained by our method (Method A) and the related method [12] (Method B).
Figure 13. Comparisons among the retrieval results obtained by our method (Method A) and the related method [12] (Method B).
Ijgi 06 00269 g013
Table 1. Time performance. 1st column: point clouds of the tested building; 2nd column: number of points; 3rd column: ground area of a building; 4th column: average point density; 5th column: grid size; 6th column: encoding time (En.); 7th column: retrieval time (Re.).
Table 1. Time performance. 1st column: point clouds of the tested building; 2nd column: number of points; 3rd column: ground area of a building; 4th column: average point density; 5th column: grid size; 6th column: encoding time (En.); 7th column: retrieval time (Re.).
Point Clouds#PointsArea (m^2)Avg.PDGS (m)Time Performance
T_en(ms) T_re(ms)
Ijgi 06 00269 i001508433050.116.670.245155606
Ijgi 06 00269 i0021837199848.618.650.232413597
Ijgi 06 00269 i003260772141.75012.180.287155579
Ijgi 06 00269 i004550613829.50014.380.264205637
Ijgi 06 00269 i005296981342.43822.120.212155606
Table 2. Comparisons between retrieved rankings using the proposed method and the method by Chen et al. (2014). 1st column: RMSD of the retrieved model; 2nd column: reference ranking; 3rd column: Diffr value. Two numbers in a pair denotes the performances of the compared methods.
Table 2. Comparisons between retrieved rankings using the proposed method and the method by Chen et al. (2014). 1st column: RMSD of the retrieved model; 2nd column: reference ranking; 3rd column: Diffr value. Two numbers in a pair denotes the performances of the compared methods.
The Proposed Method/ Chen et al. (2014)
Data #1Data #2Data #3
RankRMSDRef.DiffrRMSDRef.DiffrRMSDRef.Diffr
10.38/0.381/10/02.16/2.161/10/03.60/4.957/136/12
22.15/1.879/47/27.18/5.8412/510/32.07/4.953/12110
32.07/2.217/104/78.40/5.2115/312/02.07/2.263/50/2
42.38/2.2114/1110/75.83/8.884/170/131.97/4.082/102/6
52.23/3.4812/187/137.60/8.5213/168/111.90/5.651/174/12
61.40/3.822/194/135.96/7.646/140/83.87/6.609/193/13
71.91/2.446/151/85.96/7.007/100/33.87/5.598/161/9
82.47/1.9116/58/35.96/9.347/181/104.08/6.3310/182/10
91.59/2.503/176/86.04/5.149/20/74.08/5.3010/141/5
102.30/2.0813/83/27.17/10.4211/191/93.10/5.306/144/5
Avg.Diffr1.89/2.306.23/7.013.06/5.10
Table 3. Precision-and-recall of our method (Method A) and the related method [12] (Method B). The tested data is Data #3 in Figure 13 is used as tested data.
Table 3. Precision-and-recall of our method (Method A) and the related method [12] (Method B). The tested data is Data #3 in Figure 13 is used as tested data.
# of queries1234567891011121314
MethodAPrecision0.000.000.330.500.400.330.290.250.220.200.180.250.230.21
Recall0.000.000.070.140.140.140.140.140.140.140.140.210.210.21
MethodBPrecision1.001.001.001.001.001.001.001.001.001.001.001.001.000.93
Recall0.070.140.210.290.360.430.500.570.640.710.790.860.930.93

Share and Cite

MDPI and ACS Style

Chen, Y.-C.; Lin, B.-Y.; Lin, C.-H. Consistent Roof Geometry Encoding for 3D Building Model Retrieval Using Airborne LiDAR Point Clouds. ISPRS Int. J. Geo-Inf. 2017, 6, 269. https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi6090269

AMA Style

Chen Y-C, Lin B-Y, Lin C-H. Consistent Roof Geometry Encoding for 3D Building Model Retrieval Using Airborne LiDAR Point Clouds. ISPRS International Journal of Geo-Information. 2017; 6(9):269. https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi6090269

Chicago/Turabian Style

Chen, Yi-Chen, Bo-Yi Lin, and Chao-Hung Lin. 2017. "Consistent Roof Geometry Encoding for 3D Building Model Retrieval Using Airborne LiDAR Point Clouds" ISPRS International Journal of Geo-Information 6, no. 9: 269. https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi6090269

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop