Next Article in Journal
Emergence of White Organic Light-Emitting Diodes Based on Thermally Activated Delayed Fluorescence
Previous Article in Journal
Bias Impact Analysis and Calibration of UAV-Based Mobile LiDAR System with Spinning Multi-Beam Laser Scanner
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Depth Image Super Resolution Based on Edge-Guided Method

Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, Dalian University, Dalian 116622, China
*
Author to whom correspondence should be addressed.
Submission received: 26 December 2017 / Revised: 10 February 2018 / Accepted: 12 February 2018 / Published: 18 February 2018
(This article belongs to the Section Optics and Lasers)

Abstract

:
Depth image super-resolution (SR) is a technique which can reconstruct a high-resolution (HR) depth image from a low-resolution (LR) depth image. Its purpose is to obtain HR details to meet the needs of various applications in computer vision. In general, conventional depth image SR methods often cause edges in the final HR image to be blurred or ragged. To solve this problem, an edge-guided method for depth image SR is presented in this paper. To get high-quality edge information, a pair of sparse dictionaries was applied to reconstruct edges of depth image. Then, with the guidance of these high-quality edges, a depth image was interpolated by using a modified joint bilateral filter. Edge-guided method can preserve the sharpness of edges and effectively avoid generating blurry and ragged edges when SR is performed. Experiments showed that the proposed method can get better results on both subjective and objective evaluation, and the reconstructed performance was superior to conventional depth image SR methods.

1. Introduction and Related Works

In recent years, with the rapid development of computer vision technology, the depth information of scenes becomes increasingly essential for many applications, such as 3D Reconstruction [1,2], Augmented Reality [3], Robot Navigation [4] and so on. Some active sensors [5], such as Kinect and PMD (Photonic Mixer Device), can easily acquire depth information of scenes. Then, this information will be used to create a depth image. However, due to the theoretical and practical limitations, the achievable resolution of any depth imaging device is usually too low to meet the needs of many practical applications. How to improve depth image resolution is an urgent problem that needs to be solved. One way to solve this problem is to apply some sophisticated vision sensors. However, these sensors are usually very expensive. Another way is to use super-resolution (SR) algorithm. Compared with expensive sensors, SR algorithm, not relying on hardware configuration, is evidently a low-cost approach. Inspired by the idea of color image SR, researchers have proposed many promising depth image SR methods [6,7,8] in recent years, which can improve the resolution of depth image effectively.
According to the difference of referenced information, depth SR can be mainly divided into four categories: (1) interpolation; (2) SR from LR depth image frames; (3) SR through fusing depth image and HR color image; and (4) example-based SR.
(1) Interpolation: There are many analytic methods for image interpolation, including nearest neighbor interpolation, bilinear and cubic interpolation [9]. However, when interpolation is done by a large factor, these analytic methods can cause image edge to be ragged and blurred because of the big value difference of pixels across edges. To solve the problem, Pang [10] presents an SR method based on bilinear interpolation and adaptive sharpening filter. This method can suppress effectively the edge-blurred effect. Ning [11] proposes an improved cubic interpolation algorithm, which uses cubic interpolation to compute the pixels in smooth areas and uses edge-vector interpolation to compute the pixels near edges. Xie [8] presents an edge-guided approach. This method reconstructs sharp HR edges through Markov random field at first, and then an SR depth image can be interpolated under the guidance of these edges. With the help of edge-guided information, the sharpness of edges can be well preserved in the final SR image. At the same time, some bilateral filtering methods [12,13] can also preserve edge well.
(2) SR from LR depth image frames: The HR image can be reconstructed by fusing the complementary information among a few LR depth images captured from the same scene. Schuon [14] uses an optimization framework with a bilateral total-variation regularization term to solve such a SR problem. Rjagopalan [15] constructs an energy function through the Markov Random Field, and minimizes the energy function to get the HR image. Ismaeil [16] proposes a dynamic-scene SR method for depth image to deal with the problem of non-rigid body motion between LR images. Gevrekci [17] uses convex projection method to construct the imaging model of depth image sequence for depth image SR.
(3) SR through fusing depth image and HR color image: Most commercial depth cameras can get a depth image and a color image about the same scene simultaneously, and usually the resolution of the color image is higher than that of the depth image. Thus, the HR depth image can be reconstructed with the help of HR color image. Ferstl [18] calculates an anisotropic total variation diffusion tensor from HR color image, and then the tensor is used to reconstruct the SR depth image. Yang [19] combines bilateral filter with median filter to compute adaptively weights from the HR image, and then the depth image is interpolated according to these weights. Lo [20] proposes a depth image SR method based on joint trilateral filter. The method considers not only the weight of distance, but also the weight of pixel value and gradient.
(4) Example-based SR: This method learns the transformation between LR and HR image from example database, and then an HR depth image can be reconstructed through the learned transformation when an LR depth image is inputted. Yang [21] uses a sparse coding method to grasp the transformation. Therefore, HR image patches can be represented by a sparse linear combination of HR dictionary atoms. Zeyde [6] modifies the above sparse coding method, and uses the K-SVD [22] and the orthogonal matching pursuit (OMP) [23] to train an LR and HR dictionary pair. Xie [24] proposes a pairwise dictionary training method with local coordinate constraints for depth image SR. Timofte [7] clusters the dictionary atoms into sub-dictionaries using K-NN algorithm, and then the HR patches can be represented by the most suited sub-dictionary. Kim [25] presents an accurate color image SR method based on VGG-NET [26], which can also be applied to solve the depth image SR problem.
Although the above methods can effectively reconstruct SR depth image from LR input, some existing problems cannot be ignored as well. The methods of the first category can cause discontinuous regions jagged and blurred. The methods of the second category need to be subject to the rigorous assumption that adjacent images only have slight movements on the plane parallel to the focal plane of camera. This assumption is usually difficult to satisfy in practical scenarios. Depth image SR based on fusing depth image and HR color image needs first to obtain an HR color image which register fully with the depth image. The example-based method has a strong dependence on training databases. That is, the difference of training databases may have a great effect on experiment results.
To address these problems, in this study, an edge-guided method for depth image SR is presented. We first train a pair of sparse dictionaries to recover high-quality edge information, and then an HR depth image is interpolated with the guidance of these high-quality edges. This method is a mixture of the example-based method and the interpolated method. We make full use of the advantage of the two methods. In this way, the proposed method can achieve improved results that are comparable to current state-of-the-art methods. Our approach needs neither strict assumptions nor the assistance of HR color image, so it can be used to improve depth image resolution conveniently. At the same time, our approach not only can achieve the goal of preserving sharp edge in depth image SR, but also can get a better color image SR.
The remainder of this paper is organized as follows. A detailed overview of the proposed method is presented in Section 2. Section 3 reports and discusses the results of the experiments. Finally, Section 4 concludes the paper.

2. Proposed Method

In this section, we first present the general steps of our work. Then, the way we have built the LR and HR edge dictionaries is discussed. Afterward, we continue with the details of how to interpolate HR image by joint bilateral filter.
To keep blurred and jagged edges away from the final SR result, we present a novel depth image SR method, which employs joint bilateral filter based on edge guidance for LR-to-HR reconstruction. The general steps of the proposed depth image SR method are summarized and shown in Figure 1.
To avoid computational complexity caused by the different size between LR image and the final HR image, we first use simple interpolation algorithm (bicubic interpolation) to magnify the input LR image I l to the same size as the final HR image. However, interpolation algorithm can cause blurred and jagged effects near edges, so we use a Shock filter [27] to clear the magnified image for further process.
Edges provide essential structural information to describe the objects in the scene. Thus, we first focus to recover HR edges before reconstructing the whole image. As illustrated in Figure 1, LR edge map E l is extracted from the preprocessed LR image. Edge map preserves only the primary structure information and abandons widespread smooth area. This leads to it having very strong sparseness. Thus, we choose sparse coding method to recover HR edge map E h in our method.
After getting HR edge map E h , depth image I h will be interpolated by a modified joint bilateral filter under the guidance of the high-quality edge map. The usage of bilateral filter can not only preserve the edge sharpness but also suppress noise further.
From the above introduction, the proposed method mainly includes two important parts: (1) edge recovery using sparse coding method; and (2) edge-guided depth interpolation using bilateral filter. The details on these two parts will be discussed in the following subsections.

2.1. Edge Recovery Using Sparse Coding

In this section, we first present some notation for our work. Then, the way we have built the LR and HR dictionaries for edge map recovery is discussed.

2.1.1. Sparse Dictionary Training

The LR and HR images are represented as z l R N l , and y h R N h , where N h = s 2 N l , and s > 1 is some integer scale-up factor. The blur operator is denoted by H : R N h R N l , and the decimation operator for a factor s is denoted by D : R N h R N l . The acquisition model of how to generate an LR image from an HR image can be described as:
z l = D H y h + v
where v is an additive noise in the acquisition process. Given z l , the problem is to find y ^ R N h such that y ^ y h . That is, y ^ y h 2 tends to zero. To avoid the complexities caused by the different resolutions between z l and y h , it is assumed that the image z l is scaled-up by a simple interpolation operator Q : R N l R N h (e.g., bicubic interpolation) that fills out the missing pixels between the original pixels in the input LR image. The scaled-up image shall be denoted by y l and it satisfies the relation:
y l = Q z l = Q ( D H y h + v ) = Q D H y h + Q v = U y h + v ¯
The reconstruction problem now is cast to process y l R N h and produce a result y ^ h R N h , which will get as close as possible to the original HR image, y h R N h .
The algorithm we propose operates on patches extracted from y l , aiming to estimate the corresponding patch from y h . Let p k = R n k y be an image patch of size n × n centered at location k and extracted from the image y by the linear operator R . The stride d is used for spatially shifting of image patches. Hence, the LR and HR patches are extracted as:
p l k = R n k y l ,   p h k = R n k y h
It shall be further assumed that p l k and p h k can be represented sparsely by coefficients q k over the dictionary pair A l and A h , respectively, namely:
p l k = A l q k ,   p h k = A h q k
To acquire such a dictionary pair A l and A h , we choose to apply jointly dictionary training method proposed in ref. [7].
The flow of training dictionary pair is summarized in Algorithm 1. The first step is to construct the training set. A set of HR training images { y h j } j are collected, LR images { y l j } j are constructed using scale-down operator U and pairs of matching patches that form the training database { p h k , p l k } k , are extracted. After finishing training database preparation, we can enter into dictionary learning stage.
For LR dictionary A l , the K-SVD dictionary training procedure [22] is applied to LR patches { p l k } k , resulting in the dictionary A l :
A l , { q k } = argmin A l , { q k } k p l k A l q k 2 s . t . q k 0 L
A side product of this training is the sparse representation coefficients vectors { q k } k that correspond to the training patches { p l k } k . 0 is the zero norm, and q k 0 is used to count the nonzero entries of vector q k . L is a constant that controls sparsity.
The next step is the high-resolution dictionary construction. Recall that we assume that the HR patch p h k can be approximated by p h k = A h q k . The dictionary A h is therefore sought such that this approximation is as exact as possible, i.e.,
A h = arg min A h k p h k A h q k 2 2 = arg min A h P h A h Q F 2
where the matrix P h is constructed with the HR training patches { p h k } k as its columns, and similarly, Q contains { q k } k as its columns (give that Q has full row rank). F is the Frobenius norm [28]. The solution of the least-squares problem is given by the following expression:
A h = P h Q T ( Q Q T ) 1
Algorithm 1.
  Input: A set of HR training images { y h j } j
  Output: LR-HR dictionary pairwise { A l , A h }
  Step 1. Construct training set: use scale-down operator U to construct LR images { y l j } j from HR training images { y h j } j and extract pairs of matching patches that form the training database { p h k , p l k } k from images { y h j } j and { y l j } j .
  Step 2. LR dictionary training: apply K-SVD [22] dictionary training procedure to train LR patches { p l k } k , resulting with LR dictionary A l and the sparse representation coefficients vectors { q k } k .
  Step 3. HR dictionary training: HR dictionary A h is trained using the sparse representation coefficients vectors { q k } k to match corresponding LR one

2.1.2. Edge Map Recovery

Once we get LR-HR dictionary pair { A l , A h } , high-quality edge map E h can be represented by a sparse linear combination of HR dictionary atoms. Before starting reconstruction, we first process the input LR image I l to obtain an LR edge map E l . The process can be divided into the following three steps:
(1) The input image I l is interpolated to the same size as the desired HR image using bicubic interpolation algorithm, producing an LR image I ^ l .
(2) Shock filter [27] is applied to suppress zigzag effect produced by up-sampling interpolation.
(3) Canny operator is used to extract edge E l from I ^ l .
Then, the HR edge map E h can be reconstructed from E l using the LR-HR dictionary pair { A l , A h } . The process is described in Algorithm 2.
Algorithm 2.
  Input: LR-HR dictionary pairwise { A l , A h } and edge map E l
  Output: High-quality edge map E h
  Step 1. Extract patches { b l k } k from edge map E l ;
  Step 2. Patches { b l k } k can be represented by the atoms of LR dictionary A l , and the side product is the corresponding sparse coefficients { c k } k ;
  Step 3. Multiply the obtained sparse coefficients { c k } k by HR dictionary A h to find HR patches { b h k } k ;
  Step 4. The high-quality edge map E h can be constructed by merging these HR patches { b h k } k , and the overlap regions of image patches are processed by the method of Zeyde [6].

2.2. Edge-Guided Depth Interpolation

In this section, we first introduce some notation during interpolation. Then, the method of discriminating pixels distribution is discussed.

2.2.1. Modified Joint Bilateral Filter

After obtaining HR edge image E h , HR depth image I h can be interpolated through a modified joint bilateral filter with the guidance of E h . For each pixel p in the target HR depth image I h , its value can be interpolated by a local neighborhood of LR image:
I h ( p ) = 1 k p q N ( p ) I l ( q ) f s ( p q ) f r ( E h , p , q )  
where N ( p ) is an s × s neighborhood window centered at pixel p . p and q represent pixel coordinate corresponding to pixel p and pixel q in the LR depth image I l , and only integer coordinate is considered. f s ( ) is a Gaussian kernel with standard deviation σ and mean value 0, which is used to weight the correlation of different pixel in the neighborhood. k p is a normalizing factor. f r ( ) is a binary indicator, which determines whether or not two pixels are on the same side of the edge. The indicator is defined as:
f r ( E h , p , q ) = { 1 i f   p i x e l   p   a n d   p i x e l   q   a r e   a t   t h e   s a m e   s i d e   o f   E h   0 o t h e r w i s e
The concrete form of f r ( ) can be created by discriminating the distribution of pixels p and q .

2.2.2. Discrimination of pixels distribution

Firstly, the set C e is used to store the pixels on the edge. The pixels on the line segment between pixels p and q are stored in set L . Pixels p and pixel q are on the same side of the edge if the intersection of sets C e and L is null, as shown Figure 2a. The distribution of pixels p and q may have two situations when the intersection of sets C e and L is not null. Pixels p and q are not on the same side of the edge in Figure 2b, but they are on the same side in Figure 2c. In this situation, we divide each neighborhood window into some sets according to the edge. The process is summarized in Algorithm 3.
As shown in Figure 2, white lines represent the edge pixels, the whole black portion is a to-be-divided area, and an image patch will be area-divided based on the connectivity of the black area. In addition, there are some special edge curve formats that need to be stated clearly. As shown in Figure 3, if the edge curve is not traversing the entire image patch, we think this is a special form of connectivity, that is, a form where interior space which is enclosed within the edge pixels is zero. Furthermore, the details of the algorithm are as follows.
Algorithm 3.
  Input:An image patch A with edge pixels and the set C e
  Output:Different sets C i   ( i = 1 , 2 , 3 n ) , where i is the index of sets, and n is the total number of sets.
  Step 1. The initial pixel r is chosen randomly from A . The following will be sequentially obtained based on the coordinates;
  Step 2. If r C e , we assume it belongs to C 1 . If r C e , the algorithm returns to the Step 1;
  Step 3. Adjacent pixels of the newly added pixels in set C 1 are judged. If the adjacent pixel does not belong to C e , we add it into set C 1 ;
  Step 4. Repeat Step 3 until set C 1 does not change;
  Step 5. The remaining pixels are judged by the same method as C 1 ;
  Step 6. Area A is divided into different pixel sets C i   ( i = 1 , 2 , 3 n ) .
After determining the distribution of pixels, pixels p and q can be discriminated easily whether on the same side of the edge. They are on the same side of the edge when they belong to the same pixel set, otherwise they are not on the same side. Once the kernel functions of bilateral filter are determined, the HR depth image can be interpolated using Equation (9). When interpolation is performed, the Gaussian kernel f s ( ) also suppresses some noise for the depth values. In addition, with the guidance provided by the indicator f r ( ) , only pixels at the same side of the edge will be considered during interpolation so that edges can be well preserved.

3. Experiments and Analysis

In this section, we analyzed the performance of our proposed depth image SR method and benchmarked it in quantitative and qualitative comparison with other state-of-the-art methods. All the experiments are implemented in a same experimental environment.

3.1. Test Environment and Parameter Setting

In our experiments, the programming tool was MATALAB (v.2016a) [29], and the test environment is the following. The processor was Intel(R) Xeon(R) CPU E5-2620 v3@ 2.40 Hz. Computer memory size was 64.0 Gb. The multithreading technology was used in the experiment. The proposed algorithm supports GPU computing, but it is not used. Test images were from the Middlebury Stereo database [30,31].
Some parameters were selected based on the smallest Root Mean Square Error (RMSE). We calculated average RMSE of 10 test images by varying the size of image patches from n × n = 3 × 3 to n × n = 13 × 13 per experiment. The results are depicted in Figure 4a. By comparison, we chose n × n = 9 × 9 as the size of image patch. Similarly, we also compared the stride d of patch selection in Figure 3b. The stride is determined to be 2. The size of neighborhood window was s × s   =   7 × 7 when joint bilateral filter was performed. The reliability of the value s has been confirmed in [8]. The standard deviation σ = 0.5 for f s ( ) in Equation (8). Dictionaries were trained using the database from Yang [32] which consisted of 100,000 patches extracted from 30 training images.

3.2. Experimental Results and Comparative Analysis

To compare the proposed method quantitatively, we chose RMSE, the Peak Signal Noise Ratio (PSNR), Structural Similarity (SSIM) and Percent of Error (PE) [8] to evaluate experimental results. Table 1, Table 2, Table 3 and Table 4 show experimental results of 10 test images from Middlebury Stereo database using different SR methods. These methods included: Neighbor Embedding with Locally Linear Embedding (NE + LLE) [33], Neighbor Embedding with Least Squares (NE + LS) [34], Neighbor Embedding with Non-Negative Least Squares (NE + NNLS) [35], Global Regression (GR) and Anchored Neighborhood Regression (ANR) [36], Adjusted Anchored Neighborhood Regression for Fast Super-Resolution (AANR) [7], Accurate Image Super-Resolution Using Very Deep Convolutional Networks (CNN) [25], the sparse coding method of Yang [21], the modified sparse coding method of Zeyde [6], and edge-guided method based Markov random field of Xie [8].
To make the tables readable, we marked the top three reconstruction methods in the four tables. The value in bold is the best. The value with single underline is the second best, and this is the value that is closest to the best value in the optimal direction. Likewise, the value with double underlines denote the third best, which is closest to the second best value in the optimal direction. From the tables, we can conclude that the RMSE and PSNR values of our method both rank the first in the test results. There were seven SSIM values ranked first and three SSIM values ranked third using our method in the test results. Three PE values of our result were first, and seven PE values were second. These objective measurements showed that our method can get good performance compared with other methods.
We also provided visual assessments on test image “cones” and “tsukuba”. The ground-truth HR image and the final SR images using the top five methods in objective evaluation tables (4× scaling factor) are shown in Figure 5 and Figure 6, and note that except for Figure 5a and Figure 6a, all the remaining experimental images for comparison are all generated by ourselves after repeating the original algorithms.
From the above Figures, we can see that, in the SR result of Kim et al. [25], serious zigzag effect exists. Zeyde [6] and Timofte [7] could relieve zigzags using sparse coding method, but still introduced many artifacts around edges. The method of Xie [8] could get good results, but the detail information of edges could not be reconstructed very well compared with our method. In Figure 5f and Figure 6f, we can see clearly that our reconstructed depth images not only avoided blurred edges, but also reduced zigzags near edges and preserved sharpness of edges.

4. Conclusion and Future Works

Conventional SR methods can cause edges to be blurred and jagged. Aiming at solving this problem, this paper proposes an edge-guided SR method. First, high-quality edge information is reconstructed based on generating a dictionary from pairs of HR and their corresponding LR edge patches. Then, with the guidance of these recovered edges, the SR depth image is interpolated by a joint bilateral filter. The guidance of high-quality edge information can improve the performance of SR algorithm resulting in sharper SR depth image. The quantitative and qualitative analyses of the experimental results showed the superiority of the proposed technique over conventional and state-of-the-art techniques.
There are still some shortages of the proposed method. The running time is higher than some methods shown in Table 5. The process of dictionary pair requires acquiring a database from external HR-LR images. In the future, we will further improve the proposed method in the following ways. (1) Database Construction: We will construct an image pyramid by interpolating the inputted image across different scales, and then database can be extracted from image pyramid. (2) Dictionary Training: We will use an optimal approach to train a sparse dictionary so that the running time can be reduced.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Nos. 61751203, 61603066, and 61300015), the Program for Dalian High-level Talent’s Innovation (2015R088), Program for Changjiang Scholars and Innovative Research Team in University (No. IRT_15R07), and Program for Liaoning Distinguished Professor.

Author Contributions

Ruyi Wang wrote this manuscript. Dongsheng Zhou, Jian Lu and Qiang Zhang contributed to the writing, direction, and content, and revised the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pan, H.; Guan, T.; Luo, Y.; Duan, L.; Tian, Y.; Yi, L.; Zhao, Y.; Yu, J. Dense 3D reconstruction combining depth and RGB inf-ormation. Neurocomputing 2016, 75, 644–651. [Google Scholar] [CrossRef]
  2. Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. Color-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments. In Proceedings of the 12th International Symposium on Experimental Robotics, Delhi, India, 18–21 December 2010; Springer: Berlin/Heidelberg, Germany, 2014; pp. 647–663. [Google Scholar]
  3. Lee, K.; Hou, X. Augmented Reality Using 3D Depth Sensor and 3D Projection. U.S. Patent Appl. 15/172,723, 3 June 2016. [Google Scholar]
  4. Correa, D.S.O.; Sciotti, D.F.; Prado, M.G.; Sales, D.O.; Wolf, D.F.; Osorio, F.S. Mobile robots navigation in indoor environments using kinect sensor. In Proceedings of the 2012 Second Brazilian Conference on Critical Embedded Systems, Campinas, Brazil, 20–25 May 2012; pp. 36–41. [Google Scholar]
  5. Zhen-Liang, N.; Zhen, S.; Chao, G.; Xiong, G.; Nyberg, T.; Shang, X.; Li, S.; Wang, Y. The application of the depth camera in the social manufacturing: A review. In Proceedings of the IEEE International Conference on Service Operations and Logistics, Beijing, China, 10–12 July 2016; pp. 66–70. [Google Scholar]
  6. Zeyde, R.; Elad, M.; Protter, M. On single image scale-up using sparse-representations. In Proceedings of the International Conference on Curves and Surfaces, Avignon, France, 24–30 June 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 711–730. [Google Scholar]
  7. Timofte, R.; Smet, V.D.; Gool, L.V. A+: Adjusted Anchored Neighborhood Regression for Fast Super-Resolution. In Proceedings of the Asian Conference on Computer Vision, Singapore, 1–5 November 2014; Springer: Cham, Switzerland, 2014; pp. 111–126. [Google Scholar]
  8. Xie, J.; Feris, R.S.; Sun, M.T. Edge-guided single depth image super resolution. IEEE Trans. Image Process. 2016, 25, 428–438. [Google Scholar] [CrossRef] [PubMed]
  9. Prajapati, A.; Naik, S.; Mehta, S. Evaluation of Different Image Interpolation Algorithms. Int. J. Comput. Appl. 2012, 58, 466–476. [Google Scholar] [CrossRef]
  10. Pang, Z.; Dai, H.; Tan, H.; Chan, D. An improved low-cost adaptive bilinear image interpolation algorithm. In Proceedings of the 2nd International Conference on Green Communications and Networks, Chongqing, China, 14–16 December 2012; Springer: Berlin/Heidelberg, Germany, 2013; pp. 691–699. [Google Scholar]
  11. Ning, L.; Luo, K. An Interpolation Based on Cubic Interpolation Algorithm. In Proceedings of the International Conference Information Computing and Automation, Chengdu, China, 20–22 December 2007; pp. 1542–1545. [Google Scholar]
  12. Tomasi, C.; Manduchi, R. Bilateral filtering for gray and color images. In Proceedings of the Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), Bombay, India, 7 January 1998; pp. 839–846. [Google Scholar]
  13. Kopf, J.; Cohen, M.F.; Lischinski, D.; Uyttendaele, M. Joint Bilateral Upsampling. ACM Trans. Graph. 2007, 26, 5–9. [Google Scholar] [CrossRef]
  14. Schuon, S.; Theobalt, C.; Davis, J.; Thrun, S. LidarBoost: Depth superresolution for ToF 3D shape scanning. In Proceedings of the 2009 the 22nd International Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 343–350. [Google Scholar]
  15. Rajagopalan, A.N.; Bhavsar, A.; Wallhoff, F.; Rigoll, G. Resolution enhancement of pmd range maps. In Proceedings of the Joint Pattern Recognition Symposium, Munich, Germany, 10–13 June 2008; Springer: Berlin/Heidelberg, Germany, 2008; pp. 304–313. [Google Scholar]
  16. Al Ismaeil, K.; Aouada, D.; Mirbach, B.; Ottersten, B. Dynamic super resolution of depth sequences with non-rigid motions. In Proceedings of the 2013 20th IEEE International Conference on Image, Melbourne, Australia, 15–18 September 2013; pp. 660–664. [Google Scholar]
  17. Gevrekci, M.; Pakin, K. Depth map super resolution. In Proceedings of the 2011 18th IEEE International Conference on Image, Brussels, Belgium, 11–14 September 2011; pp. 3449–3452. [Google Scholar]
  18. Ferstl, D.; Reinbacher, C.; Ranftl, R.; Ruether, M.; Bischof, H. Image guided depth upsampling using anisotropic total generalized variation. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 1–8 December 2013; pp. 993–1000. [Google Scholar]
  19. Yang, Q.; Yang, R.; Davis, J.; Nister, D. Spatial-depth super resolution for range images. In Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
  20. Lo, K.H.; Wang, Y.F.; Hua, K.L. Edge-Preserving Depth Map Upsampling by Joint Trilateral Filter. IEEE Trans. Cybern. 2017, 13, 1–14. [Google Scholar] [CrossRef] [PubMed]
  21. Yang, J.; Wright, J.; Huang, T.S.; Ma, Y. Image super-resolution via sparse representation. IEEE Trans. Image Process. 2010, 19, 2861–2873. [Google Scholar] [CrossRef] [PubMed]
  22. Aharon, M.; Elad, M.; Bruckstein, A. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. IEEE Trans. Signal Process. 2006, 54, 4311–4322. [Google Scholar] [CrossRef]
  23. Tropp, J.A.; Gilbert, A.C. Signal Recovery from Random Measurements Via Orthogonal Matching Pursuit. IEEE Trans. Inf. Theory 2007, 53, 4655–4666. [Google Scholar] [CrossRef]
  24. Xie, J.; Chou, C.C.; Feris, R.; Sun, M.-T. Single depth image super resolution and denoising via coupled dictionary learning with local constraints and shock filtering. In Proceedings of the 2014 IEEE International Conference on Multimedia and Expo (ICME), Chengdu, China, 14–18 July 2014; pp. 1–6. [Google Scholar]
  25. Kim, J.; Kwon Lee, J.; Mu Lee, K. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar]
  26. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409-1556v6. [Google Scholar]
  27. Gilboa, G.; Sochen, N.; Zeevi, Y.Y. Image enhancement and denoising by complex diffusion processes. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 1020–1036. [Google Scholar] [CrossRef] [PubMed]
  28. Böttcher, A.; Wenzel, D. The Frobenius norm and the commutator. Linear Algebra Appl. 2008, 8, 1864–1885. [Google Scholar] [CrossRef]
  29. MATLAB. 2016. Available online: www.mathworks.com/products/matlab (accessed on 12 December 2017).
  30. Baker, S.; Scharstein, D.; Lewis, J.P.; Roth, S.; Black, M.J.; Szeliski, R. A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 2011, 92, 1–31. [Google Scholar] [CrossRef]
  31. Scharstein, D.; Hirschmüller, H.; Kitajima, Y.; Krathwohl, G.; Nešić, N.; Wang, X.; Westling, P. High-resolution stereo datasets with Subpixel-accurate ground truth. In Proceedings of the German Conference on Pattern Recognition, Münster, Germany, 2–5 September 2014; Springer: Cham, Switzerland, 2014; pp. 31–42. [Google Scholar]
  32. Yang, J.; Wright, J.; Huang, T.; Ma, Y. Image super-resolution as sparse representation of raw image patches. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
  33. Chang, H.; Yeung, D.Y.; Xiong, Y. Super-resolution through neighbor embedding. In Proceedings of the 2004 IEEE Computer Socity Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 27 June–2 July 2004; pp. 1275–1282. [Google Scholar]
  34. Chan, T.M.; Zhang, J.; Pu, J.; Huang, H. Neighbor embedding based super-resolution algorithm through edge detection and feature selection. Pattern Recognit. Lett. 2009, 30, 494–502. [Google Scholar] [CrossRef]
  35. Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi Morel, M.-L. Neighbor embedding based single-image super-resolution using Semi-Nonnegative Matrix Factorization. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, 25–30 March 2012; pp. 1289–1292. [Google Scholar]
  36. Timofte, R.; De Smet, V.; Van Gool, L. Anchored neighborhood regression for fast example-based super-resolution. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 1–8 December 2013; pp. 1920–1927. [Google Scholar]
  37. Charstein, D.; Szeliski, R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 2002, 47, 7–42. [Google Scholar] [CrossRef]
  38. Hirschmüller, H.; Scharstein, D. Evaluation of cost functions for stereo matching. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
Figure 1. Pipeline of the edge-guided depth image SR.
Figure 1. Pipeline of the edge-guided depth image SR.
Applsci 08 00298 g001
Figure 2. Distinguish of two pixels near edge.
Figure 2. Distinguish of two pixels near edge.
Applsci 08 00298 g002
Figure 3. Some special forms of the edge curves.
Figure 3. Some special forms of the edge curves.
Applsci 08 00298 g003
Figure 4. The sensitivity of patch size n and stride d .
Figure 4. The sensitivity of patch size n and stride d .
Applsci 08 00298 g004
Figure 5. Visual comparisons of “cones”: (a) ground truth (reprinted with permission from [37]); (b) Kim [25]; (c)AANR [7]; (d) Zeyde [6]; (e) Xie [8]; and (f) our method.
Figure 5. Visual comparisons of “cones”: (a) ground truth (reprinted with permission from [37]); (b) Kim [25]; (c)AANR [7]; (d) Zeyde [6]; (e) Xie [8]; and (f) our method.
Applsci 08 00298 g005aApplsci 08 00298 g005b
Figure 6. Visual comparisons of “tsukuba”: (a) ground truth (reprinted with permission from [38]); (b) Kim[25]; (c)AANR [7]; (d) Zeyde[6]; (e) Xie[8]; and (f) our method.
Figure 6. Visual comparisons of “tsukuba”: (a) ground truth (reprinted with permission from [38]); (b) Kim[25]; (c)AANR [7]; (d) Zeyde[6]; (e) Xie[8]; and (f) our method.
Applsci 08 00298 g006
Table 1. RMSE values on the Middlebury Stereo database with scaling factor of 4.
Table 1. RMSE values on the Middlebury Stereo database with scaling factor of 4.
RMSE ×4BowlingArtlAloeConesIndianVenusWarriorTsukubaHandDove
NE + LLE1.8412.5672.3701.3800.8200.6543.6452.8611.8281.004
NE + LS1.8112.5122.3321.3570.8020.6343.6302.8601.8391.000
NE + NNLS1.8122.5232.3261.3560.8030.6373.6272.8671.8201.012
GR1.9862.8382.5851.4920.9060.7263.8643.1422.0041.088
ANR1.8072.6342.4291.3930.8360.6623.7232.9461.8851.022
AANR1.8552.7132.4781.4560.8550.6743.7072.9721.9251.043
CNN2.2383.7983.2451.7780.9870.8454.4243.5052.1741.214
Yang2.1123.3612.8651.5140.9080.7334.3603.1862.0261.098
Zeyde1.8032.4942.3291.3380.7980.6353.6202.8441.8320.989
Xie1.7662.9352.5831.2400.7710.6174.0813.0091.9261.010
Ours1.6622.3682.2421.2300.7050.5413.3252.7681.7960.935
Table 2. SSIM values on the Middlebury Stereo database with scaling factor of 4.
Table 2. SSIM values on the Middlebury Stereo database with scaling factor of 4.
SSIM ×4BowlingArtlAloeConesIndianVenusWarriorTsukubaHandDove
NE + LLE0.9160.7120.8750.8740.9870.9370.8930.8270.9830.988
NE + LS0.9230.7350.8840.8900.9880.9510.9030.8140.9840.989
NE + NNLS0.9230.7280.8840.8870.9890.9480.9020.8290.9840.989
GR0.8990.6560.8470.8500.9840.9310.8750.8330.9780.985
ANR0.9160.7110.8730.8790.9870.9440.8920.7820.9830.988
AANR0.9240.7500.8800.8910.9870.9530.9060.8010.9850.990
CNN0.9220.7450.8650.8800.9870.9510.9040.8430.9830.988
Yang0.9090.6770.8570.8610.9850.9400.8840.7950.9800.986
Zeyde0.9250.7400.8850.8930.9880.9500.9050.8390.9840.989
Xie0.9460.7910.9080.9160.9920.9710.9310.8550.9890.993
Ours0.9530.8040.9120.9170.9910.9680.9360.8790.9900.992
Table 3. PSNR values on the Middlebury Stereo database with scaling factor of 4.
Table 3. PSNR values on the Middlebury Stereo database with scaling factor of 4.
PSNR ×4BowlingArtlAloeConesIndianVenusWarriorTsukubHandDove
NE + LLE42.82739.94240.63445.32849.85251.81836.89638.99942.90148.096
NE+LS42.97240.09940.77645.47550.04552.08436.93239.00142.85048.126
NE + NNLS42.96640.09140.79845.48050.03452.03836.93838.98042.93848.130
GR42.17139.06939.87844.65348.98450.91236.38938.18642.10447.396
ANR42.69139.71640.41945.25049.67851.71336.71238.75942.63247.935
AANR42.76139.46040.24544.86449.48551.55336.74838.66942.45147.762
CNN40.66736.53837.76443.13148.23749.58735.21437.23741.39646.444
Yang41.01637.60038.65544.52648.96050.82135.34038.06442.00847.314
Zeyde43.00840.19340.78445.59950.08552.07136.95439.04942.88248.219
Xie42.12438.77839.33246.26050.38452.31735.91538.56042.44748.044
Ours44.25541.01541.11046.33251.16353.45937.69439.60543.55248.707
Table 4. PE values on the Middlebury Stereo database with scaling factor of 4.
Table 4. PE values on the Middlebury Stereo database with scaling factor of 4.
PE ×4BowlingArtlAloeConesIndianVenusWarriorTsukubaHandDove
NE + LLE6.72524.51916.8909.0502.4032.6909.42715.2434.2932.730
NE + LS6.21622.94415.9488.1552.2302.4128.63814.4154.2602.584
NE + NNLS6.48223.72516.2468.4782.3032.5838.91814.7304.3322.782
GR8.43830.24820.61111.2003.1913.53910.77618.2295.1373.719
ANR7.01725.18317.3609.0522.5062.7819.91215.1824.5342.911
AANR5.27421.08414.7417.3852.0471.9677.64012.8163.2932.248
CNN4.23219.16913.4546.9931.7951.6968.07711.3402.6241.690
Yang8.20830.54719.7999.9552.8672.99612.69217.1705.3413.430
Zeyde6.04022.71415.7517.9682.2082.4478.22814.2404.0362.544
Xie2.40510.6948.2992.8290.9510.5052.5754.2390.9180.608
Ours2.38410.6768.2573.2941.0760.6412.6034.5861.0890.751
Table 5. Running time on the Middlebury Stereo database with scaling factor 4.
Table 5. Running time on the Middlebury Stereo database with scaling factor 4.
Time (s)BowlingArtlAloeConesIndianVenusWarriorTsukubaHandDove
GR2.70.71.61.35.21.33.90.85.35.5
ANR3.20.81.81.55.81.44.81.06.16.4
NE + LS9.12.54.94.217.04.113.12.417.318.5
NE + NNLS56.315.129.627.0104.826.066.610.4104.3110.4
NE + LLE11.73.06.65.421.55.315.72.921.821.9
AANR3.40.92.01.66.41.64.50.96.46.2
CNN6.72.14.83.512.93.610.02.313.712.9
Zeyde5.31.43.12.79.92.46.71.39.910.6
Yang986.2260.4726.9541.51937.3485.1864.7177.51786.71419.8
Xie594.9517.4864.6608.9913.7141.3759.9373.8469.5417.7
Ours92.174.7105.0101.3177.456.9115.651.776.475.3

Share and Cite

MDPI and ACS Style

Zhou, D.; Wang, R.; Lu, J.; Zhang, Q. Depth Image Super Resolution Based on Edge-Guided Method. Appl. Sci. 2018, 8, 298. https://0-doi-org.brum.beds.ac.uk/10.3390/app8020298

AMA Style

Zhou D, Wang R, Lu J, Zhang Q. Depth Image Super Resolution Based on Edge-Guided Method. Applied Sciences. 2018; 8(2):298. https://0-doi-org.brum.beds.ac.uk/10.3390/app8020298

Chicago/Turabian Style

Zhou, Dongsheng, Ruyi Wang, Jian Lu, and Qiang Zhang. 2018. "Depth Image Super Resolution Based on Edge-Guided Method" Applied Sciences 8, no. 2: 298. https://0-doi-org.brum.beds.ac.uk/10.3390/app8020298

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop