Highlight Removal of Multi-View Facial Images

Su, Tong; Zhou, Yu; Yu, Yao; Du, Sidan

doi:10.3390/s22176656

Open AccessArticle

Highlight Removal of Multi-View Facial Images

by

Tong Su

,

Yu Zhou

,

Yao Yu

^* and

Sidan Du

School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China

^*

Author to whom correspondence should be addressed.

Sensors 2022, 22(17), 6656; https://0-doi-org.brum.beds.ac.uk/10.3390/s22176656

Submission received: 8 August 2022 / Revised: 28 August 2022 / Accepted: 31 August 2022 / Published: 2 September 2022

(This article belongs to the Section Sensing and Imaging)

Download

Browse Figures

Versions Notes

Abstract

:

Highlight removal is a fundamental and challenging task that has been an active field for decades. Although several methods have recently been improved for facial images, they are typically designed for a single image. This paper presents a lightweight optimization method for removing the specular highlight reflections of multi-view facial images. This is achieved by taking full advantage of the Lambertian consistency, which states that the diffuse component does not vary with the change in the viewing angle, while the specular component changes the behavior. We provide non-negative constraints on light and shading in all directions, rather than normal directions contained in the face, to obtain physically reliable properties. The removal of highlights is further facilitated through the estimation of illumination chromaticity, which is done by employing orthogonal subspace projection. An important practical feature of the proposed method does not require face reflectance priors. A dataset with ground truth for highlight removal of multi-view facial images is captured to quantitatively evaluate the performance of our method. We demonstrate the robustness and accuracy of our method through comparisons to existing methods for removing specular highlights and improvement in applications such as reconstruction.

Keywords:

highlight removal; intrinsic image decomposition; multi-view images; facial image; specular reflection

1. Introduction

The removal of specular highlight reflection is an important problem in computer graphics, computer vision, and image processing since it provides useful information for the applications that need consistent object surface appearance [1], such as stereo reconstruction [2], visual recognition [3,4], augmented reality [5,6], object re-illumination [7] and dichromatic editing [8], many of which are multi-view issues. This is particularly significant in facial issues [9,10,11,12,13] since highlights are inevitable in facial images due to the oily skin surface of the human face and often show high intensity.

Nevertheless, previous highlight removal methods are typically based on a single image. The results of highlight removal can not be consistent between images from different viewpoints, even if some are improved for facial images [14,15,16]. In order to facilitate multi-view issues, the specular highlight reflections of multi-view images need to be removed consistently. Moreover, the extracted specular highlight can provide useful information for inferring scene properties such as surface normal and lighting directions.

To address this problem, we present a highlight removal method of multi-view facial images and jointly estimate the lighting environment. Based on the Lambertian consistency and mirror reflection model, respectively, we model the diffuse reflection and specular reflection of facial skin in three-dimensional space and render into each viewpoint by taking advantage of the prior about the geometry of the human face, which can be achieved via existing multi-view based 3D face reconstruction methods. Since the mirror reflection model is sensitive to the accuracy of the face geometry and the irregularities of the micro-facet structure, which is unable to characterize accurately limited by model precision, we further employ the dichromatic reflectance model and define the specular factor in each viewpoint. To obtain a physically reliable lighting environment, we provide non-negative constraints on light and shading in all directions, rather than the normal directions contained in the face. The removal of highlights is further facilitated by estimating illumination chromaticity, which is done by taking advantage of orthogonal subspace projection.

The main contributions of our study are summarized as follows:

We propose a lightweight optimization solution for removing the specular highlight reflections of multi-view facial images and jointly estimate the lighting environment.
We provide physically reliable intrinsic image properties through the non-negative constraints and the orthogonal subspace projection rather than priors.
A dataset with ground truth for highlight removal of multi-view facial images is captured to quantitatively evaluate our method’s performance and demonstrate our method’s effectiveness.

The rest of this paper is organized as follows: We commence with the motivation of our method and the related work in Section 2. In Section 3, we introduce the formulation and propose our method in Section 4. Section 5 contains the experiments of our method on laboratory captured images with ground truth highlight removal data, as shown in Figure 1. Finally, Section 6 concludes the paper.

2. Related Work

Highlight removal is a problem that has been studied for decades, as reviewed in [1,17]. Existing works for color images mostly originate from two categories: one is based on a single input image, and the other is based on multi-images. In this section, we briefly review previous works on highlight removal of general objects over these two categories and works particularly targeting facial images.

2.1. General Highlight Removal from a Single Image

Early approaches of general highlight removal from a single input image used color space analysis and treated an image pixel by pixel. Klinker et al. [18] analyzed the color histogram distributions using the convex polygon fitting technique and proposed methods linking color space with the dichromatic reflection model [19,20] to separate diffuse and specular reflections. Techniques that transformed images into other color spaces were later proposed [21,22,23,24]. The color space analysis was extended to spatial information analysis, enabling the handling of surface textures that can be inpainted [25] or that have a repetitive structure [26]. Tan et al. [27] introduced Maximum Chromaticity-Intensity Space to differentiate between the maximum intensity and maximum chromaticity. A pseudo-diffuse component image was created and later utilized to separate specular reflection from the image. Variants of this approach had employed a dark channel prior [28]. Yang et al. [29,30] treated the specular pixels as noise and used a bilateral filter for smoothing the maximum fraction of color components. An et al. [31] proposed the pure diffuse pixel distribution model. Many researchers have recently applied data-driven deep learning for single image highlight removal. Funke et al. [32] presented a GAN-based method for automatic specular highlight removal from a single endoscopic image. Lin et al. [33] proposed a fully-convolutional neural network that removed specular highlights from a single image by generating its diffuse component automatically and consistently. Wu et al. [34] presented Generative Adversarial Network by introducing the detection of specular reflection information as guidance.

A fundamental problem with methods based on a single image is that they either rely on image statistics or are based on strong prior assumptions. Therefore, such methods are not robust to changes in imaging objects, viewing angles, or lighting environments.

2.2. General Highlight Removal from Multiple Images

Multi-image methods use the information in an image sequence of the same scene, taken from different points of view or with different light information. Physically, the degree of light polarization can be considered a strong indicator of specular reflection, while diffuse is considered unpolarized. Therefore, many polarization-based methods with hardware assistance have been proposed, such as [35,36,37,38,39,40]. Specular highlights exhibit varying behaviors under different illumination directions or from different views [41]. Based on this property, multiple images based highlight removal techniques are proposed in the literature. Sato et al. [42] introduced temporal-color space analysis by using a moving light source. Lin et al. [43] used different illuminations for the same scene and then proposed linear basis functions for separating diffuse and specular components. Lin et al. [44] presented a method based on color analysis and multi-baseline stereo that used a sequence of images to achieve the separation of specular reflection. Prinet et al. [45] proposed the generation of specularity map from a video sequence. Wang et al. [46] used three cameras to take images of a transparent plastic package containing tablets.

Most closely related to our work is the method that utilizes multi-baseline stereo proposed by Lin et al. [44] to the best of our knowledge. Nevertheless, they assumed that scene points having specular reflection exhibit purely diffuse reflection in some other views, which is usually unsatisfactory in facial images.

2.3. Highlight Removal of Facial Images

Recently, several highlight removal techniques have been proposed, particularly targeting facial images. Li et al. [14] decomposed a single face photograph into its intrinsic components by utilizing human face priors, including statistics on skin reflectance and facial geometry. They also utilized a physically-based model called bidirectional surface-scattering reflectance distribution function to model skin translucency and the Planckian locus constraint for light color estimation. Later, Li et al. [15] improved the algorithm by adopting a skin model based on melanin and hemoglobin and an illumination model based on spherical harmonics. Unlike the optimization mentioned above, Yi et al. [10] proposed a deep neural network to extract faces’ highlights. The network is pre-trained with synthetic images and fine-tuned using an unsupervised strategy with real photos. Zhu et al. [16] adopted the structure of a conditional generative adversarial network to generate highlight-free images. Limited by the lack of diversity of training data, this method showed overall yellow tint results similar to the skin color of Asians.

The above methods, however, are also based on a single image and have the same limitations as general highlight removal methods based on a single image. That is, they are not robust to changes in viewing angles. As a result, they still cannot provide consistent intrinsic properties between images from different viewpoints to facilitate multi-view issues.

2.4. Our Work

In this paper, we develop a lightweight optimization-based highlight removal method for multi-view facial images, as shown in Figure 2. Making use of the property that while the specular component varies with the viewing direction, the diffuse component reflects light equally in all directions, we can remove the highlight of multi-view facial images consistently between viewpoints and effectively reduce the ambiguity that exists in separating specular highlights from diffuse reflection occurs when the illumination chromaticity is similar to the diffuse chromaticity. Experiments show that our approach can provide superior performance and significantly facilitate multi-view issues such as 3D face reconstruction.

3. Formulation

According to the well-known dichromatic reflection model [48], the surface radiance is decomposed into a diffuse component and a specular one. Assuming that the image intensity I is calibrated to be linearly related to the image irradiance, which is also known as the surface radiance, the dichromatic reflection model can be expressed as follows:

I_{j} = D_{j} + H_{j} .

(1)

Here, j indicates images of the jth viewpoint. D and S represent diffuse and specular reflection, respectively. We formulate the problem of highlight removal of multi-view facial images as decomposing a set of input facial images

{\{I_{j}\}}_{j = 1}^{N}

into their specular reflections

{\{H_{j}\}}_{j = 1}^{N}

and diffuse reflections

{\{D_{j}\}}_{j = 1}^{N}

.

3.1. Illumination Model

Considering a distant lighting environment L with uniform illumination chromaticity C is incident on the face, we employ the spherical harmonics, which form a complete set of orthogonal functions on the surface of a sphere similar to the Fourier series on a circle, to model the angular distribution of lighting over the range of incident directions. Assuming that the diffuse reflection of the human face adheres to the Lambertian model, we can express L the coefficients

L_{l m}

in spherical harmonic expansion as follow [49,50]:

L (n) = \sum_{l, m} L_{l m} Y_{l m} (n) .

(2)

Here,

Y_{l m}

denotes the real form basis of the spherical harmonic function of degree l and order m, with

l \geq 0

and

- l \leq m \leq l

, and n is a unit direction vector (surface normal, etc.).

3.2. Reflection Model

When a distant light reaches the surface of the human face at 3D position p, some portion of the light is reflected at the boundary, resulting in the specular reflection

h (p, ω_{o}) = \int_{Ω} f_{s} (p, ω_{i}, ω_{o}) C L (ω_{i}) d ω_{i},

(3)

and the rest is refracted into the facial skin and exits as diffuse reflection:

d (p) = \int_{Ω} f_{d} (p, ω_{i}, n_{p}) C L (ω_{i}) (n_{p} \times ω_{i}) d ω_{i} .

(4)

Here,

f_{s}

and

f_{d}

are the bidirectional reflectance distribution functions (BRDF) which relate radiance exiting in direction

ω_{o}

to incoming light from direction

ω_{i}

for the specular reflection and diffuse reflection, respectively.

n_{p}

is the surface normal vector at point p.

3.2.1. Diffuse Reflection

As mentioned in Section 3.1, we assume that the diffuse reflection of facial skin adheres to the Lambertian model. As a result, the diffuse reflection at point p can be expressed as follows:

d (p) = a (p) C s (n_{p}) .

(5)

Here,

a (p)

is the diffuse albedo of facial skin, which quantifies the fraction of incident light reflected by the skin surface, and

s (n_{p})

is the geometric shading factor that governs the proportion of diffuse light reflected from the skin surface:

s (n_{p}) = \int_{Ω} L (ω_{i}) (n_{p} \cdot ω_{i}) d ω_{i} .

(6)

According to [49], the diffuse shading

s (n_{p})

can be represented by the coefficients

L_{l m}

of the lighting environment in spherical harmonic expansion as well:

s (n_{p}) = \sum_{l, m} {\hat{A}}_{l} L_{l m} Y_{l m} (n_{p}) .

(7)

Here, the analytic formula for

{\hat{A}}_{l}

has been derived in [51].

3.2.2. Specular Reflection

To represent the specular reflection of facial skin, we first employ the physically based mirror reflection model as:

h (p, ω_{o}) = k (p) C L (ω_{p}),

(8)

where k is the specular coefficient related to the Fresnel reflection coefficients, which in turn depend on the material refractive index [52].

ω_{p}

is the incident direction of light which satisfied that the surface normal

n_{p}

of pixel p is the half-angle direction of

ω_{p}

and the viewing direction

ω_{o}

:

ω_{p} = 2 〈n_{p}, ω_{o}〉 n_{p} - ω_{o}

(9)

The value of

ω_{p}

depends on changes in the viewpoint, and results in different specular reflections for different viewpoints, denoted as

h_{j}

for jth viewpoint.

Note that the mirror reflection model is sensitive to the accuracy of the face geometry and the irregularities of the micro-facet structure, which is unable to characterize limited by model precision. Moreover, the face geometry reconstructed from specular contaminated images is often inaccurate, especially in the region with high intensity around the nose and between the eyebrows, which is also the motivation of our method. In order to get rid of the dependency of the rough face geometry prior and remove the highlight specular more precisely, we take the result of the specular component under the mirror reflection model as initialization and further optimize the specular factor in the image space for each viewpoint under the dichromatic reflectance model. Figure 3 shows examples of specular reflection under two models.

According to the dichromatic reflection model, the specular component has the same chromaticity C as that of the light source [48]:

H_{j} (x) = g_{j} (x) C

(10)

The pixel-wise parameter g is the corresponding of

k (p) L (ω_{p})

and models the intensity of the specular reflection in each viewpoint. It is determined by not only the specular reflection coefficient but also the intensity of the light source that caused the specular reflection to the pixel x.

3.2.3. Rendering

The reflection components can be rendered into each viewpoint,

\begin{matrix} H_{j} = R (m, n, c_{j}, h_{j}) \\ D_{j} = R (m, n, c_{j}, d) \end{matrix}

(11)

such as albedo, shading, and specular coefficient, if necessary. Here,

R (\cdot)

is the traditional rasterizer rendering function that generates a rendered image from 3D properties, including triangle mesh m, mesh normals n, camera parameters c and per-vertex appearance (specular reflection, diffuse reflection, albedo, shading, etc). A sparse multi-view camera system is used to capture multi-view facial images, from which we obtain the above 3D properties by AgiSoft Metashape. Existing 3D face reconstruction algorithms from multi-view images such as [53,54,55] can also provide these parameters.

Substituting Equations (5), (8) and (11) into Equation (1), we can formulate the multi-view facial images as follows:

I_{j} = D_{j} + H_{j} = R (m, n, c_{j}, s a) + R (m, n, c_{j}, h_{j}) .

(12)

As a result, the highlight removal of multi-view facial images can be transformed into the reflection components separation of the human face in 3D space.

When we optimize the specular factor in the image space for each viewpoint under the dichromatic reflectance model, we can formulate the reflection model as:

I_{j} = D_{j} + H_{j} = R (m, n, c_{j}, s a) + g_{j} C,

(13)

by substituting Equation (5), (11) and (10) into Equation (1).

4. Multi-View Facial Images Highlight Removal

The objective function of our method for multi-view facial images highlight removal is:

argmin E_{0} + λ_{N N} E_{N N} + λ_{O S P} E_{O S P}

(14)

4.1. Data Term

The data term

E_{0}

measures the difference between the reflectance model and the captured input images

I_{i}

:

E_{0}^{1} (L_{l m}, C, a, k) = \sum_{j = 1}^{N} {∥R (m, n, c_{j}, s a) + R (m, n, c_{j}, h_{j}) - I_{j}∥}_{2} .

(15)

where the shading factor s is a function of

L_{l m}

computed according to Equation (7).

When we separate the specular reflection for each viewpoint, the data term becomes:

E_{0}^{2} (L_{l m}, C, a, g) = \sum_{j = 1}^{N} {∥R (m, n, c_{j}, s a) + g_{i} C - I_{j}∥}_{2} .

(16)

4.2. Non-Negative Constraint Term

In this study, we employ several physically meaningful parameters, such as lighting environment, shading factor, specular coefficient, albedo, etc., which should bed strictly non-negative in all conditions.

As for parameters in the illumination model, while the shading tends to be positive in the normal direction of the human face, it could be negative in other directions, as in the lighting environment. Thus, we compute the lighting environment and shading factor in all directions:

\begin{matrix} L_{0} = \sum_{l, m} L_{l m} Y_{l m} (n_{0}) \\ s_{0} = \sum_{l, m} {\hat{A}}_{l} L_{l m} Y_{l m} (n_{0}) \end{matrix}

(17)

Here,

n_{0}

is the mesh normals of a unit sphere.

Together with the albedo and specular coefficient, we define the non-negative constraint (NN) term as follows:

E_{N N}^{1} (L_{l m}, C, a, k) = {∥L_{0}∥}_{2} + {∥s_{0}∥}_{2} + {∥a∥}_{2} + {∥k∥}_{2}

(18)

Similarly, when we separate the specular reflection for each viewpoint, the non-negative constraint term becomes into:

E_{N N}^{2} (L_{l m}, C, a, g) = {∥L_{0}∥}_{2} + {∥s_{0}∥}_{2} + {∥a∥}_{2} + {∥g∥}_{2}

(19)

4.3. Orthogonal Subspace Projection Term

By adopting the orthogonal subspace projection (OSP) [56], the radiance of a specular contaminated image can be projected onto two orthogonal subspaces. One is parallel, while the other is orthogonal to the light chromaticity C. Based on the theory of matrix projections, we can design the orthogonal projector:

P = E - C C^{T} / ∥C∥

(20)

where E is the identity matrix. Substituting Equation (8) or Equation (10) into Equation (1), and then multiply both sides of equation by projector P, yields the same equation:

P I_{j} = P D_{j} + P H_{j} = P D_{j}

(21)

As a result, the specular component is removed, while the diffuse component is preserved at the cost of losing one dimension of information:

E_{O S P} (L_{l m}, C, a) = {∥P I_{j} - P D_{j}∥}_{2} = {∥P I_{j} - P R (m, n, c_{j}, s a \times C)∥}_{2} .

(22)

This term measures the difference between the reflectance model and the captured input images

I_{j}

under orthogonal subspace projection, which leads to constraining the illumination chromaticity.

4.4. Optimization

We use PyTorch to implement the model and minimize the objective function of Equation (14) sequentially by using the Adam optimizer [47]. The initial learning rate is 1 × 10⁻², and the number of training times is 20 epochs. The optimization is iterated until the change in the objective energy falls below the threshold 1 × 10⁻³. We set the value of the regularization weights

\{λ_{N N}, λ_{O S P}\}

to

\{10, 1\}

in our experiments.

We initialize the illumination as white ambient light, where C equals

[1 / π, 1 / π, 1 / π]

and the spherical harmonics coefficients are zero for all values except for

L_{00} = 1

, that is ambient lighting only. The albedo a is initialized by the color of the 3D face model divided by C, and the specular coefficients k are all zero.

We first estimate the parameters

\{L_{l m}, C, a, k\}

based on the mirror reflection model by minimizing:

\underset{L_{l m}, C, a, k}{argmin} E_{0}^{1} + λ_{N N} . E_{N N}^{1} + λ_{O S P} E_{O S P}

(23)

After that, we render the result of k into each viewpoint by Equation (11) and take the rendered images

{\{K_{j}\}}_{j = 1}^{N}

as the initialization of the pixel-wise parameter g. The specular component is further optimized by minimizing:

\underset{L_{l m}, C, a, g}{argmin} E_{0}^{2} + λ_{N N} . E_{N N}^{2} + λ_{O S P} E_{O S P}

(24)

5. Results

In this section, experiments are performed to evaluate the proposed method. Firstly, we introduce the Laboratory Captured Dataset and the FaceScape dataset [57] used in the experiments. Then, we test our method and show quantitative results and visual effects compared with the previous methods. After that, we demonstrate the ablation studies of our method. Lastly, we evaluate our method’s improvement in several applications.

5.1. Dataset

5.1.1. Laboratory Captured Dataset with Ground Truth

The input multi-view facial images are captured under a sparse multi-view camera system containing a set of metal brackets, 12 DSLR cameras (Canon EOS 80D), a photography lighting. Four-column brackets in arc shape are used to mount cameras. In the center of the arc, we place a forehead bracket accessory (equipment of computer refractometer) to fix the position of the human face to capture the stable and high-precision facial skin appearance.

We use a classic 24-color chart (X-Rite ColorChecker) to radiometric calibrate the camera system linearly related to the image radiance. The polarizers are placed in front of the cameras and the light source. A non-metal dielectric spherical ball is used to correct the polarizers. So that we set the cross and parallel polarization and filter out the highlight when capturing pictures so that pairs of real facial photos with and without highlights can be collected. As a result, the ground truth highlight removal results are obtained through cross-polarization for each viewpoint.

AgiSoft Metashape software developed by AgiSoft LLC is used to generate initial 3D properties, including triangle mesh m, mesh normal n, and camera parameters c. Existing multi-view-based 3D face reconstruction algorithms [53,54,55] can also obtain these 3D properties.

We recruited 20 Asian participants (17 male, 3 female) and captured

20 \times 12

linear raw images. The effective resolution is

6024 \times 4020

. By taking advantage of 3D properties, we resize and crop the original image data with a window of

800 \times 400

, preserving the information of the face. For data augmentation, we apply the image flip. All participants had fully read and understood the informed consent form and signed a portrait rights authorization agreement. All data are only used for non-profit academic research areas.

5.1.2. FaceScape Dataset

To qualitatively analyze the performance of our approach, we employ the FaceScape Dataset [57], an open-source 3D face dataset consisting of multi-view facial images, camera parameters and 3D face models with a texture map. We resize and crop the image data using the same laboratory-captured dataset strategy.

5.2. Experiments

5.2.1. Comparisons

Quantitative comparisons of multi-view highlight removal on the Laboratory Captured Dataset are presented in Table 1. We compare the specular highlight removal results of our method using either the mirror reflection model (M.) or dichromatic reflection model (D.) to three existing techniques [29,58,59]. In order to show comparisons for both absolute intensity errors and structural similarities, we use the root-mean-square error (RMSE) (smaller is better, denoted as ↓), and the structural similarity (SSIM) [60] (larger is better, denoted as ↑) for error measurement. Note that the RMSE is rescaled by 100 for better viewing. The best results are shown in bold. According to Table 1, our method obtains highlight removal images closer to the ground truth than other methods. Our method under the mirror reflection model can obtain good results when each viewpoint shares the same parameters, while the method under the dichromatic model is pixel-wise optimized in each viewpoint to obtain more accurate results.

In order to explore the consistency of results from different viewpoints, we calculate the standard deviation (SD) and relative standard deviation (RSD) of our method and three existing techniques [29,58,59], as shown in Table 2. Note that the SD RMSE is also rescaled by 100 for better viewing. Regarding the robustness in changes of viewpoints, our methods show superior improvement to methods from a single image. It is worth noting that, in the mirror reflection model, we use the same parameters to ensure the consistency of the results under different viewpoints, which is also well maintained after changing into the dichromatic reflection model.

As for images in Figure 4 and Figure 5, we qualitatively analyze the performance of our method on the laboratory captured and FaceScape dataset. As shown in the first four rows of Figure 4 and the last four rows of Figure 5, the results of [29,58,59] are not consistent with different viewing angles since they are relying on a single input image. The specular reflections still exist in the results, especially in the region around the cheek and forehead. Benefiting from the multi-view information, we can process facial images with large-area, high-intensity specular highlights and obtain results robust to changes in viewing angles. As shown in Figure 4c,d and Figure 5b,c, our results based on mirror reflection show significant improvement and exhibit further improvement based on the dichromatic reflection model. By making use of the advanced pixel-wise optimization, our method based on the dichromatic reflection model can separate the specular highlights as much as possible while preserving the view-independent diffuse reflection initialized by the mirror reflection model and offset the effects caused by the inaccuracy of the face geometry and the irregularities of the microfacet structure.

5.2.2. Ablation Studies

Our method has two optimization steps, and the objective function of each step consists of data term, non-negative constraint term, and orthogonal subspace term. These terms have similar but different formulas and share the same weights for different optimization stages. In the initial optimization step, view-independent parameters help the optimization obtain consistent diffuse reflection. The pixel-wise optimization step helps offset the effects caused by the face geometry’s inaccuracy and the microfacet structure’s irregularities. The orthogonal subspace projection term is used to constrain the chromaticity of light, while the non-negative constraint term is used to guide the optimization to maintain stability in physical properties. The weights of these terms are adjusted according to our experiments.

Table 3 and Figure 6 demonstrate the results of the ablation studies of our work. It shows that the complete loss function leads to better results. As for the regularization weights, minor weight for the orthogonal subspace projection term makes the light chromaticity worse. If the weight for the non-negative constraint term is too small, many areas will be filled with dark colors.

The experiment on the number of images is necessary since we use multi-view images. We experiment on one set of faces whose error is above the median since the combinations of the number of viewpoints is too large. Table 4 and Figure 7 demonstrate the mean RMSE (R.) and SSIM (S.) of our methods under a different number of viewpoints. With the increase in the number of viewpoints, our method under the mirror reflection model improves slightly, while the method under the dichromatic reflection model basically remains the same. It is worth mentioning that our method shows better performance than other methods when only two viewpoints are used.

5.2.3. Applications

Since the motivation of our method is to facilitate multi-view issues, we demonstrate the face reconstruction performance using the result of our method as shown in Figure 8. The reconstructed face geometry using images highly affected by specular highlight is shown in Figure 8b, which is also the rough prior of face geometry used in our method. The face model reconstructed using the highlight removal images by our method in Figure 8c shows significant improvement, especially in the region with high intensity around the nose and between the eyebrows. Note that the improvement of reconstruction can also facilitate our highlight removal method of multi-view images.

Moreover, we demonstrate the potential of our method under the mirror reflection model for synthesizing novel views using a sparse set of input viewpoints. The experimental results with different numbers of input viewpoints are shown in Figure 9, which produce photorealistic rendering at new viewpoints.

6. Conclusions

This paper presents a lightweight, optimized-based highlight removal method for multi-view facial images. The proposed method can facilitate many applications, such as 3D face reconstruction from multi-view images, re-lighting, and AR effects on the face image. One particular problem for multi-view facial issues such as 3D face reconstruction we addressed here is that the presence of highlights makes the color of the same point inconsistent between images from different viewpoints. We proposed a lightweight, optimized-based framework for multi-view facial images to solve this problem. We took the view-independent diffuse results achieved under the mirror reflection model as initialization. We performed pixel-wise optimization in each viewpoint using non-negative constraints and orthogonal subspace projection.

In our experiments, we showed that our method could separate the specular highlights as much as possible while preserving the view-independent diffuse reflection and offset the effects caused by the inaccuracy of the face geometry and the irregularities of the microfacet structure. We captured a dataset with ground truth for the highlight removal of multi-view facial images to evaluate quantitative performance. A qualitative evaluation of the publicly available FaceScape dataset was also performed. In the last, we showed robustness in the number of input images and improvements for the reconstruction task.

Future research could be combined with specific facial problems for joint optimization, such as reconstructing the face model with the results of highlight removal and, in turn, improving the highlight removal results. In addition, collecting more facial image datasets, especially with ground truth, is essential to obtain better results.

Author Contributions

Conceptualization, T.S., Y.Z. and Y.Y.; methodology, T.S.; software, T.S.; validation, T.S.; formal analysis, T.S.; investigation, T.S.; resources, T.S.; data curation, T.S.; writing—original draft preparation, T.S.; writing—review and editing, Y.Z., Y.Y. and S.D.; visualization, T.S.; supervision, Y.Z., Y.Y. and S.D.; project administration, T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We acknowledge the members of the Visual Sensing and Graphic Lab, Nanjing University for cooperating in capturing the multi-view facial image dataset and giving permission for academic research and publication in journals.

Conflicts of Interest

The authors declare no conflict of interest.

References

Artusi, A.; Banterle, F.; Chetverikov, D. A survey of specularity removal methods. Comput. Graph. Forum 2011, 30, 2208–2230. [Google Scholar] [CrossRef]
Guo, Y.; Zhang, J.; Cai, J.; Jiang, B.; Zheng, J. CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 1294–1307. [Google Scholar] [CrossRef] [PubMed]
Wang, M.; Deng, W. Deep face recognition: A survey. Neurocomputing 2021, 429, 215–244. [Google Scholar] [CrossRef]
Adjabi, I.; Ouahabi, A.; Benzaoui, A.; Taleb-Ahmed, A. Past, present, and future of face recognition: A review. Electronics 2020, 9, 1188. [Google Scholar] [CrossRef]
Minaee, S.; Liang, X.; Yan, S. Modern Augmented Reality: Applications, Trends, and Future Directions. arXiv 2022, arXiv:2202.09450. [Google Scholar]
Jachnik, J.; Newcombe, R.A.; Davison, A.J. Real-time surface light-field capture for augmentation of planar specular surfaces. In Proceedings of the 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Atlanta, GA, USA, 5–8 November 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 91–97. [Google Scholar]
Nestmeyer, T.; Lalonde, J.F.; Matthews, I.; Lehrmann, A. Learning physics-guided face relighting under directional light. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5124–5133. [Google Scholar]
Innamorati, C.; Ritschel, T.; Weyrich, T.; Mitra, N.J. Decomposing single images for layered photo retouching. Comput. Graph. Forum 2017, 36, 15–25. [Google Scholar] [CrossRef]
Shu, Z.; Hadap, S.; Shechtman, E.; Sunkavalli, K.; Paris, S.; Samaras, D. Portrait lighting transfer using a mass transport approach. ACM Trans. Graph. 2017, 36, 1. [Google Scholar] [CrossRef]
Yi, R.; Zhu, C.; Tan, P.; Lin, S. Faces as lighting probes via unsupervised deep highlight extraction. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 317–333. [Google Scholar]
Li, C.; Zhou, K.; Wu, H.T.; Lin, S. Physically-based simulation of cosmetics via intrinsic image decomposition with facial priors. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1455–1469. [Google Scholar] [CrossRef]
Sun, T.; Barron, J.T.; Tsai, Y.T.; Xu, Z.; Yu, X.; Fyffe, G.; Rhemann, C.; Busch, J.; Debevec, P.E.; Ramamoorthi, R. Single image portrait relighting. ACM Trans. Graph. 2019, 38, 79. [Google Scholar] [CrossRef]
Song, G.; Zheng, J.; Cai, J.; Cham, T.J. Recovering facial reflectance and geometry from multi-view images. Image Vis. Comput. 2020, 96, 103897. [Google Scholar] [CrossRef]
Li, C.; Zhou, K.; Lin, S. Intrinsic face image decomposition with human face priors. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 218–233. [Google Scholar]
Li, C.; Lin, S.; Zhou, K.; Ikeuchi, K. Specular highlight removal in facial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3107–3116. [Google Scholar]
Zhu, T.; Xia, S.; Bian, Z.; Lu, C. Highlight Removal in Facial Images. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Nanjing, China, 16–18 October 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 422–433. [Google Scholar]
Khan, H.A.; Thomas, J.B.; Hardeberg, J.Y. Analytical survey of highlight detection in color and spectral images. In Proceedings of the International Workshop on Computational Color Imaging, Milano, Italy, 29–31 March 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 197–208. [Google Scholar]
Klinker, G.J.; Shafer, S.A.; Kanade, T. Using a color reflection model to separate highlights from object color. In Proceedings of the International Conference on Computer Vision (ICCV), London, UK, 8–11 June 1987; Volume 87, pp. 145–150. [Google Scholar]
Klinker, G.J.; Shafer, S.A.; Kanade, T. The measurement of highlights in color images. Int. J. Comput. Vis. 1988, 2, 7–32. [Google Scholar] [CrossRef] [Green Version]
Klinker, G.J.; Shafer, S.A.; Kanade, T. A physical approach to color image understanding. Int. J. Comput. Vis. 1990, 4, 7–38. [Google Scholar] [CrossRef]
Bajcsy, R.; Lee, S.W.; Leonardis, A. Detection of diffuse and specular interface reflections and inter-reflections by color image segmentation. Int. J. Comput. Vis. 1996, 17, 241–272. [Google Scholar] [CrossRef]
Schlüns, K.; Teschner, M. Analysis of 2d color spaces for highlight elimination in 3d shape reconstruction. In Proceedings of the Proc. ACCV, Singapore, 5–8 December 1995; Citeseer: Princeton, NJ, USA, 1995; Volume 2, pp. 801–805. [Google Scholar]
Schlüns, K.; Teschner, M. Fast separation of reflection components and its application in 3D shape recovery. In Proceedings of the Color and Imaging Conference, Scottsdale, AZ, USA, 7–10 November 1995; Society for Imaging Science and Technology: Springfield, VI, USA, 1995; Volume 1995, pp. 48–51. [Google Scholar]
Mallick, S.P.; Zickler, T.; Belhumeur, P.N.; Kriegman, D.J. Specularity removal in images and videos: A PDE approach. In Proceedings of the European Conference on Computer Vision, Graz, Austria, 7–13 May 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 550–563. [Google Scholar]
Tan, P.; Lin, S.; Quan, L.; Shum, H.Y. Highlight removal by illumination-constrained inpainting. In Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, 13–16 October 2003; Volume 1, pp. 164–169. [Google Scholar] [CrossRef]
Tan, P.; Quan, L.; Lin, S. Separation of highlight reflections on textured surfaces. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; IEEE: Piscataway, NJ, USA, 2006; Volume 2, pp. 1855–1860. [Google Scholar]
Tan, R.T.; Ikeuchi, K. Separating reflection components of textured surfaces using a single image. In Digitally Archiving Cultural Objects; Springer: Berlin/Heidelberg, Germany, 2008; pp. 353–384. [Google Scholar]
Kim, H.; Jin, H.; Hadap, S.; Kweon, I. Specular reflection separation using dark channel prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 1460–1467. [Google Scholar]
Yang, Q.; Wang, S.; Ahuja, N. Real-time specular highlight removal using bilateral filtering. In Proceedings of the European Conference on Computer Vision, Crete, Greece, 5–11 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 87–100. [Google Scholar]
Yang, Q.; Tang, J.; Ahuja, N. Efficient and robust specular highlight removal. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 1304–1311. [Google Scholar] [CrossRef] [PubMed]
Suo, J.; An, D.; Ji, X.; Wang, H.; Dai, Q. Fast and high quality highlight removal from a single image. IEEE Trans. Image Process. 2016, 25, 5441–5454. [Google Scholar] [CrossRef]
Funke, I.; Bodenstedt, S.; Riediger, C.; Weitz, J.; Speidel, S. Generative adversarial networks for specular highlight removal in endoscopic images. In Proceedings of the Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, Houston, TX, USA, 12–15 February 2018; SPIE: Bellingham, DC, USA, 2018; Volume 10576, pp. 8–16. [Google Scholar]
Lin, J.; Amine Seddik, M.E.; Tamaazousti, M.; Tamaazousti, Y.; Bartoli, A. Deep multi-class adversarial specularity removal. In Proceedings of the Scandinavian Conference on Image Analysis, Norrkoping, Sweden, 11–13 June 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 3–15. [Google Scholar]
Wu, Z.; Zhuang, C.; Shi, J.; Guo, J.; Xiao, J.; Zhang, X.; Yan, D.M. Single-image specular highlight removal via real-world dataset construction. IEEE Trans. Multimed. 2021, 24, 3782–3793. [Google Scholar] [CrossRef]
Wolff, L.B. Polarization-based material classification from specular reflection. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 1059–1071. [Google Scholar] [CrossRef]
Nayar, S.K.; Fang, X.S.; Boult, T. Separation of reflection components using color and polarization. Int. J. Comput. Vis. 1997, 21, 163–186. [Google Scholar] [CrossRef]
Kim, D.W.; Lin, S.; Hong, K.S.; Shum, H.Y. Variational Specular Separation Using Color and Polarization. In Proceedings of the MVA, Nara, Japan, 11–13 December 2002; pp. 176–179. [Google Scholar]
Umeyama, S.; Godin, G. Separation of diffuse and specular components of surface reflection by use of polarization and statistical analysis of images. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 639–647. [Google Scholar] [CrossRef]
Lamond, B.; Peers, P.; Debevec, P.E. Fast image-based separation of diffuse and specular reflections. SIGGRAPH Sketches 2007, 6. Available online: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.227.9256&rep=rep1&type=pdf (accessed on 7 August 2022).
Zhang, L.; Hancock, E.R.; Atkinson, G.A. Reflection component separation using statistical analysis and polarisation. In Proceedings of the Iberian Conference on Pattern Recognition and Image Analysis, Las Palmas de Gran Canaria, Spain, 8–10 June 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 476–483. [Google Scholar]
Lee, S.W.; Bajcsy, R. Detection of specularity using color and multiple views. In Proceedings of the European Conference on Computer Vision, Santa Margherita Ligure, Italy, 19–22 May 1992; Springer: Berlin/Heidelberg, Germany, 1992; pp. 99–114. [Google Scholar]
Sato, Y.; Ikeuchi, K. Temporal-color space analysis of reflection. JOSA A 1994, 11, 2990–3002. [Google Scholar] [CrossRef]
Lin, S.; Shum, H.Y. Separation of diffuse and specular reflection in color images. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001; CVPR 2001. IEEE: Piscataway, NJ, USA, 2001; Volume 1, p. I. [Google Scholar]
Lin, S.; Li, Y.; Kang, S.B.; Tong, X.; Shum, H.Y. Diffuse-specular separation and depth recovery from image sequences. In Proceedings of the European Conference on Computer Vision, Copenhagen, Denmark, 28–31 May 2002; Springer: Berlin/Heidelberg, Germany, 2002; pp. 210–224. [Google Scholar]
Prinet, V.; Werman, M.; Lischinski, D. Specular highlight enhancement from video sequences. In Proceedings of the 2013 IEEE International Conference on Image Processing, Melbourne, Australia, 15–18 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 558–562. [Google Scholar]
Wang, C.; Kamata, S.i.; Ma, L. A fast multi-view based specular removal approach for pill extraction. In Proceedings of the 2013 IEEE International Conference on Image Processing, Melbourne, Australia, 15–18 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 4126–4130. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. Conference Track Proceedings. [Google Scholar]
Shafer, S.A. Using color to separate reflection components. Color Res. Appl. 1985, 10, 210–218. [Google Scholar] [CrossRef]
Ramamoorthi, R.; Hanrahan, P. An efficient representation for irradiance environment maps. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 12–17 August 2001; pp. 497–500. [Google Scholar]
Ramamoorthi, R.; Hanrahan, P. A signal-processing framework for inverse rendering. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 12–17 August 2001; pp. 117–128. [Google Scholar]
Ramamoorthi, R.; Hanrahan, P. On the relationship between radiance and irradiance: Determining the illumination from images of a convex Lambertian object. JOSA A 2001, 18, 2448–2459. [Google Scholar] [CrossRef]
Robles-Kelly, A.; Huynh, C.P. Imaging Spectroscopy for Scene Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Wu, F.; Bao, L.; Chen, Y.; Ling, Y.; Song, Y.; Li, S.; Ngan, K.N.; Liu, W. Mvf-net: Multi-view 3d face morphable model regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 959–968. [Google Scholar]
Bai, Z.; Cui, Z.; Rahim, J.A.; Liu, X.; Tan, P. Deep facial non-rigid multi-view stereo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5850–5860. [Google Scholar]
Shang, J.; Shen, T.; Li, S.; Zhou, L.; Zhen, M.; Fang, T.; Quan, L. Self-supervised monocular 3d face reconstruction by occlusion-aware multi-view geometry consistency. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XV 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 53–70. [Google Scholar]
Fu, Z.; Tan, R.T.; Caelli, T. Specular free spectral imaging using orthogonal subspace projection. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; IEEE: Piscataway, NJ, USA, 2006; Volume 1, pp. 812–815. [Google Scholar]
Yang, H.; Zhu, H.; Wang, Y.; Huang, M.; Shen, Q.; Yang, R.; Cao, X. Facescape: A large-scale high quality 3d face dataset and detailed riggable 3d face prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 601–610. [Google Scholar]
Yamamoto, T.; Nakazawa, A. General improvement method of specular component separation using high-emphasis filter and similarity function. ITE Trans. Media Technol. Appl. 2019, 7, 92–102. [Google Scholar] [CrossRef]
Shen, H.L.; Zheng, Z.H. Real-time highlight removal using intensity ratio. Appl. Opt. 2013, 52, 4483–4493. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Examples of highlight removal of multi-view facial images. The first column presents facial images with highlights. The second and third columns show the corresponding highlight removal and specular images obtained through cross-polarization. Note that the images of the first two columns are not calibrated radiometrically for better visualization.

Figure 2. The schematic diagram of the proposed method. The left column shows the input multi-view images and the prior face geometry, which can be obtained roughly by input specular contaminated images as well. We first decompose the reflection components in 3D space under the Lambertian and mirror reflection models, respectively, and then render the result into 2D image space. The specular component is further pixel-wise optimized under the dichromatic reflection model in 2D image space. Parameters are estimated by minimizing the objective function using the Adam optimization algorithm [47]. The changes in the objective function determine the convergence criterion.

Figure 3. Examples of specular component. (a) Ground truth. Results by our method under (b) dichromatic model and (c) mirror reflection model.

Figure 4. Qualitative evaluation of highlight removal on the Laboratory Captured Dataset. (a) Input images. (b) Ground truth diffuse and specular components obtained by cross-polarization. (c–g) Separated diffuse and specular components by (c) our method based on dichromatic reflection model, (d) our method based on mirror reflection model, (e) [58], (f) [59], and (g) [29]. The specular component is rescaled by two for better visualization.

Figure 5. Qualitative evaluation of highlight removal on FaceScape Dataset. (a) Input images. (b–f) Separated diffuse and specular components by (b) our method based on dichromatic reflection model, (c) our method based on mirror reflection model, (d) [58], (e) [59], and (f) [29]. The specular component is rescaled by two for better visualization.

Figure 6. Ablation studies of the proposed multi-view highlight removal on the Laboratory Captured Dataset. (a) Input image. (b) Ground truth. (c) Ours with full loss. (d) Ours only with the initialization by the mirror reflection model. (e) Ours without initialization by the mirror reflection model. (c–e) are setted with

λ_{N N} = 10

and

λ_{O S P} = 1

. (f) Ours with

λ_{O S P} = 0.1

. (g) Ours with

λ_{N N} = 1

. The specular component is rescaled by two for better visualization.

Figure 6. Ablation studies of the proposed multi-view highlight removal on the Laboratory Captured Dataset. (a) Input image. (b) Ground truth. (c) Ours with full loss. (d) Ours only with the initialization by the mirror reflection model. (e) Ours without initialization by the mirror reflection model. (c–e) are setted with

λ_{N N} = 10

and

λ_{O S P} = 1

. (f) Ours with

λ_{O S P} = 0.1

. (g) Ours with

λ_{N N} = 1

. The specular component is rescaled by two for better visualization.

Figure 7. Comparison of experimental results on the number of viewpoints.

Figure 8. Improvement in the 3D face reconstruction. (a) A sample of input images. (b) Face reconstruction results obtained by highlight-contaminated facial images. (c) Face reconstruction results obtained by the highlight removal results of our method.

Figure 9. Experiment of synthesizing novel views. (a) Ground truth. (b) Synthesizing results using two input viewpoints. (c) Synthesizing results using eleven input viewpoints.

Table 1. Quantitative evaluation of multi-view highlight removal on the Laboratory Captured Dataset.

Method	RMSE		SSIM
Method	Mean ↓	Median ↓	Mean ↑	Median ↑
[29]	4.4932	4.4276	0.9287	0.9323
[59]	5.9070	5.5094	0.8953	0.9294
[58]	5.8736	5.4703	0.8961	0.9304
Ours (M.)	2.0048	1.6245	0.9788	0.9808
Ours (D.)	1.7220	1.5618	0.9800	0.9823

Table 2. Standard deviation and relative standard deviation of multi-view highlight removal.

Method	RMSE		SSIM
Method	SD ↓	RSD ↓	SD ↓	RSD ↓
[29]	0.0668	1.4864	1.6285	1.7534
[59]	0.3230	5.4681	8.7117	9.7304
[58]	0.3239	5.5141	8.7467	9.7611
Ours (M.)	0.0404	2.0133	0.6412	0.6551
Ours (D.)	0.0202	1.1722	0.5962	0.6063

Table 3. Ablation Studies of multi-view highlight removal on the Laboratory Captured Dataset.

	Mean RMSE	Median RMSE	Mean SSIM	Median SSIM
Full loss	1.7220	1.5618	0.9800	0.9823
Only M. initialization	2.0048	1.6245	0.9788	0.9808
Without M. initialization	2.2306	2.1598	0.9652	0.9671
Without OSP term	1.8530	1.8179	0.9801	0.9805
Without NN term	1.8986	1.8587	0.9781	0.9785

Table 4. Quantitative evaluation of the experiment on the number of viewpoints on the Laboratory Captured Dataset.

Number of Viewpoints	Ours (D.)		Ours (M.)
Number of Viewpoints	RMSE	SSIM	RMSE	SSIM
2	2.0061	0.9730	3.3379	0.9499
3	1.9013	0.9740	3.2522	0.9505
4	1.8797	0.9733	3.2629	0.9397
5	1.8168	0.9740	3.1570	0.9446
6	2.0133	0.9693	3.0350	0.9462
7	1.9738	0.9694	2.6980	0.9511
8	1.9635	0.9695	2.9307	0.9486
9	1.9703	0.9696	2.7753	0.9590
10	2.0406	0.9678	2.9241	0.9527
11	2.0354	0.9681	2.9136	0.9519
12	2.0409	0.9675	2.7964	0.9606

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Su, T.; Zhou, Y.; Yu, Y.; Du, S. Highlight Removal of Multi-View Facial Images. Sensors 2022, 22, 6656. https://0-doi-org.brum.beds.ac.uk/10.3390/s22176656

AMA Style

Su T, Zhou Y, Yu Y, Du S. Highlight Removal of Multi-View Facial Images. Sensors. 2022; 22(17):6656. https://0-doi-org.brum.beds.ac.uk/10.3390/s22176656

Chicago/Turabian Style

Su, Tong, Yu Zhou, Yao Yu, and Sidan Du. 2022. "Highlight Removal of Multi-View Facial Images" Sensors 22, no. 17: 6656. https://0-doi-org.brum.beds.ac.uk/10.3390/s22176656

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Highlight Removal of Multi-View Facial Images

Abstract

1. Introduction

2. Related Work

2.1. General Highlight Removal from a Single Image

2.2. General Highlight Removal from Multiple Images

2.3. Highlight Removal of Facial Images

2.4. Our Work

3. Formulation

3.1. Illumination Model

3.2. Reflection Model

3.2.1. Diffuse Reflection

3.2.2. Specular Reflection

3.2.3. Rendering

4. Multi-View Facial Images Highlight Removal

4.1. Data Term

4.2. Non-Negative Constraint Term

4.3. Orthogonal Subspace Projection Term

4.4. Optimization

5. Results

5.1. Dataset

5.1.1. Laboratory Captured Dataset with Ground Truth

5.1.2. FaceScape Dataset

5.2. Experiments

5.2.1. Comparisons

5.2.2. Ablation Studies

5.2.3. Applications

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI