3. Proposed Method
The main idea of the 2D/3D approach is to represent a 3D model by a set of characteristic views. The 3D object is scaled with respect to its bounding sphere, and principal component analysis (PCA) is applied in order to normalize the pose of the model by estimating the principal axes of a 3D object that are used to determine its orientation [
21].
Let V be the space of views of the 3D model discretized to M points of the view distributed around the 3D model and d the distance metric of the space. In order to select an optimum view, we propose to use the pivot selection techniques for the proximity search in metric spaces. The set of pivots selected from space V of views will present the optimum view of the 3D objet treated. We propose to select optimum views by using the incremental algorithm described in [
8]; it is one of the most suitable strategies for realworld metric spaces [
22], in which the authors suggest an efficiency criterion that compares two set of pivots and designates the better of the two.
Let {v_{1}, v_{2}, …, v_{k}} be a set of k pivots, with v_{i} $\in $ V. {v_{1}, v_{2}, …, v_{k}} defining a space ${\rm P}$ of distance tuples between pivots and views from V. The mapping of a view u $\in $ V to ${\rm P}$, which will be denoted [u], is carried out as [u] = (d(u,v_{1}), d(u,v_{2}), …, d(u,v_{k})). Defining the metric D_{{v1, v2, …, vk}}([x],[y]) = max_{{1 ≤ i ≤ k}}d(x,v_{i})−d(y, v_{i}), it follows that $\left({\rm P},D\right)$ is a metric space, which turns out to be (${R}^{k}$, L_{∞}).
Let
μ_{V} be the mean of the distribution
V; the authors in [
8] suggest an efficiency criterion that compares two sets of pivots and designates the better of the two. Given two sets of pivots
P_{1} = {
p_{1},
p_{2}, …,
p_{k}} and
P_{2} = {
p_{1}’,
p_{2}’, …,
p_{k}’}, we call
P_{1} better than
P_{2} if
μ_{{p1, p2, …, pk}} >
μ_{{p1’, p2’, …, pk’}}.
An estimation of the value of µV is obtained as follows: A pairs of views {(a_{1}, a′_{1}), (a_{2}, a′_{2}), …, (a_{A}, a′_{A})} from V are chosen at random. All of the pairs of views are mapped to space P, obtaining the set {D_{1}, D_{2}, …, D_{A}} of distances D between every pair of views. The value of µ_{V} is estimated as ${\mu}_{V}=\frac{1}{A}{\sum}_{1\le i\le A}{D}_{i}$.
The processes of selecting the optimum view of 3D object works as follows: the first view v_{1} is selected from a sample of N views of V, such that the view alone has the maximum μ_{D} value. Then, a second view v_{2} is chosen from another sample of N views of V, such that {v_{1}, v_{2}} has the maximum μ_{D} value, considering v_{1} as fixed. The third view v_{3} is chosen from another sample of N views of V, such that {v_{1}, v_{2}, v_{3}} has the maximum μ_{D} value, considering v_{1} and v_{2} as fixed. The process is repeated until k views have been chosen. The set of views {v_{1}, v_{2}, …, v_{k}} selected from V will be the optimum set of views of the 3D object. For each iteration i of the algorithm, we propose to select the sample N views from the i^{eme} cluster c_{i} of views of the 3D objet obtained by applying the kmeans algorithm. The N views selected include the center of cluster c_{i}.
Let EvaluationSetA be a function that returns the set of A pairs of views from V and CandidatePivot be a function that returns the sample N views, including the center of cluster c_{i}. The steps of the proposed algorithm are illustrated in Algorithm 1.
Algorithm 1. The proposed algorithm. 
Input: 
V: the set of views of 3D object 
d: the distance metric used to compare two views 
NP: the number of optimum views 
Output: 
SetPivot: set of optimum views 
Variables: 
setA: the set of A pairs of views used to estimates µ_{V} 
setC: the sample N views, including the center of cluster c_{i} // $setC\subset cluste{r}_{i}$ 
Begin 
Classify the views of V into NP clusters using a kmeans clustering algorithm 
setA = EvaluationSetA(V, d); 
SetPivot = ∅; 
for i: = 1 to NP do // NP present the number of clusters 
setC = CandidatePivot(cluster_{i}, d); // c_{i} $\in $ setC 
bestValue = 0; 
for (each x in setC) 
value = getValueUv(setE, d, P ∪ x); 
if (value is better than bestValue) 
bestValue = value; 
bestView = x; 
endif 
endfor 
SetPivot = SetPivot ∪ bestView; 
endfor 
Return SetPivot; 
End 
The function GetValueUv used to estimate the value of µV exploits a mapping of the form $\psi :\left(V,d\right)\mapsto \left({\mathbb{R}}^{p},{L}_{\infty}\right)$, defined by k pivots v_{i}, such that $\psi $ (x,y) = ${L}_{\infty}\left(\psi \left(x\right),\psi \left(y\right)\right)$ = max_{{1 ≤ i ≤ k}}d(x,v_{i})−d(y,v_{i}), where $\psi \left(x\right)=\left[d\left({P}_{1},x\right),d\left({P}_{2},x\right),\dots ,d\left({P}_{p},x\right)\right]$. The steps of GetValueUv are illustrated in Algorithm 2.
Algorithm 2. The function GetValueUv. 
Input: 
setE: the set of A pairs of views 
d: the distance metric used to compare two views 
setP: the set of pivots {v_{1}, v_{2}, …, v_{k}} 
Output: 
${\mu}_{V}$: the mean of the distribution 
Begin 
all pairs of setE are mapped into the vector space associated with the set of pivots setP using the mapping function $\psi (.)$ for every pair $\left(x,y\right)$, compute the distance between $x$ and $y$ in the feature space, that is ${d}_{i}={L}_{\infty}\left(\psi \left(x\right),\psi \left(y\right)\right)=max$_{{1 ≤ i ≤ k}}$d(x,v$_{i}$)d(y,v$_{i}$)\text{}$ compute ${\mu}_{V}$ as the mean of these distances, i.e., ${\mu}_{V}=\frac{1}{A}{\sum}_{1\le i\le A}{d}_{i}$

End 
If the distances D_{{p1, …, pi−1}}([a_{r}], [a′_{r}]), 1 ≤ r ≤ A, are kept in an array, it is not necessary to redo all of the distance computations to estimate μ_{v} when the ith pivot is added. It is enough to calculate D_{{pi}}([a_{r}], [a′_{r}]), 1 ≤ r ≤ A, because it follows that D_{{p1, …, pi}}([a_{r}],[a′_{r}]) = max(D_{{p1, ..., pi−1}}([a_{r}], [a′_{r}]), D_{{pi}}([a_{r}], [a′_{r}])). Therefore, only 2NA distance computations are needed to estimate μ_{V} when a new pivot is added, where N presents the number of optimum views and A presents the number of pairs of views. Since the process is repeated k times, the total cost is 2kAN for distance computations.
We have indexed each view by using four wellestablished descriptors from the MPEG7 standard [
23,
24] that captures various image characteristics.
The scalable color descriptor (SCD) is derived from a color histogram in the huesaturationvalue color space with fixed space quantization. We used the 64 coefficients version of this descriptor.
The color structure descriptor (CSD) aims at identifying localized color distributions using an 8 × 8 pixel structuring matrix that slides over the image. This descriptor can distinguish between two images having a similar amount of pixels of a specific color, if structures of these pixels differ in these images.
The color layout descriptor (CLD) is obtained by applying the discrete cosine transform on a 2D array of local representative colors in three color channels.
The edge histogram descriptor (EHD) represents the localedge distribution in the image. The images are subdivided into 4 × 4 subimages, and edges in each subimage are categorized into five types: vertical, horizontal, 45° diagonal, 135° diagonal and nondirectional edges. This results in 80 coefficients representing the local edge histograms. Furthermore, these semiglobal and the global histograms can be computed based on the local histogram.