Next Article in Journal
Edge Domination and Incidence Domination in Vague Incidence Graphs and Its Application
Next Article in Special Issue
Screw Motion via Matrix Algebra in Three-Dimensional Generalized Space
Previous Article in Journal
Some Fuzzy Inequalities for Harmonically s-Convex Fuzzy Number Valued Functions in the Second Sense Integral
Previous Article in Special Issue
The Solvability of a System of Quaternion Matrix Equations Involving ϕ-Skew-Hermicity
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Note on a Minimal Irreducible Adjustment Pagerank

1
School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
2
School of Economics and Management, Tongji University, Shanghai 200092, China
*
Author to whom correspondence should be addressed.
Submission received: 17 July 2022 / Revised: 4 August 2022 / Accepted: 6 August 2022 / Published: 9 August 2022

Abstract

:
The stochastic modification and irreducible modification in PageRank produce large web link changes correspondingly. To get a minimal irreducible web link adjustment, a PageRank model of minimal irreducible adjustment and its lumping method are discussed by Li, Chen, and Song. In this paper, we provide alternative proofs for the minimal irreducible PageRank by a new type of similarity transformation matrices. To further provide theorems and fast algorithms on a reduced matrix, an 4 × 4 block matrix partition case of the minimal irreducible PageRank model is utilized and analyzed. For some real applications of our results, a lumping algorithm used for speeding up PageRank vector computations is also presented. Numerical results are also reported to show the efficiency of the proposed algorithm.
MSC:
15A06; 15A18; 15A21; 65F10; 65F15; 65B99

1. Introduction

In modern Internet search engine and web information retrieval domains, a good ranking method is extremely important due to the fact that it reveals the relative importance of the corresponding web page. As one of the Markov chain applications in web information retrieval, PageRank is such a famous web ranking method. It assumes that the importance of a web page mainly depends on the relative importance of the pages linking to it [1]. In matrix expression, the iterative procedure of the raw PageRank computation is defined by
π ( k + 1 ) T = π ( k ) T H , where H R n × n , π ( k ) 0 , π ( k ) T e = 1 , k = 0 , 1 , 2 , ,
where the matrix H R n × n models the web link structure matrix, π ( k ) , k = 0 , 1 , 2 , is an iterative vector whose entries represent the importance or weight of each page [2,3]. The web link structure matrix H is given by
h i j = 1 n i , if   page   i   links   to   page   j , 0 , otherwise ,
where n i , i = 1 , 2 , , n stands for the number of outlinks of page i. As (1) can not insure the uniqueness of the PageRank vector, Page and Brin design an n × n matrix by two rank-1 modification procedures [1]
G = α S + ( 1 α ) e v T , where S = H + d w T , the   damping   factor α ( 0 , 1 ) ,
where the nonnegative vector of unit length w R n is the dangling node vector, e R n is a vector of all ones, the nonnegative vector v ( v R n , v 1 = 1 ) is a personalization vector, and the entries of the dangling node indicator vector d are given by
d i = 1 , if   page   i   does   not   link   to   other   pages , 0 , otherwise .
As we know, some web pages may not have links to other web pages. If web pages have no outlinks, they are called dangling nodes; otherwise, they are called nondangling nodes, such as videos, jpg files, or pdf files are common dangling nodes in web pages. Thus, the above stochastic modification is necessary. The heart of the unique eigenvector is the irreducibility issue; therefore, the matrix e v T (where e and v are defined in (2)) is further used to handle this uniqueness problem. The classical PageRank vector is defined by the principle eigenvector of the Google matrix G with unit 1-norm. It is also to find a stationary probability distribution vector of a Markov chain. In other words, the PageRank computation problem is defined by
π T G = π T , G R n × n , π 0 , π T e = 1 .
However, it may take much time merely to compute a large PageRank vector, since hyperlink matrices involved stand for even over a billion pages’ link relationships [1,4]. Hence, faster methods are increasingly heated topics. The Arnoldi method can estimate the first few eigenvalues for nonsymmetric eigenvalue problems, while it may cost expensively in each iteration. The inverse iteration is not suitable for PageRank since it may need O ( n 3 ) operations by finding an inversion of a large web link matrix. Instead, methods based on the power method are increasingly attractive recently. The damping factor represents the coupling degree of the corresponding Markov chain. A large value of the damping factor stands for a nearly coupled Markov chain. However, as the damping factor increases, the power method converges slowly [3,5,6]. The lumping or reordering method takes full advantage of the zero entries in the Google matrix [2]. Some recent methods for computing PageRank effectively involve Chebyshev polynomial techniques, matrix splitting methods [7,8], the lumping method [9,10], coupled iteration algorithm [11], Hessenberg-type method [12], a simpler GMRES accelerated by Chebyshev polynomials [13], and hybrid methods [14]. Other accelerated methods, theoretical and numerical results are available, see [7,15,16]. Adding dangling nodes’ links to other web sites will change links between web pages maximally. If the number of web pages expands, there will be more dangling node web pages. In addition, every node can be directly connected to other nodes by adding the matrix e v T ; hence, in this case, irreducibility is enforced trivially. In addition, the web link change is increasingly large, so the adjustment (2) is called maximal irreducible adjustment. For more details of the PageRank model and computation, see [1,2,3,17].
Our recent work provides relatively easy alternative proofs for classical PageRank theoretical analyses by a class of new similarity matrices [18,19]. One may wonder whether the minimal irreducible PageRank model still has similar theoretical results. To further reduce the computational overhead, we discuss a level 4 partition of the minimal irreducible PageRank model matrix. In the level 4 partition of the minimal irreducible Google matrix, we provide a smaller lumping matrix and take full advantage of zero entries in the minimal irreducible Google matrix.
The remainder of this paper is as follows: Section 2 discusses the main theorems of the minimal irreducible PageRank model in [20]. Section 3 presents a class of new similarity matrices which can be used in theoretical analysis. Section 4 provides alternative proofs for the main results due to Li, Chen, and Song [20]. Section 5 discusses the 4 × 4 level case, and one can derive a solution vector expression by further partitioning the minimal irreducible Google matrix. Hence, the matrix computation dimension involved is reduced and the corresponding computation process can speed up. Finally, Section 6 gives some conclusions of this paper. Throughout the paper, we first revisit the theoretical results of Li, Chen, and Song on a minimal irreducible PageRank model [20]. In addition, Matlab notation is used if necessary.

2. Some Theoretical Results on a Minimal Irreducible PageRank Model

In this section, we focus on underlying theoretical and computational contributions on a kind of minimal irreducible PageRank by the lumping method. Hence, a minimal irreducible PageRank model is discussed below.

2.1. A Minimal Irreducible PageRank Model

If all dangling nodes are lumped into a single node, then the Google matrix is said to be lumpable [9,21]. In general, the matrix M is lumpable with respect to the partition if
M i j e = μ e , where   the   scalar μ 0 , e = 1 1 1 T ,
where i j , 1 i , j k + 1 , and the matrix
P M P T = M 11 M 1 , k + 1 M k + 1 , 1 M k + 1 , k + 1 ,
and P is a proper permutation matrix.
Suppose that H R n × n has k nondangling nodes and n k dangling nodes, thus there exists a permutation matrix Π R n × n , such that
Π H Π T = H 11 H 12 0 0 .
Hence,
A = Π G Π T = α S ¯ + ( 1 α ) e v T = α H 11 H 12 e w 1 T e w 2 T + 1 α e v 1 T e v 2 T e v 1 T e v 2 T = α H 11 + ( 1 α ) e v 1 T α H 12 + ( 1 α ) e v 2 T e ( α w 1 T + ( 1 α ) v 1 T ) e ( α w 2 T + ( 1 α ) v 2 T ) = A 11 A 12 e u 1 T e u 2 T , where S ¯ = Π S Π T ,
where the dangling node vector w and personalization vector v are partitioned into w 1 R k , w 2 R n k and v 1 R k , v 2 R n k . Moreover, if
π ¯ T = π ¯ T A = π ¯ T Π G Π T , with π ¯ 0 , π ¯ 1 = 1 ,
where A = Π G Π T , G R n × n is the classical Google matrix, and π ¯ is the stationary distribution of A, then the PageRank vector corresponding to (4) is defined by
π T = π ¯ T Π .
In [20], Li, Chen, and Song discussed a minimal irreducible Google matrix by changing S ¯ to a bordered matrix
S ˜ = n n + 1 S ¯ 1 n + 1 e 1 n + 1 e T 1 n + 1 = n n + 1 H 11 n n + 1 H 12 1 n + 1 e n n + 1 e w 1 T n n + 1 e w 2 T 1 n + 1 e 1 n + 1 e T 1 n + 1 e T 1 n + 1 .
The new PageRank vector of order n + 1 is the solution vector defined by the principle eigenvector of the new Google matrix S ˜ ,
π ˜ T = π ˜ T S ˜ , or π ˜ = S ˜ T π ˜ , with π ˜ 0 , π ˜ 1 = 1 .
We note that this modification only increases one additional row and column to ensure that the link matrix is strongly connected, in order to ensure that the new Google matrix S ˜ is irreducible. In this way, the Perron–Frobenius theorem can guarantee the uniqueness of a positive principle eigenvector π ˜ with unit length ( π ˜ 1 = 1 ). Note that the decomposition presented in (5) has been studied before in various forms. See, e.g., [22,23]. In Section 4.3, we mainly studied the first two layers of such decompositions. In addition, another way of creating a minimal irreducible Google matrix is defined by [24,25]
S ˜ = α S ¯ ( 1 α ) e v T 0 = α H 11 α H 12 ( 1 α ) e α e w 1 T α e w 2 T ( 1 α ) e v T v T 0 .
In this paper, we mainly discuss the minimal irreducible Google matrix (5). In addition, we will briefly discuss our results on (6) and leave it as a further question.
In the following, in order to lump the stochastic matrix and show our theoretical results conveniently, we first permutate the rows and columns of S ˜ simultaneously. Thus, we first permutate S ˜ , such that
S ^ = Π ˜ S ˜ Π ˜ T = n n + 1 H 11 1 n + 1 e n n + 1 H 12 1 n + 1 e T 1 n + 1 1 n + 1 e T n n + 1 e w 1 T 1 n + 1 e n n + 1 e w 2 T ,
where the corresponding permutation matrix
Π ˜ = I k 0 0 0 0 1 0 I n k 0 .

2.2. Some Related Theorems on the Minimal Irreducible PageRank Model

The similarity transformation matrix
L = I n k 1 n k e ^ e T , where e ^ = e e 1 = 0 , 1 , , 1 T R n k
is employed by Ipsen and Selee to analyze the theoretical results, and the identity matrix of order n is denoted by I n = e 1 e 2 e n , where e i , i = 1 , 2 , , n is the ith column vector of I n . If large scale computation problems can be reduced to a relatively small scale, the computations can be simplified and accelerated. Fortunately, lumping is one of these methods. In [20], Li, Chen, and Song obtained the following theorems for the new Google matrix. They showed that the minimal irreducible Google matrix can also be lumped. In addition, the spectral distribution of the new Google matrix, the relationship between the eigenvector part associated with the dangling nodes, and the eigenvector part associated with the nondangling nodes is as follows.
Theorem 1
([20,26]). Given the stochastic matrix S ¯ with spectrum 1 , λ 2 , λ 3 , , λ n . Then, the spectral of the matrix
S ˜ = n n + 1 S ¯ 1 n + 1 e 1 n + 1 e T 1 n + 1
is 1 , n n + 1 λ 2 , n n + 1 λ 3 , , n n + 1 λ n , 0 .
Theorem 2
([20]). With the above notation, let
X = I k 0 0 0 1 0 0 0 L , w h e r e L = I n k 1 n k e ^ e T a n d e ^ = e e 1 = 0 1 1 T .
Then,
X S ^ X 1 = S ^ ( 1 ) S ^ ( 2 ) 0 0 , w h e r e S ^ ( 1 ) = n n + 1 H 11 1 n + 1 e n n + 1 H 12 e 1 n + 1 e T 1 n + 1 n k n + 1 n n + 1 w 1 T 1 n + 1 n n + 1 w 2 T e ,
so that S ^ has at least n k 1 zero eigenvalues.
Theorem 3
([20]). With the above notation, let
σ T n n + 1 H 11 1 n + 1 e n n + 1 H 12 e 1 n + 1 e T 1 n + 1 n k n + 1 n n + 1 w 1 T 1 n + 1 n n + 1 w 2 T e = σ T , σ R k + 2 0 , σ 1 = 1 ,
and partition σ T = σ 1 : k T σ k + 1 σ k + 2 , where σ k + 1 and σ k + 2 are scalars. Then, the PageRank vector of S ˜ equals
π ˜ T = σ 1 : k T σ T n n + 1 H 12 1 n + 1 e T n n + 1 w 2 T σ k + 1 .

3. New Similarity Transformation Matrix

In order to further study the lumping method and make contribution to the PageRank computation, we derive alternative proofs for related theorems by using new similarity transformation matrices [18,19] and try to establish efficient algorithms to compute PageRank. To discuss this topic, we begin to introduce some related work.
Suppose that λ i is the eigenvalues of A R n × n ordered as | λ 1 | > | λ 2 | | λ 3 | | λ n | 0 . In [27], Ding and Zhou proposed the following rank-1 updated spectral theorem.
Theorem 4
([27]). Let u and v be two n dimensional column vectors such that u is an eigenvector of A associated with the eigenvalue λ 1 . Then, the eigenvalues of A + u v T are λ 1 + v T u , λ 2 , , λ n , counting algebraic multiplicity.
Langville and Meyer in their book [3] analyze the spectral distribution of the classical Google matrix by employing a similarity transformation method. To find the relationship between PageRank of the nondangling nodes π 1 and that of dangling nodes π 2 , Ipsen and Selee in [9] employed the similarity transformation matrix
X = I k 0 0 L , L = I n k 1 n k e ^ e T , e ^ = e e 1 = 0 1 1 T .
As it is pointed in [18], the invertible matrix L exists such that the related theorem holds. However, the matrix L is not unique. In the general case of the Google matrix, if the matrix L satisfies the condition
L e = e 1 , where   e   is   a   vector   of   all   ones ,
then the proof process can be more simpler and shortened. Motivated by the above conclusion, we further study the case in the new PageRank model. In Theorem 2, if we exploit the matrix [18]    
X ˜ = I k 0 0 L ˜ , where L ˜ = I n k e ^ e 1 T , e ^ = e e 1 = 0 1 1 T ,
and we separate the first row and column of L ˜
L ˜ = 1 0 e I n k 1 ,
and partition L in (7) conformally with L ˜ ,
L = 1 0 1 n k e I n k 1 1 n k e e T ,
where the 2-by-2 block of L in (11) is
I n k 1 1 n k e e T = n k 1 n k 1 n k 1 n k 1 n k n k 1 n k 1 n k 1 n k 1 n k n k 1 n k .
Unfortunately, it is a typical dense matrix; therefore, it is complex when we analyze it, and it is expensive when we compute or store it. Furthermore, the matrix (10) is actually [18]
L ˜ = 1 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 .
because L ˜ is a triangular matrix while L is a dense matrix. Inspired and motivated by the easier similarity transformation matrix L ˜ , we consider replacing L with L ˜ in X so that an alternative proof process can be presented, and its process can be simplified and shortened. Moreover, if
L ˜ = 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1
in (9), then the proof process of the above theorems also holds. In other words, this similarity transformation matrix is not unique at all. Instead of the matrix L used in [9], we propose a class of invertible matrices L ^ , which satisfy the condition (8). Particularly, if L ^ = L , it becomes the similarity transformation matrix proposed by Ipsen and Selee in [9]. In the following, as we mainly discuss the minimal irreducible PageRank matrix (5), one may wonder “Could you apply your methods on the results in [24,25]?” Thus, we provide a remark below.
Remark 1.
If we permutate S ˜ by the permutation matrix Π ˜ , such that S ˜ is reordered as
α H 11 ( 1 α ) e α H 12 v 1 T 0 v 2 T α e w 1 T ( 1 α ) e α e w 2 T .
Performing a similarity transformation on it, we yield
I 0 0 0 1 0 0 0 L α H 11 ( 1 α ) e α H 12 v 1 T 0 v 2 T α e w 1 T ( 1 α ) e α e w 2 T I 0 0 0 1 0 0 0 L 1 = α H 11 ( 1 α ) e α H 12 L 1 v 1 T 0 v 2 t L 1 α L e w 1 T ( 1 α ) L e α L e w 2 T L 1 = α H 11 ( 1 α ) e α H 12 L 1 e 1 α H 12 L 1 e 2 e n k v 1 T 0 v 2 T L 1 e 1 v 2 T L 1 e 2 e n k α w 1 T ( 1 α ) α w 2 T L 1 e 1 α w 2 T L 1 e 2 e n k 0 0 0 0 ,
where the invertible matrix L satisfies L e = e 1 , e 1 = 1 0 0 , and e is a column vector of all ones. Therefore, we preliminarily judge that the new similarity matrix results and theoretical results are also suitable for this form of minimal irreducibility Google matrix. More detailed work is under consideration.

4. Alternative Proofs

In this section, we will analyze theoretical results of the minimal irreducible PageRank model based on the similarity transformation matrix (9). Here, we mainly provide alternative proofs for the minimal irreducible PageRank model.

4.1. An Alternative Proof of Theorem 1

In [20], Li, Chen, and Song completed the proof of Theorem 1 by using the determinant property: That is,
d e t n n + 1 S ¯ 1 n + 1 e 1 n + 1 e T 1 n + 1 λ I n + 1 = λ λ 1 n n + 1 λ 2 λ n n + 1 λ n λ ,
Thus, one can conclude that the eigenvalues of S ˜ is { 1 , n n + 1 λ 2 , , n n + 1 λ n , 0 } . Here, we show an alternative proof for Theorem 1 by using the similarity transformation matrix.
Proof. 
A similarity transformation is done as follows:
I 1 n e 0 1 S ˜ I 1 n e 0 1 = I 1 n e 0 1 n n + 1 S ¯ 1 n + 1 e 1 n + 1 e T 1 n + 1 I 1 n e 0 1 = n n + 1 S ¯ + 1 n ( n + 1 ) e e T 0 1 n + 1 e T 0 .
The eigenvalues of the stochastic matrix S ¯ are denoted by 1 , λ 2 , λ 3 , , λ n . By Theorem 4, we obtain that the eigenvalues of S ˜ are
1 n + 1 n + 1 n · n = 1 , n n + 1 λ 2 , n n + 1 λ 3 , , n n + 1 λ n ,
and the remaining eigenvalue is 0. Thus, we complete the proof.    □

4.2. An Alternative Proof of Theorem 2

Li, Chen, and Song [20] present the proof of Theorem 2 by using the similarity transformation matrix (7). Now, we validate Theorem 2 by a general invertible matrix L ^ satisfying L ^ e = e 1 below.
Proof. 
Since
X ˜ 1 = I k 0 0 0 1 0 0 0 L ˜ 1 , where L ˜ = I n k e ^ e 1 T and L ˜ 1 = I n k + e ^ e 1 T ,
then
X ˜ S ^ X ˜ 1 = n n + 1 H 11 1 n + 1 e n n + 1 H 12 L ˜ 1 1 n + 1 e T 1 n + 1 1 n + 1 e T L ˜ 1 n n + 1 L ˜ e w 1 T 1 n + 1 L ˜ e n n + 1 L ˜ e w 2 T L ˜ 1 = n n + 1 H 11 1 n + 1 e n n + 1 H 12 L ˜ 1 e 1 n n + 1 H 12 L ˜ 1 e 2 e 3 e n k 1 n + 1 e T 1 n + 1 1 n + 1 e T L ˜ 1 e 1 1 n + 1 e T L ˜ 1 e 2 e 3 e n k n n + 1 e 1 T L ˜ e w 1 T 1 n + 1 e 1 T L ˜ e n n + 1 e 1 T L ˜ e w 2 T L ˜ 1 e 1 n n + 1 e 1 T L ˜ e w 2 T L ˜ 1 e 2 e 3 e n k 0 0 0 0 = n n + 1 H 11 1 n + 1 e n n + 1 H 12 e n n + 1 H 12 L ˜ 1 e 2 e 3 e n k 1 n + 1 e T 1 n + 1 n k n + 1 1 n + 1 e T L ˜ 1 e 2 e 3 e n k n n + 1 w 1 T 1 n + 1 n n + 1 w 2 T e n n + 1 w 2 T L ˜ 1 e 2 e 3 e n k 0 0 0 0 ,
where e = 1 1 1 T R n k , e T e = n k , L ˜ 1 e 1 = e , and the last quality follows form L ˜ e = e 1 (thus, e 1 T L ˜ e = 1 ).    □
Remark 2.
During the above proof of Theorem 2, if X ˜ is generalized to
X ^ = I k 0 0 0 1 0 0 0 L ^ , where an invertible matrix L ^ satisfies L ^ e = e ,
the proof also holds.

4.3. An Alternative Proof of Theorem 3

In this alternative proof, the similarity matrix is chosen as X ^ . A general invertible matrix L ^ satisfying L ^ e = e 1 is applied to validate Theorem 3.
Proof. 
If
X ^ S ^ X ^ 1 = S ^ ( 1 ) S ^ ( 2 ) 0 0 , where S ^ ( 1 ) = n n + 1 H 11 1 n + 1 e n n + 1 H 12 e 1 n + 1 e T 1 n + 1 n k n + 1 n n + 1 w 1 T 1 n + 1 n n + 1 w 2 T e ,
then the stochastic matrix S ^ ( 1 ) of order k + 2 has the same nonzero eigenvalues as S ^ . We obtain that
σ T σ T S ^ ( 2 )
is an eigenvector for X ^ S ^ X ^ 1 associated with the eigenvalue λ = 1 . This follows from
σ T σ T S ^ ( 2 ) S ^ ( 1 ) S ^ ( 2 ) 0 0 = σ T S ^ ( 1 ) σ T S ^ ( 2 ) = σ T σ T S ^ ( 2 ) .
Therefore,
π ^ T = σ T σ T S ^ ( 2 ) X ^
is an eigenvector of S ^ associated with λ = 1 . Since S ^ and S ^ ( 1 ) have the same nonzero eigenvalues, and the principle eigenvalue of S or S ^ is distinct, the stationary probability distribution σ of S ^ ( 1 ) is unique. We repartition
π ^ T = σ 1 : k T σ k + 1 σ k + 2 σ T S ^ ( 2 ) I k 0 0 0 1 0 0 0 L ^ .
Multiplying out
π ^ T = σ 1 : k T σ k + 1 σ k + 2 σ T S ^ ( 2 ) L ^ .
Since
σ k + 2 σ T S ^ ( 2 ) L ^ = σ T s ^ ( 1 ) e σ T s ^ ( 1 ) L ^ 1 e 2 e n k L ^ = σ T s ^ ( 1 ) L ^ 1 e 1 σ T s ^ ( 1 ) L ^ 1 e 2 e n k L ^ = σ T s ^ ( 1 ) ,
the first equation uses the fact that
s ^ ( 1 ) = n n + 1 H 12 T 1 n + 1 e n n + 1 w 2 T , σ k + 2 = σ T s ^ ( 1 ) e and S ^ ( 2 ) = s ^ ( 1 ) L ^ 1 e 2 e n k ,
and the second equation uses e = L ^ 1 e 1 . Hence,
π ^ T = σ 1 : k T σ k + 1 σ T s ^ ( 1 ) .
Furthermore,
π ˜ T = π ^ T Π ˜ = σ 1 : k T σ T s ^ ( 1 ) σ k + 1 , with s ^ ( 1 ) = n n + 1 H 12 T 1 n + 1 e n n + 1 w 2 T .
As the claim π ˜ is unique, we conclude that π ˜ is the stationary probability of the PageRank model if e T π ˜ = 1 . Therefore, we complete the proof.    □

5. Results for 4-Level Matrix Partition

We will show that the nondangling nodes can also be further classified into two classes. The nodes which are nondangling but point to only the dangling nodes are further called “weakly nondangling” nodes. The other nondangling nodes are referred to as “strongly nondangling” nodes [10]. Therefore, the reduced matrix obtained by lumping dangling nodes can be further reduced by lumping weakly nondangling nodes, to another single node. In addition, the further reduced matrix is also stochastic with the same nonzero eigenvalues as the Google matrix.

5.1. Derivation

To lump dangling nodes and weakly nondangling nodes [10] to a level 4 matrix partition, we permutate the matrix S in (3) such that
S ¯ = Π S Π T = H 11 ( 11 ) H 11 ( 12 ) H 12 ( 1 ) 0 0 H 12 ( 2 ) e w k 1 T e w k 2 T e w 2 T ,
where H 11 ( 11 ) R k 1 × k 1 , H 11 ( 12 ) R k 1 × k 2 , H 12 ( 1 ) R k 1 × ( n k ) , H 12 ( 2 ) R k 2 × ( n k ) , w 2 R n k , k 1 , and k 2 are the number of strongly nondangling nodes and weakly nondangling nodes, respectively. Moveover, k = k 1 + k 2 and S ¯ e = e . After the adjustment by Li, Chen, and Song [20], the new Google matrix in 4 × 4 block form is
G = n n + 1 H 11 ( 11 ) n n + 1 H 11 ( 12 ) n n + 1 H 12 ( 1 ) 1 n + 1 e 0 0 n n + 1 H 12 ( 2 ) 1 n + 1 e n n + 1 e w k 1 T n n + 1 e w k 2 T n n + 1 e w 2 T 1 n + 1 e 1 n + 1 e T 1 n + 1 e T 1 n + 1 e T 1 n + 1 .
We define the permutation matrix
Π 1 = I 0 0 0 0 I 0 0 0 0 0 1 0 0 I 0
to permutate the third and fourth row and column of G simultaneously. Performing Π 1 G Π 1 T yields
Π 1 G Π 1 T = n n + 1 H 11 ( 11 ) n n + 1 H 11 ( 12 ) 1 n + 1 e n n + 1 H 12 ( 1 ) 0 0 1 n + 1 e n n + 1 H 12 ( 2 ) 1 n + 1 e T 1 n + 1 e T 1 n + 1 1 n + 1 e T n n + 1 e w k 1 T n n + 1 e w k 2 T 1 n + 1 e n n + 1 e w 2 T .
Let
X = I k 1 0 0 0 0 I k 2 0 0 0 0 1 0 0 0 0 L ^ , where L ^ R ( n k ) × ( n k ) is   an   invertible   matrix   with   L ^ e = e 1 ,
then
X Π 1 G Π 1 T X 1 = n n + 1 H 11 ( 11 ) n n + 1 H 11 ( 12 ) 1 n + 1 e n n + 1 H 12 ( 1 ) L ^ 1 0 0 1 n + 1 e n n + 1 H 12 ( 2 ) L ^ 1 1 n + 1 e T 1 n + 1 e T 1 n + 1 1 n + 1 e T L ^ 1 n n + 1 L ^ e w k 1 T n n + 1 L ^ e w k 2 T 1 n + 1 L ^ e n n + 1 L ^ e w 2 T L ^ 1 = n n + 1 H 11 ( 11 ) n n + 1 H 11 ( 12 ) 1 n + 1 e n n + 1 H 12 ( 1 ) L ^ 1 0 0 1 n + 1 e n n + 1 H 12 ( 2 ) L ^ 1 1 n + 1 e T 1 n + 1 e T 1 n + 1 1 n + 1 e T L ^ 1 n n + 1 e 1 w k 1 T n n + 1 e 1 w k 2 T 1 n + 1 e 1 n n + 1 e 1 w 2 T L ^ 1 = n n + 1 H 11 ( 11 ) n n + 1 H 11 ( 12 ) 1 n + 1 e n n + 1 H 12 ( 1 ) e n n + 1 H 12 ( 1 ) L ^ 1 e 2 e n k 0 0 1 n + 1 e n n + 1 H 12 ( 2 ) e n n + 1 H 12 ( 2 ) L ^ 1 e 2 e n k 1 n + 1 e T 1 n + 1 e T 1 n + 1 n k n + 1 1 n + 1 e T L ^ 1 e 2 e n k n n + 1 w k 1 T n n + 1 w k 2 T 1 n + 1 n n + 1 w 2 T e n n + 1 w 2 T L ^ 1 e 2 e n k 0 0 0 0 0 .
Denote by
G 1 = n n + 1 H 11 ( 11 ) n n + 1 H 11 ( 12 ) 1 n + 1 e n n + 1 H 12 ( 1 ) e 0 0 1 n + 1 e n n + 1 H 12 ( 2 ) e 1 n + 1 e T 1 n + 1 e T 1 n + 1 n k n + 1 n n + 1 w k 1 T n n + 1 w k 2 T 1 n + 1 n n + 1 w 2 T e ,
and then we permutate the second and fourth block row and column of G 1 simultaneously by a permutation matrix Π 2 ,
Π 2 G 1 Π 2 T = I 0 0 0 0 0 0 1 0 0 1 0 0 I 0 0 n n + 1 H 11 ( 11 ) n n + 1 H 11 ( 12 ) 1 n + 1 e n n + 1 H 12 ( 1 ) e 0 0 1 n + 1 e n n + 1 H 12 ( 2 ) e 1 n + 1 e T 1 n + 1 e T 1 n + 1 n k n + 1 n n + 1 w k 1 T n n + 1 w k 2 T 1 n + 1 n n + 1 w 2 T e I 0 0 0 0 0 0 1 0 0 1 0 0 I 0 0 T = n n + 1 H 11 ( 11 ) n n + 1 H 12 ( 1 ) e 1 n + 1 e n n + 1 H 11 ( 12 ) n n + 1 w k 1 T n n + 1 w 2 T e 1 n + 1 n n + 1 w k 2 T 1 n + 1 e T n k n + 1 1 n + 1 1 n + 1 e T 0 n n + 1 H 12 ( 2 ) e 1 n + 1 e 0 .
Denote by
X 1 = I 0 0 0 0 1 0 0 0 0 1 0 0 0 0 L ^ , where L ^ R k 2 × k 2 is   an   invertible   matrix   with   L ^ e = e 1 .
Then,
X 1 Π 2 G 1 Π 2 T X 1 1 = n n + 1 H 11 ( 11 ) n n + 1 H 12 ( 1 ) e 1 n + 1 e n n + 1 H 11 ( 12 ) L ^ 1 n n + 1 w k 1 T n n + 1 w 2 T e 1 n + 1 n n + 1 w k 2 T L ^ 1 1 n + 1 e T n k n + 1 1 n + 1 1 n + 1 e T L ^ 1 0 n n + 1 L ^ H 12 ( 2 ) e 1 n + 1 L ^ e 0 = n n + 1 H 11 ( 11 ) n n + 1 H 12 ( 1 ) e 1 n + 1 e n n + 1 H 11 ( 12 ) L ^ 1 e 1 n n + 1 H 11 ( 12 ) L ^ 1 e 2 e n k n n + 1 w k 1 T n n + 1 w 2 T e 1 n + 1 n n + 1 w k 2 T L ^ 1 e 1 n n + 1 w k 2 T L ^ 1 e 2 e n k 1 n + 1 e T n k n + 1 1 n + 1 1 n + 1 e T L ^ 1 e 1 1 n + 1 e T L ^ 1 e 2 e n k 0 n n + 1 1 n + 1 0 0 0 0 0 0 0 = n n + 1 H 11 ( 11 ) n n + 1 H 12 ( 1 ) e 1 n + 1 e n n + 1 H 11 ( 12 ) e n n + 1 H 11 ( 12 ) L ^ 1 e 2 e n k n n + 1 w k 1 T n n + 1 w 2 T e 1 n + 1 n n + 1 w k 2 T e n n + 1 w k 2 T L ^ 1 e 2 e n k 1 n + 1 e T n k n + 1 1 n + 1 k 2 n + 1 1 n + 1 e T L ^ 1 e 2 e n k 0 n n + 1 1 n + 1 0 0 0 0 0 0 0 ,
due to the fact that H 12 ( 2 ) e = e , a n d e = L ^ 1 e 1 .
Let
G 2 = n n + 1 H 11 ( 11 ) n n + 1 H 12 ( 1 ) e 1 n + 1 e n n + 1 H 11 ( 12 ) e n n + 1 w k 1 T n n + 1 w 2 T e 1 n + 1 n n + 1 w k 2 T e 1 n + 1 e T n k n + 1 1 n + 1 k 2 n + 1 0 n n + 1 1 n + 1 0 .
Then, we obtain a PageRank vector expression for this level 4 partition by using Theorem 3.
Theorem 5.
Using the above notations and defining G 2 by (13), then G 2 of order k 1 + 3 with the same nonzero eigenvalues as the full new Google matrix G is stochastic. Let
σ ^ T = σ ^ T G 2 , σ ^ 0 , σ ^ T e = 1 , w h e r e G 2 R ( k 1 + 3 ) × ( k 1 + 3 ) .
Partition σ ^ T = σ ^ 1 : k 1 + 1 T σ ^ k 1 + 2 σ ^ k 1 + 3 , where σ ^ k 1 + 2 and σ ^ k 1 + 3 are scalars. Then, we define the vector
σ T = σ ^ 1 : k 1 + 1 T σ ^ T n n + 1 H 11 ( 12 ) n n + 1 w k 2 T 1 n + 1 e T 0 σ ^ k 1 + 2 R k 1 + k 2 + 2 ,
and the vector σ satisfies σ T G 1 = σ T . Thus, the PageRank vector π n + 1 for G R ( n + 1 ) × ( n + 1 ) is given by
π n + 1 T = σ 1 : k 1 + k 2 T σ T n n + 1 H 12 ( 1 ) n n + 1 H 12 ( 2 ) 1 n + 1 e T n n + 1 w 2 T σ k 1 + k 2 + 1 .
Proof. 
The proof is technical modifications of the proof in Section 4.3; therefore, it is omitted here.   □
Thus, we claim that the new similarity matrix I 0 0 L , where L satisfies (8), can lead to a dimension reduction of an involved minimal irreducible web link matrix in its 4-level matrix partition form. After obtaining a smaller vector σ ^ T of G 2 , the PageRank vector can recover according to Theorem 5. Now, we present the output of the corresponding algorithm in Algorithm 1.
Algorithm 1 Lumping algorithm for a minimal irreducible PageRank model (5)
  • Input:  An initial nonnegative vector σ ^ T = σ ^ 1 : k 1 T σ ^ k 1 + 1 σ ^ k 1 + 2 σ ^ k 1 + 3 σ ^ T 0 σ ^ T = 1 ,  a prescribed tolerance ϵ .
  • Output: The PageRank vector π n T .
  •  Choose an initial vector σ ^ T = σ ^ 1 : k 1 T σ ^ k 1 + 1 σ ^ k 1 + 2 σ ^ k 1 + 3 R k 1 + 3 σ ^ T 0 σ ^ T = 1 .
  •  Set τ = 1 .
  • while τ ϵ do
  •       σ ˜ 1 : k 1 T = n n + 1 σ ^ 1 : k 1 T H 11 ( 11 ) + n n + 1 σ ^ k 1 + 1 w k 1 T + 1 n + 1 · 1 n + 1 e T ;
  •       σ ˜ k 1 + 1 = n n + 1 σ ^ 1 : k 1 T H 12 ( 1 ) e + n n + 1 w 2 T e σ ^ k 1 + 1 + n k n + 1 · 1 n + 1 + n n + 1 σ ^ k 1 + 3 ;
  •     % σ ^ k 1 + 2 = 1 n + 1 (by direct computation);
  •       σ ˜ k 1 + 3 = 1 σ ^ 1 : k 1 T e σ ^ k 1 + 1 1 n + 1 ;
  •       σ ^ = σ ˜ ;
  •       τ = σ ˜ σ ^ 1 ;
  • end while
  •  Compute x = σ ^ T n n + 1 H 11 ( 12 ) n n + 1 w k 2 T 1 n + 1 e T 0 T ;
  •  Recover σ T by σ T = σ ^ 1 : k 1 + 1 T x σ ^ k 1 + 2 ;
  •  Compute y = σ T n n + 1 H 12 ( 1 ) n n + 1 H 12 ( 2 ) 1 n + 1 e T n n + 1 w 2 T T ;
  •  Obtain the PageRank vector π n + 1 = σ 1 : k 1 + k 2 T y σ k 1 + k 2 + 1 .
The power method applied to G 2 involves a sparse matrix vector multiplied by a small matrix of order k 1 + 3 as well as several vector operations. The final step in Algorithm 1 recovers minimal irreducible PageRank vector according to Theorem 5. As the dangling nodes are further excluded from each iteration of the power method, Algorithm 1 has the same convergence rate as that of the power method applied to G but is much faster due to a reduced matrix computation.

5.2. Numerical Experiments

In this subsection, the web link matrix “wiki-Vote” which has 8297 Web pages and 103689 hyperlinks is used in our numerical example. There are 2187 dangling nodes and 905 “weakly nondangling” nodes in this web link matrix. We compare the power method, LDNA [20], and Algorithm 1 for computing (5) as follows:
  • The original method [17], i.e., the power method applied to a full matrix G in (12).
  • LDNA algorithm in [20]. Lumping all dangling nodes into a single node, the power method is applied to the ( k + 2 ) × ( k + 2 ) matrix S ^ ( 1 ) in Theorem 2.
  • Algorithm 1 (called as Algorithm 1). Lumping two classes of nodes, the power method applied to the ( k 1 + 3 ) × ( k 1 + 3 ) matrix G 2 in (13).
The parameters v and w are chosen as v = w = 1 n e in the above algorithms, where e is a vector of all ones. The overall termination criterion is triggered once the residual norm is below ϵ = 10 10 . Suppose that three algorithms all need N iterations until they reach the convergence criterion. Table 1 and Table 2 give the residues and operation costs of three algorithms, respectively. Here, nnz(H) denotes the number of nonzero entries of matrix H. A sparse matrix-vector product requires O (nnz(H)) operations. From Table 1, we can observe that Algorithm 1 produces the least residual norms compared with the other two algorithms. In addition, we also obtain the fact that the above three algorithms have almost the same convergence rate. We can find that the LDNA algorithm saves about O ( ( n k ) ( N 1 ) ) operations, and Algorithm 1 saves about O ( ( n k 1 ) ( N 1 ) ) operations from Table 2. Because Algorithm 1 computes the PageRank vector by lumping the dangling and “weakly nondangling” nodes and applies the power method to the smallest matrix, it needs the least operations.

6. Conclusions

In this paper, we have presented some new proofs and provided a novel efficient algorithm for computing a kind of minimal irreducible PageRank. Firstly, the spectral distribution of the new Google matrix is also validated by a similarity transformation method except for the determinant method. The simple and generalized similarity transformation matrix method is easier for validating Theorems 2 and 3 in the new PageRank model. Secondly, the new Google matrix of an 4 × 4 block partition is further lumped to reduce the order of the involved computation matrix. Some theoretical results are discussed. Thirdly, as one real application of our results, the power method can operate on a smaller matrix than before. Therefore, the computational cost is reduced. For further work, we can consider how to choose an extrapolation method operating on a reduced matrix to accelerate the computation of web information retrieval models. The extrapolation method combined with a lumping method is also worth studying.

Author Contributions

Conceptualization, Y.D.; methodology, Y.F. and Y.D.; software, Y.F.; validation, Y.D. and J.Y.; formal analysis, Y.F. and Y.D.; investigation, Y.D.; resources, Y.F.; data curation, Y.F. and Y.D.; writing—original draft preparation, Y.D. and J.Y.; writing—review and editing, Y.F.; supervision, J.Y.; project administration, Y.F. and J.Y.; funding acquisition, Y.F. and J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant Nos. 12001363, 71671125).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the editor and referees who kindly reviewed the earlier version of this manuscript and provided valuable suggestions and comments.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Page, L.; Brin, S.; Motwani, R.; Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web; Stanford Digital Libraries: 1999. Available online: http://dbpubs.stanford.edu:8090/pub/1999–66 (accessed on 16 July 2022).
  2. Feng, Y.H.; You, J.X.; Dong, Y.X. An extrapolation iteration and its lumped type iteration for computing PageRank. Bulletion Iran. Math. Soc. 2021. [Google Scholar] [CrossRef]
  3. Langville, A.N.; Meyer, C.D. Google’s PageRank and Beyond: The Science of Search Engine Rankings; Princeton University Press: Princeton, NJ, USA, 2006. [Google Scholar]
  4. Brezinski, C.; Redivo-Zaglia, M. The PageRank Vector: Properties, Computation, Approximation, and Acceleration. SIAM J. Matrix Anal. Appl. 2006, 28, 551–575. [Google Scholar] [CrossRef]
  5. Bai, Z.Z.; Wu, W.T.; Muratova, G.V. The power method and beyond. Appl. Numer. Math. 2021, 164, 29–42. [Google Scholar] [CrossRef]
  6. Berkhin, P. A survey on PageRank computing. Internet Math. 2005, 2, 73–120. [Google Scholar] [CrossRef]
  7. Miao, C.Q.; Tan, X.Y. Accelerating the Arnoldi method via Chebyshev polynomials for computing PageRank. J. Comput. Appl. Math. 2020, 377, 112891. [Google Scholar] [CrossRef]
  8. Tian, Z.L.; Zhang, Y.; Wang, J.X.; Gu, C.Q. Several relaxed iteration methods for computing PageRank. J. Comput. Appl. Math. 2021, 388, 113295. [Google Scholar] [CrossRef]
  9. Ipsen, I.C.F.; Selee, T.M. PageRank computation with special attention to dangling nodes. SIAM J. Matrix Anal. Appl. 2007, 29, 1281–1296. [Google Scholar] [CrossRef]
  10. Lin, Y.Q.; Shi, X.H.; Wei, Y.M. On computing PageRank via lumping the Google matrix. J. Comput. Appl. Math. 2009, 224, 702–708. [Google Scholar] [CrossRef]
  11. Tian, Z.L.; Liu, Z.Y.; Dong, Y.H. The coupled iteration algorithms for computing PageRank. Numer. Algorithms 2021, 89, 1603–1637. [Google Scholar] [CrossRef]
  12. Gu, X.M.; Lei, S.L.; Zhang, K.; Shen, Z.L.; Wen, C.; Carpentieri, B. A Hessenberg-type algorithm for computing PageRank Problems. Numer. Algorithms 2022, 89, 1845–1863. [Google Scholar] [CrossRef]
  13. Jin, Y.; Wen, C.; Shen, Z.L.; Gu, X.M. A simpler GMRES algorithm accelerated by Chebyshev polynomials for computing PageRank. J. Comput. Appl. Math. 2022, 413, 114395. [Google Scholar] [CrossRef]
  14. Hu, Q.Y.; Wen, C.; Huang, T.Z.; Shen, Z.L.; Gu, X.M. A variant of the Power-Arnoldi algorithm for computing PageRank. J. Comput. Appl. Math. 2021, 381, 113034. [Google Scholar] [CrossRef]
  15. Yu, Q.; Miao, Z.K.; Wu, G.; Wei, Y.M. Lumping Algorithms for Computing Google’s PageRank and Its Derivative, with Attention to Unreferenced Nodes. Inf. Retr. 2012, 15, 503–526. [Google Scholar] [CrossRef]
  16. Dong, Y.X.; Gu, C.Q.; Chen, Z.B. An Arnoldi-Inout method accelerated with a two-stage matrix splitting iteration for computing PageRank. Calcolo 2017, 54, 857–879. [Google Scholar] [CrossRef]
  17. Gleich, D.F. PageRank beyond the web. SIAM Rev. 2015, 57, 321–363. [Google Scholar] [CrossRef]
  18. Dong, Y.X.; You, J.X.; Feng, Y.H. Remarks on lumping PageRank results of Ipsen and Selee. arXiv 2021, arXiv:2107.11080. [Google Scholar]
  19. Dong, Y.X.; Feng, Y.H.; You, J.X. On computing HITS ExpertRank via lumping the hub matrix. arXiv 2021, arXiv:2112.13173. [Google Scholar]
  20. Li, L.L.; Chen, X.; Song, Y.Z. The PageRank model of minimal irreducible adjustment and its lumping method. J. Appl. Math. Comput. 2013, 42, 297–308. [Google Scholar] [CrossRef]
  21. Lee, C.P.C.; Golub, G.H.; Zenios, S.A. A Fast Two-Stage Algorithm for Computing PageRank and Its Extensions; Technical Report; Stanford University: Palo Alto, CA, USA, 2003. [Google Scholar]
  22. Avrachenkov, K.; Litvak, N.; Pham, K.S. A singular perturbation approach for choosing the PageRank damping factor. Internet Math. 2008, 5, 47–69. [Google Scholar] [CrossRef]
  23. Engström, C.; Silvestrov, S. Using graph partitioning to calculate PageRank in a changing network. Data Anal. Appl. 2 Util. Results Eur. Other Top. 2019, 3, 179–191. [Google Scholar]
  24. Wu, G. Eigenvalues of Certain Augmented Complex Stochastic Matrices with Application in PageRank. Recent Advances in Scientific Computing and Matrix Analysis. 2011, pp. 85–92. Available online: https://www.researchgate.net/profile/Gang-Wu-18/publication/266862168_Eigenvalues_of_certain_augmented_complex_stochastic_matrices_with_applications_to_PageRank/links/571587b708ae1a840264fe1a/Eigenvalues-of-certain-augmented-complex-stochastic-matrices-with-applications-to-PageRank.pdf (accessed on 16 July 2022).
  25. Wu, G. On the convergence of the minimally irreducible Markov chain method with applications to PageRank. Calcolo 2017, 54, 267–279. [Google Scholar] [CrossRef]
  26. Brauer, A. Limits for the characteristic roots of a matrix. IV: Applications to stochastic matrices. Duke Math. J. 1952, 19, 75–91. [Google Scholar] [CrossRef]
  27. Ding, J.; Zhou, A.H. Eigenvalues of rank-one updated matrices with some applications. Appl. Math. Lett. 2007, 20, 1223–1226. [Google Scholar] [CrossRef]
Table 1. Comparison of residues for three algorithms.
Table 1. Comparison of residues for three algorithms.
IterationOriginalLDNAAlgorithm 1
10 3.0192 e 04 1.9117 e 04 1.2928 e 04
20 1.2539 e 06 7.0788 e 07 4.5348 e 07
30 6.0013 e 09 3.3144 e 09 1.9865 e 09
36 2.4952 e 10 1.3619 e 10 7.8191 e 11
Table 2. Comparison of operation costs for three algorithms.
Table 2. Comparison of operation costs for three algorithms.
AlgorithmsBounds for Operations
Original O ( ( n n z ( H ) + n ) N )
LDNA O ( ( n n z ( H 11 ) + k ) N ) + O ( n n z ( H 12 ) + k )
Algorithm 1 O ( ( n n z ( [ H 11 ( 11 ) H 12 ( 1 ) ] ) + k 1 ) N ) + O ( n n z ( [ H 11 ( 12 ) H 12 ] ) + k 1 )
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Feng, Y.; Dong, Y.; You, J. A Note on a Minimal Irreducible Adjustment Pagerank. Symmetry 2022, 14, 1640. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14081640

AMA Style

Feng Y, Dong Y, You J. A Note on a Minimal Irreducible Adjustment Pagerank. Symmetry. 2022; 14(8):1640. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14081640

Chicago/Turabian Style

Feng, Yuehua, Yongxin Dong, and Jianxin You. 2022. "A Note on a Minimal Irreducible Adjustment Pagerank" Symmetry 14, no. 8: 1640. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14081640

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop