Next Article in Journal
Image Representation Using Stacked Colour Histogram
Previous Article in Journal
Synthetic Experiences for Accelerating DQN Performance in Discrete Non-Deterministic Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Modified Liu and Storey Conjugate Gradient Method for Large Scale Unconstrained Optimization Problems

1
Department of Mathematics, Faculty of Ocean Engineering Technology and Informatics, Universiti Malaysia Terengganu, Kuala Nerus 21030, Terengganu, Malaysia
2
Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 13318, Saudi Arabia
3
Department of Mathematics, College of Science, Jazan University, Jazan 45142, Saudi Arabia
*
Author to whom correspondence should be addressed.
Submission received: 21 June 2021 / Revised: 23 July 2021 / Accepted: 25 July 2021 / Published: 28 July 2021

Abstract

:
The conjugate gradient method is one of the most popular methods to solve large-scale unconstrained optimization problems since it does not require the second derivative, such as Newton’s method or approximations. Moreover, the conjugate gradient method can be applied in many fields such as neural networks, image restoration, etc. Many complicated methods are proposed to solve these optimization functions in two or three terms. In this paper, we propose a simple, easy, efficient, and robust conjugate gradient method. The new method is constructed based on the Liu and Storey method to overcome the convergence problem and descent property. The new modified method satisfies the convergence properties and the sufficient descent condition under some assumptions. The numerical results show that the new method outperforms famous CG methods such as CG-Descent 5.3, Liu and Storey, and Dai and Liao. The numerical results include the number of iterations and CPU time.

1. Introduction

The optimization problem that we want to solve takes the following form:
min f ( x )   , x   n   ,
where f   : n is a continuous and differentiable function, and its gradient ( f ( x ) ) is available.   x 1   n   is an arbitrary initial point for new functions or non-standard functions. The CG method generates a sequence of iterates x k (vector) as follows:
Where x k is the current iteration and α k > 0 is a step size obtained from a line search
x k + 1 = x k + α k d k , k = 1 ,   2 ,   ,
such as exact or inexact line search. The search direction d k in the CG method requests only the first derivative of the optimization function, and it is defined as follows:
d k = { g k , g k + β k d k 1 , if   k = 1 , if   k 2 ,
where g ( x ) = f ( x ) , g k = g ( x k ) and β k is known as the CG formula or CG parameter. To obtain the step size α k , we can use the following line searches:
Exact line search, which is given by the following equation
f ( x k + α k d k ) = min f ( x k + α d k ) , α 0 .
However, Equation (3) is computationally expensive if the function has multiple local minima or stationary points.
Inexact line search, normally we use strong Wolfe–Powell (SWP) [1,2], which is defined as follows:
f ( x k + α k d k ) f ( x k ) + δ α k g k T d k ,
and
| g ( x k + α k d k ) T d k | σ   | g k T d k | ,
where 0 < δ < σ < 1 .
From Equation (4), we can note that
f ( x k + α k d k ) f ( x k ) δ α k g k T d k ,
i.e.,
f ( x k + 1 ) f ( x k ) δ α k g k T d k .
If Equation (7) holds, then the following equation holds
f ( x k + 1 ) f ( x k ) .
Also, note that Equation (5) can be written as follows
σ   g k T d k g ( x k + α k d k ) T d k σ   g k T d k .
Another version of Wolfe–Powell (WP) line search is the weak Wolfe–Powell (WWP) line search, which is defined by Equation (4) and the following equation
g ( x k + α k d k ) T d k σ g k T d k .  
The most famous classical formulas of the CG methods are Hestenses–Stiefel (HS) [3], Polak–Ribiere–Polyak (PRP) [4], Liu and Storey (LS) [5], Fletcher–Reeves(FR) [6], Fletcher (CD) [7], as well as Dai and Yuan (DY) [8], given as follows:
β k H S = g k T y k 1 d k 1 T y k 1 ,   β k P R P = g k T y k 1 g k 1 2 ,   β k L S = g k T y k 1 d k 1 T g k 1 β k F R = g k 2 g k 1 2 ,   β k C D = g k 2 d k 1 T g k 1 ,   β k D Y = g k 2 d k 1 T g k 1 ,
where y k 1 = g k g k 1 .
The first group of classical methods, i.e., (HS), (PRP), and (LS), are efficient. However, Powell [9] proposed a counterexample to show that there exists a non-convex function. PRP, HS, and LS fail to satisfy the convergence properties even when the exact line search is used. Powell suggested the importance to achieve the convergence properties of PRP, HS, and LS methods, where the methods should be non-negative. Gilbert and Nocedal [10] showed that non-negative PRP or HS, i.e., β k = max { β k H S ,   0   } , is convergent under special different line searches. The second group of classical CG methods, i.e., (FR), (PRP), (DY), are robust and converge. However, the group is not efficient.
The descent condition (downhill condition) performs an important ruling in the convergence of the CG method and its robustness, which is defined as follows
g k T d k < 0 .
Al–Baali [11] proposed another version of the descent condition called the sufficient descent condition, which plays a significant role in the convergence of the CG method. The author used Equation (7) to establish the convergence properties of β k F R , which are defined as follows: There exists a constant c > 0 such that
g k T d k c g k 2 , k N .
In this case, the function is quadratic, i.e., f ( x ) = g T x + ( 1 / 2 ) x T H x and the step size obtained by exact line search (3) indicates that the CG method satisfies the conjugacy condition, i.e., d i T H d j T = 0 ,   i j . Using the mean value theorem and exact line search of Equation (2), we can obtain β k H S . From the quasi-Newton method, BFGS method, the limited memory (LBFGS) method, and using Equation (2), Dai and Liao [12] proposed the following conjugacy condition.
d k T y k 1 = t g k T s k 1 ,
where s k 1 = x k x k 1 , and t 0 , [12] use t = 0.1 in the numerical experiments. In the case of t = 0 , Equation (8) becomes the classical conjugacy condition. By using Equations (2) and (8), Dai and Liao [12] proposed the following CG formula
β k D L = g k T y k 1 d k 1 T y k 1 t g k T s k 1 d k 1 T y k 1 .
However, β k D L face the same problem as β k P R P and β k H S , i.e., β k D L is not non-negative in general. Thus, Dai and Liao [12] replaced Equation (9) by
β k D L + = max { β k H S , 0 } t g k T s k 1 d k 1 T y k 1 .
Hager and Zhang [13,14] presented a modified CG parameter that satisfies the descent property for any inexact line search with g k T d k ( 7 / 8 ) g k 2 . This new version of the CG method is convergent whenever the line search satisfies the WP line search. This formula is given as follows:
β k H Z =   max   { β k N ,   η k }
where β k N = 1 d k T y k ( y k 2 d k y k 2 d k T y k ) T g k , η k = 1 d k   min { η ,   g k } , and η > 0 is a constant.
Note that if t = 2 y k 2 s k T y k , then β k N = β k D Y . Zhang et al. [15] and Zhang and Xu [16] used Taylor’s series and presented the following modified secant relation:
B k s k 1 = y ¯ k 1 where   y ¯ k 1 = y k 1 + ϑ k 1 s k 1 2 s k   and   ϑ k 1 = 6 ( f k 1 f k ) + 3 ( g k 1 + g k ) T s k 1 .
From (12), Yabe and Takano [17] proposed the following formula:
β k Y T = g k T ( z k 1 t s k 1 ) d k 1 T z k 1 where   z k 1 = y k 1 + ρ ϑ k 1 s k 1 2 s k   and ρ 0 .
Moreover, Razieh Dehghani et al. [18] proposed the following formula:
β k Y T = g k T ( y k 1 * t s k 1 ) d k 1 T y k 1 * .
Jiang et al. [19] proposed the CG method by replacing d k 1 by g k 1 in β k H S as follows:
β k J J S L = g k T y k 1 g k 1 T y k 1 .
Furthermore, Wei et al. [20] proposed a non-negative formula, referred to as the WYL coefficient. It is defined as follows:
β k W Y L = g k 2 g k g k 1 g k T g k 1 g k 1 2
This parameter is similar to the PRP coefficient, which possesses convergence with both an exact line and inexact line searches, providing a sufficient descent condition. Many modifications of the WYL coefficient have been suggested. The reader can refer to [21]. Here we mention some of them.
β k V H S = g k 2 g k g k 1 g k T g k 1 d k 1 T ( g k g k 1 ) and β k N P R P = g k 2 g k g k 1 | g k T g k 1 | g k 1 2 ,
which were presented in [22,23], respectively.
Al-Baali et al. [24] proposed a new CG version called (G3TCG) that offers many selections of CG parameters. The same study [24] found that the G3TCG method is more efficient than β k H Z in some cases and competitive in some other cases.
The convergence rate for the CG method is linear unless the iterative procedure is occasionally restarted at least every n iterations [25]. Beale [26] suggested the use of the two-term CG method instead of the steepest descent (SD) method ( d k = g k ,   k 1 ) as the restart search direction. He also extended this method to a three-term method as a non-restart direction. Furthermore, Powell [25] suggested restarting the search direction using Beale’s method if
| g k T g k 1 | > 0.2   g k 2 ,
holds for every   n iteration, whichever first occurs. Dai and Yuan [27] extended the Powell restart criteria (7) to the following form
| g k T g k 1 | > τ   g k 2 , τ ( 0 ,   1 ) .
Besides, Alhawarat et al. [28] presented the following simple formula
β k A Z P R P = { g k 2 μ k | g k T g k 1 | g k 1 2 ,   if   g k 2 > μ k | g k T g k 1 | , 0 ,                          otherwise ,
where represents the Euclidean norm and μ k is defined as follows:
μ k = x k x k 1 y k
Lemma 3.1 in [28] shows that if Assumption 1 holds and the Beale–Powell restart condition is violated for non-restart search direction, then g k 2 > μ k | g k T g k 1 | holds.
Moreover, Kaelo et al. [29] proposed a non-negative CG formula with convergence properties, and compared the new formula with β k A Z P R P , where the proposed method is given as follows:
β k P K T = { g k 2 g k T g k 1 max { d k 1 T y k 1 , g k 1 T d k 1 } ,   If   0 < g k T g k 1 < g k 2 g k 2 max { d k 1 T y k 1 , g k 1 T d k 1 }   ,   otherwise .
To solve non-linear monotone equations with convex constraints, [30] proposed using the Powell symmetrical technique to the Liu–Storey conjugate gradient method. The reader can refer to the references for more about non-linear monotone equations with convex constraints and their application [31,32,33,34].

2. The Proposed CG Formula and Its Motivation

The CG method with β k L S cannot satisfy the decent condition. On the other hand, β k L S comes from the group of efficient CG method as explained before to inherit the efficiency of β k L S . To avoid the convergence problem, we use β k A Z P R P in [19] and β k D L in [12] to propose a new non-negative CG method that satisfies the sufficient descent condition with SWP line search. Moreover, convergence properties are achieved. The new formula is a modification of β k L S , β k D L , and β k A Z P R P defined as follows:
β k L S + = { g k 2 μ k | g k T g k 1 | d k 1 T g k 1   if   g k 2 > μ k | g k T g k 1 | , β k D L H S                otherwise
where represents the Euclidean norm and
β k D L H S = μ k g k T s k 1 d k 1 T y k 1 .
The proposed method in Equation (16) has the following attitudes:
  • Satisfies the descent condition.
  • Satisfies the convergence properties.
  • Equation (16) is constructed based on β k L S with restart condition to be non-negative and avoid the convergence problems.
  • Dai and Liao [12] suggest using t   ( g k T s k 1 / d k 1 T y k 1 ) instead of steepest descent as a restart criterion, where Equation (16) is restarted by μ k   ( g k T s k 1 / d k 1 T y k 1 ) , more efficient than the steepest descent.
  • The numerical results demonstrate the efficiency and robustness of Equation (16) compared to the other CG methods, including CG-Descent, LS, and DL+.

3. Convergence Analysis of β k L S +

In the analysis below, we obtain the stationary point from the convergence mean. In addition, if g k = 0 , then a stationary point has been obtained. Thus, we assume g k 0 for all k 1 in the subsequent analysis. For more about local and global minimum points, the reader can refer to the following useful references [35,36].
To satisfy the convergence analysis of the modified CG method, the following assumption is required.
Assumption 1. 
A. The level set Φ = { x | f ( x ) f ( x 1 ) } is bounded; that is, a positive constant λ exists such that
x λ , x Φ .
B. In some neighborhood Ρ of Φ , f is continuously differentiable, and its gradient is Lipschitz continuous; that is, for all x , y Ρ , there exists a constant L > 0 such that
g ( x ) g ( y ) L x y
This assumption implies that there exists a positive constant γ ¯ such that
g ( x ) γ ¯ ,   x Ρ .
The following algorithm (Algorithm 1) classifies the steps of the CG method to obtain the stationary point using Equation (16) and SWP line search.
Algorithm 1. The steps to obtain the optimum point of optimization problem by using the CG method
Step 1 Provide a starting point x 1 . Set the initial search direction d 1 = g 1 . Let k = 1 .
Step 2 If a stopping criteria is satisfied, then stop.
Step 3 Compute d k based on (2) with (16).
Step 4 Compute α k using (4) and (5).
Step 5 Update x k + 1 . based on (1).
Step 6 Set k : = k + 1 and go to Step 2.
In the following section and subsections, we obtain the descent condition and the convergence property of Equation (16) with CG formula, the proof similar to that presented by [12].

3.1. The Descent Property of CG Method with β k L S +

Theorem 1. 
Let the sequences { x k } and { d k } be obtained using (1), (2), and β k L S + , where α k is computed using the SWP line search (4) and (5). If σ ( 0 ,   1 2 ] , then the descent condition given in (7) holds.
Proof. 
By multiplying (2) by g k T , we obtain
g k T d k = g k T ( g k + β k d k 1 ) = g k 2 + β k g k T d k 1 = g k 2 + ( g k 2 μ k | g k T g k 1 | d k 1 T g k 1 ) g k T d k 1 .
By using the proof by induction technique, for k = 1 , it is true, i.e., g 1 T d 1 = g 1 2 . We now assume that it is true until k 1 , thus we can write Equation (16) as follows:
β k L S + = g k 2 + μ k | g k T g k 1 | d k 1 T g k 1 g k 2 d k 1 T g k 1 + μ k | g k T g k 1 | d k 1 T g k 1 g k 2 d k 1 T g k 1 = g k 2 g k 1 2
Divide both sides of Equation (18) by g k 2 and using (5), we obtain
1 + σ g k 1 T d k 1 | | g k 1 | | 2 g k T d k | | g k | | 2 1 σ g k 1 T d k 1 | | g k 1 | | 2 .
Repeating the process for the left-hand side of Equation (18), we obtain
1 + j = 1 k 1 σ j = j = 0 k 1 σ j .
Repeating the process for the right hand side of Equation (18) yields
1 j = 1 k 1 σ j = 2 + j = 0 k 1 σ j .
Thus, Equation (18) can be written as follows:
j = 0 k 1 σ j g k T d k | | g k | | 2 2 + j = 0 k 1 σ j
Since
j = 0 k 1 ( σ ) j < 1 ( σ ) k 1 σ ,
hence
1 ( σ ) k 1 σ g k T d k | | g k | | 2 2 + 1 ( σ ) k 1 σ .
When σ 1 / 2 , we obtain 1 ( σ ) k 1 σ < 2 . Let c = 2 1 ( σ ) k 1 σ , then
c 2 g k T d k | | g k | | 2 c .  
The proof is complete. □

3.2. The Convergence Property of the CG Method with β k L S + to Obtain the Stationary Point

Theorem 2. 
Let Assumption 1 hold. Consider any form of (1) and (2), with the new formula (10), in which α k is obtained from the SWP line search (4) and (5) with σ 1 / 2 . Then,
lim   inf   k   g k = 0 .
Proof. 
We will prove the theorem by contradiction. Assume that the conclusion is not true.
Then, a constant ε > 0 exists such that
g k ε ,   k 1 .
Upon squaring both sides of (2), we obtain
d k 2 = g k 2 2 β k g k T d k 1 + β k 2 d k 1 2 .
Dividing Equation (21) by g k 4 yields
d k 2 g k 4 = 1 g k 2 2 β k g k T d k 1 g k 4 + β k 2 d k 1 2 g k 4 .
Using Equations (5) and (16), we now obtain
d k 2 g k 4 d k 1 2 g k 1 4 + 1 g k 2 + 2 σ | g k 1 T d k 1 | g k 1 2 g k 2 d k 1 2 g k 1 4 + 1 + 2 σ ( 2 c ) g k 2 .
Repeating the process for (22) using the relationship 1 g 1 = 1 d 1 yields
d k 2 g k 4 ( 1 + 2 σ ( 2 c ) ) i = 1 k 1 g i 2
From (20), we obtain
g k 4 d k 2 ε 2 k ( 1 + 2 σ ( 2 c ) )
Therefore,
k = 0 g k 4 d k 2 = .
 □
Theorem 3. 
Suppose the sequences { x k } and { d k } are obtained by using (1), (2), and β k D L H S , where α k is computed via the SWP line search (4) and (5). Then, the descent condition holds.
Proof. 
Let
β k D L H S = μ k g k T s k 1 d k 1 T y k 1 .
By multiplying (2) by g k T , and substituting β k D L H S , we obtain
g k T d k = g k 2 μ k α k 1 g k T d k 1 d k 1 T y k 1 g k T d k 1 .
g k T d k = g k 2 μ k α k 1 g k T d k 1 2 d k 1 T y k 1 < 0 .  
This completes the proof. □
The condition in (23) is called Zoutendijk condition [37], acting as an important role in proving convergence properties of the CG method.
Lemma 1. 
Assume that Assumption 1 holds. Consider any form of (1) and (2) with step size α k satisfying the WWP line search, where the search direction d k is descent. Then, we have
k = 0 ( g k T d k ) 2 d k 2 < .   ···
Moreover, (23) holds for exact and SWP line searches. Substituting (7) into (18), it follows that
k = 0 g k 4 d k 2 < .   ···
Theorem 4. 
Assume that Assumption 1 holds. Consider the conjugate gradient method in (1) and (2) with β k D L H S , where d k is a descent direction and α k is obtained by the strong Wolfe line search. Then, the lim   inf   k g k = 0 .
Proof. 
We will prove this theorem by contradiction. Suppose Theorem 4 is not true.
Then, a constant ε > 0 exists such that
g k ε ,   k 1 .
By squaring both sides of (2), we obtain
d k 2 = g k 2 2 β k g k T d k 1 + β k 2 d k 1 2      g k 2 + 2 | β k | g k T d k 1 + β k 2 d k 1 2      g k 2 + 2 L g k s k ( 1 σ ) g k 1 T d k 1 ( σ ) g k 1 T d k 1 + 1 L 2 ( ( σ ) g k 1 T d k 1 ) 2 s k 1 2 ( 1 σ ) g k 1 T d k 1      g k 2 + 2 L g k s k ( 1 σ ) σ + 1 L 2 ( σ ) 2 s k 1 2 ( 1 σ ) 2 . d k 2 g k 4 g k 2 g k 4 + 2 L g k s k ( 1 σ ) g k 4 σ + 1 L 2 σ 2 s k 1 2 ( 1 σ ) 2 g k 4      1 g k 2 + 2 L g k s k ( 1 σ ) g k 4 σ + 1 L 2 σ 2 s k 1 2 ( 1 σ ) 2 g k 4 .          1 g k 2 + 2 L s k ( 1 σ ) g k 3 σ + 1 L 2 σ 2 s k 1 2 ( 1 σ ) 2 g k 4 .
Let
g k m = min { g k 2 , g k 3 , g k 4 } , m N
then
d k 2 g k 4 1 g k m ( 1 + 2 L λ ( 1 σ ) σ + 1 λ 2 σ 2 λ 2 ( 1 σ ) 2 ) .
Also, let
R = ( 1 + 2 L Γ ( 1 σ ) σ + 1 λ 2 σ 2 Γ 2 ( 1 σ ) 2 )
then
d k 2 g k 4 R g k m R i = 1 k 1 g i m g k 4 d k 2 ε m k R .
Therefore,
k = 0 g k 4 d k 2 = .
This result contradicts (24). Therefore, lim   inf   k   g k = 0 . , completing the proof. □

4. Numerical Results and Discussion

To improve the efficiency of the proposed CG method, we modified Equation (16) as follows:
β k L S + = { g k 2 | g k T g k 1 | d k 1 T g k 1   if   g k 2 > | g k T g k 1 | g k 2 μ k | g k T g k 1 | d k 1 T g k 1   if   g k 2 > μ k | g k T g k 1 | μ k g k T s k 1 d k 1 T y k 1                 otherwise
To analyze the efficiency of the new formula, we selected several test problems in Appendix A from CUTEr [38]. We compared the CG-Descent, LS, DL+ coefficients based on the CPU time and the number of iterations. We employed the modified CG descent 6.8 [39] with the SWP line search with δ = 0.01 , σ = 0.1 and memory = 0 for LS, LS+, DL+ algorithms. To obtain the result of CG-Descent 5.3, we employ the CG-descent 6.3 with memory = 0. The norm of the gradient was employed as the stopping criterion, specifically g k 10 6 for all algorithms. The host computer is an AMD A4 and 4GB of RAM. The results are shown in Figure 1 and Figure 2. Note that a performance measure introduced by Dolan and More [40] was employed. This performance measure was introduced to compare a set of solvers S on a set of problems ρ . Assuming n s solvers and n p problems in S and ρ , respectively, the measure t p , s is defined as the computation time (e.g., the number of iterations or the CPU time) required for the solver s to solve the problem p .
To create a baseline for comparison, the performance of the solver s on the problem p is scaled by the best performance of any solver in S on the problem using the ratio
r p , s = t p , s min { t p , s : s S }
Let the parameter r M r p , s for all p ,   s be selected. A further assumption made is that r p , s = r M if and only if the solver s does not solve the problem p . As we would like to obtain an overall assessment of the performance of a solver, we defined the following measure
P s ( t ) = 1 n p size { p ρ : r p , s t }
Thus, P s ( t ) is the probability for a solver s S that the performance ratio r p , s is within a factor t R of the best possible ratio. Suppose we define the function p s as the cumulative distribution function for the performance ratio. In that case, the performance measure P s :   [0,1] for a solver is non-decreasing and piecewise continuous from the right. The value of p s ( 1 ) is the probability that the solver achieves the best performance of all the solvers. In general, a solver with high values of P s ( t ) , which would appear in the upper right corner of the figure, is preferable.
Figure 1 shows that LS+ outperforms CG-Descent 5.3, LS, and DL+ in the number of iterations. It is worth noting that LS fails to obtain the stationary point for some functions. Thus, it solves around 95% of test functions. According to Figure 2, which presents the CPU time, we note that the LS+ also outperforms CG-Descent, LS, and DL+. Also, we can note that CG-Descent outperforms DL+ and LS+ in both figures. Thus, the proposed method is more efficient than DL+ and CG-Descent and more efficient and robust than the original LS. Also, it is worth noting that the proposed method can be extended to a three term CG method by using a modified DL+ CG formula, which is our new paper in near future.

5. Conclusions

In this paper, we proposed a modification of the Liu and Storey CG method that satisfies the following main challenges:
  • The proposed method is a non-negative two-term CG method.
  • The proposed method satisfies the descent condition.
  • The convergences properties are satisfied by obtaining the stationary point/s.
  • The modified method was restarted based on the suggestion presented by [12] instead of using the steepest descent, improving the efficiency of the proposed method.
  • The numerical results show that the new method outperformed CG-Descent, DL+, and LS in terms of the number of iteration and CPU time. Moreover, the modified method is more robust than the original LS.
In future, we will try to improve the line search to reduce the number of function evaluations and the number of gradient evaluations. As an application of the CG method in image restoration, the reader can refer to [41]. In addition, we will focus on some applications of the CG method, such as machine learning, deep learning, and image restoration. The proposed method is of great interest in solving nonlinear coefficient inverse problems for partial differential equations, for more, the reader can refer to the following references [42,43,44,45,46,47].

Author Contributions

Conceptualization, Z.S.; Supervison, G.A. and I.M.; metholdolgy, A.A.; Software and orginal draft preparation. All authors have read and agreed to the published version of the manuscript.

Funding

This study has been partially supported by the universiti Malaysia Terengganu, Center of Research and Innovation Mangement.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data generated and analyzed during this study are included within the article.

Acknowledgments

The authors are grateful to the editor and the anonymous reviewers for their valuable comments and suggestions, which improved our paper. We would like to thank William W. Hager for publishing his code for implementing the CG method.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. The set of test functions.
Table A1. The set of test functions.
LS+CG-Descent
FunctionDimensionNo. IterationsNo. of Function EvaluationsNo.
Gradient Evaluations
CPU TimeNo. IterationsNo. Function EvaluationsNo. Gradient EvaluationsCPU Time
AKIVA2820150.021021110.02
ALLINITU41128200.021229180.02
ARGLINA2001320.021320.02
ARWHEAD200716120.0271580.02
BARD50001436250.021633170.02
BDQRTIC31402992560.441362732370.58
BEALE50001230220.021531160.02
BIGGS622355370.022757310.02
BOX361023140.021124130.02
BOX3724200.11825190.08
BRKMCC100051160.0251160.02
BROWNAL2619150.02925180.02
BROWNBS2001227190.021326150.02
BROWNDEN21638310.021631190.02
BROYDN7D466120900.311411281014295.47
BRYBND50003899690.285174900.38
CHAINWOO50002575333120.673186193730.866
CHNROSNB40002985963170.022875642990.02
CLIFF50943360.021870540.02
COSINE21962500.251139320.19
CRAGGLVY10,000911821420.371031971470.45
CUBE50001743300.023277470.02
CURLY10256,62877,97792,03720247,80867,29476,156173.7
CURLY2010,00078,784101,319135,082426.9766,58789,245110,540383.94
CURLY3010,00084,712111,549142,85663779,030102,516134,682639.63
DECONVU10,0001814152350.024008014010.02
DENSCHNA63616120.02919100.02
DENSCHNB2618150.0271580.02
DENSCHNC21136310.021226140.02
DENSCHND21745340.024798510.02
DENSCHNE31444360.021849320.02
DENSCHNF3928230.0281790.02
DIXMAANA2616130.0271580.02
DIXMAANB3000819130.0261370.02
DIXMAANC300061490.0261370.02
DIXMAAND30001126170.0271580.02
DIXMAANE30002753736270.422222394290.33
DIXMAANF3000441401060.061613231620.13
DIXMAANG30001163022740.21573151580.12
DIXMAANH300044669510380.781733471740.22
DIXMAANI300052567010750.633856392676444.25
DIXMAANJ3000591661170.083276553280.36
DIXMAANK3000862241510.162835672840.28
DIXMAANL3000621751270.082374752380.2
DIXON3DQ30003334017250.7810,00010,00719,99519.12
DJTL10,00082120311760.02829178800.02
DQDRTIC251160.0251160.02
DQRTIC50001532180.021737210.03
EDENSCH50002656450.032652380.03
EG2200061370.0251160.02
EIGENALS1000949217,16711,32716610,08318,02012,244178.36
EIGENBLS255024,63049,38624,81037315,30130,60315,302237
EIGENCLS255014,46828,95814,501228.2510,13619,29211,118174.19
ENGVAL126522247350.052750360.06
ENGVAL250002464480.022661370.02
ERRINROS33358677435120.083807734250.02
EXPFIT501027210.021329160.02
EXTROSNB22113457026070.83808775939821.25
FLETCBV210001110.021110.02
FLETCHCR50002645172870.091522901760.05
FMINSRF2100061161120.253466933471.09
FMINSURF562561161120.224739474741.51
FREUROTH56252151400.092551380.11
GENHUMPS500021091080.28647512,964649320.11
GENROSE50001149235012440.251078216711010.17
GROWTHLS5001134133470.021564563190.02
GULF33794650.023784480.02
HAIRY3819110.023699650.02
HATFLDD21543350.022043240.02
HATFLDE31232250.023072450.02
HATFLDFL3762231680.023992550.02
HEART6LS339110968220.0268415769410.02
HEART8LS62365703780.022495242780.02
HELIX82659370.022349270.02
HIELOW31331210.051430160.02
HILBERTA32530.022530.02
HILBERTB24950.024950.02
HIMMELBB10418180.021028210.02
HIMMELBF22256400.022660360.02
HIMMELBG4823170.02820130.02
HIMMELBH251390.0271690.02
HUMPS2853713190.02521861440.02
JENSMP21552460.021533220.02
JIMACK2830616,61383071168.56831416,62983151182.25
KOWOSB35,4491744320.021739230.02
LIARWHD41649400.052145250.03
LOGHAIRY5000431551270.022781580.02
MANCINO21123120.081123120.08
MARATOSB100779326428570.021145365727790.02
MEXHAT214545002056390.02
MOREBV21611683170.31611683170.41
MSQRTALS500010243224645432322905581529118.64
MSQRTBLS10242414470525487.532280452523266.91
NCB20B102430864922640548.920354694600646.36
NCB20500742171210569.178791511146311.83
NONCVXU25010----661012,833699915.89
NONDIA5000821140.02725200.03
NONDQUAR5000----1942388819472.45
OSBORNEA5000671671150.02942131240.02
OSBORNEB558132820.0262127650.02
OSCIPATH11296,950712,333463,3382.09310,990670,953367,3252.08
PALMER1C101227280.021126260.02
PALMER1D81024230.021125250.02
PALMER2C71121220.021121210.02
PALMER3C81121210.021120200.02
PALMER4C81121210.021120200.02
PALMER5C861370.0261370.02
PALMER6C61124240.021124240.02
PALMER7C81120200.021120200.02
PALMER8C81119190.021118170.02
PARKCH86821406119431.176721385112829.45
PENALTY11525806402869440.02
PENALTY210001922283660.031912213540.05
PENALTY3200993012401.94992852191.78
POWELLSG2003480540.052653270.02
POWER50003477574330.593727543840.76
QUARTC10,0001532180.051737210.03
ROSENBR50002877580.023477440.02
S3082721170819120.02
SCHMVETT22015497732.444373600.23
SENSORS50003890590.52150340.25
SINEVAL100521451050.0264144880.02
SINQUAD21545390.091440330.09
SISSER5000519190.02618140.02
SNAIL21747330.021002301320.02
SPARSINE218,17718,44636,08971.818,35818,64736,43173
SPARSQUR5000852672650.312861350.31
SPMSRTLS10,0002094282230.52034122100.59
SROSENBR4999922150.021123120.02
STRATEC50001663742395.19462104379619.98
TESTQUAD101494150129831.251577158431491.52
TOINTGOR50001382371790.021352331740.02
TOINTGSS503740.024950.02
TOINTPSP50001142431760.021432791820.02
TOINTQOR502936530.022936530.02
TQUARTIC501343360.031440270.03
TRIDIA500078178815570.8878278915590.84
VARDIM50001126200.021125160.02
VAREIGVL-2451290.022347240.02
VIBRBEAM501353372670.021383231990.02
WATSON857129830.0249102540.02
WOODS122355360.022251300.06
YFITU4000692001570.02841971180.02
ZANGWIL231320.021320.02

References

  1. Wolfe, P. Convergence Conditions for Ascent Methods. SIAM Rev. 1969, 11, 226–235. [Google Scholar] [CrossRef]
  2. Wolfe, P. Convergence Conditions for Ascent Methods. II: Some Corrections. SIAM Rev. 1971, 13, 185–188. [Google Scholar] [CrossRef]
  3. Hestenes, M.; Stiefel, E. Methods of conjugate gradients for solving linear systems. J. Res. Natl. Inst. Stand. Technol. 1952, 49, 409. [Google Scholar] [CrossRef]
  4. Elijah, P.; Ribiere, G. Note sur la convergence de méthodes de directions conjuguées. ESAIM: Math. Model. Numer. Anal. Modélisation Mathématique Et Anal. Numérique 1969, 3. R1, 35–43. [Google Scholar]
  5. Liu, Y.; Storey, C. Efficient generalized conjugate gradient algorithms, part 1: Theory. J. Optim. Theory Appl. 1991, 69, 129–137. [Google Scholar] [CrossRef]
  6. Fletcher, R.; Reeves, C.M. Function minimization by conjugate gradients. Comput. J. 1964, 7, 149–154. [Google Scholar] [CrossRef] [Green Version]
  7. Fletcher, R. Practical Methods of Optimization, Unconstrained Optimization; Wiley: New York, NY, USA, 1987. [Google Scholar]
  8. Dai, Y.H.; Yuan, Y. A non-linear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 1999, 10, 177–182. [Google Scholar] [CrossRef] [Green Version]
  9. Powell, M.J.D. Non-convex minimization calculations and the conjugate gradient method, Numerical Analysis (Dundee, 1983). In Lecture Notes in Mathematics; Springer: Berlin, Germany, 1984; Volume 1066, pp. 122–141. [Google Scholar]
  10. Gilbert, J.C.; Nocedal, J. Global convergence properties of conjugate gradient methods for optimization. Siam. J. Optim. 1992, 2, 21–42. [Google Scholar] [CrossRef] [Green Version]
  11. Al-Baali, M. Descent property and global convergence of the Fletcher-Reeves method with inexact line search. IMA J. Numer. Anal. 1985, 5, 121–124. [Google Scholar] [CrossRef]
  12. Dai, Y.H.; Liao, L.Z. New conjugacy conditions and related non-linear conjugate gradient methods. Appl. Math. Optim. 2001, 43, 87–101. [Google Scholar] [CrossRef]
  13. Hager, W.W.; Zhang, H. A new conjugate gradient method with guaranteed descent and an efficient line search. Siam. J. Optim. 2005, 16, 170–192. [Google Scholar] [CrossRef] [Green Version]
  14. Hager, W.W.; Zhang, H. The limited memory conjugate gradient method. Siam. J. Optim. 2013, 23, 2150–2168. [Google Scholar] [CrossRef] [Green Version]
  15. Zhang, J.Z.; Deng, N.Y.; Chen, L.H. New quasi-Newton equation and related methods for unconstrained optimization. J. Optim. Theory Appl. 1999, 102, 147–167. [Google Scholar] [CrossRef]
  16. Zhang, J.Z.; Xu, C.X. Properties and numerical performance of quasi-Newton methods with modified quasi-Newton equation. J. Comput. Appl. Math. 2001, 137, 269–278. [Google Scholar] [CrossRef] [Green Version]
  17. Yabe, H.; Takano, M. Global convergence properties of non-linear conjugate gradient methods with modified secant relation. Comput. Optim. Appl. 2004, 28, 203–225. [Google Scholar] [CrossRef]
  18. Dehghani, R.; Bidabadi, N.; Fahs, H.; Hosseini, M.M. A Conjugate Gradient Method Based on a Modified Secant Relation for Unconstrained Optimization. Numer. Funct. Anal. Optim. 2020, 41, 621–634. [Google Scholar] [CrossRef]
  19. Jiang, X.; Jian, J.; Song, D.; Liu, P. An improved Polak–Ribière–Polyak conjugate gradient method with an efficient restart direction. Comput. Appl. Math. 2021, 40, 1–24. [Google Scholar] [CrossRef]
  20. Wei, Z.; Yao, S.; Liu, L. The convergence properties of some new conjugate gradient methods. Appl. Math. Comput. 2006, 183, 1341–1350. [Google Scholar] [CrossRef]
  21. Qu, A.; Li, M.; Xiao, Y.; Liu, J. A modified Polak–Ribi’e re–Polyak descent method for unconstrained optimization. Optim. Methods Softw. 2014, 29, 177v188. [Google Scholar] [CrossRef]
  22. Shengwei, Y.; Wei, Z.; Huang, H. A note about WYL’s conjugate gradient method and its applications. Appl. Math. Comput. 2007, 191, 381–388. [Google Scholar] [CrossRef]
  23. Zhang, L. An improved Wei–Yao–Liu non-linear conjugate gradient method for optimization computation. Appl. Math. Comput. 2009, 215, 2269–2274. [Google Scholar]
  24. Al-Baali, M.; Narushima, Y.; Yabe, H. A family of three-term conjugate gradient methods with sufficient descent property for unconstrained optimization. Comput. Optim. Appl. 2015, 60, 89–110. [Google Scholar] [CrossRef]
  25. Powell, M.J.D. Restart procedures for the conjugate gradient method. Math. Program. 1977, 12, 241–254. [Google Scholar] [CrossRef]
  26. Beale, E.M.L. A derivative of conjugate gradients. In Numerical Methods for Nonlinear Optimization; Lootsma, F.A., Ed.; Academic Press: London, UK, 1972; pp. 39–43. [Google Scholar]
  27. Dai, Y.; Yuan, Y. Convergence properties of Beale-Powell restart algorithm. Sci. China Ser. A Math. 1998, 41, 1142–1150. [Google Scholar] [CrossRef]
  28. Alhawarat, A.; Salleh, Z.; Mamat, M.; Rivaie, M. An efficient modified Polak–Ribière–Polyak conjugate gradient method with global convergence properties. Optim. Methods Softw. 2017, 32, 1299–1312. [Google Scholar] [CrossRef]
  29. Kaelo, P.; Mtagulwa, P.; Thuto, M.V. A globally convergent hybrid conjugate gradient method with strong Wolfe conditions for unconstrained optimization. Math. Sci. 2019, 14, 1–9. [Google Scholar] [CrossRef] [Green Version]
  30. Liu, J.K.; Xu, J.Ł.; Zhang, L.Q. Partially symmetrical derivative-free Liu–Storey projection method for convex constrained equations. Int. J. Comput. Math. 2019, 96, 1787–1798. [Google Scholar] [CrossRef]
  31. Gao, P.; He, C.; Liu, Y. An adaptive family of projection methods for constrained monotone non-linear equations with applications. Appl. Math. Comput. 2019, 359, 1–16. [Google Scholar]
  32. Zheng, L.; Yang, L.; Liang, Y. A Modified Spectral Gradient Projection Method for Solving Non-linear Monotone Equations with Convex Constraints and Its Application. IEEE Access 2020, 92677–92686. [Google Scholar] [CrossRef]
  33. Zheng, L.; Yang, L.; Liang, Y. A conjugate gradient projection method for solving equations with convex constraints. J. Comput. Appl. Math. 2020, 375, 112781. [Google Scholar] [CrossRef]
  34. Benrabia, N.; Laskri, Y.; Guebbai, H.; Al-Baali, M. Applying the Powell’s Symmetrical Technique to Conjugate Gradient Methods with the Generalized Conjugacy Condition. Numer. Funct. Anal. Optim. 2016, 37, 839–849. [Google Scholar] [CrossRef]
  35. Klibanov, M.V.; Li, J.; Zhang, W. Convexification for an inverse parabolic problem. Inverse Probl. 2020, 36, 085008. [Google Scholar] [CrossRef]
  36. Beilina, L.; Klibanov, M.V. A Globally Convergent Numerical Method for a Coefficient Inverse Problem. SIAM J. Sci. Comput. 2008, 31, 478–509. [Google Scholar] [CrossRef]
  37. Zoutendijk, G. Non-linear programming, computational methods. Integer Nonlinear Program 1970, 143, 37–86. [Google Scholar]
  38. Bongartz, I.; Conn, A.R.; Gould, N.; Toint, P.L. CUTE: Constrained and unconstrained testing environment, ACM Trans. Math. Softw. 1995, 21, 123–160. [Google Scholar] [CrossRef]
  39. Available online: http://users.clas.ufl.edu/hager/papers/Software/ (accessed on 20 May 2021).
  40. Dolan, E.D.; Moré, J.J. Benchmarking optimization software with performance profiles. Math. Program. 2002, 91, 201–213. [Google Scholar] [CrossRef]
  41. Alhawarat, A.; Salleh, Z.; Masmali, I.A. A Convex Combination between Two Different Search Directions of Conjugate Gradient Method and Application in Image Restoration. Math. Probl. Eng. 2021, 2021. [Google Scholar] [CrossRef]
  42. Kaltenbacher, B.; Rundell, W. The inverse problem of reconstructing reaction–diffusion systems. Inverse Probl. 2020, 36, 065011. [Google Scholar] [CrossRef]
  43. Kaltenbacher, B.; Rundell, W. On the identification of a nonlinear term in a reaction–diffusion equation. Inverse Probl. 2019, 35, 115007. [Google Scholar] [CrossRef] [Green Version]
  44. Averós, J.C.; Llorens, J.P.; Uribe-Kaffure, R. Numerical Simulation of Non-Linear Models of Reaction—Diffusion for a DGT Sensor. Algorithms 2020, 13, 98. [Google Scholar] [CrossRef] [Green Version]
  45. Lukyanenko, D.; Yeleskina, T.; Prigorniy, I.; Isaev, T.; Borzunov, A.; Shishlenin, M. Inverse Problem of Recovering the Initial Condition for a Nonlinear Equation of the Reaction–Diffusion–Advection Type by Data Given on the Position of a Reaction Front with a Time Delay. Mathematics 2021, 9, 342. [Google Scholar] [CrossRef]
  46. Lukyanenko, D.V.; Borzunov, A.A.; Shishlenin, M.A. Solving coefficient inverse problems for nonlinear singularly perturbed equations of the reaction-diffusion-advection type with data on the position of a reaction front. Commun. Nonlinear Sci. Numer. Simul. 2021, 99, 105824. [Google Scholar] [CrossRef]
  47. Egger, H.; Engl, H.W.; Klibanov, M.V. Global uniqueness and Hölder stability for recovering a nonlinear source term in a parabolic equation. Inverse Probl. 2004, 21, 271–290. [Google Scholar] [CrossRef]
Figure 1. Performance measure based on the number of iterations.
Figure 1. Performance measure based on the number of iterations.
Algorithms 14 00227 g001
Figure 2. Performance measure based on the CPU time.
Figure 2. Performance measure based on the CPU time.
Algorithms 14 00227 g002
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Salleh, Z.; Alhamzi, G.; Masmali, I.; Alhawarat, A. A Modified Liu and Storey Conjugate Gradient Method for Large Scale Unconstrained Optimization Problems. Algorithms 2021, 14, 227. https://0-doi-org.brum.beds.ac.uk/10.3390/a14080227

AMA Style

Salleh Z, Alhamzi G, Masmali I, Alhawarat A. A Modified Liu and Storey Conjugate Gradient Method for Large Scale Unconstrained Optimization Problems. Algorithms. 2021; 14(8):227. https://0-doi-org.brum.beds.ac.uk/10.3390/a14080227

Chicago/Turabian Style

Salleh, Zabidin, Ghaliah Alhamzi, Ibitsam Masmali, and Ahmad Alhawarat. 2021. "A Modified Liu and Storey Conjugate Gradient Method for Large Scale Unconstrained Optimization Problems" Algorithms 14, no. 8: 227. https://0-doi-org.brum.beds.ac.uk/10.3390/a14080227

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop