Next Article in Journal
q-Analogue of Differential Subordinations
Previous Article in Journal
On New Solutions of Time-Fractional Wave Equations Arising in Shallow Water Wave Propagation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Time-Consistent Strategies for the Generalized Multiperiod Mean-Variance Portfolio Optimization Considering Benchmark Orientation

1
School of Business, Hunan Normal University, Changsha 410081, China
2
School of Business Adminstration, Hunan University, Changsha 410082, China
*
Author to whom correspondence should be addressed.
Submission received: 7 July 2019 / Revised: 6 August 2019 / Accepted: 7 August 2019 / Published: 9 August 2019

Abstract

:
In this paper, we propose a generalized multiperiod mean-variance portfolio optimization based on consideration of benchmark orientation and intertemporal restrictions, in which the investors not only focus on their own performance but also tend to compare the performance gap between themselves and the benchmark. We aim to find the time-consistent strategy under the generalized mean-variance criterion, such that their relative performance is maximized. We derive the time-consistent strategy for the proposed model with and without a risk-free asset by using the backward induction approach. The results show that, in the case that there exists a risk-free asset, the time-consistent strategy is a feedback strategy about the benchmark process. However, in the other case, the time-consistent strategy is a double feedback strategy on both the benchmark process and the wealth process. Finally, we carry out some numerical simulations to show the evolution process of the time-consistent strategy. These simulations indicate that the proposed strategy can not only reduce the risk of investment existed in the intermediate time period but also imitate the return of the benchmark process.

1. Introduction

Nowadays, portfolio optimization has been one of the most important topics in asset management, which mainly focuses on how to allocate investors’ wealth among different assets. The classical mean-variance portfolio selection theory was first introduced by Markowitz [1] and was limited to the single-period investment situation. As far as we know, the multiperiod portfolio optimization problem is deemed to be one of the most significant extensions of the pioneering work of Markowitz [1], and it has received considerable attention in recent years (e.g., Li and Ng [2], Leippold et al. [3], Wei and Ye [4], Yao et al. [5], Chen et al. [6], Cui et al. [7], Liu and Chen [8], Zhou et al. [9] and so on). Most of the existing studies mainly assume that the investors only focus on their own performance and formulate the corresponding investment strategy accordingly. Obviously, this assumption is more consistent with the investment behavior of individual investors. However, in the real financial market, the institutional investors (e.g., fund managers and insurance companies) not only focus on their own performance but also tend to compare the performance gap between themselves and competitors/benchmarks. Some researchers also point out that the above investment approach is sensible in that fund investors expect their portfolios to maintain a performance level that is close to a desirable benchmark(e.g., Roll [10] and Zhao [11]). To describe the above investment behavior, we propose a multiperiod portfolio optimization problem, in which the investors consider the relative performance for the given benchmark.
However, Zhao [11] considered the maximization of the relative performance of the gap between the investors’ own wealth and the benchmark and neglected the maximization of the performance of the investors’ own wealth. Espinosa and Touzi [12] proposed a more general approach to measure the relative performance, which can not only consider the utility of the gap between the investors’ own wealth and the benchmark but also consider that of their own wealth. In addition to this, the work of Zhao [11] merely considered the terminal performance while ignoring the intermediate performance. What is more, Zhu et al. [13] noted that the investment bankruptcies that occur in the earlier periods are larger than those that occur in the later periods. That is, the intermediate performance of the portfolio can not be ignored. To address this problem, Costa and Nabholz [14] considered a generalized mean-variance model with consideration of the intertemporal restrictions (i.e., the investors have restrictions on the intermediate expectations and intermediate variances of the portfolio). Under this generalized mean-variance criterion, the investors not only consider the terminal performance but also consider the intermediate performance of their portfolio. For other kinds of the generalized mean-variance portfolio optimization problems, readers may refer to Costa and Araujo [15], Costa and de Oliveira [16], Cui et al. [17] and Zhou et al. [18]. Motivated by the works of Costa and Nabholz [14] and Espinosa and Touzi [12], we construct a generalized mean-variance portfolio optimization model with intertemporal restrictions and the investors who are also concerned about the relative performance compared to the given benchmark.
Similar to the classical multiperiod mean-variance portfolio optimization problems, our proposed generalized model is also a time-inconsistent optimization problem, in that the variance measure does not satisfy the expected iterated property. That is, the proposed model can not be solved directly by using the traditional dynamic programming approach. As far as we know, the precommitment and time-consistent strategies are the two most representative strategies for these multiperiod mean-variance portfolio optimization problems. Li and Ng [2] first derived the precommitment strategy by using the embedding scheme. Since then, this approach has been widely applied to the different portfolio optimizations (e.g., Leippold et al. [3], Celikyurt and Özekici [19], Yao et al. [20] and Zhou et al. [9]). However, some researchers have noted that this strategy does not satisfy the time-consistency. This cause is that, the precommitment strategy is made at the initial time, and it not only depends on the current wealth but also relies on the initial capital. In this situation, the optimal strategy at time t 1 does not agree with that at time t 2 , where t 2 > t 1 , that is, the global and local objectives are not consistent. To address this problem, Björk and Murgoci [21] derived the time-consistent strategy by using a game approach. The proposed solution methodology treats these time-inconsistent problems as a noncooperative game, in which the strategies at different time points are made by the different players who seek to maximize their own utilities. Then, Nash equilibrium of these strategies is applied to define the time-consistent strategy for the original optimization problem. Compared with the precommitment strategy, the time-consistent strategy might be adopted by the investors who are more rational and sophisticated, since the decision-makers take possible future revisions into account (e.g., Basak and Chabakauri [22], Wu and Chen [23]), Cui et al. [24] and so on). For this research topic, readers may refer to Basak and Chabakauri [22], Bensoussan et al. [25], Björk and Murgoci [26], Wu and Chen [23], Zhou et al. [27] and Wang and Chen [28] and so on. Actually, most of the existing researches on the time-consistent strategies for the multi-period portfolio optimization problems are only concerned with the capital pool with both risky assets and one risk-free asset. In the real applications, it is not difficult to identify a case in which some investors only invest in risky assets. Although Zhou et al. [27] derived the time-consistent strategy for the classical multiperiod mean-variance portfolio optimization with and without the risk-free asset, these authors are still limited to the framework of the classical mean-variance model without considering the benchmark orientation and intertemporal restrictions. In this paper, we mainly aim to investigate the time-consistent strategy for a generalized multiperiod mean-variance portfolio optimization with and without a risk-free asset.
Along the aforementioned lines of research, we propose a generalized mean-variance portfolio optimization with consideration of both the intertemporal restrictions and the benchmark orientation. We use a generalized approach provided by Espinosa and Touzi [12] to measure the relative performance of the portfolio in which the investors’ own performance and the relative performance compared to the benchmark are both considered. We derive the time-consistent strategy for the proposed model with and without the risk-free asset, by using the backward induction approach, which can be regarded as a suitable investment strategy for the rational and sophisticated investors. We find that the time-consistent strategies for the above investment situations are both feedback strategies. Finally, we also provide some numerical simulations to show the evolution process of the proposed time-consistent strategy. These simulations indicate that the proposed time-consistent strategy can not only change the risk of investment existed in the intermediate time period but also imitate the return of benchmark process.
Different from the existing literature, this paper has three contributions. (a) We extend the work of Zhao [11] to a generalized mean-variance criterion, where the intertemporal restrictions are considered in the proposed model. The proposed model not only can cover many classical models, but also can depict the behavior of investors imitating the benchmark process. (b) Compared with Zhao [11], we focus on both the investors’ own performance and the performance relative to the given benchmark. The investors can weigh their own wealth value and the gap between their wealth value and the benchmark. (c) We derive the corresponding time-consistent strategy for the proposed model when there exists a risk-free asset or not, while the most of existing studies always ignore the latter condition. The results show that the time-consistent strategies are both feedback strategies. The difference is that, when there exists a risk-free asset, the time-consistent strategy is a feedback strategy about the benchmark process; when there does not exist a risk-free asset, the time-consistent strategy is a double feedback strategy on both the benchmark process and the wealth process.
The remainder of this paper is organized as follows. In Section 2, we introduce the assumption of investment market, and then construct a generalized multiperiod mean-variance portfolio optimization model. In Section 3, we first give the definition of time-consistent strategy and the solution methodology. Further, we derive the time-consistent strategy for the proposed model with and without the risk-free asset. In Section 4, we carry out some numerical simulations to show the results derived from Section 3. Finally, some concluding remarks are summarized.

2. Generalized Multiperiod Mean-Variance Portfolio Optimization Considering Benchmark

In this section, we assume that the investors will join the capital market taking along with the initial wealth R 0 . The investors can invest their wealth into one risk-free asset and n risky assets within time horizon T. We suppose that the risk-free asset with a deterministic return s t and the i-th risky asset with a random return e t i at the time period t, where i = 1 , 2 , . . , n and t = 0 , 1 , . . , T 1 . Let R t be the wealth at the time period t and u t i be the amount invested in the i-th risky asset at the beginning of the time period t, then the amount invested in the risk-free asset can be expressed as R t i = 1 n u t i , t = 0 , 1 , T 1 . Based on the above assumption, the wealth dynamic process can be expressed as
R t + 1 = i = 1 n e t i u t i + s t ( R t i = 1 n u t i ) , = s t R t + P t u t , t = 0 , 1 , , T 1 ,
where P t = [ e t 1 s t , e t 2 s t , , e t n s t ] denotes the vector of excess rates of returns, and u t = [ u t 1 , u t 2 , , u t n ] , t = 0 , 1 , , T 1 .
In addition, we assume that the investors’ decision-making will refer to the return process of a given benchmark (i.e., stock index and investment fund, etc.), since the investors always hope that the performance of their portfolio can outperform that of this benchmark, or the investors want to replicate the return process of the benchmark according to their own portfolio. Let the return of the benchmark be r t , and let B t denote the wealth of this benchmark at the time period t, where t = 0 , 1 , , T 1 . Then, the wealth process of this benchmark can be expressed as
B t + 1 = r t B t , t = 0 , 1 , , T 1 .
In this paper, we assume that the investors not only consider their own wealth but also consider the relative wealth compared to this benchmark. Additionally, we use a generalized mean-variance utility to measure the relative performance of the portfolio, that is, the intertemporal restrictions are considered in this optimization problem. Therefore, we can construct the following multiperiod portfolio optimization model:
max u t = 1 T w t { ξ t E [ ( 1 θ t ) R t + θ t ( R t B t ) ] η t V a r [ ( 1 θ t ) R t + θ t ( R t B t ) ] } s . t . R t + 1 = s t R t + P t u t , t = 0 , 1 , . . . T 1 , B t + 1 = r t B t , t = 0 , 1 , . . . , T 1 .
Note that u : = { u 0 , u 1 , , u T 1 } , ξ t and η t denote the two weights for the expectation E ( R t ) and variance V a r ( R t ) at each time period t ( t = 1 , 2 , , T ), which can be regarded as the trade-off parameters between maximizing the investment return and minimizing the investment risk. For the real investors, how to determine the above two weights mainly depends on their preferences for the return and risk. Typically, the investors will first fix one of the above two weights, and then adjust another one according to their preferences, e.g., for the given weight ξ t , when the investors are more risk-averse, they will choose a larger weight η t at the time period t, t = 1 , 2 , , T . In addition, as shown in Zhu et al. [13], the number of investment bankruptcies that occur in the earlier periods is larger than those that occur in the later periods, this also lead to the investors will give a larger weight η t for the earlier risk restrictions in the corresponding optimization objective, that is, for these risk-averse investors, the weight η t might be a decrease function on the time period t ( in fact, the form of η t depends on the investor’s preference, which can be described by some linear and nonlinear functions; note that, in Section 4, we assume that the weight η t decreases exponentially with the time period t), t = 1 , 2 , , T . Further, w t denotes the weight for the mean-variance objective ξ t E ( R t ) η t V a r ( R t ) at the time period t, t = 1 , 2 , , T . In this paper, we assume that w t is a 0-1 variable, where w t = 1 denotes that the investors will consider the intertemporal restriction at the time period t and w t = 0 indicates that the intertemporal restriction is not considered at the time period t, t = 1 , 2 , , T . As shown in Model (3), the investors not only consider their own investment performance but also consider the relative performance compared to this benchmark, where θ t denotes the sensitivity of the investors to the performance of this benchmark at time period t, t = 1 , 2 , , T . Furthermore, Model (3) can be rewritten as
max u t = 1 T w t { ξ t E [ R t θ t B t ] η t V a r [ R t θ t B t ] } s . t . R t + 1 = s t R t + P t u t , t = 0 , 1 , . . . T 1 , B t + 1 = r t B t , t = 0 , 1 , . . . , T 1 .
Let e ˇ t = ( e t 1 , e t 2 , , e t n , r t ) and e t = ( e t 1 , e t 2 , , e t n ) , t = 0 , 1 , , T 1 . Suppose that e ˇ t are statistically independent random vectors (i.e., e ˇ k and e ˇ l are independent for k , l = 0 , 1 , , T 1 if k l ). However, the benchmark return r t is dependent with random vector e t , t = 0 , 1 , , T 1 . Let μ t = E ( P t ) , λ t = E ( e t ) , Ω t = C o v ( e t ) , ν t = E ( r t ) , σ t = V a r ( r t ) , ϕ t = E ( r t e t ) , Ξ t = E ( e t e t ) and Q t = [ C o v ( e t 1 , r t ) , , C o v ( e t n , r t ) ] , t = 0 , 1 , , T 1 . Here, we assume that Ω t and Ξ t are both positive definite matrices, t = 0 , 1 , , T 1 . In addition, for convenience, we define that t = k l ( · ) = 0 and t = k l ( · ) = 1 for k > l . Since the variance measure does not have the expected iterated property, then Model (4) is a time-inconsistent optimization problem. In the following, we will derive the time-consistent solution of Model (4) by using the backward induction approach.

3. Time-Consistent Strategy for the Generalized Portfolio Optimization Problem

As far as we know, Li and Ng [2] first applied the embedding scheme to solve the classical multi-period mean-variance portfolio optimization problem. However, the optimal investment strategy shown in Li and Ng [2] has been criticized for not satisfying time consistency. Similarly, Model (4) is a time-inconsistent problem, which cannot be directly solved by using the dynamic programming approach. Inspired by Björk and Murgoci [21], in the following, we will investigate the time-consistent strategy for Model (4) under the two investment situations: (i) there exists a risk-free asset and n risky assets in the capital pool; and (ii) there only exist n risky assets in the capital pool.
To this end, we should provide the definition of the time-consistent strategy first. Similar to Björk and Murgoci [21], we regarded this investment decision-making process as a noncooperative game and assume that there exists a decision-maker, called as “decision-maker k”, for each point of the time period k. Then, we can define the corresponding sub-objective as follows.
J k ( R k , B k , u ) = t = k + 1 T w t [ ξ t E k ( R t θ t B t ) η t V a r k ( R t θ t B t ) ] .
According to the Definition 2.2 presented in Björk and Murgoci [21], the time-consistent strategy for Model (4) can be defined as follows.
Definition 1.
Consider a fixed control law u ^ = ( u ^ 0 , u ^ 1 , , u ^ T 1 ) . For k = 0 , 1 , , T 1 , we let
u ( k ) = ( u k , u ^ k + 1 , , u ^ T 1 ) ,
u ^ ( k ) = ( u ^ k , u ^ k + 1 , , u ^ T 1 ) ,
where u k is an arbitrarily control value. Then, u ^ is called as a time-consistent strategy if for all k = 0 , 1 , , T 1 , it satisfies the following conditions
max u k J k ( R k , B k , u ( k ) ) = J k ( R k , B k , u ^ ( k ) ) .
In addition, if time-consistent strategy u ^ exists, the corresponding value function is defined as
V k ( R k , B k ) = J k ( R k , B k , u ^ ( k ) ) .
Definition 1 shows that the solution methodology of the time-consistent strategy is essentially a backward induction approach. According to the above definition of time consistent strategy presented in Definition 1, the recursive formula of the above value function can be derived.
Proposition 1.
The value function satisfies the following recursive formula.
V k ( R k , B k ) = max u k J k ( R k , B k , u ( k ) ) = max u k E k [ V k + 1 ( R k + 1 , B k + 1 ) ] t = k + 2 T R t η t V a r k [ f k + 1 , t ( R k + 1 , B k + 1 ) ] + w k + 1 [ ξ k + 1 E k ( R k + 1 θ k + 1 B k + 1 ) η k + 1 V a r k ( R k + 1 θ k + 1 B k + 1 ) ] , k = 0 , 1 , . . . , T 2 ,
V T 1 ( R T 1 , B T 1 ) = max u T 1 w T [ ξ T E T 1 ( R T θ T B T ) η T V a r T 1 ( R T θ T B T ) ] ,
where
f k , τ ( w k , B k ) = E k [ f k + 1 , τ ( R k + 1 , B k + 1 ) ] , τ > k , τ , k = 0 , 1 , . . . , T 1 , R k θ k B k , τ = k , k = 0 , 1 , . . . , T 1 .
Proof. 
See Appendix A. □
Based on Proposition 1, in the following, we will investigate the time-consistent solution of the optimization problem (4) with and without a risk-free asset.

3.1. Time-Consistent Strategy for the Generalized Portfolio Optimization with Multiple Risky Assets and a Risk-Free Asset

In this section, we will discuss the time-consistent strategy for this generalized model with both n risky assets and a risk-free asset. According to Definition 1 and Proposition 1, we can derive the corresponding time-consistent strategy and value function by using the backward induction approach, and the main conclusions are as follows.
Theorem 1.
When there exists a risk-free asset and n risky assets, for the multiperiod mean-variance portfolio optimization problem (4), the time-consistent strategy can be described as
u ^ t = a ^ t B t + b ^ t , t = 0 , 1 , , T 1 ,
and the corresponding value function V t ( R t , B t ) and f t , τ ( R t , B t ) are given by
V t ( R t , B t ) = m = t + 1 T w m ξ m k = t m 1 s k R t + m ^ t B t 2 + n ^ t B t + γ ^ t , t = 0 , 1 , , T 1 ,
f t , τ ( R t , B t ) = m = t τ 1 s m R t + ρ ^ t , τ B t + κ ^ t , τ , τ t , τ , t = 0 , 1 , , T 1 .
Here, we define that ρ ^ t , τ = θ τ and κ ^ t , τ = 0 for t = τ . In addition, the above parameters (i.e., a ^ t , b ^ t , m ^ t , n ^ t , γ ^ t , ρ ^ t , τ and κ ^ t , τ , where t = 0 , 1 , . . , T 2 and τ = t + 1 , t + 2 , , T 1 ) satisfy the following iteration formulas:
a ^ t = m = t + 1 T ( w m η m ρ ^ t + 1 , m k = t + 1 m 1 s k ) Ω t 1 Q t m = t + 1 T ( w m η m k = t + 1 m 1 s k 2 ) , b ^ t = m = t + 1 T ( w m ξ m k = t + 1 m 1 s k ) Ω t 1 μ t 2 m = t + 1 T ( w m η m k = t + 1 m 1 s k 2 ) , m ^ t = m t + 1 ( σ t + ν t 2 ) 2 m = t + 1 T ( w m η m ρ ^ t + 1 , m k = t + 1 m 1 s k ) Q t a ^ t m = t + 1 T ( w m η m k = t + 1 m 1 s k 2 ) a ^ t Ω t a ^ t m = t + 1 T ( w m η m ρ ^ t + 1 , m 2 ) σ t , n ^ t = n ^ t + 1 ν t w t + 1 ξ t + 1 θ t + 1 ν t + ( m = t + 1 T w m ξ m k = t + 1 m 1 s k ) μ t a ^ t 2 m = t + 1 T ( w m η m ρ ^ t + 1 , m k = t + 1 m 1 s k ) Q t b ^ t 2 m = t + 1 T ( w m η m k = t + 1 m 1 s k 2 ) a ^ t Ω t b ^ t , γ ^ t = γ ^ t + 1 + ( m = t + 1 T w m ξ m k = t + 1 m 1 s k ) μ t b ^ t ( m = t + 1 T w m η m k = t + 1 m 1 s k 2 ) b ^ t Ω t b ^ t , ρ ^ t , τ = m = t + 1 τ 1 s m μ t a ^ t + ρ ^ t + 1 , τ ν t , κ ^ t , τ = m = t + 1 τ 1 s m μ t b ^ t + κ ^ t + 1 , τ ,
as well as the boundary conditions
a ^ T 1 = θ T Ω T 1 1 Q T 1 , b ^ T 1 = ξ T Ω T 1 1 μ T 1 2 η T , m ^ T 1 = w T η T ( 2 θ T Q T 1 a ^ T 1 a ^ T 1 Ω T 1 a ^ T 1 θ T 2 σ T 1 ) , n ^ T 1 = w T ξ T ( μ T 1 a ^ T 1 θ T ν T 1 ) 2 w T η T ( a ^ T 1 Ω T 1 b ^ T 1 θ T Q T 1 b ^ T 1 ) , γ ^ T 1 = w T ξ T μ T 1 b ^ T 1 w T η T b ^ T 1 Ω T 1 b ^ T 1 , ρ ^ T 1 , T = μ T 1 a ^ T 1 θ T ν T 1 , κ ^ T 1 , T = μ T 1 b ^ T 1 .
Proof. 
See Appendix B. □
From Theorem 1, we can find that, when the performance of the benchmark is considered into the investment decision-making, the corresponding time-consistent strategy depends on the current wealth of the benchmark compared to the results shown in Zhou et al. [18]. That is, the proposed time-consistent strategy (9) is a feedback strategy, while the time-consistent strategy provided by Zhou et al. [18] is a nonfeedback one. Additionally, Model (4) is a generalized one that can recover some classical models presented in the existing studies. In the following, we will discuss the time-consistent strategies under some special settings, the details are as follows.
Remark 1.
When the investors only consider the performance of terminal wealth (i.e., w t = 0 if t = 1 , 2 , , T 1 and w T = 1 , and ξ t = 1 for t = 1 , 2 , , T ), then the time-consistent strategy (9) can be reduced as
u ^ t = a ^ t B t + b ^ t , t = 0 , 1 , , T 1 ,
where a ^ t and b ^ t are shown as follows.
a ^ t = θ T k = t T 1 ( μ k Ω k 1 Q k ν k ) Ω t 1 Q t k = t + 1 T 1 s k , b ^ t = Ω t 1 μ t 2 η T k = t + 1 T 1 s k .
Remark 1 shows that the investors only consider the performance of terminal wealth, and the intertemporal expectations and variances are ignored in here. Compared with (9) and (14), we can find that the latter only considers the terminal risk aversion coefficient η T , while the former considers both the intertemporal and terminal risk aversion coefficients.
Remark 2.
When the investors do not consider the performance of the benchmark process (i.e., θ t = 0 for t = 1 , 2 , , T ), then the time-consistent strategy (9) can be reduced as
u ^ t = m = t + 1 T ( w m ξ m k = t + 1 m 1 s k ) Ω t 1 μ t 2 m = t + 1 T ( w m η m k = t + 1 m 1 s k 2 ) , t = 0 , 1 , , T 1 .
Remark 2 shows that, the investors’ decision only considers the performance of the assets they want to invest in, while the performance of the benchmark is ignored here. However, the intertemporal restrictions are embedded into this time-consistent strategy. In this case, the time-consistent strategy (16) is a nonfeedback strategy, which is consistent with the result shown in Zhou et al. [18].
Remark 3.
When the investors do not consider the performance of the benchmark and also ignore the impact of the intertemporal restrictions(i.e., w t = 0 if t = 1 , 2 , , T 1 and w T = 1 , ξ t = 1 and θ t = 0 , for t = 1 , 2 , , T ), the time-consistent strategy (9) is
u ^ t = Ω t 1 μ t 2 η T k = t + 1 T 1 s k , t = 0 , 1 , , T 1 .
Under this special setting presented in Remark 3, Model (4) is degenerated into the classical multiperiod mean-variance model, and then the time-consistent strategy (17) is consistent with the result shown in Björk and Murgoci [21].

3.2. Time-Consistent Strategy for the Generalized Portfolio Optimization with Only Risky Assets

Section 3.1 investigates the time-consistent strategy for the generalized portfolio optimization with both n risky assets and a risk-free asset. This condition is also the common investment assumption found in previous studies. However, in some situations, the investors might only treat the risky assets as the investment targets. Therefore, it is necessary to investigate the time-consistent solution of Model (4) when the capital pool only contains n risky assets. Mathematically, we merely require to add an additional condition R t i = 1 n u t i = 0 to Model (4). In this assumption, Model (4) can be written as follows.
max u t = 1 T w t { ξ t E [ R t θ t B t ] η t V a r [ R t θ t B t ] } s . t . R t + 1 = e t u t , t = 0 , 1 , . . . T 1 , R t = I u t , t = 0 , 1 , . . . T 1 , B t + 1 = r t B t , t = 0 , 1 , . . . , T 1 ,
where I = [ 1 , 1 , 1 ] R n .
According to Definition 1 and Proposition 1, we can derive the time-consistent strategy for Model (18), the details see Theorem 2.
Theorem 2.
When there exists only risky assets, the time-consistent strategy and the corresponding value function for Model (18) can be expressed as follows:
u ^ t = a ˜ t B t + b ˜ t R t + c ˜ t , t = 0 , 1 , , T 1 ,
V t ( R t , B t ) = m ˜ t B t 2 + α ˜ t R t 2 + φ ˜ t B t R t + n ˜ t B t + β ˜ t R t + γ ˜ t , t = 0 , 1 , . . . , T 1 ,
where
f t , τ ( R t , B t ) = ϑ ˜ t , τ R t + ρ ˜ t , τ B t + κ ˜ t , τ , τ t , τ , t = 0 , 1 , . . . , T 1 .
Here, we define that ϑ ˜ t , τ = 1 , ρ ˜ t , τ = θ τ and κ ˜ t , τ = 0 for t = τ . In addition, the above parameters ( a ˜ t , b ˜ t , c ˜ t , m ˜ t , α ˜ t , n ˜ t , β ˜ t , ϑ ˜ t , τ , ρ ˜ t , τ and κ ˜ t , τ , t = 0 , 1 , . . , T 2 and τ = t + 1 , t + 2 , , T 1 ), which satisfy the following iteration equations:
a ˜ t = Ω ˜ t 1 [ φ ˜ t + 1 ϕ t 2 ( m = t + 1 T w m η m ϑ ˜ t + 1 , m ρ ˜ t + 1 , m ) Q t ] 2 + 2 ( m = t + 1 T w m η m ϑ ˜ t + 1 , m ρ ˜ t + 1 , m ) I Ω ˜ t 1 Q t φ ˜ t + 1 I Ω ˜ t 1 ϕ t 2 I Ω ˜ t 1 I Ω ˜ t 1 I , b ˜ t = Ω ˜ t 1 I I Ω ˜ t 1 I , c ˜ t = ( w t + 1 ξ t + 1 + β ˜ t + 1 ) Ω ˜ t 1 λ t 2 ( w t + 1 ξ t + 1 + β ˜ t + 1 ) I Ω ˜ t 1 λ t 2 I Ω ˜ t 1 I Ω ˜ t 1 I , m ˜ t = m ˜ t + 1 ( σ t + ν t 2 ) ( m = t + 1 T w m η m ρ ˜ t + 1 , m 2 ) σ t a ˜ t Ω ˜ t a ˜ t 2 ( m = t + 1 T w m η m ϑ ˜ t + 1 , m ρ ˜ t + 1 , m ) Q t a ˜ t + φ ˜ t + 1 ϕ t a ˜ t , α ˜ t = b ˜ t Ω ˜ t b ˜ t , φ ˜ t = 2 ( m = t + 1 T w m η m ϑ ˜ t + 1 , m ρ ˜ t + 1 , m ) Q t b ˜ t + φ ˜ t + 1 ϕ t b ˜ t , n ˜ t = n ˜ t + 1 ν t w t + 1 ξ t + 1 θ t + 1 ν t 2 a ˜ t Ω ˜ t c ˜ t 2 ( m = t + 1 T w m η m ϑ ˜ t + 1 , m ρ ˜ t + 1 , m ) Q t c ˜ t + w t + 1 ξ t + 1 λ t a ˜ t + β ˜ t + 1 λ t a ˜ t + φ ˜ t + 1 ϕ t c ˜ t , β ˜ t = w t + 1 ξ t + 1 λ t b ˜ t + β ˜ t + 1 λ t b ˜ t , γ ˜ t = γ ˜ t + 1 c ˜ t Ω ˜ t c ˜ t + w t + 1 ξ t + 1 λ t c ˜ t + β ˜ t + 1 λ t c ˜ t , ϑ ˜ t , τ = ϑ ˜ t + 1 , τ λ t b ˜ t , ρ ˜ t , τ = ϑ ˜ t + 1 , τ λ t a ˜ t + ρ ˜ t + 1 , τ ν t , κ ˜ t , τ = ϑ ˜ t + 1 , τ λ t c ˜ t + κ ˜ t + 1 , τ , Ω ˜ t = ( m = t + 1 T w m η m ϑ ˜ t + 1 , m 2 ) Ω t α ˜ t + 1 Ξ t ,
where Ω ˜ t ( t = 0 , 1 , . . . , T 2 ) is a positive definite matrix. Additionally, the above iteration equations satisfy the following boundary conditions
a ˜ T 1 = θ T Ω T 1 1 Q T 1 θ T I Ω T 1 1 Q T 1 Ω T 1 1 I I Ω T 1 1 I , b ˜ T 1 = Ω T 1 1 I I Ω T 1 1 I , c ˜ T 1 = ξ T Ω T 1 1 λ T 1 2 η T ξ T I Ω T 1 1 λ T 1 Ω T 1 1 I 2 η T I Ω T 1 1 I , m ˜ T 1 = w T η T θ T 2 σ T 1 w T η T a ˜ T 1 Ω T 1 a ˜ T 1 + 2 w T η T θ T Q T 1 a ˜ T 1 , α ˜ T 1 = w T η T b ˜ T 1 Ω T 1 b ˜ T 1 , φ ˜ T 1 = 2 w T η T θ T Q T 1 b ˜ T 1 , n ˜ T 1 = w T ξ T θ T ν T 1 + w T ξ T λ T 1 a ˜ T 1 2 w T η T θ T Q T 1 c ˜ T 1 , β ˜ T 1 = w T ξ T λ T 1 b ˜ T 1 , γ ˜ T 1 = w T ξ T λ T 1 c ˜ T 1 w T η T c ˜ T 1 Ω T 1 c ˜ T 1 , ϑ ˜ T 1 , T = λ T 1 b ˜ T 1 , ρ ˜ T 1 , T = λ T 1 a ˜ T 1 θ T ν T 1 , κ ˜ T 1 , T = λ T 1 c ˜ T 1 .
Proof. 
See Appendix C. □
As shown in Theorem 2, when there are only risky assets in the capital pool, the time-consistent strategy (19) is dependent on both the benchmark process B t and wealth process R t compared to the time-consistent strategy (9). That is, the time-consistent strategy (19) is a double feedback strategy on both benchmark process B t and wealth process R t , while the time-consistent strategy (9) is only a feedback strategy on the benchmark process B t .
Remark 4.
When the investors only consider the performance of terminal wealth (i.e., w t = 0 if t = 1 , 2 , , T 1 and w T = 1 , and ξ t = 1 for t = 1 , 2 , , T ), the time-consistent strategy (19) can be reduced as
u ^ t = a ˜ t B t + b ˜ t R t + c ˜ t , t = 0 , 1 , , T 1 .
Therefore, the above parameters (i.e., a ˜ t , b ˜ t and c ˜ t , t = 0, 1, …, T − 2), which satisfy the following iteration equations.
a ˜ t = Ω ˜ t 1 [ φ ˜ t + 1 ϕ t 2 η T ϑ ˜ t + 1 , T ρ ˜ t + 1 , T Q t ] 2 + 2 η T ϑ ˜ t + 1 , T ρ ˜ t + 1 , T I Ω ˜ t 1 Q t φ ˜ t + 1 I Ω ˜ t 1 ϕ t 2 I Ω ˜ t 1 I Ω ˜ t 1 I , b ˜ t = Ω ˜ t 1 I I Ω ˜ t 1 , c ˜ t = ( w t + 1 ξ t + 1 + β ˜ t + 1 ) Ω ˜ t 1 λ t 2 ( w t + 1 ξ t + 1 + β ˜ t + 1 ) I Ω ˜ t 1 λ t 2 I Ω ˜ t 1 I Ω ˜ t 1 I , α ˜ t = b ˜ t Ω ˜ t b ˜ t , φ ˜ t = 2 η T ϑ ˜ t + 1 , T ρ ˜ t + 1 , T Q t b ˜ t + φ ˜ t + 1 ϕ t b ˜ t , β ˜ t = β ˜ t + 1 λ t b ˜ t , ϑ ˜ t , T = ϑ ˜ t + 1 , T λ t b ˜ t , ρ ˜ t , T = ϑ ˜ t + 1 , T λ t a ˜ t + ρ ˜ t + 1 , T ν t , Ω ˜ t = η T ϑ ˜ t + 1 , T 2 Ω t α ˜ t + 1 Ξ t ,
as well as the boundary conditions
a ˜ T 1 = θ T Ω T 1 1 Q T 1 θ T I Ω T 1 1 Q T 1 Ω T 1 1 I I Ω T 1 1 I , b ˜ T 1 = Ω T 1 1 I I Ω T 1 1 I , c ˜ T 1 = ξ T Ω T 1 1 λ T 1 2 η T ξ T I Ω T 1 1 λ T 1 Ω T 1 1 I 2 η T I Ω T 1 1 I , α ˜ T 1 = w T η T b ˜ T 1 Ω T 1 b ˜ T 1 , φ ˜ T 1 = 2 w T η T θ T Q T 1 b ˜ T 1 , β ˜ T 1 = w T ξ T λ T 1 b ˜ T 1 , ϑ ˜ T 1 , T = λ T 1 b ˜ T 1 , ρ ˜ T 1 , T = λ T 1 a ˜ T 1 θ T ν T 1 .
Similarly, the time-consistent strategy (24) only concerns the terminal risk aversion coefficient η T , while the time-consistent strategy (19) both consider the intertemporal and terminal risk aversion coefficients. In addition, compared with Remark 1, when there exist n risky assets in the capital pool, the time-consistent strategy (24) is a double feedback one on current benchmark process B t and wealth process R t .
Remark 5.
When the investors do not consider the performance of the benchmark process (i.e., θ t = 0 for t = 1 , 2 , , T ), the time-consistent strategy (19) can be reduced as
u ^ t = b ˜ t R t + c ˜ t , t = 0 , 1 , , T 1 .
Here, we also define that ϑ ˜ t , τ = 1 , ρ ˜ t , τ = θ τ and κ ˜ t , τ = 0 for t = τ . Therefore, the above parameters (i.e., b ˜ t and c ˜ t , t = 0, 1, …, T − 2), which satisfy the following iteration equations
b ˜ t = Ω ˜ t 1 I I Ω ˜ t 1 I , c ˜ t = ( w t + 1 ξ t + 1 + β ˜ t + 1 ) I Ω ˜ t 1 λ t ) 2 ( w t + 1 ξ t + 1 + β ˜ t + 1 ) I Ω ˜ t 1 λ t 2 I Ω ˜ t 1 I Ω ˜ t 1 I , α ˜ t = b ˜ t Ω ˜ t b ˜ t , β ˜ t = 2 b ˜ t Ω ˜ t c ˜ t + w t + 1 ξ t + 1 λ t b ˜ t + β ˜ t + 1 λ t b ˜ t , Ω ˜ t = m = t + 1 T w m η m ϑ ˜ t + 1 , m 2 Ω t α ˜ t + 1 Ξ t , ϑ ˜ t , τ = ϑ ˜ t + 1 , τ λ t b ˜ t ,
where τ = t + 1 , t + 2 , , T 1 , and the boundary conditions of the above parameters can be expressed as
b ˜ T 1 = Ω T 1 1 I I Ω T 1 1 I , c ˜ T 1 = ξ T Ω T 1 1 λ T 1 2 η T ξ T I Ω T 1 1 λ T 1 Ω T 1 1 I 2 η T I Ω T 1 1 I , α ˜ T 1 = w T η T b ˜ T 1 Ω T 1 b ˜ T 1 , β ˜ T 1 = w T ξ T λ T 1 b ˜ T 1 , ϑ ˜ T 1 , T = λ T 1 b ˜ T 1 .
As shown in Remark 5, we can find that the time-consistent strategy (27) is a feedback strategy on current wealth R t compared to the time-consistent strategy (16). This is the largest difference between the time-consistent strategies with and without the risk-free asset.
Remark 6.
When the investors do not consider the performance of the benchmark and ignore the impact of the intertemporal restrictions(i.e., w t = 0 if t = 1 , 2 , , T 1 and w T = 1 , ξ t = 1 and θ t = 0 for t = 1 , 2 , , T ), the time-consistent strategy (19) can be reduced as
u ^ t = b ˜ t R t + c ˜ t ,
The above parameters (i.e., b ˜ t and c ˜ t , t = 0, 1, …, T − 2), which satisfy the following iteration equations.
b ˜ t = Ω ˜ t 1 I I Ω ˜ t 1 I , c ˜ t = β ˜ t + 1 Ω ˜ t 1 λ t 2 β ˜ t + 1 I Ω ˜ t 1 λ t 2 I Ω ˜ t 1 I Ω ˜ t 1 I , α ˜ t = 1 I Ω ˜ t 1 I , β ˜ t = β ˜ t + 1 I Ω ˜ t 1 λ t I Ω ˜ t 1 I , ϑ ˜ t , T = ϑ ˜ t + 1 , T I Ω ˜ t 1 λ t I Ω ˜ t 1 I , Ω ˜ t = η T ϑ ˜ t + 1 , T 2 Ω t α ˜ t + 1 Ξ t ,
as well as the boundary conditions
b ˜ T 1 = Ω T 1 1 I I Ω T 1 1 I , c ˜ T 1 = ξ T Ω T 1 1 λ T 1 2 η T ξ T I Ω T 1 1 λ T 1 Ω T 1 1 I 2 η T I Ω T 1 1 I , α ˜ T 1 = w T η T I Ω T 1 I , β ˜ T 1 = w T ξ T I Ω T 1 1 λ T 1 I Ω T 1 1 I , ϑ ˜ T 1 , T = λ T 1 b ˜ T 1 .
Remark 6 shows that, the investors only concern the performance of the terminal wealth, and also do not consider the relative performance compared to the benchmark. In this case, this conclusion is coincident with the results in Zhou et al. [27].

4. Numerical Analysis

In this section, we will provide some numerical simulations to show the results presented in Section 3. Suppose that R 0 = 1 and B 0 = 1 . We randomly select four stocks from American financial market, where the stock codes are AIG, GE, INTC and PEP. Further, we regard the S&P 500 index as the benchmark process. The monthly returns from January 2000 to December 2018 are applied to estimate the parameters of the risky assets, which is downloaded from Yahoo Finance (https://finance.yahoo.com/). The detailed estimations are given as follows.
λ t = 1.0044 0.9967 1.0047 1.0063 , t = 0 , 1 , , T 1 ,
ϕ t = 1.0119 1.0027 1.0111 1.0109 , t = 0 , 1 , , T 1 ,
Q t = 0.0037 0.0022 0.0025 0.0008 , t = 0 , 1 , , T 1 ,
ν t = 1 . 0038 , σ t = 0 . 0018 , t = 0 , 1 , , T 1 ,
Ω t = 0.0505 0.0061 0.0041 0.0017 0.0061 0.0064 0.0025 0.0010 0.0041 0.0025 0.0094 0.0007 0.0017 0.0010 0.0007 0.0020 , t = 0 , 1 , , T 1 .
In this section, we treat 3-month Treasury bill as the risk-free asset, the annual returns can be downloaded from Federal Reserve Economic Data (https://fred.stlouisfed.org/series/TB3MS). We use the mean of the historical returns from January 2000 to December 2018 as the return of the risk-free asset, that is, s t = 1 + 0 . 0161 / 12 = 1 . 00134 . In the following, we will investigate the evolution process of the time-consistent strategy and discuss the impact of the intertemporal restrictions and benchmark orientation on the time-consistent strategy. In order to better show the evolution of investment strategy, we choose a relatively large investment horizon T in the following simulations, that is, T = 200 . In fact, we can explore the evolution of the investment strategy for any given investment horizon T, the corresponding results have been omitted for space reasons. To this end, we will show the evolution processes of the time-consistent strategies under different settings. The details are given as follows.
  • Case I. The proposed time-consistent strategy considers all the intertemporal restrictions, and it also relies on the benchmark origination. Since the weight w t is a 0–1 variable, the above situation means that w t = 1 , t = 1 , 2 , , T . In addition, we assume that the investors consider their own wealth value (i.e., R t ) and the gap between their own wealth value and the benchmark (i.e., R t B t ) equally important, that is, θ t = 0 . 5 for t = 1 , 2 , , T ;
  • Case II. The proposed time-consistent strategy does not intertemporal restrictions, and it only depends on the benchmark origination. In this case, the investors only consider the performance of the terminal wealth and the intermediate performance of the portfolio is ignored here, that is, w t = 0 if t = 1 , 2 , , T 1 and w T = 1 . Similar to Case I, we assume that θ t = 0 . 5 for t = 1 , 2 , . . , T ;
  • Case III. The proposed time-consistent strategy considers all the intertemporal restrictions, however it has nothing to do with the benchmark process. Similar to Case I, we can find that w t = 1 for t = 1 , 2 , , T . Additionally, in this case, the investors only consider the performance of their own wealth, that is, θ t = 0 for t = 1 , 2 , , T .
Zhu et al. [13] showed that the number of investment bankruptcies that occur in the earlier periods is larger than those that occur in the later periods. In this situation, we should give a larger penalty for the earlier intertemporal restrictions in the mathematical formulation. That is, the investors have a higher risk aversion coefficient at the beginning of the investment period. In order to discuss the impacts of the intertemporal restrictions on the time-consistent strategies, a reasonable weight function η t should be given first. As shown in Zhou et al. [27], we can find that, the time-consistent strategy for the traditional multiperiod mean-variance model, i.e., the time consistent strategy (17), can be derived by optimizing the following single-period problem with the time-varying risk aversion coefficient η ^ t = η T k = t + 1 T 1 s k , t = 1 , 2 , , T .
max u t E ( R t + 1 ) η ^ t + 1 V a r ( R t + 1 ) s . t . R t + 1 = s t R t + P t u t , t = 0 , 1 , . . . T 1 .
Further, if the risk-free rate is a number that doesn’t change over time, that is, s t = r f , the time-vary risk aversion coefficient η ^ t can be written as η ^ t = η T × r f T t , t = 1 , 2 , , T . Motivated by the above time-vary risk aversion coefficient η ^ t , in this paper, we arbitrarily assume that the investors’ risk aversion coefficient changes exponentially, i.e., η t = η T × q T t for t = 1 , 2 , , T , where q is a fixed parameter. Compared with the traditional time-consistent strategy (17), the proposed time-consistent strategies have considered the role of the intertemporal restrictions, that is, the investors who adopt the proposed strategies might be more risk-averse than that in Model (38). To this end, we let q and r f satisfy the relationship that q > r f . In the following, we will discuss the evolution process of the time-consistent strategy under the following two investment situations: (i) there exists a risk-free asset and 4 risky assets in the capital pool; (ii) there only exist 4 risky assets in the capital pool.

4.1. The Time-Consistent Strategy with Both Risky Assets and a Risk-Free Asset

In this section, we will discuss the evolution of the time-consistent strategy with both risky assets and a risk-free asset. Using the monthly return of risky assets from May 2002 to December 2018 as the investment sample, we can derive the corresponding path of the time-consistent strategy. The details see Figure 1, Figure 2, Figure 3 and Figure 4.
Suppose that η T = 2 and q = 1.005 . We will compare the time-consistent strategy with and without intertemporal restrictions (i.e., Case I and Case II) to show the impact of the intertemporal restrictions on the time-consistent strategy. As shown in Figure 1, when the intertemporal restrictions are considered in the investment decision, the investors will shrink investment position (i.e., shrinking the long position u ^ t 2 , u ^ t 3 and u ^ t 4 , meanwhile, shrinking the short position u ^ t 1 ) invested in the risky assets compared to the investment strategy without considering intertemporal restrictions. This means that the amount invested in risk-free asset ( R t i = 1 n u t i ) will be increased for the fixed time period, indicating that the investors will adopt a conservative strategy to reduce the investment risk in the earlier periods. Additionally, with the increase in the time period, the position difference of the time-consistent strategies with and without intertemporal restrictions is decreases.
Using the parameters shown in Figure 2, i.e., η T = 2 and q = 1.005 , we will compare the time-consistent strategy with and without benchmark orientation (i.e., Case I and Case III), so as to show the impact of the benchmark on the time-consistent strategy. As shown in Figure 2, we can find that the time-consistent strategies with and without benchmark orientation almost are coincident. In other words, the benchmark has little impact on the time-consistent strategy when the risk aversion coefficient of the investors is small.
As shown in Figure 3, when the investors have a larger risk aversion coefficient η T = 100 , the benchmark process leads to a significant impact on the time-consistent strategy, especially for the investment strategy in the later periods. In this situation, the investors might tend to choose a conservative investment strategy to imitate the return of the benchmark process.
To evaluate whether the time-consistent strategy that considers the benchmark can imitate the return of the benchmark or not, we will give a more intuitive simulation to verify this conclusion. In addition to the condition of Case I, we also suppose that η T = 100 , then the return of the portfolio at the different time periods can be derived. As shown in Figure 4, we can find that the return of the benchmark has almost the same trend as that of the proposed portfolio. This results indicate that, when the investors have the larger risk aversion coefficients, the proposed time-consistent strategy can indeed imitate the return of the benchmark.

4.2. The Time-Consistent Strategy with Only Risky Assets

In this section, we will discuss the evolution of the time-consistent strategy with only risky assets. Similar to Section 4.1, we can derive the corresponding path of the time-consistent strategy. The details see Figure 5, Figure 6, Figure 7 and Figure 8.
Suppose that η T = 2 and q = 1.005 . We will compare the time-consistent strategy with and without intertemporal restrictions (i.e., Case I and Case II). As shown in Figure 5, when the intertemporal restrictions are considered in the investment decision, the investors will shrink investment position (i.e., shrinking the short position u ^ t 1 and the long position u ^ t 2 , meanwhile, increasing the long position u ^ t 3 and u ^ t 4 ) invested in the risky assets compared to the investment strategy without considering intertemporal restrictions. Unlike the time-consistent strategy with both multiple risky assets and a risk-free asset (e.g., the investors can reduce the portfolio risk by increasing the amount investment in the risk-free asset), when there are only risky assets, the investors can only reduce the investment risk that existed in the earlier periods by adjusting the investment position among the risky assets. Similarly, with the increase in the time period, the position difference of the time-consistent strategies with and without intertemporal restrictions is decreases.
Similar to Figure 2 and Figure 3, we also suppose that η T = 2 o r 100 and q = 1.005 . In the following, we will compare the time-consistent strategies with and without benchmark orientation (i.e., Case I and Case III) to show the impact of the benchmark on the time-consistent strategy. Figure 6 and Figure 7 show that the benchmark will lead to a significant impact on the time-consistent strategy regardless of whether the investors have small risk aversion coefficients or large risk aversion coefficients. Additionally, comparing Figure 2 and Figure 3 and Figure 6 and Figure 7, we can find that, the benchmark has a larger impact on the time-consistent strategy with only risky assets compared to that on the time-consistent strategy with both a risk-free asset and multiple risky assets.
In addition to the condition of Case I, we also suppose that η T = 100 . As shown in Figure 8, we can find that the return of the benchmark and the return of the proposed portfolio have almost the same trend, which is consistent with the conclusion shown in Figure 4. This result indicates that, when the investors have the larger risk aversion coefficient, the proposed time-consistent strategy can also imitate the return of the benchmark.

5. Conclusions

In this paper, we investigate a generalized multiperiod mean-variance portfolio optimization with consideration of benchmark orientation and intertemporal restrictions. Since the proposed model is a time-inconsistent problem, we cannot directly solve it by using the traditional dynamic programming approach. Although this problem can be solved indirectly by the embedding scheme, this approach cannot guarantee that the derived strategy (i.e., precommitment strategy) satisfies the time-consistency. Thus, the precommitment strategy has been criticized for lacking rationality by some researchers. In this paper, we adopt a game approach to solve the proposed model, in which the investment decision-making process is deemed to be a noncooperative game. We assume that there exist T players who stand in the different time periods; they all aim to maximize their own generalized mean-variance sub-objectives. Then, the Nash equilibrium solution of this game problem is defined as the time-consistent strategy for the proposed model. In this framework, we derive the time-consistent strategies for the proposed model with and without a risk-free asset by using the backward induction approach. We find that the time-consistent strategy, when there exists a risk-free asset in the capital pool, is feedback one on the benchmark process; when the capital pool with only risky assets, the time-consistent strategy is double feedback one on both benchmark process and wealth process. Finally, we also provide some numerical simulations to show the conclusions derived in this study. These results indicate that, the proposed time-consistent strategy not only can reduce the risk existed in the intermediate process of investment but also can imitate the return of benchmark process.
Apparently, the above game approach can be extended to many time-inconsistent dynamic optimization problems. More importantly, this approach can provide a more suitable strategy for sophisticated decision-makers, since it takes possible future revisions into account. Roughly speaking, the current work can be further extended from the following two aspects. First, this paper assumes that the risk aversion coefficient is independent with current wealth; however, in some cases, the risk aversion coefficient of investors also depends on their level of wealth. Intuitively, the greater the wealth of investors, the less risk averse they are likely to be. Therefore, the case when the risk aversion depends dynamically on current wealth is worth to be investigated in the further work. Second, we can introduce the Markov chain into our proposed model to investigate the time-consistent strategy under the regime switching environment.

Author Contributions

The contributions of authors are as follows: writing–original draft preparation, H.X.; data curation, T.R; supervision, Z.Z.

Funding

This research is supported by the National Natural Science Foundation of China (Nos. 71771082 and 71801091) and Hunan Provincial Natural Science Foundation of China (No. 2017JJ1012).

Acknowledgments

The authors are grateful to the anonymous reviewers and the editor for the valuable comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. The Proof of Proposition 1

When k = T 1 , according to the definition of J k ( R k , B k , u ) , we have
V T 1 ( R T 1 , B T 1 ) = max u T 1 w T [ ξ T E T 1 ( R T θ T B T ) η T V a r T 1 ( R T θ T B T ) ] .
This indicates that Proposition 1 holds for k = T 1 . When k = 0 , 1 , , T 2 , the function J k ( R k , B k , u ) can be expressed as
J k ( R k , B k , u ) = t = k + 1 T w t [ ξ t E k ( R t θ t B t ) η t V a r k ( R t θ t B t ) ] = t = k + 2 T w t [ ξ t E k ( R t θ t B t ) η t V a r k ( R t θ t B t ) ] + w k + 1 [ ξ k + 1 E k ( R k + 1 θ k + 1 B k + 1 ) η k + 1 V a r k ( R k + 1 θ k + 1 B k + 1 ) ] .
By using the law of iterated expectations and the law of total variance, we have
E k ( R t θ t B t ) = E k [ E k + 1 ( R t θ t B t ) ] ,
V a r k ( R t θ t B t ) = E k [ V a r k + 1 ( R t θ t B t ) ] + V a r k [ E k + 1 ( R t θ t B t ) ] .
Then, J k ( R k , B k , u ) can be rewritten as
J k ( R k , B k , u ) = E k t = k + 2 T w t [ ξ t E k + 1 ( R t θ t B t ) η t V a r k + 1 ( R t θ t B t ) ] t = k + 2 T w t η t V a r k [ E k + 1 ( R t θ t B t ) ] + w k + 1 [ ξ k + 1 E k ( R k + 1 θ k + 1 B k + 1 ) η k + 1 V a r k ( R k + 1 θ k + 1 B k + 1 ) ] , = E k [ J k + 1 ( R k + 1 , B k + 1 , u ) ] t = k + 2 T w t η t V a r k [ E k + 1 ( R t θ t B t ) ] + w k + 1 [ ξ k + 1 E k ( R k + 1 θ k + 1 B k + 1 ) η k + 1 V a r k ( R k + 1 θ k + 1 B k + 1 ) ] .
Let f k , t ( R k , B k ) = E k [ R t θ t B t ] | u ^ ( k ) = E k [ f k + 1 , t ( R k + 1 , B k + 1 ) ] . Additionally, due to the fact that V k ( R k , B k ) = max u k J k ( R k , B k , u ( k ) ) = J k ( R k , B k , u ^ ( k ) ) , then we have
V k ( R k , B k ) = max u k E k [ V k + 1 ( R k + 1 , B k + 1 ) ] t = k + 2 T R t η t V a r k [ f k + 1 , t ( R k + 1 , B k + 1 ) ] + w k + 1 [ ξ k + 1 E k ( R k + 1 θ k + 1 B k + 1 ) η k + 1 V a r k ( R k + 1 θ k + 1 B k + 1 ) ] , k = 0 , 1 , . . . , T 2 ,
Therefore, we complete the proof of Proposition 1.

Appendix B. The Proof of Theorem 1

When k = T 1 , we have
V T 1 ( R T 1 , B T 1 ) = max u T 1 J T 1 ( R T 1 , B T 1 , u T 1 ) = max u T 1 w T [ ξ T E T 1 ( R T θ T B T ) η T V a r T 1 ( R T θ T B T ) ] = max u T 1 w T ξ T [ s T 1 R T 1 + μ T 1 u T 1 θ T ν T 1 B T 1 ] w T η T [ u T 1 Ω T 1 u T 1 + θ T 2 σ T 1 B T 1 2 2 θ T Q T 1 u T 1 B T 1 ] .
Since (A7) is a convex programming problem, by using the first-order necessary optimality condition, then we have
u ^ T 1 = Ω T 1 1 [ ξ T μ T 1 + 2 η T θ T Q T 1 B T 1 ] 2 η T = a ^ T 1 B T 1 + b ^ T 1 .
Substituting (A8) into (A7), therefore, V T 1 ( R T 1 , B T 1 ) can be expressed as
V T 1 ( R T 1 , B T 1 ) = w T ξ T s T 1 R T 1 + [ 2 w T η T θ T Q T 1 a ^ T 1 w T η T a ^ T 1 Ω T 1 a ^ T 1 w T η T θ T 2 σ T 1 ] B T 1 2 + [ w T ξ T μ T 1 a ^ T 1 w T ξ T θ T ν T 1 2 w T η T a ^ T 1 Ω T 1 b ^ T 1 + 2 w T η T θ T Q T 1 b ^ T 1 ] B T 1 + w T ξ T μ T 1 b ^ T 1 w T η T b ^ T 1 Ω T 1 b ^ T 1 = w T ξ T s T 1 R T 1 + m ^ T 1 B T 1 2 + n ^ T 1 B T 1 + γ ^ T 1 ,
and
f T 1 , T ( R T 1 , B T 1 ) = s T 1 R T 1 + [ μ T 1 a ^ T 1 θ T ν T 1 ] B T 1 + μ T 1 b ^ T 1 = s T 1 w T 1 + ρ ^ T 1 , T B T 1 + κ ^ T 1 , T ,
where
a ^ T 1 = θ T Ω T 1 1 Q T 1 , b ^ T 1 = ξ T Ω T 1 1 μ T 1 2 η T , m ^ T 1 = w T η T ( 2 θ T Q T 1 a ^ T 1 a ^ T 1 Ω T 1 a ^ T 1 θ T 2 σ T 1 ) , n ^ T 1 = w T ξ T ( μ T 1 a ^ T 1 θ T ν T 1 ) 2 w T η T ( a ^ T 1 Ω T 1 b ^ T 1 θ T Q T 1 b ^ T 1 ) , γ ^ T 1 = w T ξ T μ T 1 b ^ T 1 w T η T b ^ T 1 Ω T 1 b ^ T 1 , ρ ^ T 1 , T = μ T 1 a ^ T 1 θ T ν T 1 , κ ^ T 1 , T = μ T 1 b ^ T 1 .
Suppose that Theorem 1 holds for k = j + 1 , j + 2 . . , T 1 , then when k = j , we have
V j ( R j , B j ) = max u j E j [ V j + 1 ( R j + 1 , B j + 1 ) ] m = j + 2 T w m η m V a r j [ f j + 1 , m ( R j + 1 , B j + 1 ) ] + w j + 1 [ ξ j + 1 E j ( R j + 1 θ j + 1 B j + 1 ) η j + 1 V a r j ( R j + 1 θ j + 1 B j + 1 ) ] = max u j ( m = j + 1 T w m ξ m k = j m 1 s k ) R j + m ^ j + 1 ( σ j + ν j 2 ) B j 2 + n ^ j + 1 ν j B j + γ ^ j + 1 + ( m = j + 1 T w m ξ m k = j + 1 m 1 s k ) μ j u j 2 ( m = j + 1 T w m η m ρ ^ j + 1 , m k = j + 1 m 1 s k ) Q j u j B j ( m = j + 1 T w m η m k = j + 1 m 1 s k 2 ) u j Ω j u j ( m = j + 1 T w m η m ρ ^ j + 1 , m 2 ) σ j B j 2 w j + 1 ξ j + 1 θ j + 1 ν j B j .
Similarly, due to the fact that (A12) is also a convex programming problem, by using the first-order necessary optimality condition, we have
u ^ j = m = j + 1 T ( w m ξ m k = j + 1 m 1 s k ) Ω j 1 μ j 2 m = j + 1 T ( w m η m ρ ^ j + 1 , m k = j + 1 m 1 s k ) Ω j 1 Q j B j 2 m = j + 1 T ( w m η m k = j + 1 m 1 s k 2 ) = a ^ j B j + b ^ j .
Substituting (A13) into (A12), therefore, V j ( R j , B j ) can be expressed as
V j ( R j , B j ) = ( m = j + 1 T w m ξ m k = j m 1 s k ) R j + m ^ j + 1 ( σ j + ν j 2 ) 2 m = j + 1 T ( w m η m ρ ^ j + 1 , m k = j + 1 m 1 s k ) Q j a ^ j m = j + 1 T ( w m η m k = j + 1 m 1 s k 2 ) a ^ j Ω j a ^ j m = j + 1 T ( w m η m ρ ^ j + 1 , m 2 ) σ j B j 2 + n ^ j + 1 ν j w j + 1 ξ j + 1 θ j + 1 ν j + ( m = j + 1 T w m ξ m k = j + 1 m 1 s k ) μ j a ^ j 2 m = j + 1 T ( w m η m ρ ^ j + 1 , m k = j + 1 m 1 s k ) Q j b ^ j 2 m = j + 1 T ( w m η m k = j + 1 m 1 s k 2 ) a ^ j Ω j b ^ j B j + γ ^ j + 1 + ( m = j + 1 T w m ξ m k = j + 1 m 1 s k ) μ j b ^ j ( m = j + 1 T w m η m k = j + 1 m 1 s k 2 ) b ^ j Ω j 1 b ^ j = m = j + 1 T w m ξ m k = j + 1 m 1 s k R j + m ^ j B j 2 + n ^ j B j + γ ^ j ,
as well as the function f j , τ ( R j , B j ) can be shown as follows.
f j , τ ( R j , B j ) = E j [ f j + 1 , τ ( R j + 1 , B j + 1 ) ] = m = j τ 1 s m R j + m = j + 1 τ 1 s m μ j a ^ j + ρ ^ j + 1 , τ ν j B j + m = j + 1 τ 1 s m μ j b ^ j + κ ^ j + 1 , τ = m = j τ 1 s m R j + ρ ^ j , τ B j + κ ^ j , τ ,
where
a ^ j = m = j + 1 T ( w m η m ρ ^ j + 1 , m k = j + 1 m 1 s k ) Ω j 1 Q j m = j + 1 T ( w m η m k = j + 1 m 1 s k 2 ) , b ^ j = m = j + 1 T ( w m ξ m k = j + 1 m 1 s k ) Ω j 1 μ j 2 m = j + 1 T ( w m η m k = j + 1 m 1 s k 2 ) , m ^ j = m ^ j + 1 ( σ j + ν j 2 ) 2 m = j + 1 T ( w m η m ρ ^ j + 1 , m k = j + 1 m 1 s k ) Q j a ^ j m = j + 1 T ( w m η m k = j + 1 m 1 s k 2 ) a ^ j Ω j a ^ j m = j + 1 T ( w m η m ρ ^ j + 1 , m 2 ) σ j , n ^ j = n ^ j + 1 ν j w j + 1 ξ j + 1 θ j + 1 ν j + ( m = j + 1 T w m ξ m k = j + 1 m 1 s k ) μ j a ^ j 2 m = j + 1 T ( w m η m ρ ^ j + 1 , m k = j + 1 m 1 s k ) Q j b ^ j 2 m = j + 1 T ( w m η m k = j + 1 m 1 s k 2 ) a ^ j Ω j b ^ j γ ^ j = γ ^ j + 1 + ( m = j + 1 T w m ξ m k = j + 1 m 1 s k ) μ j b ^ j ( m = j + 1 T w m η m k = j + 1 m 1 s k 2 ) b ^ j Ω j b ^ j , ρ ^ j , τ = m = j + 1 τ 1 s m μ j a ^ j + ρ ^ j + 1 , τ ν j , κ ^ j , τ = m = j + 1 τ 1 s m μ j b ^ j + κ ^ j + 1 , τ .
According to the above proof, we can conclude that Theorem 2 holds for all t = 0 , 1 , . . , T 1 .

Appendix C. The Proof of Theorem 2

For k = T 1 , we have
V T 1 ( R T 1 , B T 1 ) = max u T 1 ξ T E T 1 ( R T θ T B T ) η T V a r T 1 ( R T θ T B T ) = max u T 1 R T 1 = I u T 1 w T ξ T [ λ T 1 u T 1 θ T ν T 1 B T 1 ] w T η T [ u T 1 Ω T 1 u T 1 + θ T 2 B T 1 2 σ T 1 2 θ T B T 1 Q T 1 u T 1 ] = max u T 1 R T 1 = I u T 1 w T ξ T θ T ν T 1 B T 1 w T η T θ T 2 B T 1 2 σ T 1 + w T ξ T λ T 1 u T 1 w T η T u T 1 Ω T 1 u T 1 + 2 w T η T θ T B T 1 Q T 1 u T 1 .
Here, we can construct the following Lagrange function
L T 1 ( u T 1 , ζ T 1 ) = w T ξ T λ T 1 u T 1 w T η T u T 1 Ω T 1 u T 1 + 2 w T η T θ T B T 1 Q T 1 u T 1 ζ T 1 ( I u T 1 R T 1 ) .
Since (A17) is a convex programming problem, by using the first-order necessary optimality condition, then we have
w T ξ T λ T 1 2 w T η T Ω T 1 u T 1 + 2 w T η T θ T B T 1 Q T 1 ζ T 1 I = 0 , I u T 1 R T 1 = 0 .
By solving Equation (A19), we can conclude that
u ^ T 1 = w T ξ T Ω T 1 1 λ T 1 + 2 w T η T θ T B T 1 Ω T 1 1 Q T 1 2 w T η T w T ξ T I Ω T 1 1 λ T 1 + 2 w T η T θ T I Ω T 1 1 Q T 1 B T 1 2 w T η T R T 1 2 w T η T I Ω T 1 1 I Ω T 1 1 I = Ω T 1 1 I I Ω T 1 1 I R T 1 + θ T Ω T 1 1 Q T 1 θ T I Ω T 1 1 Q T 1 Ω T 1 1 I I Ω T 1 1 I B T 1 + ξ T Ω T 1 1 λ T 1 2 η T ξ T I Ω T 1 1 λ T 1 Ω T 1 1 I 2 η T I Ω T 1 1 I = a ˜ T 1 B T 1 + b ˜ T 1 R T 1 + c ˜ T 1 ,
where
a ˜ T 1 = θ T Ω T 1 1 Q T 1 θ T I Ω T 1 1 Q T 1 Ω T 1 1 I I Ω T 1 1 I , b ˜ T 1 = Ω T 1 1 I I Ω T 1 1 I , c ˜ T 1 = ξ T Ω T 1 1 λ T 1 2 η T ξ T I Ω T 1 1 λ T 1 Ω T 1 1 I 2 η T I Ω T 1 1 I .
Thus, the value function V T 1 ( R T 1 , B T 1 ) can be expressed as
V T 1 ( R T 1 , B T 1 ) = w T ξ T θ T ν T 1 B T 1 w T η T θ T 2 σ T 1 B T 1 2 + w T ξ T λ T 1 ( a ˜ T 1 B T 1 + b ˜ T 1 R T 1 + c ˜ T 1 ) w T η T ( a ˜ T 1 B T 1 + b ˜ T 1 R T 1 + c ˜ T 1 ) Ω T 1 ( a ˜ T 1 B T 1 + b ˜ T 1 R T 1 + c ˜ T 1 ) + 2 w T η T θ T B T 1 Q T 1 ( a ˜ T 1 B T 1 + b ˜ T 1 R T 1 + c ˜ T 1 ) .
Since a ˜ T 1 Ω T 1 b ˜ T 1 = 0 and b ˜ T 1 Ω T 1 c ˜ T 1 = 0 , then we have
V T 1 ( R T 1 , B T 1 ) = [ w T η T θ T 2 σ T 1 w T η T a ˜ T 1 Ω T 1 a ˜ T 1 + 2 w T η T θ T Q T 1 a ˜ T 1 ] B T 1 2 + [ w T η T b ˜ T 1 Ω T 1 b ˜ T 1 ] R T 1 2 + 2 w T η T θ T Q T 1 b ˜ T 1 B T 1 R T 1 + [ w T ξ T θ T ν T 1 + w T ξ T λ T 1 a ˜ T 1 2 w T η T a T 1 Ω T 1 c T 1 + 2 w T η T θ T Q T 1 c ˜ T 1 ] B T 1 + w T ξ T λ T 1 b ˜ T 1 R T 1 + [ w T ξ T λ T 1 c ˜ T 1 w T η T c ˜ T 1 Ω T 1 c ˜ T 1 ] = m ˜ T 1 B T 1 2 + α ˜ T 1 R T 1 2 + φ ˜ T 1 B T 1 R T 1 + n ˜ T 1 B T 1 + β ˜ T 1 R T 1 + γ ˜ T 1 ,
where
m ˜ T 1 = w T η T θ T 2 σ T 1 w T η T a ˜ T 1 Ω T 1 a ˜ T 1 + 2 w T η T θ T Q T 1 a ˜ T 1 , α ˜ T 1 = w T η T b ˜ T 1 Ω T 1 b ˜ T 1 , φ ˜ T 1 = 2 w T η T θ T Q T 1 b ˜ T 1 , n ˜ T 1 = w T ξ T θ T ν T 1 + w T ξ T λ T 1 a ˜ T 1 2 w T η T θ T Q T 1 c ˜ T 1 , β ˜ T 1 = w T ξ T λ T 1 b ˜ T 1 , γ ˜ T 1 = w T ξ T λ T 1 c ˜ T 1 w T η T c ˜ T 1 Ω T 1 c ˜ T 1 .
Additionally, the function f T 1 , T ( R T 1 , B T 1 ) can be expressed as
f T 1 , T ( R T 1 , B T 1 ) = λ T 1 ( a ˜ T 1 B T 1 + b ˜ T 1 R T 1 + c ˜ T 1 ) θ T ν T 1 B T 1 = [ λ T 1 a ˜ T 1 θ T ν T 1 ] B T 1 + λ T 1 b ˜ T 1 R T 1 + λ T 1 c ˜ T 1 = ϑ ˜ T 1 , T R T 1 + ρ ˜ T 1 , T B T 1 + κ ˜ T 1 , T ,
where
ϑ ˜ T 1 , T = λ T 1 b ˜ T 1 , ρ ˜ T 1 , T = λ T 1 a ˜ T 1 θ T ν T 1 , κ ˜ T 1 , T = λ T 1 c ˜ T 1 .
Here, we define that
Ω ˜ T 2 = ( m = T 1 T w m η m ϑ ˜ T 1 , m 2 ) Ω T 2 α ˜ T 1 Ξ T 2 .
Due to the fact that α ˜ T 1 = w T η T b ˜ T 1 Ω T 1 b ˜ T 1 = w T η T I Ω T 1 1 I , then we have
Ω ˜ T 2 = ( w T 1 η T 1 + w T η T ϑ ˜ T 1 , T 2 ) Ω T 2 + w T η T I Ω T 1 1 I Ξ T 2 .
Because, for t = 0 , 1 , , T 1 , Ω t and Ξ t are both positive definite matrices, then we can derive that Ω t 1 is also positive definite. Since ( w T 1 η T 1 + w T η T ϑ ˜ T 1 , T 2 ) > 0 and w T η T I Ω T 1 1 I > 0 , we can conclude that Ω ˜ T 2 is also a positive definite matrix. The above results show that Theorem 2 holds for t = T 1 .
Suppose that Theorem 2 holds for t = j + 1 , j + 2 , , T 1 . This indicates that, for t = j + 1 , j + 2 , , T 2 , Ω ˜ t = ( m = t + 1 T w m η m ϑ ˜ t + 1 , m 2 ) Ω t α ˜ t + 1 Ξ t are all positive definite matrices. Then, for t = j , we have that
V j ( R j , B j ) = max u j E j [ m ˜ j + 1 B j + 1 2 + α ˜ j + 1 R j + 1 2 + φ ˜ j + 1 B j + 1 R j + 1 + n ˜ j + 1 B j + 1 + β ˜ j + 1 R j + 1 + γ ˜ j + 1 ] m = j + 2 T w m η m V a r j [ ϑ ˜ j + 1 , m R j + 1 + ρ ˜ j + 1 , m B j + 1 + κ ˜ j + 1 , m ] w j + 1 [ ξ j + 1 E j ( R j + 1 θ j + 1 B j + 1 ) η j + 1 V a r j ( R j + 1 θ j + 1 B j + 1 ) ] = max u j R j = I u j m ˜ j + 1 ( σ j + ν j 2 ) B j 2 + n ˜ j + 1 ν j B j + γ ˜ j + 1 w j + 1 ξ j + 1 θ j + 1 ν j B j ( m = j + 1 T w m η m ρ ˜ j + 1 , m 2 ) σ j B j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m 2 ) u j Ω j u j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) B j Q j u j + w j + 1 ξ j + 1 λ j u j β ˜ j + 1 λ j u j + α ˜ j + 1 u j Ξ j u j + φ ˜ j + 1 B j ϕ j u j .
Similarly, we can construct the following Lagrange function
L j ( u j , ζ j ) = ( m = j + 1 T w m η m ϑ ˜ j + 1 , m 2 ) u j Ω j u j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ j + 1 , m ) B j Q j u j + w j + 1 ξ j + 1 λ j u j + β ˜ j + 1 λ j u j + α ˜ j + 1 u j Ξ j u j + φ ˜ j + 1 B j ϕ j u j ζ j ( I u j R j ) .
Let Ω ˜ j = ( m = j + 1 T w m η m ϑ ˜ j + 1 , m 2 ) Ω j α ˜ j + 1 Ξ j . Since Ω ˜ j + 1 is a positive definite matrix, then we find that Ω ˜ j + 1 1 is also positive definite as well as I Ω ˜ j + 1 1 I > 0 . Additionally, due to the fact that α ˜ j + 1 = b ˜ j + 1 Ω ˜ j + 1 b ˜ j + 1 = 1 I Ω ˜ j + 1 1 I and m = j + 1 T w m η m ϑ ˜ j + 1 , m 2 > 0 , we can conclude that Ω ˜ j is also a positive definite matrix. Further, by the first-order necessary optimality condition, we have
2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m 2 ) Ω j u j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) B j Q j + w j + 1 ξ j + 1 λ j + β ˜ j + 1 λ j + 2 α ˜ j + 1 Ξ j u j + φ ˜ j + 1 B j ϕ j ζ j I = 0 , I u j R j = 0 .
By solving equation set of (A31), we can conclude that
u ^ j = Ω ˜ j 1 [ φ ˜ j + 1 ϕ j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j ] 2 + 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) I Ω ˜ j 1 Q j φ ˜ j + 1 I Ω ˜ j 1 ϕ j 2 I Ω ˜ j 1 I Ω ˜ j 1 I B j + Ω ˜ j 1 I I Ω ˜ j 1 R j + ( w j + 1 ξ j + 1 + β ˜ j + 1 ) Ω ˜ j 1 λ j 2 ( w j + 1 ξ j + 1 + β ˜ j + 1 ) I Ω ˜ j 1 λ j 2 I Ω ˜ j 1 I Ω ˜ j 1 I = a ˜ j B j + b ˜ j R j + c ˜ j ,
where
a ˜ j = Ω ˜ j 1 [ φ ˜ j + 1 ϕ j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j ] 2 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) I Ω ˜ j 1 Q j + φ ˜ j + 1 I Ω ˜ j 1 ϕ j 2 I Ω ˜ j 1 I Ω ˜ j 1 I , b ˜ j = Ω ˜ j 1 I I Ω ˜ j 1 I , c ˜ j = ( w j + 1 ξ j + 1 + β ˜ j + 1 ) Ω ˜ j 1 λ j 2 ( w j + 1 ξ j + 1 + β ˜ j + 1 ) I Ω ˜ j 1 λ j 2 I Ω ˜ j 1 I Ω ˜ j 1 I .
Since a ˜ j Ω ˜ j b ˜ j = 0 and b ˜ j Ω ˜ j c ˜ j = 0 , then we have
V j ( R j , B j ) = m ˜ j + 1 ( σ j + ν j 2 ) ( m = j + 1 T w m η m ρ ˜ j + 1 , m 2 ) σ j a ˜ j Ω ˜ j a ˜ j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j a ˜ j + φ ˜ j + 1 ϕ j a ˜ j B j 2 + b ˜ j Ω ˜ j b ˜ j R j 2 + 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j b ˜ j + φ ˜ j + 1 ϕ j b ˜ j B j R j + n ˜ j + 1 ν j w j + 1 ξ j + 1 θ j + 1 ν j 2 a ˜ j Ω ˜ j c ˜ j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j c ˜ j + w j + 1 ξ j + 1 λ j a ˜ j + β ˜ j λ j a ˜ j + φ ˜ j + 1 ϕ j c ˜ j B j + w j + 1 ξ j + 1 λ j b ˜ j + β ˜ j + 1 λ j b ˜ j R j + γ ˜ j + 1 c ˜ j Ω ˜ j c ˜ j + w j + 1 ξ j + 1 λ j c ˜ j + β ˜ j + 1 λ j c ˜ j = m ˜ j B j 2 + α ˜ j R j 2 + φ ˜ j B j R j + n ˜ j B j + β ˜ j R j + γ ˜ j ,
where
m ˜ j = m ˜ j + 1 ( σ j + ν j 2 ) ( m = j + 1 T w m η m ρ ˜ j + 1 , m 2 ) σ j a ˜ j Ω ˜ j a ˜ j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j a ˜ j + φ ˜ j + 1 ϕ j a ˜ j , α ˜ j = b ˜ j Ω ˜ j b ˜ j , φ ˜ j = 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j b ˜ j + φ ˜ j + 1 ϕ j b ˜ j , n ˜ j = n ˜ j + 1 ν j w j + 1 ξ j + 1 θ j + 1 ν j 2 a ˜ j Ω ˜ j c ˜ j 2 ( m = j + 1 T w m η m ϑ ˜ j + 1 , m ρ ˜ j + 1 , m ) Q j c ˜ j + w j + 1 ξ j + 1 λ j a j + β ˜ j + 1 λ j a ˜ j + φ ˜ j + 1 ϕ j c ˜ j , β ˜ j = w j + 1 ξ j + 1 λ j b ˜ j + β ˜ j + 1 λ j b ˜ j , γ ˜ j = γ ˜ j + 1 c ˜ j Ω ˜ j c j + w j + 1 ξ j + 1 λ j c ˜ j + β ˜ j + 1 λ j c ˜ j .
Additionally, we have
f j , τ ( R j , B j ) = E j [ f j + 1 , τ ( R j + 1 , B j + 1 ) ] = ϑ ˜ j + 1 , τ λ j b ˜ j R j + [ ϑ ˜ j + 1 , τ λ j a ˜ j + ρ ˜ j + 1 , τ ν j ] B j + ϑ ˜ j + 1 , τ λ j c ˜ j + κ ˜ j + 1 , τ = ϑ ˜ j , τ R j + ρ ˜ j , τ B j + κ ˜ j , τ ,
where
ϑ ˜ j , τ = ϑ ˜ j + 1 , τ λ j b ˜ j , ρ ˜ j , τ = ϑ ˜ j + 1 , τ λ j a ˜ j + ρ ˜ j + 1 , τ ν j , κ ˜ j , τ = ϑ ˜ j + 1 , τ λ j c ˜ j + κ ˜ j + 1 , τ .
According to the above proof, we can conclude that Theorem 2 holds for all t = 0 , 1 , , T 1 .

References

  1. Markowitz, H. Portfolio selection. J. Financ. 1952, 7, 77–91. [Google Scholar]
  2. Li, D.; Ng, W.L. Optimal dynamic portfolio selection: Multiperiod mean-variance formulation. Math. Financ. 2000, 10, 387–406. [Google Scholar] [CrossRef]
  3. Leippold, M.; Trojani, F.; Vanini, P. A geometric approach to multiperiod mean variance optimization of assets and liabilities. J. Econ. Dyn. Control 2004, 28, 1079–1113. [Google Scholar] [CrossRef] [Green Version]
  4. Wei, S.-Z.; Ye, Z.-X. Multi-period optimization portfolio with bankruptcy control in stochastic market. Appl. Math. Comput. 2007, 186, 414–425. [Google Scholar] [CrossRef]
  5. Yao, H.; Zeng, Y.; Chen, S. Multi-period mean–variance asset–liability management with uncontrolled cash flow and uncertain time-horizon. Econ. Model. 2013, 30, 492–500. [Google Scholar] [CrossRef]
  6. Chen, Z.; Li, G.; Zhao, Y. Time-consistent investment policies inmarkovian markets: A case of mean–variance analysis. J. Econ. Dyn. Control 2014, 40, 293–316. [Google Scholar] [CrossRef]
  7. Cui, X.; Li, D.; Li, X. Mean-variance policy for discrete-time cone-constrained markets: Time consistency in efficiency and theminimum-variance signed supermartingale measure. Math. Financ. 2017, 27, 471–504. [Google Scholar] [CrossRef]
  8. Liu, J.; Chen, Z. Time consistent multi-period robust risk measures and portfolio selection models with regime-switching. Eur. J. Opt. Res. 2019, 268, 373–385. [Google Scholar] [CrossRef]
  9. Zhou, Z.; Zeng, X.; Xiao, H.; Ren, T.; Liu, W. Multiperiod portfolio optimization for asset-liability management with quadratic transaction costs. J. Ind. Manag. Optim. 2019, 15, 1493–1515. [Google Scholar] [CrossRef]
  10. Roll, R. A mean/variance analysis of tracking error. J. Portf. Manag. 1992, 18, 13–22. [Google Scholar] [CrossRef]
  11. Zhao, Y. A dynamic model of active portfolio management with benchmark orientation. J. Bank. Financ. 2007, 31, 3336–3356. [Google Scholar] [CrossRef]
  12. Espinosa, G.E.; Touzi, N. Optimal investment under relative performance concerns. Math. Financ. 2015, 25, 221–257. [Google Scholar] [CrossRef]
  13. Zhu, S.S.; Li, D.; Wang, S.Y. Risk control over bankruptcy in dynamic portfolio selection: A generalized mean-variance formulation. IEEE Trans. Autom. Control 2004, 49, 447–457. [Google Scholar] [CrossRef]
  14. Costa, O.; Nabholz, R.d.B. Multiperiod mean-variance optimization with intertemporal restrictions. J. Optim. Theory Appl. 2007, 134, 257. [Google Scholar] [CrossRef]
  15. Costa, O.L.; Araujo, M.V. A generalized multi-period mean–variance portfolio optimization with Markov switching parameters. Automatica 2008, 44, 2487–2497. [Google Scholar] [CrossRef]
  16. Costa, O.L.; de Oliveira, A. Optimal mean–variance control for discrete-time linear systems with Markovian jumps and multiplicative noises. Automatica 2012, 48, 304–315. [Google Scholar] [CrossRef]
  17. Cui, X.; Li, X.; Li, D. Unified framework of mean-field formulations for optimal multi-period mean-variance portfolio selection. IEEE Trans. Autom. Control 2014, 59, 1833–1844. [Google Scholar] [CrossRef]
  18. Zhou, Z.; Xiao, H.; Yin, J.; Zeng, X.; Lin, L. Pre-commitment vs. time-consistent strategies for the generalized multi-period portfolio optimization with stochastic cash flows. Insur. Math. Econ. 2016, 68, 187–202. [Google Scholar] [CrossRef]
  19. Celikyurt, U.; Özekici, S. Multiperiod portfolio optimization models in stochastic markets using the mean–variance approach. Eur. J. Oper. Res. 2007, 179, 186–202. [Google Scholar] [CrossRef]
  20. Yao, H.; Li, Z.; Li, D. Multi-period mean-variance portfolio selection with stochastic interest rate and uncontrollable liability. Eur. J. Oper. Res. 2016, 252, 837–851. [Google Scholar] [CrossRef]
  21. Björk, T.; Murgoci, A. A General Theory of Markovian Time Inconsistent Stochastic Control Problems. 2010. Available online: http://ssrn.com/abstract=1694759 (accessed on 5 July 2019).[Green Version]
  22. Basak, S.; Chabakauri, G. Dynamic mean-variance asset allocation. Rev. Financ. Stud. 2010, 23, 2970–3016. [Google Scholar] [CrossRef]
  23. Wu, H.; Chen, H. Nash equilibrium strategy for a multi-period mean–variance portfolio selection problem with regime switching. Econ. Model. 2015, 46, 79–90. [Google Scholar] [CrossRef]
  24. Cui, X.; Li, D.; Shi, Y. Self-coordination in time inconsistent stochastic decision problems: A planner–doer game framework. J. Econ. Dyn. Control 2017, 75, 91–113. [Google Scholar] [CrossRef]
  25. Bensoussan, A.; Wong, K.; Yam, S.C.P.; Yung, S.P. Time-consistent portfolio selection under short-selling prohibition: From discrete to continuous setting. SIAM J. Financ. Math. 2014, 5, 153–190. [Google Scholar] [CrossRef]
  26. Björk, T.; Murgoci, A. A theory of Markovian time-inconsistent stochastic control in discrete time. Financ. Stoch. 2014, 18, 545–592. [Google Scholar] [CrossRef]
  27. Zhou, Z.; Liu, X.; Xiao, H.; Ren, T.; Liu, W. Time-consistent strategies for multi-period portfolio optimization with/without the risk-free asset. Math. Probl. Eng. 2018, 2018, 1–20. [Google Scholar] [CrossRef]
  28. Wang, L.; Chen, Z. Stochastic game theoretic formulation for a multi-period DC pension plan with state-dependent risk aversion. Mathematics 2019, 7, 108. [Google Scholar] [CrossRef]
Figure 1. The time-consistent strategies with and without intertemporal restrictions.
Figure 1. The time-consistent strategies with and without intertemporal restrictions.
Mathematics 07 00723 g001
Figure 2. The time-consistent strategies with and without benchmark orientation.
Figure 2. The time-consistent strategies with and without benchmark orientation.
Mathematics 07 00723 g002
Figure 3. The time-consistent strategies with and without benchmark orientation.
Figure 3. The time-consistent strategies with and without benchmark orientation.
Mathematics 07 00723 g003
Figure 4. The time-consistent strategies with and without benchmark orientation.
Figure 4. The time-consistent strategies with and without benchmark orientation.
Mathematics 07 00723 g004
Figure 5. The time-consistent strategies with and without intertemporal restrictions.
Figure 5. The time-consistent strategies with and without intertemporal restrictions.
Mathematics 07 00723 g005
Figure 6. The time-consistent strategies with and without benchmark orientation.
Figure 6. The time-consistent strategies with and without benchmark orientation.
Mathematics 07 00723 g006
Figure 7. The time-consistent strategies with and without benchmark orientation.
Figure 7. The time-consistent strategies with and without benchmark orientation.
Mathematics 07 00723 g007
Figure 8. The time-consistent strategies with and without benchmark orientation.
Figure 8. The time-consistent strategies with and without benchmark orientation.
Mathematics 07 00723 g008

Share and Cite

MDPI and ACS Style

Xiao, H.; Ren, T.; Zhou, Z. Time-Consistent Strategies for the Generalized Multiperiod Mean-Variance Portfolio Optimization Considering Benchmark Orientation. Mathematics 2019, 7, 723. https://0-doi-org.brum.beds.ac.uk/10.3390/math7080723

AMA Style

Xiao H, Ren T, Zhou Z. Time-Consistent Strategies for the Generalized Multiperiod Mean-Variance Portfolio Optimization Considering Benchmark Orientation. Mathematics. 2019; 7(8):723. https://0-doi-org.brum.beds.ac.uk/10.3390/math7080723

Chicago/Turabian Style

Xiao, Helu, Tiantian Ren, and Zhongbao Zhou. 2019. "Time-Consistent Strategies for the Generalized Multiperiod Mean-Variance Portfolio Optimization Considering Benchmark Orientation" Mathematics 7, no. 8: 723. https://0-doi-org.brum.beds.ac.uk/10.3390/math7080723

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop