Next Article in Journal
Congruence Representations via Soft Ideals in Soft Topological Spaces
Previous Article in Journal
On Local Unique Solvability for a Class of Nonlinear Identification Problems
Previous Article in Special Issue
A Statistical Dependence Framework Based on a Multivariate Normal Copula Function and Stochastic Differential Equations for Multivariate Data in Forestry
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of Migrating Optimization Algorithms in Problems of Optimal Control of Discrete-Time Stochastic Dynamical Systems

by
Andrei Panteleev
* and
Vladislav Rakitianskii
Department of Mathematics and Cybernetics, Moscow Aviation Institute, National Research University, 4, Volokolamskoe Shosse, 125993 Moscow, Russia
*
Author to whom correspondence should be addressed.
Submission received: 7 September 2023 / Revised: 24 October 2023 / Accepted: 25 October 2023 / Published: 27 October 2023
(This article belongs to the Special Issue Stochastic Modeling and Analysis for Applications and Technologies)

Abstract

:
The problem of finding the optimal open-loop control for discrete-time stochastic dynamical systems is considered. It is assumed that the initial conditions and external influences are random. The average value of the Bolza functional defined on individual trajectories is minimized. It is proposed to solve the problem by means of classical and modified migrating optimization algorithms. The modification of the migrating algorithm consists of cloning the members of the initial population and choosing different strategies of migratory behavior for the main population and for populations formed by clones. At the final stage of the search for an extremum, an intensively clarifying migration cycle is implemented with the participation of three leaders of the populations participating in the search process. Problems of optimal control of bundles of trajectories of deterministic discrete dynamical systems, as well as individual trajectories, are considered as special cases. Seven model examples illustrating the performance of the proposed approach are solved.

1. Introduction

Discrete system control problems have found wide application in solving engineering and economic problems [1,2,3,4]. For deterministic problems, as a rule, the discrete maximum principle or more general necessary conditions are used to search for optimal open-loop control [5] and the Bellman equation for the synthesis of systems with full feedback [1,6,7,8]. When solving problems with non-separable quality criteria, methodological problems arise in their use. As the dimension of the problem being solved increases, the computational difficulties can become insurmountable.
To solve stochastic problems in which the initial conditions and external influences are random, when searching for the optimal open-loop control, the stochastic discrete maximum principle [9,10,11] is used. To find the optimal control with full feedback on the state vector, the Bellman equation for discrete stochastic systems is usually applied [2,6,7]. Compared to deterministic problems, the complexity of the numerical solution increases significantly. Therefore, the direct search for optimal control for the described class of stochastic problems using various methods of global optimization, including metaheuristic algorithms, is relevant.
A special case of stochastic problems is the problem of optimal control of bundles of trajectories, in which it is assumed that there are no external influences on the system and the initial conditions belong to a given set of possible states. As a rule, the law of distribution of the initial states of the system is assumed to be uniform, which corresponds to the case of uncertainty in setting the initial conditions. Various necessary and sufficient conditions for the optimality of the control of bundles of trajectories of discrete systems, as well as computational procedures for their implementation, are considered in [12,13,14]. The problem of finding the optimal open-loop control of a separate trajectory of a discrete deterministic system starting from a given initial state can, in turn, be considered a special case of controlling bundles of trajectories.
Examples of solving discrete deterministic optimal control problems using various metaheuristic optimization algorithms, such as evolutionary methods, the Luus-Jaakola method, iterative dynamic programming, and methods that simulate the behavior of the tomtit flock and a flock of river perch, are described in [15,16,17,18]. Migrating optimization algorithms that use the processes of solution population evolution have been used in solving various applied problems [19,20]. The development of migrating algorithms was a modified migrating algorithm, which has proven itself to solve finite-dimensional technical problems [21].
The article proposes applying a modified migrating algorithm to solve the problem of finding the optimal open-loop control of nonlinear discrete stochastic systems. As special cases, we consider the solution of control problems for bundles of trajectories in the presence of uncertainty in setting the initial conditions and individual trajectories under a fixed initial condition.
The purpose of the article is to demonstrate the possibility of using migration metaheuristic optimization algorithms, including new ones, to solve a wide class of problems of finding optimal control of discrete dynamic systems. The solution procedure does not require the application of necessary optimality conditions and numerical methods for satisfying them. It is shown that the stated problems can be solved with relatively low computational costs. The completed research opens up the possibility of using a similar approach to solve more complex problems, namely the problems of finding control laws with complete and incomplete feedback.
The article has the following structure: In Section 2, the formulations of three optimal control problems for discrete dynamical systems are given. A description of the basic self-organizing migrating optimization algorithm and the modified one developed by the authors is presented. Section 3 contains solutions to seven model examples, two for each of the problem statements given in Section 2. Finally, conclusions are summarized in Section 4.

2. Materials and Methods

2.1. Statements of Optimal Control Problems

Problem 1
(average optimal open-loop control of a stochastic system). The behavior of a nonlinear discrete stochastic model of the control object is described using the following equation:
X ( t + 1 ) = f ( t , X ( t ) , u ( t ) , W ( t ) ) , t = 0 , 1 , , N 1 ,
where t is a discrete time, t T = { 0 , 1 , , N } and the number N is given; X is a system state vector, X R n ; u is a control vector; u U R q ; U is the set of admissible control values, which is a direct product of segments [ a i , b i ] , i = 1 , , q ;   f ( t , x , u , w ) = f 1 ( t , x , u , w ) , ,   f n ( t , x , u , w ) T is a continuous vector function; and W ( t ) is an l -dimensional random vector with a known probability density p ( W , t ) .
A special case of model (1) is the equation
X ( t + 1 ) = g ( t , X ( t ) , u ( t ) ) + σ ( t , X ( t ) ) W ( t ) ) ,
in which a random external influence W ( t ) on the control system enters additively; g ( t , x , u ) is a vector function of dimensions ( n × 1 ) ; and σ ( t , x ) is a matrix function of dimensions ( n × l ) .
The initial conditions
X ( 0 ) = X 0
are given by the initial probability density p 0 ( x ) .
It is assumed that only discrete-time information is used in the control law. The set U of admissible controls form sequences u ( · ) = { u ( 0 ) , u ( 1 ) , , u ( N 1 ) } such that t { 0 , , N 1 }   u ( t ) U , and the function f t , x , u ( t ) , w is such that a solution to the system of Equation (1) with initial condition (2) exists and is unique.
Let us define the control quality functional for a separate trajectory in the Bolza form:
I X 0 , u ( · ) = t = 0 N 1 f 0 ( t , X ( t ) , u ( t ) ) + F X ( N )
where f 0 ( t , x , u ) , F ( x ) are given continuous functions.
The quality of system control is proposed to be estimated by the value of the function
J u ( · ) = M [ I X 0 , u ( · ) ] ,
where M denotes the operation of finding the mathematical expectation.
One needs to find a control such that
J u * ( · ) = min u ( · ) U J u ( · ) .
The desired control is called optimal on average since the average value of the functional (3) is minimized.
Problem 2
(average optimal control of a bundle of trajectories of a deterministic system). The behavior of the nonlinear deterministic discrete model of the control object is described by the difference equation as follows:
x ( t + 1 ) = f ( t , x ( t ) , u ( t ) ) ,   t = 0 , 1 , , N 1 ,
where t is a discrete time, x is a system state vector, x R n ; u is a system control vector, u U R q ; U is the set of admissible control values, which is a direct product of segments [ a i , b i ] , i = 1 , , q ; f ( t , x , u ) = ( f 1 ( t , x , u ) , ,   f n ( t , x , u ) ) T is a continuous vector function; and the number of discrete-time steps is given as N .
The initial state is given by a compact set of positive measures:
x ( 0 ) = x 0 Ω R n .
The set U of admissible controls form the sequences u ( · ) = { u ( 0 ) , u ( 1 ) , , u ( N 1 ) } , where u ( t ) U   t { 0 , 1 , , N 1 } .
Let us define the control quality functional for a separate trajectory in the Bolza form:
I ( x 0 , u ( · ) ) = t = 0 N 1 f 0 t , x ( t ) , u ( t ) + F x ( N )
or
I ( x 0 , u ( · ) ) = F ( x ( 0 ) , , x ( N ) ; u ( 0 ) , , u ( N 1 ) ) ,
where f 0 t , x , u ,   F x ,   F ( x ( · ) , u ( · ) ) are given continuous functions, trajectory x ( · ) = { x 0 , x ( 1 ) , , x ( N ) } , and control u ( · ) = { u ( 0 ) , u ( 1 ) , , u ( N 1 ) } . Function (9) corresponds to the case of non-separable criteria.
To every admissible control u ( · ) U and set Ω , let us set in correspondence the bundle of trajectories of the system of Equation (6):
X t , u ( · ) = x t , u ( · ) , x 0 | x 0 Ω ,
i.e., the union of solutions x t , u ( · ) , x 0 of Equation (6) over all possible initial states (7).
The quality of control of a bundle of trajectories is proposed to be estimated by the value of the functional
J [ u ( · ) ] = Ω I ( x 0 , u ( · ) ) d x 0 / mes Ω ,
where mes Ω is a set Ω measure.
It is required to find a control u * ( · ) U such that
J u * ( · ) = min   u ( · )   U J u ( · ) .
The desired control is called optimal on average, since the average value of the functional (8) or (9) is minimized on the given set of initial states.
It is further assumed that set Ω of initial states is a parallelepiped defined by the direct product of the segments α i ; β i , i = 1 , , n , i.e., Ω = α 1 ; β 1 × × α n ; β n . All segments with a step Δ x i are divided into K i segments, and parallelepiped Ω is divided into K = K 1 K n elementary subsets Ω k ,   k = 1 , , K . In each elementary subset Ω k the initial state x 0 k (center of the parallelepiped) is set. These initial states and the applied control generate K trajectories on which the value of the functional (8) or (9) is calculated. After averaging over all realizations, the value of functional (11) is found.
Problem 3
(optimal control of a separate trajectory of a dynamical deterministic system). The behavior of the nonlinear deterministic discrete model of the control object is described by the difference equation
x ( t + 1 ) = f ( t , x ( t ) , u ( t ) ) , t = { 0 ,   1 , , N 1 } ,
where t is a discrete time, x is a system state vector, x R n ; u is a system control vector, u U R q ; U is the set of admissible control values, which is a direct product of segments [ a i , b i ] , i = 1 , , q ; f ( t , x , u ) = ( f 1 ( t , x , u ) , , f n ( t , x , u ) ) T is a continuous vector function; and a number of discrete-time steps is given as N .
The initial state is given as follows:
x ( 0 ) = x 0 .
The set of admissible processes D ( 0 , x 0 ) is a set of pairs d = ( x ( · ) , u ( · ) ) , including the trajectory x ( · ) = { x 0 , x ( 1 ) , , x ( N ) } and admissible control u ( · ) = { u ( 0 ) , u ( 1 ) , , u ( N 1 ) } , where u ( t ) U , satisfying the equation of state (13) and the initial condition (14).
On the set D ( 0 , x 0 ) , let us define the control quality functional for a separate trajectory in the Bolza form as follows:
I ( d ) = k = 0 N 1 f 0 k , x ( k ) , u ( k ) + F x ( N )
or
I ( d ) = F ( x ( 0 ) , , x ( N ) ; u ( 0 ) , , u ( N 1 ) ) ,
where f 0 ( t , x , u ) ,   F ( x ) , F ( x ( · ) , u ( · ) ) are given continuous functions.
We need to find a pair d * = x * ( · ) , u * ( · ) D ( 0 , x 0 ) such that
I ( d * ) = min d D ( 0 , x 0 ) I ( d ) .
Elements of a pair d * , i.e., desired trajectory x * ( · ) = { x 0 , x * ( 1 ) , , x * ( N ) } and control u * ( · ) = { u * ( 0 ) , u * ( 1 ) , , u * ( N 1 ) } are called the optimal trajectory and optimal control, respectively.
If any admissible control u ( · ) match single pair d , then problem (17) can be rewritten in an equivalent form as the problem of finding the minimum of the functional on the set of admissible controls. Functional (16) is more general than the Bolza functional (15), and its application is typical for so-called non-separable problems.

2.2. Self-Organizing Migrating Optimization Algorithms

2.2.1. Optimization Problem Statement

Given an objective function f ( x ) = f ( x 1 , x 2 , , x n ) , defined on the set of admissible solutions D R n . It is required to find the constrained global minimum of the function f ( x ) on the set D , i.e., a point x * D , such that
f ( x * ) = min x D f ( x ) ,
where x = ( x 1 , x 2 , , x n ) T ,   D = { x | x i [ a i , b i ] , i = 1 , 2 , , n } .

2.2.2. Self-Organizing Migrating Algorithm (SOMA)

SOMA can be classified as the metaheuristic evolutionary optimization algorithm based on the self-organizing behavior of groups of individuals in a social environment. When solving the problem, finite sets I = x j = x 1 j , x 2 j , , x n j T ,   j = 1 , 2 , , N P D of possible solutions, called populations, are used, where x j is an individual with number j , N P is a population size.
The Self-Organizing Migrating Algorithm [19,20] simulates the evolution of the initial population I 0 = { x j ,   j = 1 , 2 , , N P | x j = x 1 j , x 2 j , , x n j T D } and is an iterative process that explores the set D.
Initially, a population is created from individuals with randomly generated coordinates x i from the segment a i , b i using the uniform distribution law.
Then, the migration cycles begin. Before each cycle, a leader is identified, i.e., an individual with the best value of the objective function f x . In these cycles, all individuals sequentially move towards the leader along the straight line connecting them, including moving behind him for some distance along the same straight line. In this case, the leader remains motionless. Then, the position of the individual is selected, which corresponds to the best value of the objective function obtained in the search process, and it is recorded in the next population.
The algorithm continues until the difference between the values of the objective function corresponding to the leader and the worst individual of the population becomes less than a predetermined value, or until the maximum number of migration cycles is reached.
Algorithm SOMA parameters that must be specified are as follows: control parameter NStep, determining the number of steps until the end of the movement; control parameter PRT, determining whether the individual will move along the chosen coordinate to the leader; control parameter NP, determining the size of the population of individuals; stop parameter Migration, determining the maximum number of iterations; and stop parameter MinDiv, showing the allowable difference in the value of the objective function between the leader and the worst individual.

2.2.3. Modified Self-Organizing Migrating Algorithm

When developing a modified self-organizing migrating optimization algorithm (MSOMA) [21], the basic version of the SOMA algorithm [19,20] was used.
Modifications consist of the selection of three leaders among the individuals that form the population. For each of the members of the population, two clones are generated with the same position. Thus, three populations are generated, each of which implements a migration cycle relative to its leader. For all members of the population, the best positions achieved during the cycle are found. During the search, the population is regularly updated with new individuals generated on the set of feasible solutions. They replace outgoing individuals with the worst objective function values. After the termination conditions are met, a refinement search (migration cycle) is performed, in which the three remaining leaders of the population participate. The best result is presented as a solution.
When solving the problem, finite sets I = x j = x 1 j , x 2 j , , x n j T ,   j = 1 , 2 , , N P D of possible solutions, called populations, are used, where x j is an individual with number j, N P is a population size.
The Self-Organizing Migrating Algorithm simulates the evolution of the initial population I 0 = { x j ,   j = 1 , 2 , , N P | x j = x 1 j , x 2 j , , x n j T D } and is an iterative process that explores the set D. The initial population is created from individuals with randomly generated coordinates x i from the segment a i , b i , i = 1 , , n . In this case, a uniform distribution law is used.
Further migration cycles are realized. Before each cycle, individuals are ordered, and three leaders are identified: individuals with the best values of the objective function f x . Each individual creates two individual clones with the same coordinates. Thus, three populations of total size K = 3 N P are created. Each of the populations moves relative to one of the three leaders. In these cycles, all individuals sequentially, with a certain step, move in a straight line toward their leader and along the same straight line behind him for a certain distance. In this case, the leader remains motionless. Then, the position of the individual is selected, which corresponds to the best value of the objective function obtained in the search process. This position is written to the next population of size K . After all individuals are recorded in the new population, 2 N P + 1 3 N P individuals with worse objective function values f x are removed from it. Then, 1 3 N P new individuals, according to the rules for the formation of the initial population, are generated to supplement the population to its original size NP in order to update the population and explore new areas in the set of feasible solutions.
The algorithm continues until the average deviation in the value of the function between the three leaders of the population becomes less than a predetermined value or until the maximum number of migration cycles is realized. Then, a clarifying migration cycle is implemented for the three leaders of the population relative to the absolute leader. As an approximate solution, the best position and the corresponding value of the objective function found during the search are presented.
  • Algorithm MSOMA for solving the optimization problem
Step 1. Set the method parameters:
  • Control parameter NStep, determining the number of steps until the end of the movement;
  • Control parameter PRT, determining whether the individual will move along the chosen coordinate to the leader;
  • Control parameter NP, determining the size of the population of individuals;
  • Stop parameter Migration, determining the maximum number of iterations;
  • Stop parameter MinDist, reflecting the value of the average deviation between the three leaders of the population in terms of the value of the function;
  • Iteration counter MCount, necessary to stop the algorithm when it reaches the number Migration.
Set MCount = 1.
Step 2. Create an initial population.
Step 2.1. Create a population of N P individuals with randomly generated coordinates x i from a segment a i , b i using the uniform distribution law:
x i j = a i + r a n d i 0 ,   1   b i a i , i = 1 , , n ;   j = 1 , , N P .
Step 2.2. For each individual, calculate the objective function value and choose leaders (individuals with the best values). To do this, order the population in a non-decreasing order of the objective function. Then, select the first three individuals from it with the smallest values of the objective function.
Step 3. Realize a migration cycle regarding the three leaders.
Step 3.1. For all individuals, create two “clones”, i.e., the individuals with the same coordinates:
for   all   j = 1 , , N P :   x i N p + j = x i j ,   x i 2 N p + j = x i j ,   i = 1 , , n .
In this case, the parent individual moves toward the first leader, the first clone moves towards the second leader, and the second clone moves toward the third leader.
Step 3.2. Before starting the movement of an individual to one of the leaders, generate a random number r n d i on the segment [ 0 , 1 ] for each coordinate of the individual and then compare it with PRT:
if   r n d i < P R T ,   then   P R T V e c t o r i = 1   , else   0 ,   i = 1 , , n .
Vector P R T V e c t o r thus formed defines the set of coordinate directions along which the search will be performed.
Step 3.3. Implement the movement of individuals to the appropriate leader. The movement takes place in steps until the end position is reached, according to the following formula:
  • for the first leader:
x j , m 1 = x j + m 1 x 1 x j 2 N S t e p P R T V e c t o r ,   m 1 = 0 ,   1 , , 4 N S t e p ,   j = 1 , , N P ;
  • for the second leader:
x j , m 2 = x j + m 2 x 2 x j N s t e p P R T V e c t o r , m 2 = 0 ,   1 , , 2 N S t e p ,   j = N P + 1 , , 2 N P ;
  • for the third leader:
x j , m 3 = x j + m 3 x 3 x j N S t e p 2 P R T V e c t o r ,   m 3 = 0 ,   1 , , N S t e p ,   j = 2 N P + 1 , , 4 N P ,
where is an element-wise product of vectors (according to Hadamard [22]), N S t e p 2 denotes the integer part of number N S t e p / 2 .
Step 3.4. After all the steps performed for each individual, find the best position (at which the objective function value was the smallest), and the individual to take this position, and then move to the next population:
  • for the first leader:
x j , N e w = argmin m 1 = 0 , 1 , , 4 N S t e p f x j , m 1 ,   j = 1 , , N P ;
  • for the second leader:
x j , N e w = argmin m 2 = 0 , 1 , , 2 N S t e p f x j , m 2 ,   j = N p + 1 , , 2 N P ;
  • for the third leader:
x j , N e w = argmin m 3 = 0 , 1 , , N S t e p f x j , m 3 ,   j = 2 N P + 1 , , 3 N P .
Step 3.5. Sort all individuals in non-decreasing order of the objective function value.
Step 4. Checking the conditions for completing the search.
If 1 2 j = 2 3 f x j f x 1 2 < M i n D i s t or if the migration limit (Migration) is reached, then go to step 6; otherwise, go to step 5.
Step 5. Update the current population.
Step 5.1. Sort all individuals in non-decreasing objective function value:
{ x 1 , , x 3 N P | f ( x k ) f x k + 1 } , k = 1 , , 3 N P 1 .
Step 5.2. Delete last 2 N P + 1 3 N P individuals.
Step 5.3. Create 1 3 N P new individuals:
x i j = a i + r a n d i 0 ,   1 · b i a i , i = 1 , , n ;   j = 2 3 N P + 2 , , N P .
Step 5.4. Increment iteration counter: M C o u n t = M C o u n t + 1 and go to step 3.
Step 6. Realize the intensively clarifying migration cycle.
Step 6.1. Increase parameter NStep ( N S t e p = 10   N S t e p ) and implement a migration cycle for the second and third leaders relative to the first leader. Before the individual moves towards the leader, generate a random number on the segment [ 0 , 1 ] for each coordinate of the individual, and then compare with PRT:
if   r n d i < P R T ,   then   P R T V e c t o r i = 1   , else   0 ,   i = 1 , , n .
Thus, to form P R T V e c t o r , defining the set of coordinate directions along which the search will be performed.
The movement of the second and third leaders is implemented according to the formula as follows:
x j , m = x j + m x 1 x j N S t e p 2 P R T V e c t o r ,   m = 0 , 1 , , N S t e p ;   j = 1 , 2 , 3 .
Step 6.2. Find a new leader position:
x j , N e w = argmin m = 0 , 1 , , N S t e p f x j , m ,   j = 1 , 2 , 3 .
Step 6.3. Return the best solution found during the search: x a n s = argmin   j = 1 , 2 , 3 f x j .
The complete steps of the MSOMA algorithm are shown in the Algorithm 1.
Algorithm 1 MSOMA Algorithm
1: Initialize the algorithm parameters NStep, PRT, NP, Migration, MinDist, MCount
2: Set MCount = 1
3: Create an initial population:
4: for all j = 1 , , N P do
5: x i j = a i + r a n d i 0 ,   1 b i a i , i = 1 , , n ;
6: end for
7: for all j = 1 , , N P calculate the objective function value
8: end for
9: Select the first three individuals with the smallest values of the objective function from the current population
10: Realize a migration cycles regarding the three leaders:
11: for all j = 1 , , N P create two “clones” as
12: x i N p + j = x i j ,   x i 2 N p + j = x i j ,   i = 1 , , n ;
13: end for
14: Implement the movement of individuals to the first leader:
15: for all j = 1 , , N P do
16: for all i = 1 , , n do
17: generate a random number r n d i on the segment [ 0 , 1 ] ;
18: if   r n d i < P R T ,   then
19: P R T V e c t o r i = 1
20: else
21: P R T V e c t o r i = 0
22: end if
23: end for
24: for m 1 = 0 ,   1 , , 4 N S t e p do
25: x j , m 1 = x j + m 1 x 1 x j 2 N S t e p P R T V e c t o r ;
26: end for
27: Calculate x j , N e w = argmin m 1 = 0 , 1 , , 4 N S t e p f x j , m 1
28: end for
29: Implement the movement of individuals to the second leader:
30: for all j = N P + 1 , , 2 N P do
31: for all i = 1 , , n , do
32: generate a random number r n d i on the segment [ 0 , 1 ] ;
33: if   r n d i < P R T ,   then
34: P R T V e c t o r i = 1
35: else
36: P R T V e c t o r i = 0
37: end if
38: end for
39: for m 2 = 0 ,   1 , , 2 N S t e p do
40: x j , m 2 = x j + m 2 x 2 x j N s t e p P R T V e c t o r ;
41: end for
42: Calculate x j , N e w = argmin m 2 = 0 , 1 , , 2 N S t e p f x j , m 2
43: end for
44: Implement the movement of individuals to the third leader:
45: for all j = 2 N P + 1 , , 4 N P do
46: for all i = 1 , , n , do
47: generate a random number r n d i on the segment [ 0 , 1 ] ;
48: if   r n d i < P R T ,   then
49: P R T V e c t o r i = 1
50: else
51: P R T V e c t o r i = 0
52: end if
53: end for
54: for m 3 = 0 ,   1 , , N S t e p do
55: x j , m 3 = x j + m 3 x 3 x j N S t e p 2 P R T V e c t o r ;
56: end for
57: Calculate x j , N e w = argmin m 3 = 0 , 1 , , N S t e p f x j , m 3
58: end for
59: Sort all individuals in non-decreasing order of the objective function value
60: Check the conditions for completing the search
61: if 1 2 j = 2 3 f x j f x 1 2 < M i n D i s t ˅ Mcount = Migration, then
62: go to 75
63: else
64: go to 66
65: end if
66: Update the current population:
67: Sort all individuals in non-decreasing objective function value:
{ x 1 ,   , x 3 N P | f ( x k ) f x k + 1 } , k = 1 , , 3 N P 1
68: Delete last 2 N P + 1 3 N P individuals and create 1 3 N P new individuals:
69: for j = 2 3 N P + 2 , , N P do
70: for i = 1 , , n do
71: x i j = a i + r a n d i 0 ,   1 · b i a i
72: end for
73: end for
74: Set M C o u n t = M C o u n t + 1 and go to 10
75: Realize the intensively clarifying migration cycle
76: Set N S t e p = 10 · N S t e p
77: for j = 1 , 2 , 3 do
78: for all i = 1 , , n , do
79: generate a random number r n d i on the segment [ 0 , 1 ] ;
80: if   r n d i < P R T ,   then
81: P R T V e c t o r i = 1
82: else
83: P R T V e c t o r i = 0
84: end if
85: end for
86: for m = 0 , 1 , , N S t e p do
87: x j , m = x j + m x 1 x j N S t e p 2 P R T V e c t o r ;
88: end for
89: Calculate x j , N e w = argmin m = 0 , 1 , , N S t e p f x j , m
90: end for
91: Return the best solution x a n s = argmin j = 1 , 2 , 3   f x j

3. Model Examples

The solutions of all model examples are implemented using the Jupyter Notebook software in the Phyton language. Macbook Pro 16 gb RAM, and M1 pro processor with a frequency of 3.2 GHz were used.

3.1. Optimal Open-Loop Control of a Stochastic Discrete Systems

Example 1.
The behavior of the control object model is described by the difference equation:
X ( t + 1 ) = X ( t ) + u ( t ) + σ W ( t ) ,
where X R , t = 0 , 1 , , N 1 ; the control variable is constrained by 2 × 10 4 u 0 and W ( t ) is the Gaussian random variable with zero mean and unit variance. We take the value of the parameter characterizing the intensity of external influences σ = 20 . The initial states X ( 0 ) are given by the uniform distribution law on the segment [ 200 , 200 ] . Functional (3) has the form
I ( X 0 , u ( · ) ) = 1 2 t = 0 N 1 γ t u 2 ( t ) + X ( N ) ,   γ > 1 .
It is required to find a control minimizing the value of the functional (4).
Set the number of discrete-time steps: N = 50 , number of implementations K = 5 . SOMA parameters are given in Table 1. Figure 1 shows the obtained optimal control and the family of trajectories. The resulting value of the quality functional for γ = 1.1 is min J = 428.0443 ; numerical results were obtained in 3173.59 s.
MSOMA parameters are given in Table 2. The resulting value of the quality functional is min J = 593.329 ; numerical results were obtained in 1783.98 s. Figure 2 shows the obtained optimal control and the set of trajectories.
Example 2.
The behavior of the control object model is described by a system of difference equations:
X 1 ( t + 1 ) = X 2 ( t ) + σ 1 W 1 ( t ) ,
X 2 ( t + 1 ) = 2 X 2 ( t ) X 1 ( t ) + 1 N 2 u ( t ) + σ 2 W 2 ( t ) ,
where X R 2 , u R , W 1 ( t ) , W 2 ( t ) are the independent Gaussian random variables with zero mean and unit variance. We take the values of the parameters characterizing the intensity of external influences σ 1 = 0.02 , σ 2 = 0.03 . The initial states are described by the uniform distribution law on the set [ 2 , 2 ] × [ 2 , 2 ] , the control variable is constrained by 0 u 100 .
Functional (3) has the form
I ( X 0 , u ( · ) ) = X 1 ( N ) + 1 2 N t = 0 N 1 u 2 ( t ) .
It is required to find the optimal open-loop control and the minimum value of the functional (4).
Let us set N = 10 , number of implementations K = 25. SOMA parameters are given in Table 3. Figure 3 and Figure 4 show the resulting optimal control and trajectories. The resulting value of the quality functional is min J = 0.46146 ; numerical results were obtained in 8504.02 s.
MSOMA parameters used are shown in Table 4. Figure 5 and Figure 6 show the obtained optimal control and trajectories of the discrete dynamical system. The resulting value of the quality functional is min J = 0.52729 ; numerical results were obtained in 14,270.38 s.

3.2. Optimal Open-Loop Control of Bundles of Trajectories

Example 3.
The behavior of the control object model is described by the difference equation
x ( t + 1 ) = x ( t ) + u ( t ) ,
where x R , t = 0 , 1 , , N 1 . The control variable is constrained by 2 × 10 4 u 0 . The initial states are determined from the condition x ( 0 ) Ω = [ 200 , 200 ] . Functional (8) has the form
I ( x 0 , u ( · ) ) = 1 2 t = 0 N 1 γ t u 2 ( t ) + x ( N ) ,   γ > 1 .
It is required to find a control minimizing the value of the functional (11).
Set the number of discrete-time steps N = 50 , number of implementations K = 5 .
SOMA parameters are given in Table 5. Figure 7 shows the resulting optimal control and bundle of trajectories. The resulting value of the quality functional for γ = 1.1 is min J = 581.8247 , numerical results were obtained in 520.52 s.
MSOMA parameters are given in Table 6. Figure 8 shows the resulting optimal control and bundle of trajectories. The resulting value of the quality functional for γ = 1.1 is min J = 581.9022 ; numerical results were obtained in 730.24 s.
Example 4.
The behavior of the control object model is described by a system of difference equations:
x 1 ( t + 1 ) = x 2 ( t ) ,
x 2 ( t + 1 ) = 2 x 2 ( t ) x 1 ( t ) + 1 N 2 u ( t ) ,
where x R 2 , u R . The initial states are determined from the condition x ( 0 ) Ω = [ 2 , 2 ] × [ 2 , 2 ] . The control variable is constrained by 0 u 100 . Functional (8) has the form
I ( x 0 , u ( · ) ) = x 1 ( N ) + 1 2 N t = 0 N 1 u 2 ( t ) .
It is required to find the optimal program control and the minimum value of the functional (11).
Let us set N = 10 , number of implementations K = K 1 K 2 = 25. The results of the SOMA algorithm (the parameters are given in Table 5). Figure 9 and Figure 10 show the resulting optimal control and corresponding bundle of trajectories. The resulting value of the quality functional: min J = 0.5424 ; numerical results were obtained in 225.58 s.
MSOMA parameters are given in Table 6. The resulting value of the quality functional is min J = 0.5424 , numerical results were obtained in 868.34 s. Figure 11 and Figure 12 show the resulting optimal control and corresponding bundle of trajectories.

3.3. Optimal Open-Loop Control of Individual Trajectories

Example 5.
The behavior of the control object model is described by the difference equation
x ( t + 1 ) = x ( t ) + u ( t ) ,
where x R , t = 0 , 1 , , N 1 . The control variable is constrained by 2 × 10 4 u 0 . The initial state is given by the condition x ( 0 ) = 0 .
On the set of admissible processes, the quality functional is defined as follows:
I = 1 2 t = 0 N 1 γ t u 2 ( t ) + x ( N ) ,   γ > 1 .
It is required to find the minimum value of the functional and the optimal process ( x * ( · ) , u * ( · ) ) , at which this value is reached.
The problem under consideration has an analytical solution presented as: x * ( t ) = x 0 + 1 γ t γ 1 , min I = x 0 + 1 γ N 2 ( γ 1 ) .
Set the number of discrete-time steps: N = 50 . For γ = 1.1 the exact value of the quality functional is I min = 581.95426 . SOMA parameters are given in Table 3. The resulting value of the quality functional is min I = 581.32586 , numerical results were obtained in 2.2 min. Figure 13 shows the resulting optimal control and corresponding trajectory.
MSOMA parameters are given in Table 4. The resulting value of the quality functional is min I = 581.80435 ; numerical results were obtained in 10.7 min. Figure 14 shows the resulting optimal control and corresponding trajectory.
Example 6.
The behavior of the control object model is described by a system of difference equations:
x 1 ( t + 1 ) = x 2 ( t ) ,
x 2 ( t + 1 ) = 2 x 2 ( t ) x 1 ( t ) + 1 N 2 u ( t ) ,
where x R 2 , u R . The initial state is given by the condition x ( 0 ) = ( 0 , 0 ) T . The control variable is constrained by 0 u 100 .
On the set of admissible processes, a functional of the Bolza form
I = x 1 ( N ) + 1 2 N t = 0 N 1 u 2 ( t )
is defined. It is required to find the minimum value of the functional and the optimal process on which this value is reached.
The problem has an analytical solution: u * ( t ) = N t 1 N , min I = 1 3 + 3 N 1 6 N 2 + 1 2 N 3 t = 0 N 1 t 2 . If one set N = 10 , then min I = 0.1425 .
SOMA parameters are given in Table 7. The resulting value of the quality functional is min I = 0.14256 , numerical results were obtained in 26 s. Figure 15 and Figure 16 show the resulting optimal control and trajectory.
MSOMA parameters are given in Table 8. The resulting value of the quality functional is min I = 0.14236 ; numerical results were obtained in 12.7 s. Figure 17 and Figure 18 show the resulting optimal control and trajectory.
Example 7.
The behavior of the control object model is described by a difference equation:
x ( t + 1 ) = x ( t ) + u ( t ) ,
where x R , u R . The initial state is given by the condition x ( 0 ) = 3 , the control variable is constrained by 100 u 100 .
On the set of admissible processes, a functional of the form
I = t = 0 N 1 [ u 2 ( t ) + x 2 ( t ) ]
is defined. It is required to find the minimum value of the functional and the optimal process on which this value is reached.
The problem has an analytical solution. If one set N = 10 , then min I = 14.56231 .
MSOMA parameters are given in Table 8. The resulting value of the quality functional: min I = 14.58935 ; numerical results were obtained in 5.7 s. The obtained control and state values are presented in Table 9. Figure 19 shows the resulting optimal control and trajectory.

4. Discussion

Analysis of the results of solving Examples 5 and 6 indicates the ability of migrating optimization algorithms SOMA and MSOMA to find the optimal control and the corresponding trajectory for deterministic discrete dynamical systems with sufficiently high accuracy, which follows from a comparison with the exact solutions obtained by applying the discrete maximum principle. Example 7 shows that there are problems in which the control values tend to zero over time, and the coordinate values of the state vector tend to be stationary values.
The results of solving Examples 3 and 4 demonstrate the capabilities of metaheuristic migrating algorithms to solve optimal control problems under conditions of uncertainty in specifying initial conditions. In this case, the found control laws generate a fairly regular behavior of the trajectories forming the bundle. The disadvantages of using open loop control of trajectory bundles include the inability to concentrate trajectories in the desired neighborhood, which is demonstrated in Figure 10 and Figure 12 in Example 4. The optimal control and graphs in Figure 7 and Figure 8 obtained as a result of solving Example 3 demonstrate that the uncertainty generated by specifying a set of possible initial conditions when using open loop control at least does not decrease over time.
The results of solving Examples 1 and 2, presented in Figure 1, Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6, demonstrate the influence of random external influences on the type of trajectories of a dynamical system. Comparing the graphs in Figure 1 and Figure 2 with Figure 7 and Figure 8 (Examples 1 and 3), as well as the graphs in Figure 3, Figure 4, Figure 5 and Figure 6 and Figure 9, Figure 10, Figure 11 and Figure 12 (Examples 2 and 4), one can conclude that the set of obtained trajectories of stochastic systems compared with a similar set of trajectories of deterministic systems are characterized by noticeable turbulence. At the same time, the patterns of behavior of controlled systems generally correspond to each other.

5. Conclusions and Future Work

The article demonstrates the application of metaheuristic migrating optimization algorithms to solving optimal control problems for discrete dynamical systems of different classes. Their use makes it possible to obtain a solution of acceptable quality at reasonable computational costs. It is noted that the proposed modified algorithm solves the problem of finding the optimal control of stochastic systems more successfully than the basic one. In problems of optimal control of bundles of trajectories and an individual trajectory, migrating algorithms are almost as accurate as each other.
Noting the shortcomings of the proposed approach, it should be emphasized that with a significant increase in the value of N, the complexity of the problems being solved also increases because of the number of variables on which the value of the functionals (4), (11), (15) and (16) depends increases. This fact leads to an increase in computational costs, which may be unjustified. One direction of development may be the search for the desired control law in parametric form with a relatively small number of characteristic parameters.
A further direction of research can be the search for an optimal control that depends on time and a set of coordinates of the state vector available for measurement, as well as solving problems of joint estimation and control. In this case, it is possible to use other metaheuristic optimization algorithms, namely methods of swarm intelligence, bio-inspired, multi-agent, and multi-start, as well as simulating physical and chemical processes.

Author Contributions

Conceptualization, A.P.; methodology, A.P. and V.R.; software, V.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Source code at https://github.com/Vradislav/MSOMA (accessed on 4 October 2023); numerical results at https://github.com/Vradislav/MSOMA/blob/main/MSOMA_application.ipynb (accessed on 4 October 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bertsekas, D.P. Dynamic Programming and Optimal Control; Athena Scientific: Belmont, MA, USA, 1995; Volume 1. [Google Scholar]
  2. Bertsekas, D.P.; Shreve, S.E. Stochastic Optimal Control: The Discrete Time Case; Academic: New York, NY, USA, 2004. [Google Scholar]
  3. Luus, R. Iterative Dynamic Programming, 1st ed.; Chapman & Hall/CRC: London, UK, 2000. [Google Scholar]
  4. Salinelli, E.; Tomarelli, F. Discrete Dynamical Models; Springer: Milan, Italy, 2014. [Google Scholar]
  5. Propoi, A.I. Problems of discrete control with phase constraints. USSR Comput. Math. Math. Phys. 1972, 12, 53–74. [Google Scholar] [CrossRef]
  6. Kwakernaak, H.; Sivan, R. Linear Optimal Control Systems; Wiley–Interscience: Hoboken, NJ, USA, 1972. [Google Scholar]
  7. Fleming, W.H.; Soner, H.M. Controlled Markov Processes and Viscosity Solutions; Springer: New York, NY, USA; Berlin/Heidelberg, Germany, 1993. [Google Scholar]
  8. Kushner, H.J. Approximation and Weak Convergence Methods for Random Processes with Applications to Stochastic Systems Theory; MIT Press: Cambridge, MA, USA, 1984. [Google Scholar]
  9. Kushner, H.J.; Schweppe, F.C. A maximum principle for stochastic control systems. J. Math. Anal. Appl. 1964, 8, 287–302. [Google Scholar] [CrossRef]
  10. Peng, S. A general stochastic maximum principle for optimal control problems. SIAM J. Control. Optim. 1990, 28, 966–979. [Google Scholar] [CrossRef]
  11. Lin, X.; Zhang, W. A maximum principle for optimal control of discrete-time stochastic systems with multiplicative noise. IEEE Trans. Autom. Control 2015, 60, 1121–1126. [Google Scholar] [CrossRef]
  12. Golovkina, A.; Ovsyannikov, D. Optimal control of trajectories ensemble for a class of discrete dynamic systems. IFAC-PapersOnLine 2020, 53, 6893–6898. [Google Scholar] [CrossRef]
  13. Kotina, E.D.; Ovsyannikov, D.A. Mathematical model of joint optimization of programmed and perturbed motions in discrete systems. In Prikladnaya Matematika, Informatika, Protsessy Upravleniya; Vestnik Sankt-Peterburgskogo Universiteta: St. Petersburg, FL, USA, 2021; Volume 17, pp. 213–224. [Google Scholar] [CrossRef]
  14. Kotina, E.D. Discrete optimization problem in beam dynamics. Nucl. Instrum. Methods Phys. Res. A 2006, 558, 292–294. [Google Scholar] [CrossRef]
  15. Floudas, C.; Pardalos, P. Encyclopedia of Optimization, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
  16. Gendreau, M. Handbook of Metaheuristics, 2nd ed.; Springer: New York, NY, USA, 2010. [Google Scholar]
  17. Panteleev, A.V.; Kolessa, A.A. Application of the tomtit flock metaheuristic optimization algorithm to the optimal discrete time deterministic dynamical control problem. Algorithms 2022, 15, 301. [Google Scholar] [CrossRef]
  18. Panteleev, A.V.; Kolessa, A.A. Optimal open-loop control of discrete deterministic systems by application of the perch school metaheuristic optimization algorithm. Algorithms 2022, 15, 157. [Google Scholar] [CrossRef]
  19. Davendra, D.; Zelinka, I. Self-Organizing Migrating Algorithm, 1st ed.; Methodology and Implementation; Studies in Computational Intelligence, 626; Springer: New York, NY, USA, 2016. [Google Scholar]
  20. Zelinka, I.; Lampinen, J. SOMA—Self-organizing migrating algorithm. In Proceedings of the International Conference on Soft Computing (Mendel 2000), Brno, Czech Republic, 10–12 July 2000; pp. 177–187. [Google Scholar]
  21. Panteleev, A.V.; Rakitianskii, V.M. Application of the modified self-organizing migration algorithm MSOMA in optimal open-loop control problems. J. Phys. Conf. Ser. 2021, 1925, 012017. [Google Scholar] [CrossRef]
  22. Horn, R.A.; Johnson, C.R. Matrix Analysis; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Figure 1. Optimal control and trajectories in Example 1 (SOMA application).
Figure 1. Optimal control and trajectories in Example 1 (SOMA application).
Axioms 12 01014 g001
Figure 2. Optimal control and trajectories in Example 1 (MSOMA application).
Figure 2. Optimal control and trajectories in Example 1 (MSOMA application).
Axioms 12 01014 g002
Figure 3. Optimal control in Example 2 (SOMA application).
Figure 3. Optimal control in Example 2 (SOMA application).
Axioms 12 01014 g003
Figure 4. Trajectories in Example 2 (SOMA application).
Figure 4. Trajectories in Example 2 (SOMA application).
Axioms 12 01014 g004
Figure 5. Optimal control in Example 2 (MSOMA application).
Figure 5. Optimal control in Example 2 (MSOMA application).
Axioms 12 01014 g005
Figure 6. Trajectories in Example 2 (MSOMA application).
Figure 6. Trajectories in Example 2 (MSOMA application).
Axioms 12 01014 g006
Figure 7. Optimal control and bundle of trajectories in Example 3 (SOMA application).
Figure 7. Optimal control and bundle of trajectories in Example 3 (SOMA application).
Axioms 12 01014 g007
Figure 8. Optimal control and bundle of trajectories in Example 3 (MSOMA application).
Figure 8. Optimal control and bundle of trajectories in Example 3 (MSOMA application).
Axioms 12 01014 g008
Figure 9. Optimal control in Example 4 (SOMA application).
Figure 9. Optimal control in Example 4 (SOMA application).
Axioms 12 01014 g009
Figure 10. Bundle of trajectories in Example 4 (SOMA application).
Figure 10. Bundle of trajectories in Example 4 (SOMA application).
Axioms 12 01014 g010
Figure 11. Optimal control in Example 4 (MSOMA application).
Figure 11. Optimal control in Example 4 (MSOMA application).
Axioms 12 01014 g011
Figure 12. Bundle of trajectories in Example 4 (MSOMA application).
Figure 12. Bundle of trajectories in Example 4 (MSOMA application).
Axioms 12 01014 g012
Figure 13. Optimal control and trajectory in Example 5 (SOMA application).
Figure 13. Optimal control and trajectory in Example 5 (SOMA application).
Axioms 12 01014 g013
Figure 14. Optimal control and trajectory in Example 5 (MSOMA application).
Figure 14. Optimal control and trajectory in Example 5 (MSOMA application).
Axioms 12 01014 g014
Figure 15. Optimal control in Example 6 (SOMA application).
Figure 15. Optimal control in Example 6 (SOMA application).
Axioms 12 01014 g015
Figure 16. Optimal trajectory in Example 6 (SOMA application).
Figure 16. Optimal trajectory in Example 6 (SOMA application).
Axioms 12 01014 g016
Figure 17. Optimal control in Example 6 (MSOMA application).
Figure 17. Optimal control in Example 6 (MSOMA application).
Axioms 12 01014 g017
Figure 18. Optimal trajectory in Example 6 (MSOMA application).
Figure 18. Optimal trajectory in Example 6 (MSOMA application).
Axioms 12 01014 g018
Figure 19. Optimal control and trajectory in Example 7 (MSOMA application).
Figure 19. Optimal control and trajectory in Example 7 (MSOMA application).
Axioms 12 01014 g019
Table 1. SOMA parameters (Example 1).
Table 1. SOMA parameters (Example 1).
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i s t
15001000.4100 10 5
Table 2. MSOMA parameters (Example 1).
Table 2. MSOMA parameters (Example 1).
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i v
600400.450 10 10
Table 3. SOMA parameters (Example 2).
Table 3. SOMA parameters (Example 2).
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i s t
900700.3100 10 5
Table 4. MSOMA parameters (Example 2).
Table 4. MSOMA parameters (Example 2).
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i v
600500.350 10 10
Table 5. SOMA parameters in Example 3.
Table 5. SOMA parameters in Example 3.
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i s t
10001000.3700.001
Table 6. MSOMA parameters in Example 3.
Table 6. MSOMA parameters in Example 3.
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i v
10001000.31500.001
Table 7. SOMA parameters in Example 6.
Table 7. SOMA parameters in Example 6.
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i s t
500500.4100 10 10
Table 8. MSOMA parameters in Example 6.
Table 8. MSOMA parameters in Example 6.
Parameters
N P N s t e p P R T M i g r a t i o n s M i n D i v
100500.31000.0001
Table 9. Optimal control and state values in Example 7.
Table 9. Optimal control and state values in Example 7.
t 0123456
u * ( t ) −1.83466−0.68896−0.33061−0.101690.01892−0.05302−0.05551
x * ( t ) 3.000001.165340.476380.145760.044070.062990.00997
t78910
u * ( t ) 0.05713−0.04223−0.04906
x * ( t ) −0.045540.01159−0.03064−0.07970
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Panteleev, A.; Rakitianskii, V. Application of Migrating Optimization Algorithms in Problems of Optimal Control of Discrete-Time Stochastic Dynamical Systems. Axioms 2023, 12, 1014. https://0-doi-org.brum.beds.ac.uk/10.3390/axioms12111014

AMA Style

Panteleev A, Rakitianskii V. Application of Migrating Optimization Algorithms in Problems of Optimal Control of Discrete-Time Stochastic Dynamical Systems. Axioms. 2023; 12(11):1014. https://0-doi-org.brum.beds.ac.uk/10.3390/axioms12111014

Chicago/Turabian Style

Panteleev, Andrei, and Vladislav Rakitianskii. 2023. "Application of Migrating Optimization Algorithms in Problems of Optimal Control of Discrete-Time Stochastic Dynamical Systems" Axioms 12, no. 11: 1014. https://0-doi-org.brum.beds.ac.uk/10.3390/axioms12111014

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop