Dynamic Event-Triggered Integral Sliding Mode Adaptive Optimal Tracking Control for Uncertain Nonlinear Systems

Tan, Wei; Yu, Wenwu; Wang, He

doi:10.3390/sym14061264

Open AccessArticle

Dynamic Event-Triggered Integral Sliding Mode Adaptive Optimal Tracking Control for Uncertain Nonlinear Systems

by

Wei Tan

,

Wenwu Yu

^*

and

He Wang

School of Mathematics, Southeast University, Nanjing 211189, China

^*

Author to whom correspondence should be addressed.

Symmetry 2022, 14(6), 1264; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061264

Submission received: 13 May 2022 / Revised: 9 June 2022 / Accepted: 13 June 2022 / Published: 18 June 2022

(This article belongs to the Special Issue Recent Advances in Sliding Mode Control/Observer and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we study the event-triggered integral sliding mode optimal tracking problem of nonlinear systems with matched and unmatched disturbances. The goal is to design an adaptive dynamic programming-based sliding-mode controller, which stabilizes the closed-loop system and guarantees the optimal performance of the sliding-mode dynamics. First, in order to remove the effects of the matched uncertainties, an event-triggered sliding mode controller is designed to force the state of the systems on the sliding mode surface without Zeno behavior. Second, another event-triggered controller is designed to suppress unmatched disturbances with a nearly optimal performance while also guaranteeing Zeno-free behavior. Finally, the benefits of the proposed algorithm are shown in comparison to several traditional triggering and learning-based mechanisms.

Keywords:

event-triggered control; integral sliding mode control; optimal control; adaptive control; adaptive dynamic programming

1. Introduction

Traditional control systems are implemented with time-triggered (TT) sampling (i.e., periodic sampling). Data are sent from the controller to the actuator (or from the sensor to the controller) using a fixed sample length. However, in modern control systems, especially networked control systems, control signals are often implemented aperiodically [1]. The advantages of the aperiodic sampling in terms of update times have been elaborated in [2] in detail. In fact, the limited communication bandwidth in networked control systems has stimulated a huge attention in event-triggered control (ETC) [3,4,5,6,7,8,9,10,11], as an alternative to TT. Despite the progress in the filed, some problems remain open, such as how to simultaneously counteract the effect of matched and unmatched disturbances while guaranteeing the optimal performance of the ETC systems, which is one of our concerns and is an issue to be studied.

Owing to the distinguished features such as fast dynamic response, robustness, order reduction, and implementation simplicity, sliding mode control (SMC) is widely studied in the fields of transportation, power grid, industrial communication networks, and uncertain serial industrial robots [12,13,14,15,16,17,18,19,20]. Especially, SMC has been recognized as one of the popular and powerful control tools in power converters based on deriving from the variable structure systems theorem. In addition, its popularity comes from the robustness feature, which eliminates the burden of the necessity of system parameters required for accurate modeling, yet it may lack robustness against unmatched disturbances. Although many achievements in codesign of sliding mode control and event-triggered have emerged [21,22,23], they all neglect how to guarantee the optimal performance of the controlled system in the presence of matched and unmatched disturbances.

Optimal control theory is now quite mature [24,25,26,27]. Optimal control for nonlinear systems requires one to solve the Hamilton–Jacobi–Bellman (HJB) or Hamilton–Jacobi–Isaac (HJI) equations. However, the nonlinearity of the above equation makes it impossible to get an analytical solution [28]. Although numerical solutions via dynamic programming [29] can be obtained, issues of offline optimization and curse of dimensionality make these numerical methods not feasible in practical applications. To address these shortcomings, adaptive dynamic programming (ADP), oriented from reinforcement learning (RL) technique, has been widely researched in recent decades [30,31,32,33,34,35].

Event-triggered ADP [36,37,38] and event-triggered ADP-based ISMC [39,40] have been investigated in the past years. However, some open issues still remain, which are explained as follows. Although event-triggered implementations can reduce the update times of standard TT implementations, it needs to be pointed out that the event-triggering techniques developed in [39,40] depend only on the current state (static event-triggered control). It is well known that static event-triggered techniques may effectively reduce the communication cost when the sampling error is large. However, as the sampling error becomes smaller, this technique will become conservative, i.e., it can cause several unnecessary triggering. So, developing more flexible event-triggering conditions to further save resources is urgently needed. Moreover, it is still an open problem to design a learning-based event-triggered strategy while at the same time relaxing the excitation condition typically required for ADP methods to further reduce triggered times. Enlightened by the aforementioned discussions, in this article, we will provide a new learning-based framework for ETC-based ISM optimal tracking problem. The novelties are presented below.

Different from the combined SMC and ADP frameworks of [39,40], this paper proposes a new dynamic event-triggered (DET) mechanism. By introducing an auxiliary variable, which is non-negative. This can increase the length of the time intervals between triggering events, further reducing the communication burden compared with [39,40].
A novel Integral Sliding Mode Control (ISMC) scheme based on DET for uncertain nonlinear systems is proposed, consisting of two control laws. A first event-triggered controller is designed to tackle the matched uncertainties and force the trajectory of the system on the sliding mode surface. A second event-triggered controller is designed to tackle the unmatched uncertainties and guarantee optimal performance.
To solve the resulting optimal control problem, a critic-only neural network (NN) based on ADP is proposed via the experience replay technique, which helps relaxing the excitation condition typically required for ADP methods to work. Stability of the closed-loop system is proven in the sense of uniformly ultimately boundedness, while guaranteeing Zeno-free behavior of the triggering mechanism.

The paper is arranged as follows. In Section 2, we introduce the model formulation and preliminaries. Section 3 covers the event-triggered-based ISMC design. Section 4 presents the framework of a dynamic triggered ADP strategy, along with stability analysis. Section 5 illustrates the effectiveness of the novel algorithm via comparative simulations. Section 6 gives the conclusions and possible future works.

Notations:

R^{+}

represents the sets of the positive real numbers.

R^{n}

and

R^{n \times m}

denote the space of all real n-vectors and the space of all

n \times m

real matrices, respectively. ≜ means “equal by definition”, and

I_{n}

is the identity matrix of dimension

n \times n

. T is the transposition symbol.

C^{1}

represents the class of functions with continuous derivative.

λ_{m i n} (X)

is defined as the minimum eigenvalue of matrix X,

∥ \cdot ∥

represents the 2-norm of a vector or matrix. For any full column rank matrix

F (\cdot)

, its left pseudoinverse is

F^{†} (\cdot) ≜ {(F^{T} (\cdot) F (\cdot))}^{- 1} F^{T} (\cdot)

.

2. System Formulation and Preliminaries

In this section, an ISMC design oriented toward optimal tracking is discussed. To target optimality, it is useful to introduce an augmented system associated with the tracking error system and the desired reference system.

Consider the following nonlinear system with matched and unmatched uncertainties [39]:

\dot{x} = p (x) + q (x) (u + d) + \bar{h} (x) w

(1)

where

x \in R^{n}

,

u \in R^{m}

,

d \in R^{m}

and

w \in R^{q}

are the system state, control input, unknown bounded matched disturbance, and unknown bounded unmatched disturbance, respectively. The system dynamics

p (x), q (x)

and

\bar{h} (x)

are known Lipschitz functions with

p (0) = 0

, and

∥ q (x) ∥

and

∥ \bar{h} (x) ∥

having a positive upper bound. Meanwhile, let the system (1) be controllable, and the matrix function

q (x)

be full column rank with

q (\cdot) \neq \bar{h} (\cdot)

.

Remark 1.

Herein, it should be noted that system (1) is favored by scholars in theoretical studies [18,33,35] as well as has been widely studied and explored in practical applications, such as a single link robot arm [39,40]; power system [34]; a spacecraft [41] et al.

The desired signal

x_{d}

is subject to

\begin{matrix} {\dot{x}}_{d} = r (x_{d}) \end{matrix}

(2)

where the bounded

x_{d} \in R^{n}

is Lipschitz continuous with

r (0) = 0

. Define the tracking error

\begin{matrix} e_{d} = x - x_{d} . \end{matrix}

(3)

Combining (1)–(3),

e_{d}

satisfies the following dynamics

\begin{matrix} {\dot{e}}_{d} = p (x) + q (x) (u + d) + \bar{h} (x) w - {\dot{x}}_{d} . \end{matrix}

(4)

Define the augmented state

ξ = {[e_{d}^{T}, x_{d}^{T}]}^{T} \in R^{2 n}

. Combining (2) with (4) generates the following augmented system

\dot{ξ} = f (ξ) + g (ξ) (u + d) + h (ξ) w

(5)

with

\begin{matrix} f (ξ) = [\begin{matrix} p (e_{d} + x_{d}) - r (x_{d}) \\ r (x_{d}) \end{matrix}] \\ g (ξ) = [\begin{matrix} q (e_{d} + x_{d}) \\ 0 \end{matrix}] \\ h (ξ) = [\begin{matrix} \bar{h} (e_{d} + x_{d}) \\ 0 \end{matrix}] \end{matrix}

where

f (ξ) \in R^{2 n}

,

g (ξ) \in R^{2 n \times m}

and

h (ξ) \in R^{2 n \times q}

.

For simplicity and convenience of analysis, we will omit some arguments of the functions (

f (ξ), g (ξ), and h (ξ)

will be written as

f, g, and h

, respectively) in the later part of the paper. To this end, the following two standard assumptions are required.

Assumption 1

([39]). System dynamics (5) with

f (0) = 0

is Lipschitz continuous, and

g and h

satisfy

∥ g ∥ \leq b_{q} (b_{q} \in R^{+})

and

∥ h ∥ \leq b_{h} (b_{h} \in R^{+})

, respectively. The disturbances d and

w

satisfy

∥ d ∥ \leq b_{d} (b_{d} \in R^{+})

and

∥ w ∥ \leq b_{w} (b_{w} \in R^{+})

, respectively.

Assumption 2

([39]). The matrix function g is full column rank and its left pseudoinverse is given by

g^{†} = {(g^{T} g)}^{- 1} g^{T}

and there exists some positive values

b_{q^{†}} (b_{q^{†}} \in R^{+})

such that

∥ g^{†} ∥ \leq b_{q^{†}}

. Then, the following equality holds

\begin{matrix} h w = g g^{†} h w + (I - g g^{†}) h w \end{matrix}

(6)

where

∥ g g^{†} h ∥ \leq b_{g^{†}} (b_{g^{†}} \in R^{+})

.

Control objective: This article aims to achieve an optimal tracking control of system state x for the desired trajectory

x_{d}

, so that the tracking error is uniformly ultimately bounded UUB. The control input should eliminate the effect of the matched disturbance and reduce the effects of the unmatched disturbance.

To achieve this goal, first, a new composite control law in the form

u = u_{0} + u_{1}

will be considered. Second, according to the robustness to nonlinearities and uncertainties of SMC technique, the Section 3 designs an integral sliding surface and event-triggered controller

u_{1}

aiming to suppress the matched effects of the systems while forcing the state of the systems on the sliding mode surface without Zeno-free behavior. Then, event-triggered optimal controller

u_{0}

is designed aiming to reduce the unmatched effect and guarantee the optimal performance of the sliding-mode dynamics in Section 4. Finally, a nonlinear single link robot arm is considered to verify the effectiveness of the proposed algorithm.

3. DET-Based ISMC Design

To tackle the uncertainty affecting the nonlinear system (5), an integral type sliding surface is designed as

\begin{matrix} S (ξ) = & M [ξ - ξ_{0} - \int_{0}^{t} (f + g u_{0}) d τ] \end{matrix}

(7)

where

M \in R^{m \times 2 n} (m < 2 n)

is a projection matrix satisfying that

M g

is invertible. Here,

u_{0}

is an optimal control input that will be designed in the next section. In this section, we aim to design the control

u_{1}

to force the augmented system (5) onto the manifold

{ξ | S (ξ, t) = 0}

in finite time, to remove the effect of matched disturbances. This can be achieved via

\begin{matrix} u_{1} (t) = - X sgn (g^{T} M^{T} S (ξ)) \end{matrix}

(8)

where

X > 0

. The sign function

sgn (\cdot)

is

sgn (α) = \{\begin{matrix} 1, α > 0, \\ 0, α = 0, \\ - 1, α < 0 . \end{matrix}

Next, we introduce the dynamics of the sliding variable and the notion of practical sliding mode, which will be required later. The dynamics of the sliding variable can be obtained by differentiating (7) with respect to time

\begin{matrix} \dot{S} (ξ) = M [\dot{ξ} - (f + g u_{0})] . \end{matrix}

(9)

According to the SMC theory, when the system trajectories reach the manifold, one has

\dot{S} = 0

. Then, by combining (5), (6) and (9), one has an equivalent control

u_{1 e q}

satisfying

\begin{matrix} u_{1 e q} = - d - g^{†} h w - {(M g)}^{- 1} M (I - g g^{†}) h w . \end{matrix}

(10)

Substituting the equivalent control (10) into the augmented system (5), one gets thew dynamics on the sliding manifold

\begin{matrix} \dot{ξ} & = f + g u_{0} + [I - g {(M g)}^{- 1} M] w_{u} \end{matrix}

(11)

where

w_{u} = [I - g g^{†}] h w

. The following result is recalled.

Lemma 1

([18]). The optimal solution to the following optimization problem

\begin{matrix} \underset{M \in R^{m \times 2 n}}{arg min} {∥ [I - g {(M g)}^{- 1} M] w_{u} ∥}_{2} \end{matrix}

is

M = g^{†}

.

Remark 2.

Since

M

is the left pseudoinverse of g, one knows

(1): $M = g^{†}$ can minimize $w_{u e q}$ , which leads to $w_{u e q} = w_{u}$ , refer to [18] for the detailed proof;
(2): The modulation gain associated with $u_{1} (t)$ is minimized, which means that the amplitude of chattering can be reduced;
(3): $M = g^{†}$ avoids amplifying the effect of the unmatched disturbance.

To extend the continuous-time control

u_{1}

in the event-triggered paradigm, we define a virtual control input

μ (t)

satisfying

μ (t) ≜ u (t_{k}), t \in [t_{k}, t_{k + 1})

, where

\begin{matrix} μ (t) = - X sgn (g^{T} M^{T} S (ξ (t_{k}))) \end{matrix}

(12)

and

{t_{k}}_{k = 0}^{\infty}

is a sampling sequence with

t_{k} < t_{k + 1}, k \in N

with

N = {0, 1, 2, \dots}

. Define the following measurement error

\begin{matrix} e_{u_{1}} & = μ (t_{k}) - μ (t) \\ = X sgn (g^{T} M^{T} S (ξ)) - X sgn (g^{T} (ξ_{k}) M^{T} (ξ_{k}) S (ξ_{k})) \end{matrix}

(13)

satisfying the following triggering condition

\begin{matrix} t_{k + 1} = inf {t > t_{k} | X - b_{d} - b_{q^{†}} - ∥ e_{u_{1}} (t) ∥ \leq 0}, t_{0} = 0 . \end{matrix}

(14)

The following theorem establishes a sufficient condition to the reachability of the specified sliding surface (7).

Theorem 1.

Suppose Assumptions 1 and 2 hold. Consider the augmented system (5) with the sliding surface (7) under event-triggered controller (12). Then, practical sliding mode is achieved if the triggering condition satisfies (14).

Proof.

Consider the following Lyapunov function

V_{1} = \frac{1}{2} S^{T} S

. For

t \in [t_{k}, t_{k + 1})

, the time derivative

V_{1}

concerning system (5) with (12) is

\begin{matrix} {\dot{V}}_{1} & = S^{T} \dot{S} \\ = S^{T} M [\dot{ξ} - f - g u_{0}] \\ = S^{T} M [f + g (u_{0} (t_{k}) + u_{1} (t_{k}) + d) + h w - (f + g u_{0} (t_{k}))] \\ = S^{T} M [g (u_{1} (t_{k}) + d + g^{†} h w) + w_{u}] \\ = S^{T} M g [(e_{u_{1}} (t) - X sgn (g^{T} M^{T} S) + d + g^{†} h w] + S^{T} M g {(M g)}^{- 1} M w_{u} \\ \leq ∥ S^{T} M g ∥ [- X + b_{d} + b_{q^{†}} + ∥ e_{u_{1}} (t) ∥] + ∥ S^{T} M g {(M g)}^{- 1} M w_{u} ∥ . \end{matrix}

(15)

Substituting

M = g^{†}

into (15), we have

{\dot{V}}_{1} \leq - ∥ S^{T} ∥ [X - b_{d} - b_{q^{†}} - ∥ e_{u_{1}} (t) ∥] .

Noticing the triggering condition (14), we know that

∥ e_{u_{1}} ∥ \leq X - b_{d} - b_{q^{†}}

holds all the time, then

{\dot{V}}_{1} < 0

for

∥ S ∥ \neq 0

, which implies that the system state

ξ

starting from the initial state

ξ (0)

slide robustly on the switching surface S from initial time

t = 0

and can gradually reach the sliding mode surface

S = 0

. The proof is completed. □

Theorem 2.

The event-triggered rule (14) avoids the Zeno behavior, since a minimal triggering interval is given by

T_{u_{1}} (k) \geq \frac{ν (X - b_{d} - b_{q^{†}})}{n ∥ X ∥ (∥ X ∥ + b_{d} + b_{q^{†}})} .

Proof.

Define

T_{u_{1}} (k) = t_{k + 1} - t_{k}

as the interexecution time for

u_{1}

. Recall that the control

u_{1}

is updated at

t = t_{k}

and

e_{u_{1}} (t_{k}) = 0

. During the event intervals

\begin{matrix} \frac{d ∥ e_{u_{1}} (t) ∥}{d t} & \leq ∥ \frac{d e_{u_{1}} (t)}{d t} ∥ = ∥ \frac{d}{d t} [μ (t) - u_{1} (t)] ∥ \\ = ∥ \frac{d}{d t} [- X sgn (g^{T} (ξ_{k}) M^{T} (ξ_{k}) S (ξ_{k})) \\ + X sgn (g^{T} (ξ) M^{T} S (ξ))] ∥ \\ \leq ∥ \frac{d}{d t} [X sgn (g^{T} (ξ) M^{T} S (ξ))] ∥ . \end{matrix}

(16)

The technique [39] is utilized to approximate the sign function

sgn (S)

. That is,

sgn (S) \approx \tanh (S / ν)

where

ν \geq 1

. Substituting

M = g^{†}

into (16), we have

\begin{matrix} \frac{d | | e_{u_{1}} (t) | |}{d t} & \leq ∥ \frac{d}{d t} [X tanh (g^{T} (ξ) M^{T} (ξ) S)] ∥ \\ = ∥ X ∥ ∥ I_{n} - {tanh}^{2} (g^{T} (ξ) M^{T} (ξ) S) ∥ ∥ \frac{1}{ν} \dot{S} ∥ . \end{matrix}

As

∥ I_{n} - {tanh}^{2} (g^{T} (ξ) M^{T} (ξ) S) ∥ \leq ∥ I_{n} ∥ = n

, using (7), we have

\begin{matrix} \frac{d ∥ e_{u_{1}} (t) ∥}{d t} & \leq (n / ν) ∥ X ∥ ∥ M ∥ ∥ \dot{ξ} - [f + g u_{0}] ∥ \\ \leq (n / ν) ∥ X ∥ ∥ M g ∥ ∥ u_{1} (ξ_{k}) + d (t) + g^{†} k w ∥ \\ \leq (n / ν) ∥ X ∥ (∥ X ∥ + b_{d} + b_{q^{†}}) . \end{matrix}

(17)

After integrating both sides with respect to (17) for

t \in [t_{k}, t_{k + 1})

, we get

\begin{matrix} ∥ e_{u_{1}} ∥ \leq (t - t_{k}) (n / ν) ∥ X ∥ (∥ X ∥ + b_{d} + b_{q^{†}}) . \end{matrix}

(18)

Using the event-triggered rule (14), when

t \in [0, τ_{\infty}),

one has

\begin{matrix} | | e_{u_{1}} | | \geq X - b_{d} - b_{q^{†}} . \end{matrix}

(19)

For

\forall t \in [t_{k}, t_{k + 1})

, there is

t - t_{k} \leq t_{k + 1} - t_{k} = T_{u_{1}} (k)

. Then, combined with (18) and (19), one has

\begin{matrix} T_{u_{1}} (k) \geq \frac{ν (X - b_{d} - b_{q^{†}})}{n ∥ X ∥ (∥ X ∥ + b_{d} + b_{q^{†}})} . \end{matrix}

The proof is completed. □

Next, a dynamically triggered controller

u_{0}

will be designed to suppress unmatched disturbances and guarantee the event-triggered stability of sliding manifold dynamics system (20) with a nearly optimal performance. We shall find the dynamic event-triggered condition and solve the optimal control problem using critic-only NN approximation strategy.

4. DET-Based Optimal Controller Design

The design

u_{0}

comprises two steps. First, find the event-triggered rule to guarantee stability and optimal performance; second, solve the resulting optimal control problem approximately by using ADP and

NN

approximation strategy. With

M = g^{†}

, the sliding mode dynamics (11) can be revised as

\dot{ξ} = f + g u_{0} + \bar{k} w

(20)

where

\bar{k} = (I - g g^{†}) h

.

The discounted cost function

V (ξ)

, which is subject to the above dynamics (20), is defined as

\begin{matrix} V (ξ) = \int_{t}^{\infty} e^{- α (τ - t)} (ξ^{T} Q ξ + u_{0}^{T} R u_{0} - γ^{2} w^{T} w) d τ \end{matrix}

where

α > 0

is a discount factor and

Q = [\begin{matrix} Q & 0_{n \times n} \\ 0_{n \times n} & 0_{n \times n} \end{matrix}]

. Moreover,

Q \in R^{n \times n}

and

R \in R^{m \times m}

are symmetric positive definite matrices to weight the system state and input, respectively. Meanwhile, if

u_{0}

is admissible and

V (V = V (ξ))

is

C^{1}

, the corresponding nonlinear Bellman equation is

ξ^{T} Q ξ + u_{0}^{T} R u_{0} - γ^{2} w^{T} w - α V + \nabla V^{T} (f + g u_{0} + \bar{k} w) = 0 .

(21)

Define the Hamiltonian

\begin{matrix} H (ξ, u_{0}, w, \nabla V) ≜ ξ^{T} Q ξ + u_{0}^{T} R u_{0} - γ^{2} w^{T} w - α V + \nabla V^{T} (f + g u_{0} + \bar{k} w) \end{matrix}

(22)

where

\nabla V = \partial V / \partial ξ

.

According to the zero-sum game [27], we get the optimal cost function

V^{*} (ξ)

by

\begin{matrix} V^{*} (ξ) = min_{u_{0}} max_{w} V (ξ, u_{0}, w) = max_{w} min_{u_{0}} V (ξ, u_{0}, w), \end{matrix}

(23)

which satisfies the

HJI

equation

\begin{matrix} min_{u_{0}} max_{w} H (ξ, u_{0}, w, \nabla V^{*}) = 0 . \end{matrix}

(24)

Consider the stationarity conditions [25]:

\begin{matrix} \{\begin{matrix} \partial H (ξ, u_{0}, w, \nabla V^{*}) / \partial u_{0} = 0, \\ \partial H (ξ, u_{0}, w, \nabla V^{*}) / \partial w = 0 . \end{matrix} \end{matrix}

(25)

Solving the stationary conditions (25), we obtain the optimal control input and worst-case disturbance

\begin{matrix} \{\begin{matrix} u_{0}^{*} = - \frac{1}{2} R^{- 1} g^{T} \nabla V^{*}, \\ w^{*} = \frac{1}{2 γ^{2}} {\bar{k}}^{T} \nabla V^{*} . \end{matrix} \end{matrix}

(26)

Substituting the control input (26) into (22), the

HJI

equation is written as

\begin{matrix} ξ^{T} Q ξ & + \nabla {V^{*}}^{T} f - α V^{*} - \frac{1}{4} \nabla {V^{*}}^{T} g R^{- 1} g^{T} \nabla V^{*} + \frac{1}{4 γ^{2}} \nabla {V^{*}}^{T} {\bar{k}}^{T} \bar{k} \nabla V^{*} = 0 . \end{matrix}

(27)

4.1. Dynamically Triggering Rule for Optimal Input

To propose a dynamically triggering rule, we define again a new sampling sequence

{t_{i}^{'}}_{i = 0}^{\infty}

with

t_{i} < t_{i + 1}, i \in N

and

N = {0, 1, 2, \dots}

. Define the error between the sampled state

ξ_{i}^{'}

and the current state

ξ

as

\begin{matrix} e_{i} = ξ_{i}^{'} - ξ (t), \forall t \in [t_{i}^{'}, t_{i + 1}^{'}), i \in N \end{matrix}

(28)

where

ξ_{i}^{'} {≜ ξ (t) |}_{t = t_{i}^{'}}

. Thus, for

t \in [t_{i}^{'}, t_{i + 1}^{'})

, the sampled-data version of the system (20) can be rewritten as

\begin{matrix} \dot{ξ} (t) = f + g u_{0} (ξ + e_{i}) & + \bar{k} w . \end{matrix}

(29)

Considering the event-based sampling rule, (26) is written as

\begin{matrix} \{\begin{matrix} u_{0}^{*} (ξ_{i}^{'}) = - \frac{1}{2} R^{- 1} g^{T} (ξ_{i}^{'}) \nabla V^{*} (ξ_{i}^{'}) \\ w^{*} (ξ) = \frac{1}{2 γ^{2}} {\bar{k}}^{T} \nabla V^{*} (ξ) \end{matrix} \end{matrix}

(30)

where

\nabla V^{*} (ξ_{i}^{'}) = \partial V^{*} (ξ) {/ \partial ξ |}_{ξ = ξ_{i}^{'}}

.

The

HJI

equation with event-triggered law can be written as

\begin{matrix} ξ^{T} Q ξ + \nabla {V^{*}}^{T} f - α V^{*} + \frac{1}{4} \nabla {V^{*}}^{T} (ξ_{i}^{'}) g (ξ_{i}^{'}) R^{- 1} g^{T} (ξ_{i}^{'}) \nabla V^{*} (ξ_{i}^{'}) \\ - \frac{1}{2} \nabla {V^{*}}^{T} g R^{- 1} g^{T} (ξ_{i}^{'}) \nabla V^{*} (ξ_{i}^{'}) - \frac{1}{4 γ^{2}} \nabla {V^{*}}^{T} {\bar{k}}^{T} \bar{k} \nabla V^{*} = 0 . \end{matrix}

(31)

Now, a necessary assumption is introduced for stability analysis.

Assumption 3

([36]). The optimal controller

u_{0}^{*}

is Lipschitz continuous on Ω,viz. there exists

L > 0

, ∀

ξ, \bar{ξ} \in Ω

that satisfy

∥ u_{0}^{*} (ξ) - u_{0}^{*} (\bar{ξ}) ∥ \leq L |\bar{ξ} - ξ| .

We define an internal dynamic variable

η

evolving according to the following differential equation:

\begin{matrix} \dot{η} = - λ η + ∥ e_{T} ∥^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2}, η (0) \geq 0 \end{matrix}

(32)

where

0 < λ < 1

, and

e_{T} = (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} .

Here,

η

is designed as a filtered value associated with

∥ e_{T} ∥^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2}

. This new DET technique can avoid

∥ e_{T} ∥^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2}

to be always nonnegative if the following condition is used

\begin{matrix} t_{0}^{'} = 0, \\ t_{i + 1}^{'} = inf {t > t_{i}^{'} | η + θ (∥ e_{T} ∥^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2}) \leq 0} \end{matrix}

(33)

where

θ \in R^{+}

. This is stated in the following result.

Lemma 2.

The dynamic variable η defined by (32), with the event triggered rule (33), is always non-negative.

Proof.

Using the dynamic rule (33), for ∀

t \in [0, + \infty),

one has

\begin{matrix} η + θ (∥ e_{T} ∥^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2}) \geq 0 . \end{matrix}

(34)

First, if

θ = 0,

then

η \geq 0

is true.

Second, if

θ \neq 0,

by combining (32) and (34), one can deduce the following relation:

\begin{matrix} \dot{η} + λ η = ∥ e_{T} ∥^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} \geq - \frac{η}{θ}, η (0) \geq 0 . \end{matrix}

(35)

By the comparison lemma, for ∀

t \in [0, + \infty),

one gets

\begin{matrix} \dot{η} (t) \geq η (0) e^{- (λ + \frac{1}{θ}) t} . \end{matrix}

(36)

Since the initial condition

η (0) \geq 0

and

e^{- (λ + \frac{1}{θ}) t} > 0

for ∀

t \in [0, + \infty)

, so one can obtain

η \geq 0 .

The proof is completed. □

Two main stability results for

u_{0}^{*}

follow.

Theorem 3.

Considering the sliding mode dynamics (20) with the optimal cost

V^{*}

and (30). Let Assumptions 1–3 hold. The tracking error

e_{d}

and the closed-loop system (29) achieve UUB via DET (33).

Proof.

Choose

W_{1} = V^{*} + η

as the Lyapunov function for

t \in [t_{i}, t_{i + 1})

. With (30), taking the derivative of the Lyapunov function along the trajectory of (29) gives

\begin{matrix} {\dot{W}}_{1} & = {\dot{V}}^{*} + \dot{η} \\ = {(\nabla V^{*})}^{T} (f + g u_{0}^{*} (ξ_{i}^{'}) + \bar{k} w^{*}) + \dot{η} . \end{matrix}

(37)

Note that (26) implies that

\begin{matrix} \{\begin{matrix} {\nabla V^{*}}^{T} g = - 2 R {u_{0}^{*}}^{T}, \\ {\nabla V^{*}}^{T} = 2 γ^{2} \bar{k} {w^{*}}^{T} . \end{matrix} \end{matrix}

(38)

According to the time-triggered

HJI

(27), one has

\begin{matrix} \nabla {V^{*}}^{T} f & = - ξ^{T} Q ξ - {u^{*}}_{0}^{T} R u_{0}^{*} + γ^{2} w^{*} T w^{*} + α V^{*} - \nabla {V^{*}}^{T} g u_{0}^{*} - \nabla {V^{*}}^{T} \bar{k} w^{*}, \end{matrix}

(39)

Thus, based on (38) and (39), we have

\begin{matrix} {\dot{V}}^{*} & = \nabla {V^{*}}^{T} (f + g u_{0}^{*} (ξ_{i}^{'}) + \bar{k} w^{*}) \\ = - ξ^{T} Q ξ - {u_{0}^{*}}^{T} R u_{0}^{*} + γ^{2} w^{*} T w^{*} + α V^{*} - \nabla {V^{*}}^{T} g u_{0}^{*} - \nabla {V^{*}}^{T} \bar{k} w^{*} \\ + \nabla {V^{*}}^{T} g u_{0}^{*} (ξ_{i}^{'}) + \nabla {V^{*}}^{T} \bar{k} w^{*} \\ = R ∥ u_{0}^{*} (ξ (t_{i}^{'})) - u_{0}^{*} ∥^{2} + γ^{2} {w^{*}}^{T} w^{*} - {u_{0}^{*}}^{T} (ξ_{i}^{'}) R u_{0}^{*} (ξ_{i}^{'}) - ξ^{T} Q ξ + α V^{*} \\ \leq - ξ^{T} Q ξ + 2 ∥ R ∥ L^{2} ∥ e_{i} ∥^{2} + α V^{*} - ∥ R ∥ | u_{0}^{*} (ξ_{i}^{'}) ∥^{2} + γ^{2} {∥ w^{*} ∥}^{2} \\ \leq - ϵ^{2} λ_{m i n} {(Q) ∥ ξ ∥}^{2} + γ^{2} ∥ w^{*} ∥^{2} + 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} + α V^{*} - (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} . \end{matrix}

(40)

Since

V^{*}

is continuously differentiable on

Ω

, one can conclude that both

V^{*}

and its derivative

V_{ξ}^{*}

are bounded on

Ω

. Here, we have

max {∥ V^{*} ∥, ∥ \nabla V^{*} ∥} \leq b_{v^{*}}

, where

b_{v^{*}} \in R^{+}

is a constant. Recalling the triggering condition (33), one obtains

\begin{matrix} {\dot{W}}_{1} & = {\dot{V}}^{*} + \dot{η} \\ = - ϵ^{2} λ_{m i n} {(Q) ∥ ξ ∥}^{2} + 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} - (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} + γ^{2} {∥ w^{*} ∥}^{2} \\ - λ η + (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} + α V^{*} \\ \leq - ϵ^{2} λ_{m i n} (Q) {∥ ξ ∥}^{2} + Λ \end{matrix}

(41)

where

Λ = α b_{v^{*}} + \frac{1}{4 γ^{2}} b_{k}^{2} b_{v^{*}}^{2}

and

∥ \bar{k} ∥ \leq b_{k} (b_{k} \in R^{+})

.

Thus, we have

{\dot{W}}_{1} < 0

once

∥ ξ ∥ > \sqrt{\frac{Λ}{ϵ^{2} λ_{m i n} (Q)}}

is established.

Accordingly, the UUB of the closed-loop system is ensured. The proof is completed. □

Next, we prove that the dynamic triggering rule (33) avoids the Zeno behavior.

Assumption 4

([36]).

f + g u^{*} (ξ_{i}^{'})

is Lipschitz continuous with Lipschitz constants

L_{f}

and

L_{g}

, for all ξ and

ξ_{i}^{'}

, one has

\begin{matrix} ∥ f + g u^{*} (ξ_{i}^{'}) ∥ & = ∥ f + g u^{*} (ξ + e_{i}) ∥ \\ \leq L_{f} ∥ ξ ∥ + L_{g} ∥ e_{i} ∥ . \end{matrix}

(42)

Theorem 4.

Let Assumptions 1 and 4 hold. The dynamic triggering rule (33) avoids the Zeno behavior, and a minimal triggering interval is given by

\begin{matrix} T_{u_{0}} (i) & \geq \frac{1}{L_{g} + L_{f}} ln (\frac{\sqrt{η / θ + ∥ e_{T} ∥^{2}} (L_{g} + L_{f})}{\sqrt{2 L^{2} ∥ R ∥} (L_{f} ∥ ξ (t_{i}^{'}) ∥ + b_{k} b_{w})} + 1) . \end{matrix}

Proof.

Define

T_{u_{0}} (i) = t_{i + 1}^{'} - t_{i}^{'}

as the interexecution time for

u_{0}

. The control

u_{0}

is updated at

t = t_{i}^{'}

and

{\dot{e}}_{i} = 0

. During the event intervals

\begin{matrix} ∥ \dot{ξ} ∥ & = ∥ f + g u_{0}^{*} (ξ_{i}) + \bar{k} w^{*} ∥ \\ = ∥ f + g u_{0}^{*} (ξ_{i}^{'}) ∥ + ∥ \bar{k} w^{*} ∥ \\ \leq L_{f} ∥ ξ ∥ + L_{g} ∥ e_{i} ∥ + b_{k} b_{w} . \end{matrix}

(43)

For

t \in [t_{i}^{'}, t_{i + 1}^{'})

, we have

ξ (t) = ξ (t_{i}^{'}) - e_{i}

and

\dot{ξ} = - {\dot{e}}_{i}

. From (43), we have

\begin{matrix} ∥ {\dot{e}}_{i} ∥ \leq L_{f} ∥ ξ (t_{i}^{'}) - e_{i} ∥ + L_{g} ∥ e_{i} ∥ + b_{k} b_{w} . \end{matrix}

Using the comparison lemma, we have

\begin{matrix} ∥ e_{i} ∥ \leq \frac{L_{f} ∥ ξ (t_{i}^{'}) ∥ + b_{k} b_{w}}{L_{g} + L_{f}} (e^{(L_{g} + L_{f}) (t - t_{i})} - 1) . \end{matrix}

Recall the triggering condition in (33) associated with

\begin{matrix} η + θ (∥ e_{T} ∥^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2}) \leq 0, \end{matrix}

(44)

and resulting in

\begin{matrix} | | e_{i} {| |}^{2} \geq \frac{η / θ + ∥ e_{T} ∥^{2}}{2 L^{2} ∥ R ∥} . \end{matrix}

For

\forall t \in [t_{i}^{'}, t_{i + 1}^{'})

, one has

t - t_{i}^{'} \leq T_{u_{0}}

, so

\begin{matrix} T_{u_{0}} (i) & \geq \frac{1}{L_{g} + L_{f}} ln (\frac{\sqrt{η / θ + ∥ e_{T} ∥^{2}} (L_{g} + L_{f})}{\sqrt{2 L^{2} ∥ R ∥} (L_{f} ∥ ξ (t_{i}^{'}) ∥ + b_{k} b_{w})} + 1), \end{matrix}

where

η, θ

are positive for all

t > 0

. The proof is completed. □

4.2. Dynamically Triggered ADP with Single Critic NN

The dynamic event-triggered optimal control

u_{0}^{*}

has been analyzed before. In the sequence, to guarantee UUB of sliding-mode dynamics (20), approximate optimal solution

V^{*}

is to obtain by a critic-only NN approximation structure in the following part using reinforcement learning method.

Next, the solution

V^{*}

of HJI (31) is approximated using NN using Weierstrass high-order approximation theorem:

\begin{matrix} V^{*} = ω_{c}^{T} ϕ + ε \end{matrix}

(45)

where

ω_{c} \in R^{l}

is the unknown ideal weight vector,

ϕ \in R^{l}

is the activation function vector, l is the number of hidden neurons, and

ε \in R

is the critic NN approximation error.

The gradient of

V^{*}

is

\begin{matrix} \nabla V^{*} = \nabla ϕ^{T} ω_{c} + \nabla ε \end{matrix}

(46)

where

\nabla ϕ^{T} = \partial ϕ / \partial ξ

and

\nabla ε = \partial ε / \partial ξ

.

According to (26), (30), and (46), it is easy to obtain

\begin{matrix} \{\begin{matrix} u_{0}^{*} (ξ_{i}^{'}) & = - \frac{1}{2} R^{- 1} g^{T} (ξ_{i}^{'}) (\nabla ϕ^{T} (ξ_{i}^{'}) ω_{c} + \nabla ε (ξ_{i}^{'})), \\ w^{*} & = \frac{1}{2 γ^{2}} {\bar{k}}^{T} (\nabla ϕ^{T} ω_{c} + \nabla ε) . \end{matrix} \end{matrix}

(47)

Because the ideal weights

ω_{c}

are unknown, we first define the approximate expression of the optimal cost function

V^{*}

as

\begin{matrix} {\hat{V}}^{*} = {\hat{ω}}_{c}^{T} ϕ \end{matrix}

(48)

where

{\hat{ω}}_{c} \in R^{l}

is the estimation of

ω_{c}

.

According to (26) and (30), the approximate expression of (26) are

\begin{matrix} \{\begin{matrix} {\hat{u}}_{0} (ξ_{i}^{'}) = - \frac{1}{2} R^{- 1} g^{T} (ξ_{i}^{'}) \nabla ϕ^{T} (ξ_{i}^{'}) {\hat{ω}}_{c}, \\ \hat{w} = \frac{1}{2 γ^{2}} {\bar{k}}^{T} \nabla ϕ^{T} {\hat{ω}}_{c} . \end{matrix} \end{matrix}

(49)

Hence, the event-triggered ISMC becomes

\begin{matrix} u (t) = - \frac{1}{2} R^{- 1} g^{T} (ξ_{i}^{'}) \nabla ϕ^{T} (ξ_{i}^{'}) {\hat{ω}}_{c} - X sgn (g^{T} (ξ_{k}) M^{T} (ξ_{k}) s (t_{k})) . \end{matrix}

(50)

Then, the approximation of the Hamiltonian is

\begin{matrix} \hat{H} (ξ, {\hat{u}}_{0} (ξ_{i}^{'}), \hat{w}, \hat{V}) & ≜ ξ^{T} Q ξ + {\hat{u}}_{0}^{T} (ξ_{i}^{'}) R {\hat{u}}_{0} (ξ_{i}^{'}) - γ^{2} {\hat{w}}^{T} \hat{w} - α \hat{V} \\ + {\hat{ω}}_{c}^{T} \nabla ϕ (ξ) (f + g {\hat{u}}_{0} (ξ_{i}^{'}) + \bar{k} \hat{w}) . \end{matrix}

(51)

Owing to

H (ξ, u_{0}^{*} (ξ_{i}^{'}), w^{*}, V^{*}) = 0

, we have

\begin{matrix} \hat{H} (ξ, {\hat{u}}_{0} (ξ_{i}^{'}), \hat{w}, \hat{V}) - H (ξ, u_{0}^{*} (ξ_{i}^{'}), w^{*}, V^{*}) & = ℓ + {\hat{ω}}_{c}^{T} χ \\ ≜ e_{H} \end{matrix}

(52)

where

ℓ = ξ^{T} Q ξ + {\hat{u}}_{0}^{T} (ξ_{i}^{'}) R {\hat{u}}_{0} (ξ_{i}^{'}) - γ^{2} {\hat{w}}^{T} \hat{w} - α \hat{V} and χ = \nabla ϕ (ξ) (f + g {\hat{u}}_{0} (ξ_{i}^{'}) + \bar{k} \hat{w})

.

Subsequently, a feasible training method is proposed by minimizing the residual error

E = (1 / 2) e_{H}^{2}

, combining codesign of the gradient descent technique and the experience replay (ER) technique, that is

\begin{matrix} {\dot{\hat{ω}}}_{c} & = - ζ \frac{χ}{{(1 + χ^{T} χ)}^{2}} (χ^{T} {\hat{ω}}_{c} + ℓ) - ζ \sum_{j = 1}^{l} \frac{χ_{j}}{{(1 + χ_{j}^{T} χ_{j})}^{2}} (χ_{j}^{T} {\hat{ω}}_{c} + ℓ_{j}), \end{matrix}

(53)

where

ζ > 0

is the adaptive learning rate,

κ = {(κ_{1}, . . ., κ_{l})}^{T}

,

\begin{matrix} χ_{j} = \nabla ϕ (ξ_{j}^{'}) (f (ξ_{j}^{'}) + g (ξ_{j}^{'}) {\hat{u}}_{0} (ξ_{j}^{'}) + \bar{k} (ξ_{j}^{'}) \hat{ω} (ξ_{j}^{'})), \end{matrix}

for

j \in {1, 2, \dots, l}

. The term

\frac{1}{{(1 + χ^{T} χ)}^{2}}

is used for the purpose of normalization.

Based on the weight estimation error

{\tilde{ω}}_{c} = ω_{c} - {\hat{ω}}_{c}

, together with

{\dot{\tilde{ω}}}_{c} = - {\dot{\hat{ω}}}_{c}

, one has

\begin{matrix} {\dot{\tilde{ω}}}_{c} & = ζ \frac{χ}{{(1 + χ^{T} χ)}^{2}} ε_{H} + ζ \sum_{j = 1}^{l} \frac{ζ χ_{i}}{{(1 + χ_{i}^{T} χ_{i})}^{2}} ε_{H_{i}} \\ - (ζ \frac{χ χ^{T}}{{(1 + χ^{T} χ)}^{2}} + ζ \sum_{j = 1}^{l} \frac{χ_{j} χ_{j}^{T}}{{(1 + χ_{j}^{T} χ_{j})}^{2}}) {\tilde{ω}}_{c} \end{matrix}

(54)

where

ε_{H} = - \nabla ε^{T} (f + g {\hat{u}}_{0} (ξ_{i}^{'}) + \bar{k} \hat{ω})

, and

ε_{H_{j}} = \nabla ε (ξ_{j}^{'}) (f (ξ_{j}^{'}) + g (ξ_{j}^{'}) {\hat{u}}_{0} (ξ_{j}^{'}) + \bar{k} (ξ_{j}^{'}) \hat{ω}) .

4.3. Stability Analysis

According to the critic-only NN strategy presented above, the ISMC can be obtained by solving the HJI Equation (31) approximately by DET way. Thus, the weight estimation error and the tracking error are proven to be UUB by the Lyapunov function in this section. To this end, the assumption 5 is necessary as follows for the following analysis.

Assumption 5

([36]). The NN approximation error and activation functions corresponding to their gradients are bounded, i.e.,

∥ ε (ξ) ∥ \leq b_{ε}

and

∥ \nabla ε (ξ) ∥ \leq b_{ε_{ξ}}

and

∥ ϕ (ξ) ∥ \leq b_{ϕ}

and

∥ \nabla ϕ (ξ) ∥ \leq b_{ϕ_{ξ}}

.

In the sequence, an important theorem will be emerged to guarantee that the weight estimation error and the tracking error are UUB under dynamic event-triggered condition (33) and controller (49).

Theorem 5.

Considering the sliding-mode dynamics (20) with (49), let Assumptions 1–5 hold. The tracking error and the critic weight estimation error areUUBvia DET (33).

Proof.

We will discuss two cases, i.e., the continuous dynamics and the jump dynamics. For ∀

t \in [t_{i}, t_{i + 1})

, define the Lyapunov function

L

as

\begin{matrix} L & = L_{ξ} + L_{ξ_{i}^{'}} + L_{0} + η \\ = V^{*} (ξ) + V^{*} (ξ_{i}^{'}) + \frac{ζ^{- 1}}{2} t r \{{\tilde{w}}_{c}^{T} {\tilde{w}}_{c}\} + η . \end{matrix}

(55)

(1): Event is not triggered:

Obviously, for ∀

t \in [t_{i}^{'}, t_{i + 1}^{'})

, we have

{\dot{V}}^{*} (ξ_{i}^{'}) = 0 .

Then,

\begin{matrix} \dot{L} = {\dot{L}}_{ξ} + {\dot{L}}_{0} + \dot{η} . \end{matrix}

Taking the derivative of

L_{ξ}

with (38) and (39) along the trajectory

\dot{ξ} = f + g {\hat{u}}_{0} (ξ_{i}^{'}) + \bar{k} \hat{w}

, we have

\begin{matrix} {\dot{L}}_{ξ} & = {\dot{V}}^{*} (ξ) = \nabla {V^{*}}^{T} [f + g {\hat{u}}_{0} (ξ_{i}^{'}) + \bar{k} \hat{w}] \\ = - ξ^{T} Q ξ - {u^{*}}_{0}^{T} R u_{0}^{*} + γ^{2} w^{*} T w^{*} + α V^{*} + 2 R {u^{*}}_{0}^{T} u_{0}^{*} - 2 γ^{2} w^{*} T w^{*} \\ - 2 R {u^{*}}_{0}^{T} {\hat{u}}_{0} (ξ_{i}^{'}) + 2 γ^{2} w^{*} T \hat{w} \\ = - ξ^{T} Q ξ + {u^{*}}_{0}^{T} R u_{0}^{*} - 2 R {u^{*}}_{0}^{T} {\hat{u}}_{0}^{*} (ξ_{i}^{'}) + {\hat{u}}_{0}^{T} (ξ_{i}^{'}) R {\hat{u}}_{0} (ξ_{i}^{'}) + α V^{*} - γ^{2} w^{*} T w^{*} \\ + 2 γ^{2} w^{*} T \hat{w} - {\hat{u}}_{0}^{T} (ξ_{i}^{'}) R {\hat{u}}_{0}^{*} (ξ_{i}^{'}) \\ = - ξ^{T} Q ξ + R | | u_{0}^{*} - {\hat{u}}_{0} (ξ_{i}^{'}) {| |}^{2} + α V^{*} - γ^{2} w^{*} T w^{*} + 2 γ^{2} w^{*} T \hat{w} - {\hat{u}}_{0}^{T} (ξ_{i}^{'}) R {\hat{u}}_{0} (ξ_{i}^{'}) \\ \leq - λ_{m i n} {(Q) ∥ ξ ∥}^{2} + α V^{*} + ∥ R ∥ ∥ u_{0}^{*} - {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} - ∥ R ∥ ∥ {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} \\ + 2 γ^{2} w^{*} T (\hat{w} - w^{*}) + γ^{2} {∥ w^{*} ∥}^{2} \\ \leq - λ_{m i n} {(Q) ∥ ξ ∥}^{2} + α V^{*} + ∥ R ∥ ∥ u_{0}^{*} - {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} - ∥ R ∥ ∥ {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} \\ + 2 γ^{2} ∥ w^{*} T ∥ ∥ (\hat{w} - w^{*}) ∥ + γ^{2} {∥ w^{*} ∥}^{2} \\ \leq - λ_{m i n} {(Q) ∥ ξ ∥}^{2} + α V^{*} + ∥ R ∥ ∥ u_{0}^{*} - {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} - ∥ R ∥ ∥ {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} \\ + γ^{2} ∥ (\hat{w} - w^{*}) ∥ + 2 γ^{2} {∥ w^{*} ∥}^{2} . \end{matrix}

From (47) and (49), we have

\begin{matrix} ∥ u_{0}^{*} - {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} \\ = ∥ u_{0}^{*} - u_{0}^{*} (ξ_{i}^{'}) + u_{0}^{*} (ξ_{i}^{'}) - {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} \\ \leq 2 ∥ u_{0}^{*} - u_{0}^{*} (ξ_{i}^{'}) ∥^{2} + 2 {∥ u_{0}^{*} (ξ_{i}^{'}) - {\hat{u}}_{0} (ξ_{i}^{'}) ∥}^{2} \\ = 2 ∥ u_{0}^{*} - u_{0}^{*} (ξ_{i}^{'}) ∥^{2} + {∥ R^{- 1} g^{T} (ξ_{i}^{'}) [\nabla ϕ^{T} (ξ_{i}^{'}) ({\hat{w}}_{c} - w_{c}) + \nabla ε (ξ_{i}^{'})] ∥}^{2} \\ = 2 | | u_{0}^{*} - u_{0}^{*} (ξ_{i}^{'}) {| |}^{2} + {∥ R^{- 1} g^{T} (ξ_{i}^{'}) [\nabla ϕ^{T} (ξ_{i}^{'}) {\tilde{w}}_{c} + \nabla ε (ξ_{i}^{'})] ∥}^{2} \\ \leq 2 L^{2} ∥ e_{i} ∥^{2} + 2 {(∥ R ∥)}^{- 2} b_{q}^{2} (∥ \nabla ϕ {\tilde{w}}_{c} ∥^{2} + {∥ \nabla ε ∥}^{2}), \end{matrix}

and

\begin{matrix} ∥ (\hat{w} - w^{*}) ∥ \\ = \frac{1}{2 γ^{2}} ∥ k^{T} (ξ) ∥ ∥ [\nabla ϕ^{T} (ξ) ({\hat{ω}}_{c} - ω_{c}) - \nabla ε (ξ)] ∥ \\ = \frac{1}{2 γ^{2}} ∥ k^{T} (ξ) ∥ ∥ [\nabla ϕ^{T} (ξ) {\tilde{w}}_{c} - \nabla ε (ξ)] ∥ \\ \leq \frac{b_{k}}{2 γ^{2}} (∥ \nabla ϕ {\tilde{w}}_{c} ∥^{2} + {∥ \nabla ε ∥}^{2}) . \end{matrix}

Then, we further obtain

\begin{matrix} {\dot{L}}_{ξ} & \leq - λ_{m i n} {(Q) ∥ ξ ∥}^{2} - ∥ R ∥ ∥ {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} + 2 γ^{2} {∥ w^{*} ∥}^{2} \\ + 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} + {2 ∥ R ∥}^{- 1} b_{q}^{2} (∥ \nabla ϕ {\tilde{w}}_{c} ∥^{2} + {∥ \nabla ε ∥}^{2}) \\ + \frac{b_{k}}{2 γ^{2}} (∥ \nabla ϕ {\tilde{w}}_{c} ∥^{2} + {∥ \nabla ε ∥}^{2}) + α V^{*} \end{matrix}

(56)

and

\begin{matrix} {\dot{L}}_{0} & = - ζ {\tilde{w}}_{c}^{T} [\frac{χ χ^{T}}{{(1 + χ^{T} χ)}^{2}} + \sum_{j = 1}^{l} \frac{χ_{j} χ_{j}^{T}}{{(1 + χ^{T} χ_{j})}^{2}}] {\tilde{w}}_{c} \\ + ζ {\tilde{w}}_{c}^{T} [\frac{χ}{{(1 + χ^{T} χ)}^{2}} ε_{H} + \sum_{j = 1}^{l} \frac{χ_{j}}{{(1 + χ_{j}^{T} χ_{j})}^{2}} ε_{H_{j}}] . \end{matrix}

(57)

Because of

\begin{matrix} {\tilde{w}}_{c}^{T} \frac{χ}{{(1 + χ^{T} χ)}^{2}} ε_{H} = \frac{1}{{(1 + χ^{T} χ)}^{2}} {\tilde{w}}_{c}^{T} χ ε_{H} \\ \leq \frac{1}{2 {(1 + χ^{T} χ)}^{2}} ({\tilde{w}}_{c}^{T} χ χ^{T} {\tilde{w}}_{c} + ε_{H} ε_{H}^{T}) \\ \leq \frac{1}{2} ({\tilde{w}}_{c}^{T} χ χ^{T} {\tilde{w}}_{c} + ε_{H} ε_{H}^{T}), \end{matrix}

(58)

and

\begin{matrix} {\tilde{w}}_{c}^{T} \sum_{j = 1}^{l} \frac{1}{{(1 + χ_{j}^{T} χ_{j})}^{2}} ε_{H_{i}} = \sum_{j = 1}^{l} \frac{1}{{(1 + χ_{j}^{T} χ_{j})}^{2}} {\tilde{w}}_{c}^{T} χ_{j} ε_{H_{j}} \\ \leq \sum_{j = 1}^{l} \frac{1}{2 {(1 + χ_{j}^{T} χ_{j})}^{2}} ({\tilde{w}}_{c}^{T} χ_{j} χ_{j}^{T} {\tilde{w}}_{c} + ε_{H_{j}} ε_{H_{j}}^{T}) \\ \leq \frac{1}{2} \sum_{j = 1}^{l} {\tilde{w}}_{c}^{T} χ_{j} χ_{j}^{T} {\tilde{w}}_{c} + \frac{1}{2} \sum_{j = 1}^{l} ε_{H_{j}} ε_{H_{j}}^{T} . \end{matrix}

(59)

Using (57)–(59), it is easy to obtain

\begin{matrix} {\dot{L}}_{0} \leq - \frac{1}{2} λ_{m i n} (Ψ) {∥ {\tilde{w}}_{c} ∥}^{2} + \frac{l + 1}{2} b_{ε_{H}}^{2} \end{matrix}

(60)

where

Ψ = \frac{χ χ^{T}}{{(1 + χ^{T} χ)}^{2}} + \sum_{j = 1}^{l} \frac{χ_{j} χ_{j}^{T}}{{(1 + χ^{T} χ_{j})}^{2}}

.

With the aid of (56) and (60), we obtain

\begin{matrix} \dot{L} & \leq - ϵ^{2} λ_{m i n} {(Q) ∥ ξ ∥}^{2} - (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} \\ - ∥ R ∥ ∥ {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} + 2 γ^{2} ∥ w^{*} ∥^{2} + 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} \\ + {2 ∥ R ∥}^{- 1} {∥ g ∥}^{2} (∥ \nabla ϕ {\tilde{w}}_{c} ∥^{2} + {∥ \nabla ε ∥}^{2}) + α V^{*} \\ + \frac{b_{k}}{2 γ^{2}} (∥ \nabla ϕ {\tilde{w}}_{c} ∥^{2} + {∥ \nabla ε ∥}^{2}) - \frac{1}{2} λ_{m i n} (Ψ) {∥ {\tilde{w}}_{c} ∥}^{2} + \frac{l + 1}{2} b_{ε_{H}}^{2} + {\dot{η}}_{2} \\ \leq - ϵ^{2} λ_{m i n} {(Q) ∥ ξ ∥}^{2} + 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} - ∥ R ∥ ∥ {\hat{u}}_{0} (ξ_{i}^{'}) ∥^{2} + 2 γ^{2} {∥ w^{*} ∥}^{2} + α V^{*} \\ - (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} - (\frac{1}{2} λ_{m i n} (Ψ) - {2 ∥ R ∥}^{- 1} b_{q}^{2} \nabla ϕ^{2} - \frac{b_{k}}{2 γ^{2}}) ∥ {\tilde{w}}_{c} ∥^{2} \\ + {(2 ∥ R ∥}^{- 1} b_{q}^{2} + \frac{b_{k}}{2 γ^{2}}) b_{ε_{ξ}}^{2} + \frac{l + 1}{2} b_{ε_{H}}^{2} + {\dot{η}}_{2} . \end{matrix}

Recalling the triggering condition (33), one obtains

\begin{matrix} \dot{L} & \leq - ϵ^{2} λ_{m i n} {(Q) ∥ ξ ∥}^{2} + 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} \\ - (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} + α b_{v^{*}} - λ η \\ + (1 - ϵ^{2}) λ_{m i n} (Q) {∥ ξ ∥}^{2} - 2 L^{2} ∥ R ∥ ∥ e_{i} ∥^{2} \\ - (\frac{1}{2} λ_{m i n} (Ψ) - {2 ∥ R ∥}^{- 1} b_{q}^{2} \nabla ϕ^{2} - \frac{b_{k}}{2 γ^{2}}) ∥ {\tilde{w}}_{c} ∥^{2} \\ + \frac{l + 1}{2} b_{ε_{H}}^{2} + {(2 ∥ R ∥}^{- 1} b_{q}^{2} + \frac{b_{k}}{2 γ^{2}}) b_{ε_{ξ}}^{2} + \frac{1}{2 γ^{2}} b_{k}^{2} b_{v^{*}}^{2} \\ \leq - ϵ^{2} λ_{m i n} {(Q) ∥ ξ ∥}^{2} + \frac{l + 1}{2} b_{ε_{H}}^{2} + Π + Γ {∥ {\tilde{w}}_{c} ∥}^{2} \end{matrix}

(61)

where

\begin{matrix} Π & = α b_{v^{*}} + {(2 ∥ R ∥}^{- 1} b_{q}^{2} + \frac{b_{k}}{2 γ^{2}}) b_{ε_{ξ}}^{2} + \frac{1}{2 γ^{2}} b_{k}^{2} b_{v^{*}}^{2}, \\ Γ & = - (\frac{1}{2} λ_{m i n} (Ψ) - 2 ∥ R ∥^{- 1} b_{q}^{2} \nabla ϕ^{2} - \frac{b_{k}}{2 γ^{2}}) . \end{matrix}

Hence, (61) implies

\dot{L} < 0

only if one of the following inequalities holds

\begin{matrix} ∥ ξ ∥ > \sqrt{\frac{Π}{ϵ^{2} λ_{m i n} (Q)}} and ∥ {\tilde{w}}_{c} ∥ > \sqrt{\frac{\frac{l + 1}{2} b_{ε_{H}}^{2}}{Γ}} . \end{matrix}

(62)

(2): On the triggering instant:

At the trigger moments

t = t_{i}^{'}

,

\dot{L}

is

\begin{matrix} {\dot{L}}^{+} (t_{i}^{'}) & = V^{*} (ξ^{+}) - V^{*} (ξ_{i}^{'}) + V^{*} ({\hat{ξ}}_{i + 1}^{'}) \\ - V^{*} ({\hat{ξ}}_{i}^{'}) + V^{*} ({\tilde{w}}_{c}^{+}) - V^{*} ({\tilde{w}}_{c} ({\hat{ξ}}_{i}^{'})) \end{matrix}

where

V^{*} ({\tilde{w}}_{c}^{+}) = {lim}_{h \to 0^{+}} V^{*} ({\tilde{w}}_{c}) (t_{i}^{'} + h)

,

V^{*} ({\tilde{w}}_{c}^{-}) = {lim}_{h \to 0^{-}} V^{*} ({\tilde{w}}_{c}) (t_{i}^{'} + h)

. From (56) and (57), we know

V^{*} (ξ^{+}) - V^{*} (ξ_{i}^{'}) < 0

and

V^{*} ({\tilde{w}}_{c}^{+}) - V^{*} ({\tilde{w}}_{c} ({\hat{ξ}}_{i}^{'})) < 0

. So, we can get

V^{*} ({\hat{ξ}}_{i + 1}^{'}) - V^{*} ({\hat{ξ}}_{i}^{'}) < 0

, so

\dot{L} < 0

.

Finally, UUB of the tracking error and of the weight approximation error are concluded. The proof is completed. □

4.4. Algorithm Design of the Event-Triggered ISM Optimal Tracking Control

In the framework of all assumptions holding in this article, the following Algorithm 1 is to show the procedures of the event-triggered ISM optimal tracking control.

Algorithm 1 Event-Triggered ISM Optimal Tracking Control.

Input: initial states of the sliding mode dynamics (11)

1: Select an initial admissible policy

u_{0}^{0} (X), ω^{0} (X)

and a proper small scalar

ϵ > 0 .

\begin{matrix} {\dot{\hat{ω}}}_{c} & = - ζ \frac{χ}{{(1 + χ^{T} χ)}^{2}} (χ^{T} {\hat{ω}}_{c} + ℓ) - ζ \sum_{j = 1}^{l} \frac{χ_{j}}{{(1 + χ_{j}^{T} χ_{j})}^{2}} (χ_{j}^{T} {\hat{ω}}_{c} + ℓ_{j}) . \end{matrix}

while the optimal continuous control law with input constraints is approximated as

\begin{matrix} {\hat{u}}_{0} (ξ_{i}^{'}) = - \frac{1}{2} R^{- 1} g^{T} (ξ_{i}^{'}) \nabla ϕ^{T} (ξ_{i}^{'}) {\hat{ω}}_{c} . \end{matrix}

2: To tackle the uncertainty affecting the nonlinear system (5), an integral type sliding surface is designed as

\begin{matrix} S (ξ) = M [ξ - ξ_{0} - \int_{0}^{t} (f + g u_{0}) d τ] . \end{matrix}

3: To extend the continuous-time control

u_{1}

in the event-triggered paradigm, we define a virtual control input

μ (t)

satisfying

μ (t) ≜ u (t_{k}), t \in [t_{k}, t_{k + 1})

, where

\begin{matrix} μ (t) = - X sgn (g^{T} M^{T} S (ξ (t_{k}))) \end{matrix}

4: Hence, the event-triggered ISMC becomes

\begin{matrix} u (t) = - \frac{1}{2} R^{- 1} g^{T} (ξ_{i}^{'}) \nabla ϕ^{T} (ξ_{i}^{'}) {\hat{ω}}_{c} - X sgn (g^{T} (ξ_{k}) M^{T} (ξ_{k}) s (t_{k})) . \end{matrix}

Remark 3.

An optimal tracking composite controller

u = u_{0} + u_{1}

, subject to two different dynamic event-triggered conditions is presented in this paper. Subsequent numerical experiments (cf. Table 1 and Table 2 and Figure 1 and Figure 2, respectively) show that the proposed new algorithm not only can reduce the communication burden but also improve the speed of convergence.

5. Simulation

In this section, a nonlinear single link robot arm is considered to verify the effectiveness of the proposed algorithm.

Consider the following system dynamics [39]

\begin{matrix} \ddot{θ} = - \frac{M g l}{\tilde{G}} sin (θ (t)) - \frac{D}{\tilde{G}} \dot{θ} + \frac{1}{\tilde{G}} u (t) + \bar{h} w (t) . \end{matrix}

(63)

where

θ (t)

is the angle position of robot arm, and

u (t)

is the control input. Moreover, M is the mass of the payload,

\tilde{G}

is the moment of inertia, g is the acceleration of gravity, l is the length of the arm, and D is the viscous friction, where

g, l, D

are the system parameters and

M, \tilde{G}

are the design parameters. Set the values of the system parameters as

g = 9.81, D = 1

, and

l = 1

, and the design parameters M and

\tilde{G}

are alterable. Assuming

x_{1} (t) = θ (t)

and

x_{2} (t) = \dot{θ} (t)

. Herein, consider the effect of interference on the actuator, so the dynamics by selecting the system parameters can be written as

\begin{matrix} [\begin{matrix} \dot{x_{1}} \\ \dot{x_{2}} \end{matrix}] & = [\begin{matrix} x_{2} \\ - 4 x_{1} - 0.5 x_{2} \end{matrix}] + [\begin{matrix} 0 \\ 0.04 \end{matrix}] (u (t) + d (t)) \\ + [\begin{matrix} - 0.2 sin (t) + 2 x_{2} \\ - 0.2 sin (t) + 2 x_{1}) . \end{matrix}] w (t) \end{matrix}

(64)

Let us take the initial state

x (0) = [0.02; - 0.5]

. The matched disturbance is taken as

\begin{matrix} d (t) = \{\begin{matrix} 0.5 cos (0.35 t), t < 60, \\ 0.5 sin (0.5 t), t \geq 60 . \end{matrix} \end{matrix}

(65)

The desired trajectory is

\begin{matrix} {\dot{x}}_{d} = [\begin{matrix} - x_{d 1} + sin (x_{d 2}) \\ - 4.9 sin (x_{d 1}) - 0.5 x_{d 2} \end{matrix}] \end{matrix}

(66)

with the initial state

x_{d} (0) = [0.1; 0.65]

.

Therefore, based on (64)–(66), the augmented system is reformulated as

\begin{matrix} \dot{ξ} = f (ξ) + g (ξ) (u (t) + d (t)) + h (ξ) ω (t) \end{matrix}

(67)

with

\begin{matrix} f (ξ) = [\begin{matrix} x_{2} + x_{d 1} - sin (x_{d 2}) \\ - 4 x_{1} - 0.5 x_{2} + 4.9 sin (x_{d 1}) + 0.5 x_{d 2} \\ - x_{d 1} + sin (x_{d 2}) \\ - 4.9 sin {(x_{d 1})}^{3} - 0.5 x_{d 2} \end{matrix}], \\ g (ξ) = [\begin{matrix} 0 \\ 0.04 \\ 0 \\ 0 \end{matrix}], h (ξ) = [\begin{matrix} - 0.2 (sin (t) + 2 x_{2}) \\ - 0.2 (sin (t) + 2 x_{1}) \\ 0 \\ 0 \end{matrix}] . \end{matrix}

Based on Lemma 1, one has

g^{†} = [0; 25; 0; 0]

and

k = (I - g g^{†}) h = [0.1013; 0; 0; 0] .

The integral sliding mode surface is as in (7) with the sliding mode gain

X = 1

.

According to the ISMC

u_{1}

, the sliding-mode dynamics with unmatched disturbances is obtained as

\begin{matrix} \dot{ξ} = f (ξ) + g (ξ) u_{0} (t) + \bar{k} (ξ) ω (t) \end{matrix}

(68)

where

\bar{k} (ξ) = (I - g (ξ) g^{†} (ξ)) h (ξ)

.

For simulation, the parameters of the algorithm are chosen as

Q = diag (100, 100, 0, 0)

,

γ = 5

,

ζ = 0.9

,

ϵ = 0.5

. The parameters of the triggering condition are selected as

η_{1} (0) = 0.1

,

θ_{1} = 1

,

η (0) = 0.1

,

L = 10

,

λ = 0.1

,

θ = 1

and

α = 0.5

. The initial NN weight is selected as

ω_{c} = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

and the NN activation function is designed as

ϕ (ξ) = {[ξ_{1}^{2}, ξ_{2}^{2}, ξ_{3}^{2}, ξ_{4}^{2}, ξ_{1} ξ_{2}, ξ_{1} ξ_{3}, ξ_{1} ξ_{4}, ξ_{2} ξ_{3}, ξ_{2} ξ_{4}, ξ_{3} ξ_{4}]}^{T}

.

To make a comparison with DET via PE, during the neural network implementation process, a small probing noise is injected in u for the first 90 s. Figure 1 presents the evolution of the tracking system states and the augmented system states with DET via PE technique. Figure 2 presents the evolution of the tracking system states and the augmented system states with DET-based ER technique. From Figure 1 and Figure 2, it is obvious that the convergence of states and tracking trajectories via ER method is faster than the PE method. The optimal control

u_{0}

, the ISMC law

u_{1}

, the composite law u, and the sliding mode function S are shown in Figure 3 and Figure 4, for DET via PE technique and DET via ER technique, respectively. From Figure 3 and Figure 4, we can see that the convergence via ER method is faster than the PE method. The initial weights are selected randomly in the interval

[0, 1]

. After a learning process, Figure 5 presents the convergence process of the critic NN weights

ω_{c}

. The evolution of the event error

| | e_{i} {| |}^{2}

and triggering threshold

| | e_{T} {| |}^{2}

are shown in Figure 6. The interevent time of the control

u_{0}

under under the four strategies is shown in Figure 7. Next, the characteristics of the different strategies are analyzed by comparison through Table 1, Table 2 and Table 3.

First, Table 1 shows the data comparison for the four control strategies. As is well known, the number of triggering events is one of most important factors in evaluating the triggering mechanism. Furthermore, a smaller number of triggering events can reduce the communication burden and save resources. To implement this goal, it is needed to reduce the unnecessary update of the controller guaranteeing system performance. Based on this, a novel adaptive adjustment technique consisting of DET via ER technique is used. In addition, with the help of the simulation experiment platform and MATLAB data package, using respectively the technology in [39,40] and our technique in this paper, some experimental results are shown in Table 1. From Table 1, it is obvious that the DET via ER technique is the best, and which can better reduce the communication burden and save resources, since there are only 418 samples that occurred because of the larger average triggering interval time 0.2392 s. In particular, one may notice that the ER technique is also beneficial in reducing the number of event-triggering comparing with the PE technique in the framework of the same event-triggered conditions. Similarly, the DET is also beneficial in reducing the number of event-triggering comparing with ET in the framework of the same PE technique (ER technique). In addition, the minimal interval indicates Zeno-free behaviors. The minimal interval values of the four strategies are consistent with Figure 7. Moreover, according to Figure 7, we know that DET via ER technique can generate a biggest minimal interval comparing with other technique, so the effectiveness of the current technique designed in this paper is verified.

Next, we define another important factor to evaluate the control strategy, (called a triggering rate based on TT). The triggering rate is calculated by

\frac{{DET}_{number}}{{TT}_{number}}

. When

θ = 1

and

λ = 0.01

, Table 2 and Table 3 present that the rate of DET via PE and the rate of DET via ER are

23.75 %

and

20.08 %

, respectively. Generally speaking, a small triggering rate is more favorable than a large triggering rate. To investigate the characteristics of the changes utilized to execute the triggering rule, some data of the triggering rates are shown in Table 2 and Table 3.The tables also show the effect of the parameters on the triggering rate generated by

θ

and

λ

with DET via PE and ER respectively. To summary, a bigger

θ

value means a reduction of the number of events; in contrast, a bigger

λ

value means an increase in the number of events.

Remark 4.

{ET}_{PE}

is ET via PE technique,

{ET}_{ER}

is ET via ER technique,

{DET}_{PE}

is DET via PE technique, and

{DET}_{ER}

is DET via ER technique.

6. Conclusions

In this article, a learning-based event-triggered optimal tracking control technique for nonlinear systems was developed via ADP. Matched uncertainties have been eliminated by the ISMC proposed in this paper. Unmatched uncertainties have been attenuated, utilizing projection matrix and an optimal controller. A critic NN via a novel dynamic event-triggered rule has been constructed to ensure the existence of the solution of the HJI equation and all errors have been proved UUB using the Lyapunov analysis method. Moreover, the simulation results revealed that our control algorithm is more favorable than traditional event-triggered control algorithms. Future work is to further explore this framework, e.g., in the presence of various delays or system constraints.

Remark 5.

(1) The pros: This study first provides codesign of dynamic event mechanism and experience replay technique-based weighted adaptive adjustment technique to guarantee the optimal approximation of the cost function

V^{*}

. In addition, this codesign technique not only speeds up the approximation speed of the cost function but also enlarges the average interval of internal events aiming to reduce the updating times of the controller and save the calculation burden and communication resources.

(2) The cons: In the proof Theorem 2, using the approximate the sign function

sgn (S) \approx \tanh (S / ν)

where

ν \geq 1

that makes us to only obtain an approximation condition rather than a sufficient condition, namely, which weakens the effect of sliding-mode control. This issue should be investigated further.

Author Contributions

Conceptualization, W.T. and H.W.; methodology, W.T. and H.W.; validation, W.T. and H.W.; formal analysis, W.Y.; investigation, W.T. and H.W.; writing original draft preparation, W.T.; writing review and editing, W.Y. and H.W.; visualization, W.T. and H.W.; supervision, W.Y.; project administration, W.Y.; funding acquisition, W.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Science and Technology Project of State Grid Zhejiang Electric Power CO., LTD. (5211JY19000X).

Conflicts of Interest

The authors declare no conflict of interest.

References

Isermann, R. Digital Control Systems; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Astrom, K.J.; Bernhardsson, B.M. Comparison of Riemann and Lebesque sampling for first order stochastic systems. In Proceedings of the 41st IEEE Conference on Decision and Control, Las Vegas, NV, USA, 10–13 December 2002; pp. 2011–2016. [Google Scholar]
Tabuada, P. Event-triggered real-time scheduling of stabilizing control tasks. IEEE Trans. Autom. Control 2007, 52, 1680–1685. [Google Scholar] [CrossRef] [Green Version]
Girard, A. Dynamic Triggering Mechanisms for Event-Triggered Control. IEEE Trans. Autom. Control 2015, 60, 1992–1997. [Google Scholar] [CrossRef] [Green Version]
Ding, L.; Han, Q.L.; Ge, X.; Zhang, X.M. An overview of recent advances in event-triggered consensus of multi-agent systems. IEEE Trans. Cybern. 2018, 48, 1110–1123. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Peng, C.; Li, F. A survey on recent advances in event-triggered communication and control. Inf. Sci. 2018, 457, 113–125. [Google Scholar] [CrossRef]
Zhang, X.M.; Han, Q.L.; Zhang, B.L. An overview and deep investigation on sampled-data-based event-triggered control and filtering for networked systems. IEEE Trans. Ind. Inform. 2016, 13, 4–16. [Google Scholar] [CrossRef]
Pan, Y.; Yang, G.H. Event-triggered fault detection filter design for nonlinear networked systems. IEEE Trans. Syst. Man Cybern. Syst. 2008, 48, 1851–1862. [Google Scholar] [CrossRef]
Ye, C.; Song, Y. Event-Triggered Prescribed Performance Control for a Class of Unknown Nonlinear Systems. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 6576–6586. [Google Scholar]
Ge, X.; Han, Q.; Zhang, X.M.; Ding, D. Dynamic event-triggered control and estimation: A survey. Int. J. Autom. Comput. 2021, 18, 857–886. [Google Scholar] [CrossRef]
Ge, X.; Han, Q.L.; Ding, L.; Wang, Y.L.; Zhang, X.M. Dynamic event-triggered distributed coordination control and its applications: A survey of trends and techniques. IEEE Trans. Syst. Man Cybern. Syst. 2020, 50, 3112–3125. [Google Scholar] [CrossRef]
Niu, Y.; Ho, D.W.C.; Lam, J. Robust integral sliding mode control for uncertain stochastic systems with time-varying delay. Automatica 2005, 41, 873–880. [Google Scholar] [CrossRef]
Roy, S.; Baldi, S.; Fridman, L.M. Adaptive sliding mode control without a priori bounded uncertainty. Automatica 2020, 111, 1–6. [Google Scholar] [CrossRef]
Roy, S.; Roy, S.B.; Lee, J.; Baldi, S. Overcoming the Underestimation and Overestimation Problems in Adaptive Sliding Mode Control. IEEE/ASME Trans. Mechatron. 2019, 24, 2031–2039. [Google Scholar] [CrossRef] [Green Version]
Yu, W.; Wang, H.; Cheng, F.; Yu, X.; Wen, G. Second-Order Consensus in Multiagent Systems via Distributed Sliding Mode Control. IEEE Trans. Cybern. 2017, 47, 1872–1881. [Google Scholar] [CrossRef] [PubMed]
Corradini, M.L.; Cristofaro, A. Nonsingular terminal sliding-mode control of nonlinear planar systems with global fixed-time stability guarantees. Automatica 2018, 95, 561–565. [Google Scholar] [CrossRef]
Li, H.; Shi, P.; Yao, D. Adaptive sliding-mode control of Markov jump nonlinear systems with actuator faults. IEEE Trans. Autom. Control 2017, 64, 1933–1939. [Google Scholar] [CrossRef]
Castan˜os, F.; Fridman, L. Analysis and Design of Integral Sliding Manifolds for Systems With Unmatched Perturbations. IEEE Trans. Autom. Control 2006, 51, 853–858. [Google Scholar] [CrossRef] [Green Version]
Truc, L.N.; Vu, L.A.; Thoan, T.V.; Thanh, B.T.; Nguyen, T.L. Adaptive Sliding Mode Control Anticipating Proportional Degradation of Actuator Torque in Uncertain Serial Industrial Robots. Symmetry 2022, 14, 957. [Google Scholar] [CrossRef]
Xu, L.; Xiong, W.; Zhou, M.; Chen, L. A Continuous Terminal Sliding-Mode Observer-Based Anomaly Detection Approach for Industrial Communication Networks. Symmetry 2022, 14, 124. [Google Scholar] [CrossRef]
Yan, H.; Zhang, H.; Zhan, X.; Wang, Y.; Chen, S.; Yang, F. Event-Triggered Sliding Mode Control of Switched Neural Networks With Mode-Dependent Average Dwell Time. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 1233–1243. [Google Scholar] [CrossRef]
Zheng, B.C.; Yu, X.; Xue, Y. Quantized feedback sliding-mode control: An event-triggered approach. Automatica 2018, 91, 126–135. [Google Scholar] [CrossRef]
Nair, R.R.; Behera, L.; Kumar, S. Event-triggered finite-time integral sliding mode controller for consensus-based formation of multi-robot systems with disturbances. IEEE Trans. Control Syst. Technol. 2019, 27, 39–47. [Google Scholar] [CrossRef]
Zhou, K.; Doyle, J.C.; Glover, K. Robust and Optimal Control; Prentice-Hall: Englewood Cliffs, NJ, USA, 1996. [Google Scholar]
Vincent, T.L. Nonlinear and Optimal Control Systems; Wiley: New York, NY, USA, 1997. [Google Scholar]
Lewis, F.L.; Vrabie, D.; Syrmos, V.L. Optimal Control; Wiley: New York, NY, USA, 2012. [Google Scholar]
Basar, T.; Bernard, P. H_∞-Optimal Control and Related Minimax Design Problems; Birkhäuser: Boston, MA, USA, 1995. [Google Scholar]
Vrabie, D.; Lewis, F. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw. 2009, 22, 237–246. [Google Scholar] [CrossRef] [PubMed]
Bellman, R.E. Dynamic Programming; Princeton University Press: Princeton, NJ, USA, 1957. [Google Scholar]
Wang, F.Y.; Zhang, H.; Liu, D. Adaptive dynamic programming: An introduction. IEEE Comput. Intell. Mag. 2009, 4, 39–47. [Google Scholar] [CrossRef]
Michailidis, I.; Baldi, S.; Kosmatopoulos, E.B.; Ioannou, P.A. Adaptive Optimal Control for Large-Scale Nonlinear Systems. IEEE Trans. Autom. Control 2017, 62, 5567–5577. [Google Scholar] [CrossRef] [Green Version]
Liu, D.; Xue, S.; Zhao, B.; Luo, B.; Wei, Q. Adaptive Dynamic Programming for Control: A Survey and Recent Advances. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 142–160. [Google Scholar] [CrossRef]
Fan, Q.Y.; Yang, G.H. Adaptive actor-critic design-based integral sliding-mode control for partially unknown nonlinear systems with input disturbances. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 165–177. [Google Scholar] [CrossRef]
Qu, Q.; Zhang, H.; Yu, R.; Liu, Y. Neural network-based H_∞ sliding mode control for nonlinear systems with actuator faults and unmatched disturbances. Neurocomputing 2018, 275, 2009–2018. [Google Scholar] [CrossRef]
Zhang, H.; Qu, Q.; Xia, G.; Cui, Y. Optimal guaranteed cost sliding mode control for constrained-input nonlinear systems with matched and unmatched disturbances. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 2112–2126. [Google Scholar] [CrossRef]
Vamvoudakis, K.G. Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J. Autom. Sin. 2014, 1, 282–293. [Google Scholar]
Zhang, Q.; Zhao, D.; Zhu, Y. Event-triggered H_∞ control for continuous-time nonlinear system via concurrent learning. IEEE Trans. Syst. Man Cybern. Syst. 2017, 47, 1071–1081. [Google Scholar] [CrossRef]
Xue, S.; Luo, B.; Liu, D. Event-Triggered Adaptive Dynamic Programming for Unmatched Uncertain Nonlinear Continuous-Time Systems. IEEE Trans. Neural Netw. Learn. Syst. 2020, 99, 1–13. [Google Scholar] [CrossRef] [PubMed]
Yang, D.S.; Li, T.; Xie, X.P.; Zhang, H.G.; Liu, D.R.; Li, Y.H. Event-Triggered Integral Sliding-Mode Control for Nonlinear Constrained-Input Systems With Disturbances via Adaptive Dynamic Programming. IEEE Trans. Syst. Man Cybern. Syst. 2020, 50, 2168–2216. [Google Scholar] [CrossRef]
Zhang, H.G.; Liang, Y.; Su, H.; Liu, C. Event-Driven Guaranteed Cost Control Design for Nonlinear Systems with Actuator Faults via Reinforcement Learning Algorithm. IEEE Trans. Syst. Man Cybern. Syst. 2020, 50, 4135–4150. [Google Scholar] [CrossRef]
Van, M.; Do, X.P. Optimal adaptive neural PI full-order sliding mode control for robust fault tolerant control of uncertain nonlinear system. Eur. J. Control. 2020, 54, 22–32. [Google Scholar] [CrossRef]

Figure 1. PE technique: (a,b) resulted trajectory compared with the desired trajectory with DET; (c) state trajectory.

Figure 2. ER technique: (a,b) resulted trajectory compared with the desired trajectory with DET; (c) state trajectory of augmented systems.

Figure 3. PE technique: (a) evolution of RL-based control law

u_{0}

; (b) evolution of SMC input

u_{1}

; (c) evolution of composite control law u; (d) evolution of sliding mode surface S.

Figure 3. PE technique: (a) evolution of RL-based control law

u_{0}

; (b) evolution of SMC input

u_{1}

; (c) evolution of composite control law u; (d) evolution of sliding mode surface S.

Figure 4. ER technique: (a) evolution of RL-based control law

u_{0}

; (b) evolution of SMC input

u_{1}

; (c) evolution of composite control law u; (d) evolution of sliding mode surface S.

Figure 4. ER technique: (a) evolution of RL-based control law

u_{0}

; (b) evolution of SMC input

u_{1}

; (c) evolution of composite control law u; (d) evolution of sliding mode surface S.

Figure 5. Weight curves of system states during learning stage.

Figure 6. Trigger threshold

| | e_{T} {| |}^{2}

and event error

| | e_{i} {| |}^{2}

during learning stage.

Figure 6. Trigger threshold

| | e_{T} {| |}^{2}

and event error

| | e_{i} {| |}^{2}

during learning stage.

Figure 7. Inter-event time of four strategies during learning stage.

Table 1. Comparison results of four strategies.

Strategies	Samples	Average Interval	Minimal Interval
$TT$	2000	0.05	0.05
${ET}_{PE}$	632	0.1582	0.1
${DET}_{PE}$	476	0.2101	0.1
${ET}_{ER}$	577	0.1733	0.1
${DET}_{ER}$	418	0.2392	0.15

Table 2. Trigger rates of different dets based on PE.

	0.01	0.1	0.5	1
$θ$	0.01	0.1	0.5	1
1	22.35	23.75	27.45	29.00
3	17.95	22.4	28.55	30.15
5	15.55	22.25	29.20	30.65
10	12.55	22.85	30.05	31.10

Table 3. Trigger rates of different dets based on ER.

	0.01	0.1	0.5	1
$θ$	0.01	0.1	0.5	1
0.1	24.90	25.10	25.45	25.65
0.5	22.10	22.90	25	25.80
1	17.10	20.80	25.25	26.60
3	1.75	3.6	26.30	27.50

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tan, W.; Yu, W.; Wang, H. Dynamic Event-Triggered Integral Sliding Mode Adaptive Optimal Tracking Control for Uncertain Nonlinear Systems. Symmetry 2022, 14, 1264. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061264

AMA Style

Tan W, Yu W, Wang H. Dynamic Event-Triggered Integral Sliding Mode Adaptive Optimal Tracking Control for Uncertain Nonlinear Systems. Symmetry. 2022; 14(6):1264. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061264

Chicago/Turabian Style

Tan, Wei, Wenwu Yu, and He Wang. 2022. "Dynamic Event-Triggered Integral Sliding Mode Adaptive Optimal Tracking Control for Uncertain Nonlinear Systems" Symmetry 14, no. 6: 1264. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061264

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dynamic Event-Triggered Integral Sliding Mode Adaptive Optimal Tracking Control for Uncertain Nonlinear Systems

Abstract

1. Introduction

2. System Formulation and Preliminaries

3. DET-Based ISMC Design

4. DET-Based Optimal Controller Design

4.1. Dynamically Triggering Rule for Optimal Input

4.2. Dynamically Triggered ADP with Single Critic NN

4.3. Stability Analysis

4.4. Algorithm Design of the Event-Triggered ISM Optimal Tracking Control

5. Simulation

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI