Next Article in Journal
The Role of Spatial Policy Tools in Renewable Energy Investment
Next Article in Special Issue
Power Quality and Electromagnetic Compatibility Aspects at Personal Computers
Previous Article in Journal
A Type-2 Fuzzy Controller to Enable the EFR Service from a Battery Energy Storage System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improvement of the Control of a Grid Connected Photovoltaic System Based on Synergetic and Sliding Mode Controllers Using a Reinforcement Learning Deep Deterministic Policy Gradient Agent

by
Marcel Nicola
1,
Claudiu-Ionel Nicola
1,2,* and
Dan Selișteanu
2
1
Research and Development Department, National Institute for Research, Development and Testing in Electrical Engineering—ICMET Craiova, 200746 Craiova, Romania
2
Department of Automatic Control and Electronics, University of Craiova, 200585 Craiova, Romania
*
Author to whom correspondence should be addressed.
Submission received: 16 February 2022 / Revised: 21 March 2022 / Accepted: 23 March 2022 / Published: 24 March 2022
(This article belongs to the Special Issue New Frontiers in Electrical Power Systems Quality)

Abstract

:
This article presents the control of a grid connected PV (GC-PV) array system, starting from a benchmark. The control structure used in this article was a cascade-type structure, in which PI or synergetic (SYN) controllers were used for the inner control loop of id and iq currents and PI or sliding mode control (SMC) controllers were used for the outer control loop of the udc voltage from the DC intermediate circuit. This paper presents the mathematical model of the PV array together with the main component blocks: simulated inputs for the PV array; the PV array itself; the MPPT algorithm; the DC-DC boost converter; the voltage and current measurements for the DC intermediate circuit; the load and connection to power grid; the DC-AC converter; and the power grid. It also presents the stages of building and training the reinforcement learning (RL) agent. To improve the performance of the control system for the GC-PV array system without using controllers with a more complicated mathematical description, the advantages provided by the RL agent on process controls could also be used. This technique does not require exact knowledge of the mathematical model of the controlled system or the type of uncertainties. The improvement in the control system performance for the GC-PV array system, both when using simple PI-type controllers or complex SMC- and SYN-type controllers, was achieved using an RL agent based on the Deep Deterministic Policy Gradient (DDPG). The variant of DDPG used in this study was the Twin-Delayed (TD3). The improvement in performance of the control system were obtained by using the correction command signals provided by the trained RL agent, which were added to the command signals ud, uq and idref. The parametric robustness of the proposed control system based on SMC and SYN controllers for the GC-PV array system was proven in the case of a variation of 30% caused by the three-phase load. Moreover, the results of the numerical simulations are shown comparatively and the validation of the synthesis of the proposed control system was obtained. This was achieved by comparing the proposed system with a software benchmark for the control of a GC-PV array system performed in MATLAB Simulink. The numerical simulations proved the superiority of the performance of control systems that use the RL-TD3 agent.

1. Introduction

The importance of studying renewable energies from the phenomena of their generation, from sources including solar, wind, water, geothermal, etc., to their integration into microgrids or main grids is undeniable [1].
In parallel with these studies, studies on the control systems used for the generation of energy from renewable sources have also been intensified. Thus, there have been studies on hybrid microgrids [2,3,4], the optimization of the process for battery charging in microgrids [5,6], the optimization of converters in microgrids [7,8], problems regarding the defects that can occur in microgrids [9], as well as elements regarding the dispatching of microgrids by economic criteria [10,11,12,13,14].
A specific problem that is addressed in this article is the control system for the connection of the PV array system to a main grid. This problem involves the study of a chain of primary elements, consisting of the following blocks: the inputs for the PV array; the PV array itself; the MPPT algorithm; the DC-DC boost converter; the voltage and current measurements for the DC parameters in intermediate circuit; the DC-AC converter; the PLL (phase locked loop); the load and connection to power grid; and the core element that controls these blocks, called the voltage-source converter (VSC). The main objective of the control system is to stabilize the udc voltage as precisely as possible, including under variation caused by the three-phase load [15].
To achieve this goal, a series of adaptive control-type [16], robust control-type [17,18] and predictive control-type [19] algorithms can be used. Furthermore, fuzzy logic and neuro-fuzzy systems [20,21], genetics [22], particle swarm optimization (PSO) [23], RL [24], and passivity theory control systems are a special category of these control systems [25].
Given that the description equations of GC-PV array systems are nonlinear, a control system that ensures parametric robustness is provided by the SMC [26]. The SYN control systems [27], which can be considered as an extension of the SMC, also receive special emphasis.
The control systems that are based on RL for process control are organized as a series of tasks, which run on a computer for the control of an industrial process but do not require an explicit mathematical description [28,29,30].
This article starts from a benchmark presented in MATLAB Simulink [15], which was resumed in order to compare the best results obtained in [26,27,31,32]. After presenting the main characteristics of the benchmark system, numerical simulations are reported based on the theoretical elements presented in the preceding sections. Thus, starting from the cascade control structure in which the PI-type controllers were used in the inner control loop of id and iq currents, an SMC-type controller was used in the outer control loop of udc voltage and the elements regarding the RL-TD3 agent were used, the superior performance of the control system for the GC-PV array was obtained.
Moreover, in the second part of the numerical simulations, starting from the peak performances presented in [32,33] regarding the cascade control system in which an SYN-type controller was used in the inner control loop of id and iq currents, an SMC-type controller was used in the outer control loop of udc voltage and the elements regarding the RL-TD3 agent were used, the superior performance of the control system for the GC-PV array was obtained both in terms of the direct comparison of these performances and the robustness provided by the control system under parametric variations, such as the variation caused by the three-phase load.
The main contributions of this article are as follows:
  • The proposal of a cascade control system structure for the GC-PV array system, in which an SMC-type controller is used for the outer udc voltage in the DC circuit control loop and SYN-type controllers are used in the inner control loops in the id and iq currents;
  • Improvements in the performance of the control system for the GC-PV array system when using simple PI-type controllers or complex SMC-type or SYN-type controllers through the use of an RL agent that is based on TD3;
  • Validations of the results performed through a MATLAB Simulink environment to show the improvements in the performance of the control system for the GC-PV array system by using the RL-TD3 agent, even under parametric uncertainties; for example, a variation of 30% from the nominal value caused by the three-phase load.
The rest of the paper is organized as follows. Section 2 presents the mathematical model of the GC-PV array system. Section 3 describes the RL agent used for process control. Section 4 presents a correction of the control signals and the MATLAB Simulink implementation of the control for the GC-PV array system based on PI-type controllers using the RL-TD3 agent. Section 5 presents a correction and MATLAB Simulink implementation of the command signals for the control system for the GC-PV array system based on SMC- and SYN-type controllers using the RL-TD3 agent. The results of the numerical simulations are presented in Section 6 and Section 7 presents our conclusions.

2. Grid Connected PV Array System: The Mathematical Model

The schematic block of the main circuit for the GC-PV system is presented in Figure 1 [15,27,31]. The input quantities for the PV array model were provided by radiation and temperature. A component of utmost importance was the power point tracking (MPPT) module, which acted on the DC boost converter to obtain the maximum efficiency of the energy received from the PV array. A detailed description of the MPPT is presented in [15,26,31]. A three-phase DC–AC converter, which powered a three-phase load, was added to the diagram in Figure 1. The controller proposed and described in the following sections acted on the three-phase DC–AC converter in order to stabilize the udc voltage as precisely as possible, even under a significant variation caused by the three-phase load. Using the notations presented in Figure 1, Equations (1)–(4) could be written as below:
C 1 d u P V d t = i P V i s
u P V = R 1 i s + L 1 d i s d t + u s
C 2 d u d c d t = i d c 1 i d c 2
u a b c e a b c = R 3 i a b c + L 3 d i a b c d t
where the output voltages of the DC–AC is noted with uabc (represented by the VSC with the form u a b c = u a u b u c T ), the grid voltages are denoted by eabc with the form e a b c = e a e b e c T and the alternating currents are denoted by iabc with the form i a b c = i a i b i c T .
In Equation (5), the well-known Park’s transformation based on P matrix is presented:
P = sin ω t sin ( ω t 2 π 3 ) sin ( ω t + 2 π 3 ) cos ω t cos ( ω t 2 π 3 ) cos ( ω t + 2 π 3 ) 1 2 1 2 1 2
The transformation from the coordinates abc reference frame to the dq reference frame was performed using Equation (5): udq0 = Puabc, edq0 = Peabc, idq0 = = Piabc. Equation (4) transformed as follows:
u d q 0 e d q 0 = R 3 i d q 0 + L 3 d i d q 0 d t + L 3 ω i q ω i d 0
Equation (6) could be written by components as follows:
L 3 d i d d t = R 3 i d + ω L 3 i q e d + u d = u 3 d + u d
L 3 d i q d t = R 3 i q ω L 3 i d e d + u q = u 3 q + u q
where uid and uiq are the control variables used for the command of the DC–AC. In the above equations, we noted that u 3 d = R 3 i d + ω L 3 i q e d and u 3 q = R 3 i q ω L 3 i d e q .
Following [15,26,31], the duty cycle D used for the control of the DC boost converter was described by means of Equations (9) and (10):
i d c 1 = 1 D i s
u s = 1 D u d c
A general block diagram of the entire application described in this article is shown in Figure 2. The chain of primary elements consisted of the following blocks: the simulated inputs for PV array; the PV array itself; the MPPT algorithm; the DC-DC boost converter; the voltage and current measurements for the DC intermediate circuit; the load and connection to power grid; the DC-AC converter; and the power grid. It can also be noted that the main element that we focus on in this article is the control block of the three-phase DC-AC converter. The main objective of the control system is to stabilize the udc voltage as precisely as possible, including under variation caused by the three-phase load. While in the classic case, the control system is built with PI-type controllers for the two voltage and current control loops, this article presents an improvement in the performance of the control system through the use of an RL-TD3 agent. Furthermore, in the complex case of a control system in which the control of the voltage loop is performed by an SMC-type controller and the control of the current loop is performed by an SYN-type controller, there was an improvement in the performance of the GC-PV control system through the use of the RL-TD3 agent that was created and trained accordingly.

3. Reinforcement Learning for Process Control

The RL for process control is organized as a series of tasks, which run on a computer to control an industrial process but do not require an explicit mathematical description. Thus, the RL process interacts with the controlled process in the sense of transmitting decisions (commands), which must reach the maximum of a set cumulative “Reward”. Figure 3 presents the schematic block diagram for an RL of the process control system. It can be noted that “Observation” and “Reward” are input signals for the RL. Observations are signals that characterize the process and are measurable along with their rate of change or error relative to a reference. Actions are the control quantities that act on the controlled process. Over time, Actions are selected so that the cumulative Reward increases in order to reach an optimal value. The Reward is expressed in terms of the square error of process signals and the square of the past Actions. The RL contains an optimal “Policy”, which is analogous to the operating mode of a process controller. The process contains the usual elements, namely a plant, reference signals, converters, filters and sensors.
The usual stages for the design of an RL process are the following [28,29,30]:
  • The Problem statement represents the RL agent and its capability to interconnect with the components of the process;
  • The Process creation represents the dynamic model type of the GC-PV’s controlled process and its interface;
  • The Reward creation represents the mathematical relationship of the Reward in order to carry out the performance measurements for the execution of the proposed task;
  • The Agent training represents an RL agent that is trained to realize the Policy based on the Reward, RL algorithm and controlled process.
  • The Agent validation represents the stage where the performance is evaluated after training;
  • The Deploy policy represents the step that performs the implementation of the trained RL agent within the GC-PV control system.
In this article, we used an RL-TD3 agent, which was an improved variant of the RL-DDPG-type agent. This type of agent is an actor-critic agent that calculates the long-term maximization of the Reward.
The steps performed by an RL agent during the training period are as follows [28,29]:
  • For the Observation of the current state S, the action A = μ ( S ) + N is selected, where N is the stochastic noise obtained from the noise model;
  • Action A is executed, then Reward R and the next Observation S’ are calculated;
  • The experience S , A , R , S is stored;
  • M experiences S i , A i , R i , S i are randomly generated;
  • For S i , which is a terminal state, we can obtain the value function target yi that is set to Ri.
Alternatively, this is calculated by the Equation (11) [28,29]:
y i = R i + γ min Q k S k , c l i p μ S k θ μ + ε θ Q k
The value function target is equal to the sum of the experience Reward Ri and the minimum discounted value for the future Reward from the critics.
At every training step, the parameters of each critic are updated and minimized using the following expression:
L k = 1 M i = 1 M ( y i Q k ( S i , A i θ Q k ) ) 2
At every step, the actor’s parameter values are updated, thereby maximizing the Reward:
θ μ J = 1 M i = 1 M G a i G μ i
where Gai, Gμi and A are represented by the following expressions, respectively:
G a i = A min Q k S i , A θ Q
G μ i = θ μ μ S i θ μ
A = μ S i θ μ
Additionally, the parametric updates are realized for a selected smoothing coefficient τ, as in the following equations:
θ Q k = τ θ Q k + 1 τ θ Q k
θ μ = τ θ μ + 1 τ θ μ

4. Correction of the Control Signals Used for the Control of a Grid Connected PV Array System Based on PI Controllers Using RL-TD3 Agent

The classic control system for a GC-PV system is presented in detail in [15] and can be considered as the benchmark for the performance of the control system. The control system is also presented in [27,31], both under low voltage and normal operation conditions. Figure 4 shows the schematic diagram of the classic control system for the GC-PV array, which consists of a cascade structure in which PI-type controllers and control loops are used for the control of currents id and iq (inner control loop) and for the control of the udc voltage (outer control loop).
Figure 5 shows the model MATLAB Simulink implementation of the control system for the GC-PV system based on PI-type controllers using an RL-TD3 agent for the correction of control signals, i.e., a customization of Figure 2 for the control system presented in this section. Thus, the RL-TD3 agent that learned the behavior of the control system for the GC-PV array was used, which supplied the correction command signals for the three command inputs of the cascade-type control system (idref, udref and uqref) after the training phase, so that the improved GC-PV control system would produce a superior performance.
The steps in Section 3 were followed to implement the RL-TD3 agent. In first step, the deep neural network (DNN) object was created, which was characterized through two inputs (Observation and Action) and one output. An example code sequence from the software program developed in the MATLAB environment for the design of the neural network is shown in Figure 6 and its graphic representation is presented in Figure 7.
To train the RL-TD3 agent to control the GC-PV system, 200 episodes were chosen, with the step number for each episode being around 100 and the time sampling of the agent being 10−4 s. The RL-TD3 agent training stage could be stopped when the cumulative average Reward was greater than −190 for a period of 100 consecutive episodes or after the 200 training episodes that were initially set had finished. To improve the RL-TD3 agent’s performance during training, Gaussian noise overlapped the signals that were received and transmitted by the proposed agent.

4.1. Implementation of the RL-TD3 Agent for the Correction of Commands for the Outer Voltage Control Loop

The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI-type controllers using the RL-TD3 agent for the command correction of the idref current, which represents the outer loop for the control of the udc voltage, is shown in Figure 8. Figure 9 presents subsystem diagram of the MATLAB Simulink implementation of the RL-TD3 agent. In this case, the corrected command signals of the RL-TD3 agent were added to the command signal idref. The Observations were represented by the following signals: udc and udcerror.
The Reward at every step in this case was calculated using the following equation:
r 1 = Q 1 u d c _ e r r o r 2 + R j u t 1 j 2
where Q1 is 0.5 and R is 0.1.
The training time in this case was 2 h, 37 min and 12 s. The graphical results for this training stage are presented in Figure 10.

4.2. Implementation of the RL-TD3 Agent for the Command Correction of the Inner Currents Control Loop

The model MATLAB Simulink implementation of the inner loop of the GC-PV array (which controls the id and iq currents) based on the RL-TD3 agent is shown in Figure 11. After the learning stage, the RL-TD3 agent supplied correction signals for the command signals ud and uq. Figure 12 presents the block diagram of the MATLAB Simulink subsystem implementation of the RL-TD3 agent. The Observation consisted of the following signals: id, iq, iderror and iqerror.
The Reward at every step in this case was calculated using the following equation:
r 1 = Q 1 i d e r r o r 2 + Q 2 i q e r r o r 2 + R j u t 1 j 2
where Q1 = Q2 = 0.5, R is 0.1 and u t 1 j is the Actions from the previous step.
The training time in this case was 3 h, 12 min and 42 s. The graphical results for this training stage are presented in Figure 13.

4.3. Implementation of the RL-TD3 Agent for the Command Correction of the Outer Voltage Control Loop and Inner Current Control Loops

The model MATLAB Simulink implementation of the command correction for the outer voltage control loop and the inner current control loops based on the RL-TD3 agent is shown in Figure 14. Figure 15 shows the block diagram of the MATLAB Simulink subsystem implementation of the RL-TD3 agent. In this case, the correction signals of RL-TD3 agent were supplied to the command signals ud and uq and also to the idref signal. The Observations were represented by the following signals: udc, udc_error, id, iq, id_error and iq_error.
The Reward at every step in this case was calculated using the following equation:
r 1 = Q 1 u d c _ e r r o r 2 + Q 2 i d e r r o r 2 + Q 3 i q e r r o r 2 + R j u t 1 j 2
where Q1 = Q2 = Q3 = 0.5 and R is 0.1.
The training time in this case was 1 h, 27 min and 33 s. The graphical results for this training stage are presented in Figure 16.

5. Correction of the Control Signals for the Control System of the Grid Connected PV Array Based on SMC and Synergetic Controllers Using the RL-TD3 Agent

In this section, we present the design and synthesis algorithms of the SMC and SYN controllers of the control system for the GC-PV array. Figure 17 presents the schematic diagram of the control system for the GC-PV array based on the SMC and SYN controllers. This control system consisted of a cascade in which the control loops were used for the control of the id and iq signal currents (inner control loop with the SYN controller) and the control of the udc signal voltage (outer control loop with the SMC controller).
Moreover, the RL-TD3 agent that learned the behavior of the GC-PV control system was used, which supplied the correction signals for the three control inputs of the cascade-type control system (idref, udref, uqref) after the training stage, so that the improved control system would produce a superior performance, even when the control system used SMC- and SYN-type controllers.

5.1. Sliding Mode Control

Based on the elements presented in Section 2 and denoting the switching functions of the DC-AC converter as Sa, Sb and Sc, the following equation could be written within the abc frame reference:
C 2 d u d c d t = i d c 1 i a S a + i b S b + i c S c
By using the transformation in (5), the switching functions Sd and Sq could be obtained:
S d S q 0 T = P S a S b S c T
With these, Equation (22) became:
C 2 d u d c d t = i d c 1 3 2 i d S d + i q S q
Similar to [15,31,33], the same MPPT algorithm was considered, so we then focused on obtaining the SMC and SYN command laws. Furthermore, by following [26], iqref = 0 was selected and Equation (24) became:
C 2 d u d c d t = i d c 1 3 2 i d r e f S d
Moreover, to obtain the reference current idref using the SMC design procedure, the state variable x1, as in Equation (26), and the switching surface S, as in Equation (27), were added:
x 1 = u d c u d c r e f
S = c 1 x 1 + x 2 S ˙ = c 1 x 2 + x ˙ 2
In Equation (27), the state variable x2 was defined as follows:
x 2 = x ˙ 1 = u ˙ d c
Equation (29) was then necessary to achieve convergence:
S ˙ = ε sgn S k S
where ε and k are positive constants.
Using calculus, the following could be obtained:
x ¨ 1 = x ˙ 2 = u ¨ d c = 3 2 S d C 2 i ˙ d r e f i ˙ d c 1 C 2 ,
and so, the next equation could be written:
ε sgn S k S = c 1 x 2 + 3 2 1 C 2 S d i ˙ d r e f i ˙ d c 1 C 2
Following [32,33], to improve the convergence and smoothing of the high frequency oscillations, the sgn function was replaced with the following function defined by Equation (32):
h ( x ) = 2 1 + e a ( x b ) 1
For a = 4 and b = 0, h [ 1 1 ] and a smoothed transition were achieved for this interval. Thus, the output of the designed SMC-type controller was obtained by:
i d r e f = 2 3 C 2 S d 0 t c 1 x 2 + k S ε h ( S ) + i ˙ d c 1 C 2 d t
Figure 18 presents the block diagram of the MATLAB Simulink subsystem implementation of the proposed SMC controller.

5.2. Synergetic Control

For a nonlinear system in the form of (34), an SYN control law could be synthesized that could be seen as a generalization of the SMC-type control law [27,32,33]:
x ˙ = f x , u , t
where x is the state vector x n , f . is the continuous nonlinear function and u is the input control vector u m , m < n .
The macro variable ψ x , t was chosen and it was defined for each input control according to the states of the system. The forced evolution of the states according to the following equation was imposed for the synthesis of the SYN-type control law:
T ψ ˙ + ψ = 0
where T > 0 is selected to achieve the desired convergence rate.
By differentiating the chosen macro variable, ψ was obtained by the following expression:
ψ ˙ = ψ x x ˙ ,
After inserting Equation (36) into Equation (35), the following could be obtained:
T ψ x x ˙ + ψ = 0
By inserting the explicit forms of the x ˙ states into Equation (37), we could obtain the control law given by the next equation:
u = u x , ψ ( x , t ) , T , t
The outputs of the SYN controller were given by ud and uq.
For the d axis and kd > 0, we selected the chosen macro variable ψd in the following form:
ψ d = u d c r e f u d c + k d i d r e f i d
We defined the state variable x2 as in Equation (40):
x 1 = u d r e f u d c x 1 = i d r e f i d
From Equation (40) and for the slow mode variations of the reference quantities or for a quasi-stationary regime, the next expression could be obtained:
x ˙ 1 = u ˙ d c x ˙ 2 = i ˙ d
Based on these, Equation (39) became:
ψ ˙ d = x ˙ 1 + k d x ˙ 2 = u ˙ d c k d i ˙ d
For T = T1, Equation (40) became:
T 1 u ˙ d c k d i ˙ d + u d c r e f u d c + k d i d r e f i d = 0
Using Equation (7), Equation (43) could be written in the following form:
T 1 u ˙ d c T 1 k d 1 L 3 u 3 d u d + u d c r e f u d c + k d i d r e f i d = 0
After rearranging the terms in Equation (44), we could obtain the following expression:
T 1 k d 1 L 3 u d = T 1 u ˙ d c T 1 k d 1 L 3 u 3 d + u d c r e f u d c + k d i d r e f i d
Thus, the control law ud was obtained:
u d = L 3 T 1 k d T 1 u ˙ d c T 1 k d 1 L 3 u 3 d + u d c r e f u d c + k d i d r e f i d
For the q axis and kq > 0, we selected the macro variable ψq in the following form:
ψ q = i q r e f i q
We could define the state variable x3 as:
x 1 = u d c r e f u d c x 2 = i d r e f i d x 3 = i q r e f i q
For iqref = 0, Equation(48) could be written in the following form:
x ˙ 1 = u ˙ d c x ˙ 2 = i ˙ d x ˙ 3 = i ˙ q
Thus, the macro variable derivative ψq, which was defined in Equation (47), was obtained:
ψ ˙ q = x ˙ 3
For T = T2, Equation (40) became:
T 2 i ˙ q + i q r e f i q = 0
Using Equation (8), Equation (51) could be written in the following form:
T 2 1 L 3 u 3 q + u q + i q r e f i q = 0
After rearranging of the terms in Equation (52), we could obtain the following expression:
u 3 q + u q = L 3 T 2 i q r e f i q
Thus, the control law uq was obtained:
u q = L 3 T 2 i q r e f i q u 3 q
Figure 19 presents the MATLAB Simulink implementation subsystem of the designed SYN controller.
Figure 20 shows the model MATLAB Simulink implementation of the control system for the GC-PV array based on the SMC (MATLAB Simulink subsystem implementation shown in Figure 18) and SYN (MATLAB Simulink subsystem implementation shown in Figure 19) controllers using the RL-TD3 agent for the correction of the control signals.
Following on from the aspects presented in Section 4, this section continues to present the ways in which the performance of the GC-PV system could be improved by using the RL-TD3 agent, even when using complex SMC- and SYN-type controllers, i.e., a customization of Figure 2 for the control system presented in this section.

5.3. Implementation of the RL-TD3 Agent for the Correction of the Outer Voltage Control Loop Using SMC and Synergetic Control

The block diagram of the MATLAB Simulink subsystem implementation of the GC-PV control system using SMC and SYN controllers and the improved performance of the RL-TD3 agent being used for the outer control loop is shown in Figure 21. The correction signals of the RL-TD3 agent were added to the command signal idref, the RL-TD3 block structure was similar to that in Figure 9 and the Reward was given by Equation (19).
The training time in this case was 3 h, 12 min and 17 s. The graphical results for this training stage are presented in Figure 22.

5.4. Implementation of the RL-TD3 Agent for the Correction of the Inner Currents Control Loop Using SMC and Synergetic Control

The MATLAB Simulink subsystem block implementation of the GC-PV control system using SMC and SYN and the improved performance from using the RL-TD3 agent in the inner control loop is presented in Figure 23. The correction signals of the RL-TD3 agent were added to the command signals udref and uqref, the RL-TD3 block structure was similar to that in Figure 12 and the Reward was obtained using Equation (20).
The training time in this case was 3 h, 3 min and 36 s. The graphical results for this training stage are presented in Figure 24.

5.5. Implementation of the RL-TD3 Agent for the Correction of the Outer Speed Control Loop and Inner Current Control Loops Using SMC and Synergetic Control

The MATLAB Simulink subsystem block implementation of the GC-PV control system using SMC and SYN and the improved performance from using the RL-TD3 agent in the outer and inner control loops is presented in Figure 25. The correction signals of the RL-TD3 agent were added to the command signals idref, udref and uqref, the RL-TD3 block structure was similar to that in Figure 15 and the Reward was obtained using Equation (21).
The training time in this case was 2 h, 15 min and 24 s. The graphical results for this training stage are presented in Figure 26.

6. Numerical Simulations

This section starts with the benchmark that was presented in MATLAB Simulink [15] and resumed to compare the best results obtained in [26,27,31,33]. After presenting the main characteristics of the benchmark system, this section presents the numerical simulations that were based on the theoretical elements presented in the previous sections. Thus, starting with the cascade control structure in which PI-type controllers were used for the inner control loop of the id and iq current signals and the outer control loop of the udc voltage signal and using the elements regarding the RL-TD3 agent, a superior performance of the control system for the GC-PV array was obtained.
Moreover, in the second part of the numerical simulations and starting from the peak performances presented in [32,33] regarding the cascade control system in which an SYN-type controller was used for the inner control loop of the id and iq current signals, an SMC-type controller was used for the outer control loop of the udc voltage signal and the elements regarding the RL-TD3 agent were used, a superior performance of the control system for the GC-PV array was obtained, both in terms of the direct comparison of these performances and the robustness provided by the control system under parametric variations, such as the variation caused by the three-phase load.
Regarding the characteristics of the benchmark system presented in Section 2, we note that it was a 100 kW model in which the value of the udc voltage in the DC intermediate circuit was set to a value of 500 V; therefore, one of the objectives of the control system is to maintain this voltage and the voltage value supplied by the DC–AC converter was 260 V. The load was connected to the main grid via a 25 kV–260 V transformer. The nominal value of the three-phase load was 10 kvar. The MPPT algorithm used was that presented and implemented in the benchmark system [15,31] and was kept unchanged so as to be able to compare the performances of different the control systems for the GC-PV array. The PV array consisted of 330 modules that could supply 100.7 kW (305.2 W/modules) and in which the short circuit current of each module was Isc = 5.96 A and the open circuit voltage was Voc = 64.2 V. The sampling period for the PWM generator was 1 ms and the sampling period for the voltage and the current were 100 ms. Similar to the benchmark, the control system was bypassed for the first 50 ms.
The simulation of the PV array operation was dependent on the evolution of the irradiance and temperature input signals, which is shown in Figure 27. The time evolution of the irradiance and temperature in Figure 27 and the signal type 1 PV array are noted to be those used in the numerical simulations of the GC-PV array control system when using PI-type controllers.
Figure 28 shows the time evolution of the udc voltage for the irradiance and temperature of the control GC-PV array system using PI-type controllers with the signals of the type 1 PV array. The steady-state error of the control system based on PI controllers was 1 V, i.e., 0.2%, and the overshooting was neglectable.
Next, Figure 29, Figure 30, Figure 31 and Figure 32 show the evolution over time of the following quantities of interest of the control system: the id and iq currents; the power Pmean and voltage Umean of the PV; the duty cycle of the DC-DC converter; the modulation index of the DC-AC converter; the ua voltage and the ia current of the main grid; and the power flow P between the PV and the main grid. Thus, the evolution of the id and iq currents, with the reference current iqref = 0 and the id current following the idref reference current, are presented in Figure 29. The evolution of the power Pmean and voltage Umean of the PV, the duty cycle of the DC-DC converter and the modulation index of the DC-AC converter are presented in Figure 30. The time evolution of the ua voltage and the ia current of the main grid are presented in Figure 31. The evolution of the power flow P between the PV and the main grid is presented in Figure 32.
Regarding the performance of the control system for the GC-PV array using PI-type controllers, a step variation was applied from 500 V to 550 V at 1 s. The result of the numerical simulation is presented in Figure 33.
As a result of the design of the RL-TD3 agents, their training and the numerical simulations related to the cases in Section 4.1, Section 4.2, Section 4.3 and Section 3 are presented Figure 34, Figure 35 and Figure 36.
Figure 37 presents the comparative responses of the control systems for the GC-PV array for a step signal between 500 V to 550 V that was based on PI-type controllers and three variants of this type of control system using the RL-TD3 agent for the correction of the command signals (outer and inner control loops).
Table 1 presents the comparative performances of these systems in controlling the GC-PV array variants, i.e., the response times, and the ripple of the voltage error signal obtained using the Equation (55). In all of these cases in which the RL-TD3 agent was used, the overshooting was almost zero and the steady-state error was less than 0.2%. In the presented numerical simulations, it can be seen that the use of an RL-TD3 agent contributed to the improvement in the performance of the GC-PV array control system.
u d c _ r i p = 1 N i = 1 N u d c ( i ) u d c r e f ( i ) 2
where N is the sample number, udc is the voltage and udcref is the reference voltage.
The time evolution of the irradiance and temperature in Figure 38 and the signal type 2 PV array was noted as being used in the numerical simulations of the GC-PV array control systems using SMC- and SYN-type controllers.
Figure 39 presents the response of the SMC-type control, which was designed for the control of the udc voltage, combined with the SYN-type control, which was designed for the control of the id and iq currents, for the DC voltage reference udcref = 500 V. In the detail in Figure 39, it can be observed that the steady-state error was 0.1 V, i.e., 0.02%.
To demonstrate the parametric robustness of the control system, Figure 40 and Figure 41 show the evolution of the udc voltage in the case of a variation of 30% from its nominal value that was caused by the load. Therefore, it can be noted that the control system for the GC-PV array using SMC- and SYN-type controllers maintained its performance in each of these cases.
Following the case in which the input for the GC-PV array was provided by the type 2 PV array signals, Figure 42, Figure 43, Figure 44 and Figure 45 present the evolution over time of the following quantities of interest of the control system: the id and iq currents; the Pmean power and Umean voltage of the PV; the duty cycle of the DC-DC converter; the modulation index of the DC-AC converter; the voltage ua and current ia of the main grid; and the power flow P between the PV and the main grid. Thus, the time evolution of the id and iq currents, with the reference current iqref = 0 and the id current following the idref reference current, are presented in Figure 42. The time evolution of the power Pmean and voltage Umean of the PV, the duty cycle of the DC-DC converter and the modulation index of the DC-AC converter are presented in Figure 43. The evolution of the ua voltage and ia current of the main grid are presented in Figure 44. The time evolution of the power flow P between the PV and the main grid is presented in Figure 45.
Regarding the performance of the control system for the GC-PV array using SMC- and SYN-type controllers, a step variation was applied from 500 V to 550 V at 1 s. The result of the numerical simulation is presented in Figure 46.
As a result of the proposed RL-TD3 agents, their training and the numerical simulations related to the cases in Section 5.3, Section 5.4 and Section 5.5 are presented in Figure 47, Figure 48 and Figure 49.
Figure 50 presents the comparative response of the control for the GC-PV array system for a step variation from 500 V to 550 V based on SMC and SYN controllers and three variants of this type of control system using RL-TD3 agent for correction of the command signals (outer and inner control loop).
Table 2 presents the comparative performances of these systems in controlling the GC-PV array variants, i.e., the response times, and the ripple of the voltage error signal obtained using Equation (55). In all of these cases in which the RL-TD3 agent was used, the overshooting was almost zero and the steady-state error was less than 0.02%. In the presented numerical simulations, it can be seen that the use of an RL-TD3 agent contributed to the improvement in the performance of the GC-PV array control system.
It can be observed that, between PI-type control system and the PI–RL-TD3 agent control system, the response time was improved by approximately 7 ms and between the SMC and SYN control system, the SMC–RL-TD3 agent and SYN–RL-TD3 agent, the response time was improved by approximately 1 ms, i.e., a decrease in the response time by approximately 18% in the first case and approximately 7% in the second case. However, between the simplest and most complex cases, the decrease in the response time was about 27 ms, which indicates a decrease of about 70%. In the same way, the rest of the performances could be evaluated as relative or absolute units.

7. Conclusions

This paper described the control system for a GC-PV array, starting from a benchmark system. The control structure was a cascade-type structure in which PI or SYN controllers were used for the inner control loops of id and iq signal currents and PI or SMC controllers were used for the outer control loop of the udc signal voltage in the DC intermediate circuit. The paper presented the model of the PV array together with the main component blocks: the simulated inputs for the PV array; the PV array itself; the MPPT algorithm; the DC-DC boost converter; the voltage and current measurements for the DC intermediate circuit; the DC-AC converter; the load and connection to the power grid; and the power grid. It also presented the stages of building and training the RL-TD3 agent. Additionally, the comparative results are shown for cases in which the RL-TD3 agent was properly trained and provided correction signals that were added to the command signals ud, uq and idref. The parametric robustness of the proposed control system for the GC-PV array based on SMC and SYN controllers was proven in the case of a variation of 30% that was caused by the three-phase load. Moreover, the results of the numerical simulations are presented comparatively and the validation of the synthesis of the proposed control system for the GC-PV array was obtained. This was performed by comparing the system to the software benchmark of a control system for a GC-PV array that was implemented in MATLAB Simulink. The numerical simulations proved the superiority of the control system that used the RL-TD3 agent.

Author Contributions

Conceptualization, M.N. and C.-I.N.; data curation, M.N., C.-I.N. and D.S.; formal analysis, M.N., C.-I.N. and D.S.; funding acquisition, M.N. and D.S.; investigation, M.N., C.-I.N. and D.S.; methodology, M.N., C.-I.N. and D.S.; project administration, M.N. and D.S.; resources, M.N. and D.S.; software, M.N. and C.-I.N.; supervision, M.N. and D.S.; validation, M.N. and D.S.; visualization, M.N., C.-I.N. and D.S.; writing—original draft, M.N., C.-I.N. and D.S.; writing—review and editing, M.N., C.-I.N. and D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by the European Regional Development Fund Competitiveness Operational Program, project TISIPRO, ID: P_40_416/105736, 2016–2021, and with funds from the Ministry of Research and Innovation in Romania as part of the NUCLEU program, PN 19 38 01 03.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tricarico, T.; Gontijo, G.; Neves, M.; Soares, M.; Aredes, M.; Guerrero, J.M. Control Design, Stability Analysis and Experimental Validation of New Application of an Interleaved Converter Operating as a Power Interface in Hybrid Microgrids. Energies 2019, 12, 437. [Google Scholar] [CrossRef] [Green Version]
  2. Petersen, L.; Iov, F.; Tarnowski, G.C. A Model-Based Design Approach for Stability Assessment, Control Tuning and Verification in Off-Grid Hybrid Power Plants. Energies 2020, 13, 49. [Google Scholar] [CrossRef] [Green Version]
  3. Veerashekar, K.; Askan, H.; Luther, M. Qualitative and Quantitative Transient Stability Assessment of Stand-Alone Hybrid Microgrids in a Cluster Environment. Energies 2020, 13, 1286. [Google Scholar] [CrossRef] [Green Version]
  4. Zhao, F.; Yuan, J.; Wang, N.; Zhang, Z.; Wen, H. Secure Load Frequency Control of Smart Grids under Deception Attack: A Piecewise Delay Approach. Energies 2019, 12, 2266. [Google Scholar] [CrossRef] [Green Version]
  5. Montoya, O.D.; Gil-González, W.; Rivas-Trujillo, E. Optimal Location-Reallocation of Battery Energy Storage Systems in DC Microgrids. Energies 2020, 13, 2289. [Google Scholar] [CrossRef]
  6. Alshehri, J.; Khalid, M.; Alzahrani, A. An Intelligent Battery Energy Storage-Based Controller for Power Quality Improvement in Microgrids. Energies 2019, 12, 2112. [Google Scholar] [CrossRef] [Green Version]
  7. Estévez-Bén, A.A.; Alvarez-Diazcomas, A.; Rodríguez-Reséndiz, J. Transformerless Multilevel Voltage-Source Inverter Topology Comparative Study for PV Systems. Energies 2020, 13, 3261. [Google Scholar] [CrossRef]
  8. Yan, X.; Cui, Y.; Cui, S. Control Method of Parallel Inverters with Self-Synchronizing Characteristics in Distributed Microgrid. Energies 2019, 12, 3871. [Google Scholar] [CrossRef] [Green Version]
  9. Coppola, M.; Guerriero, P.; Dannier, A.; Daliento, S.; Lauria, D.; Del Pizzo, A. Control of a Fault-Tolerant Photovoltaic Energy Converter in Island Operation. Energies 2020, 13, 3201. [Google Scholar] [CrossRef]
  10. Khan, K.; Kamal, A.; Basit, A.; Ahmad, T.; Ali, H.; Ali, A. Economic Load Dispatch of a Grid-Tied DC Microgrid Using the Interior Search Algorithm. Energies 2019, 12, 634. [Google Scholar] [CrossRef] [Green Version]
  11. Cook, M.D.; Trinklein, E.H.; Parker, G.G.; Robinett, R.D., III; Weaver, W.W. Optimal and Decentralized Control Strategies for Inverter-Based AC Microgrids. Energies 2019, 12, 3529. [Google Scholar] [CrossRef] [Green Version]
  12. Oviedo Cepeda, J.C.; Osma-Pinto, G.; Roche, R.; Duarte, C.; Solano, J.; Hissel, D. Design of a Methodology to Evaluate the Impact of Demand-Side Management in the Planning of Isolated/Islanded Microgrids. Energies 2020, 13, 3459. [Google Scholar] [CrossRef]
  13. Stadler, M.; Pecenak, Z.; Mathiesen, P.; Fahy, K.; Kleissl, J. Performance Comparison between Two Established Microgrid Planning MILP Methodologies Tested On 13 Microgrid Projects. Energies 2020, 13, 4460. [Google Scholar] [CrossRef]
  14. Artale, G.; Caravello, G.; Cataliotti, A.; Cosentino, V.; Di Cara, D.; Guaiana, S.; Nguyen Quang, N.; Palmeri, M.; Panzavecchia, N.; Tinè, G. A Virtual Tool for Load Flow Analysis in a Micro-Grid. Energies 2020, 13, 3173. [Google Scholar] [CrossRef]
  15. MathWorks—Detailed Model of a 100-kW Grid-Connected PV Array. Available online: https://nl.mathworks.com/help/physmod/sps/ug/detailed-model-of-a-100-kw-grid-connected-pv-array.html;jsessionid=29903e2e045151ffb3e27a4920e1 (accessed on 4 November 2020).
  16. Hong, W.; Tao, G. An Adaptive Control Scheme for Three-phase Grid-Connected Inverters in Photovoltaic Power Generation Systems. In Proceedings of the Annual American Control Conference (ACC), Milwaukee, WI, USA, 27–29 June 2018; pp. 899–904. [Google Scholar]
  17. Naderi, M.; Khayat, Y.; Bevrani, H. Robust Multivariable Microgrid Control Synthesis and Analysis. Energy Procedia 2016, 100, 375–387. [Google Scholar] [CrossRef] [Green Version]
  18. Hua, H.; Qin, Y.; Xu, H.; Hao, C.; Cao, J. Robust Control Method for DC Microgrids and Energy Routers to Improve Voltage Stability in Energy Internet. Energies 2019, 12, 1622. [Google Scholar] [CrossRef] [Green Version]
  19. Villalón, A.; Rivera, M.; Salgueiro, Y.; Muñoz, J.; Dragičević, T.; Blaabjerg, F. Predictive Control for Microgrid Applications: A Review Study. Energies 2020, 13, 2454. [Google Scholar] [CrossRef]
  20. Zeb, K.; Islam, S.U.; Din, W.U.; Khan, I.; Ishfaq, M.; Busarello, T.D.C.; Ahmad, I.; Kim, H.J. Design of Fuzzy-PI and Fuzzy-Sliding Mode Controllers for Single-Phase Two-Stages Grid-Connected Transformerless Photovoltaic Inverter. Electronics 2019, 8, 520. [Google Scholar] [CrossRef] [Green Version]
  21. Kamal, T.; Karabacak, M.; Perić, V.S.; Hassan, S.Z.; Fernández-Ramírez, L.M. Novel Improved Adaptive Neuro-Fuzzy Control of Inverter and Supervisory Energy Management System of a Microgrid. Energies 2020, 13, 4721. [Google Scholar] [CrossRef]
  22. Song, L.; Huang, L.; Long, B.; Li, F. A Genetic-Algorithm-Based DC Current Minimization Scheme for Transformless Grid-Connected Photovoltaic Inverters. Energies 2020, 13, 746. [Google Scholar] [CrossRef] [Green Version]
  23. Yoshida, Y.; Farzaneh, H. Optimal Design of a Stand-Alone Residential Hybrid Microgrid System for Enhancing Renewable Energy Deployment in Japan. Energies 2020, 13, 1737. [Google Scholar] [CrossRef] [Green Version]
  24. Younesi, A.; Shayeghi, H.; Siano, P. Assessing the Use of Reinforcement Learning for Integrated Voltage/Frequency Control in AC Microgrids. Energies 2020, 13, 1250. [Google Scholar] [CrossRef] [Green Version]
  25. Serra, F.M.; Fernández, L.M.; Montoya, O.D.; Gil-González, W.; Hernández, J.C. Nonlinear Voltage Control for Three-Phase DC-AC Converters in Hybrid Systems: An Application of the PI-PBC Method. Electronics 2020, 9, 847. [Google Scholar] [CrossRef]
  26. Wu, B.; Zhou, X.; Ma, Y. Bus Voltage Control of DC Distribution Network Based on Sliding Mode Active Disturbance Rejection Control Strategy. Energies 2020, 13, 1358. [Google Scholar] [CrossRef] [Green Version]
  27. Qian, J.; Li, K.; Wu, H.; Yang, J.; Li, X. Synergetic Control of Grid-Connected Photovoltaic Systems. Int. J. Photoenergy 2017, 2107, 1–11. [Google Scholar] [CrossRef]
  28. Brandimarte, P. Approximate Dynamic Programming and Reinforcement Learning for Continuous States. In From Shortest Paths to Reinforcement Learning: A MATLAB-Based Tutorial on Dynamic Programming; Springer Nature: Cham, Switzerland, 2021; pp. 185–204. [Google Scholar]
  29. Beale, M.; Hagan, M.; Demuth, H. Deep Learning Toolbox™ Getting Started Guide, 14th ed.; MathWorks, Inc.: Natick, MA, USA, 2020. [Google Scholar]
  30. MathWorks—Reinforcement Learning Toolbox™ User’s Guide. Available online: https://www.mathworks.com/help/reinforcement-learning/getting-started-with-reinforcement-learning-toolbox.html?s_tid=CRUX_lftnav (accessed on 4 November 2020).
  31. de Brito, M.A.G.; Sampaio, L.P.; Luigi, G.; e Melo, G.A.; Canesin, C.A. Comparative analysis of MPPT techniques for PV applications. In Proceedings of the International Conference on Clean Electrical Power (ICCEP), Ischia, Italy, 14–16 June 2011; pp. 99–104. [Google Scholar]
  32. Nicola, M.; Nicola, C.-I. Sensorless Fractional Order Control of PMSM Based on Synergetic and Sliding Mode Controllers. Electronics 2020, 9, 1494. [Google Scholar] [CrossRef]
  33. Nicola, M.; Nicola, C.-I. Fractional-Order Control of Grid-Connected Photovoltaic System Based on Synergetic and Sliding Mode Controllers. Energies 2021, 14, 510. [Google Scholar] [CrossRef]
Figure 1. The schematic block of the main circuit for the GC-PV system.
Figure 1. The schematic block of the main circuit for the GC-PV system.
Energies 15 02392 g001
Figure 2. The schematic diagram of the cascade control system for the GC-PV system.
Figure 2. The schematic diagram of the cascade control system for the GC-PV system.
Energies 15 02392 g002
Figure 3. The schematic diagram for an RL of process control.
Figure 3. The schematic diagram for an RL of process control.
Energies 15 02392 g003
Figure 4. The schematic diagram of the control system for the GC-PV system based on PI-type controllers.
Figure 4. The schematic diagram of the control system for the GC-PV system based on PI-type controllers.
Energies 15 02392 g004
Figure 5. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI-type controllers using an RL-TD3 agent for the correction of control signals.
Figure 5. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI-type controllers using an RL-TD3 agent for the correction of control signals.
Energies 15 02392 g005
Figure 6. An example of the MATLAB syntax program code for the DNN creation.
Figure 6. An example of the MATLAB syntax program code for the DNN creation.
Energies 15 02392 g006
Figure 7. The graphic representation of the created DNN.
Figure 7. The graphic representation of the created DNN.
Energies 15 02392 g007
Figure 8. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI controllers using the RL-TD3 agent for the correction of the idref command.
Figure 8. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI controllers using the RL-TD3 agent for the correction of the idref command.
Energies 15 02392 g008
Figure 9. The MATLAB Simulink subsystem of the RL-TD3 agent for the correction of the idref command.
Figure 9. The MATLAB Simulink subsystem of the RL-TD3 agent for the correction of the idref command.
Energies 15 02392 g009
Figure 10. The training stage of the RL-TD3 agent for the correction of the idref command.
Figure 10. The training stage of the RL-TD3 agent for the correction of the idref command.
Energies 15 02392 g010
Figure 11. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI controllers using the RL-TD3 agent for the correction of the udref and uqref commands.
Figure 11. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI controllers using the RL-TD3 agent for the correction of the udref and uqref commands.
Energies 15 02392 g011
Figure 12. The MATLAB Simulink subsystem of the RL-TD3 agent for the correction of the udref and uqref commands.
Figure 12. The MATLAB Simulink subsystem of the RL-TD3 agent for the correction of the udref and uqref commands.
Energies 15 02392 g012
Figure 13. The training stage of the RL-TD3 agent for the correction of the udref and uqref commands.
Figure 13. The training stage of the RL-TD3 agent for the correction of the udref and uqref commands.
Energies 15 02392 g013
Figure 14. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI controllers using the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Figure 14. The model MATLAB Simulink implementation of the control system for the GC-PV array based on PI controllers using the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Energies 15 02392 g014
Figure 15. The MATLAB Simulink subsystem of the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Figure 15. The MATLAB Simulink subsystem of the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Energies 15 02392 g015
Figure 16. The training stage of the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Figure 16. The training stage of the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Energies 15 02392 g016
Figure 17. The schematic diagram of the control system for the GC-PV array based on the SMC and SYN controllers.
Figure 17. The schematic diagram of the control system for the GC-PV array based on the SMC and SYN controllers.
Energies 15 02392 g017
Figure 18. The MATLAB Simulink subsystem implementation of the SMC controller.
Figure 18. The MATLAB Simulink subsystem implementation of the SMC controller.
Energies 15 02392 g018
Figure 19. The MATLAB Simulink implementation subsystem for the SYN controller.
Figure 19. The MATLAB Simulink implementation subsystem for the SYN controller.
Energies 15 02392 g019
Figure 20. The model MATLAB Simulink implementation of the control system for the GC-PV array based on SMC and SYN controllers using the RL-TD3 agent for the correction of the control signals.
Figure 20. The model MATLAB Simulink implementation of the control system for the GC-PV array based on SMC and SYN controllers using the RL-TD3 agent for the correction of the control signals.
Energies 15 02392 g020
Figure 21. The block diagram of the MATLAB Simulink subsystem implementation of the control system for the GC-PV array based on SMC and SYN using the RL-TD3 agent for the correction of the idref command.
Figure 21. The block diagram of the MATLAB Simulink subsystem implementation of the control system for the GC-PV array based on SMC and SYN using the RL-TD3 agent for the correction of the idref command.
Energies 15 02392 g021
Figure 22. The training stage of the RL-TD3 agent for the correction of the idref command.
Figure 22. The training stage of the RL-TD3 agent for the correction of the idref command.
Energies 15 02392 g022
Figure 23. The MATLAB Simulink subsystem block implementation of the control system for the GC-PV array based on SMC and SYN and using the RL-TD3 agent for the correction of the udref and uqref command signals.
Figure 23. The MATLAB Simulink subsystem block implementation of the control system for the GC-PV array based on SMC and SYN and using the RL-TD3 agent for the correction of the udref and uqref command signals.
Energies 15 02392 g023
Figure 24. The training stage of the RL-TD3 agent for the correction of the udref and uqref commands.
Figure 24. The training stage of the RL-TD3 agent for the correction of the udref and uqref commands.
Energies 15 02392 g024
Figure 25. The MATLAB Simulink subsystem block implementation of the control system for the GC-PV array based on SMC and SYN and using the RL-TD3 agent for the correction of the udref, uqref and iqref commands.
Figure 25. The MATLAB Simulink subsystem block implementation of the control system for the GC-PV array based on SMC and SYN and using the RL-TD3 agent for the correction of the udref, uqref and iqref commands.
Energies 15 02392 g025
Figure 26. The training stage of the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Figure 26. The training stage of the RL-TD3 agent for the correction of the udref, uqref and idref commands.
Energies 15 02392 g026
Figure 27. The time evolution of irradiance and temperature (signal evolution of the type 1 PV array).
Figure 27. The time evolution of irradiance and temperature (signal evolution of the type 1 PV array).
Energies 15 02392 g027
Figure 28. The time evolution of the udc voltage for the irradiance and temperature using PI-type controllers (signal evolution of the type 1 PV array).
Figure 28. The time evolution of the udc voltage for the irradiance and temperature using PI-type controllers (signal evolution of the type 1 PV array).
Energies 15 02392 g028
Figure 29. The time evolution of the id and iq currents (signal evolution of the type 1 PV array).
Figure 29. The time evolution of the id and iq currents (signal evolution of the type 1 PV array).
Energies 15 02392 g029
Figure 30. The time evolutions of Pmean and Umean, the duty cycle of the DC-DC converter and the modulation index of the DC–AC converter (signal evolution of the type 1 PV array).
Figure 30. The time evolutions of Pmean and Umean, the duty cycle of the DC-DC converter and the modulation index of the DC–AC converter (signal evolution of the type 1 PV array).
Energies 15 02392 g030
Figure 31. The time evolution of the ua voltage and ia current of the main grid (signal evolution of the type 1 PV array).
Figure 31. The time evolution of the ua voltage and ia current of the main grid (signal evolution of the type 1 PV array).
Energies 15 02392 g031
Figure 32. The time evolution of the power flow P between the PV and the main grid (signal evolution of the type 1 PV array).
Figure 32. The time evolution of the power flow P between the PV and the main grid (signal evolution of the type 1 PV array).
Energies 15 02392 g032
Figure 33. The time evolution of the udc voltage for a step variation of udcref from 500 V to 550 V using PI controllers (signal evolution of the type 1 PV array).
Figure 33. The time evolution of the udc voltage for a step variation of udcref from 500 V to 550 V using PI controllers (signal evolution of the type 1 PV array).
Energies 15 02392 g033
Figure 34. The time evolution of the udc voltage for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the idref command (signal evolution of the type 1 PV array).
Figure 34. The time evolution of the udc voltage for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the idref command (signal evolution of the type 1 PV array).
Energies 15 02392 g034
Figure 35. The time evolution of the udc voltage for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the udref and uqref commands (signal evolution of the type 1 PV array).
Figure 35. The time evolution of the udc voltage for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the udref and uqref commands (signal evolution of the type 1 PV array).
Energies 15 02392 g035
Figure 36. The time evolution of the udc voltage for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the idref, udref and uqref commands (signal evolution of the type 1 PV array).
Figure 36. The time evolution of the udc voltage for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the idref, udref and uqref commands (signal evolution of the type 1 PV array).
Energies 15 02392 g036
Figure 37. The comparison of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the outer and inner loop commands (signal evolution of the type 1 PV array).
Figure 37. The comparison of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using PI controllers and the RL-TD3 agent for the correction of the outer and inner loop commands (signal evolution of the type 1 PV array).
Energies 15 02392 g037
Figure 38. The time evolution of irradiance and temperature (signal evolution of the type 2 PV array).
Figure 38. The time evolution of irradiance and temperature (signal evolution of the type 2 PV array).
Energies 15 02392 g038
Figure 39. The time evolution of the udc voltage for irradiance and temperature using SMC- and SYN-type controllers for a 10 kvar load (signal evolution of the type 2 PV array).
Figure 39. The time evolution of the udc voltage for irradiance and temperature using SMC- and SYN-type controllers for a 10 kvar load (signal evolution of the type 2 PV array).
Energies 15 02392 g039
Figure 40. The time evolution of the udc voltage for and temperature using SMC- and SYN-type controllers for a 13 kvar load (signal evolution of the type 2 PV array).
Figure 40. The time evolution of the udc voltage for and temperature using SMC- and SYN-type controllers for a 13 kvar load (signal evolution of the type 2 PV array).
Energies 15 02392 g040
Figure 41. The time evolution of the udc voltage for irradiance and temperature using SMC- and SYN-type controllers for a 7 kvar load (signal evolution of the type 2 PV array).
Figure 41. The time evolution of the udc voltage for irradiance and temperature using SMC- and SYN-type controllers for a 7 kvar load (signal evolution of the type 2 PV array).
Energies 15 02392 g041
Figure 42. The time evolution of the id and iq currents (signal evolution of the type 2 PV array).
Figure 42. The time evolution of the id and iq currents (signal evolution of the type 2 PV array).
Energies 15 02392 g042
Figure 43. The time evolutions of the Pmean and Umean of the PV, the duty cycle of the DC–DC converter and the modulation index of the DC–AC converter (signal evolution of the type 2 PV array).
Figure 43. The time evolutions of the Pmean and Umean of the PV, the duty cycle of the DC–DC converter and the modulation index of the DC–AC converter (signal evolution of the type 2 PV array).
Energies 15 02392 g043
Figure 44. The time evolution of the ua voltage and ia current of the main grid (signal evolution of the type 2 PV array).
Figure 44. The time evolution of the ua voltage and ia current of the main grid (signal evolution of the type 2 PV array).
Energies 15 02392 g044
Figure 45. The time evolution of the power flow P between the PV and the main grid (signal evolution of the type 2 PV array).
Figure 45. The time evolution of the power flow P between the PV and the main grid (signal evolution of the type 2 PV array).
Energies 15 02392 g045
Figure 46. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers (signal evolution of the type 2 PV array).
Figure 46. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers (signal evolution of the type 2 PV array).
Energies 15 02392 g046
Figure 47. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the correction of the idref command (signal evolution of the type 2 PV array).
Figure 47. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the correction of the idref command (signal evolution of the type 2 PV array).
Energies 15 02392 g047
Figure 48. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the correction of the udref and uqref commands (signal evolution of the type 2 PV array).
Figure 48. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the correction of the udref and uqref commands (signal evolution of the type 2 PV array).
Energies 15 02392 g048
Figure 49. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the correction of the idref, udref and uqref commands (signal evolution of the type 2 PV array).
Figure 49. The time evolution of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the correction of the idref, udref and uqref commands (signal evolution of the type 2 PV array).
Energies 15 02392 g049
Figure 50. The comparison of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the outer and inner loop corrections (signal evolution of the type 2 PV array).
Figure 50. The comparison of voltage udc for a step variation of the udcref reference voltage from 500 V to 550 V using SMC- and SYN-type controllers and the RL-TD3 agent for the outer and inner loop corrections (signal evolution of the type 2 PV array).
Energies 15 02392 g050
Table 1. The performances of the GC-PV array control system based on PI-type controllers using the RL-TD3 agent.
Table 1. The performances of the GC-PV array control system based on PI-type controllers using the RL-TD3 agent.
Controllers for the GC-PV ArrayResponse Time
(ms)
Voltage Ripple
(V)
Overshooting
(%)
Steady-State Error
(%)
PI40.457.87<0.50.2
PI using the RL-TD3 agent for the correction of the idref command37.157.23<0.50.2
PI using the RL-TD3 agent for the correction of the udref and uqref commands35.956.67<0.50.2
PI using the RL-TD3 agent for the correction of the udref, uqref and idref commands33.856.22<0.50.2
Table 2. The performances of the GC-PV control systems based on SMC and SYN controllers using the RL-TD3 agent.
Table 2. The performances of the GC-PV control systems based on SMC and SYN controllers using the RL-TD3 agent.
Controllers for the GC-PV ArrayResponse Time (ms)Voltage Ripple
(V)
Overshooting
(%)
Steady-State Error
(%)
SMC and SYN14.155.63<0.20.02
SMC and SYN using the RL-TD3 agent for the correction of the idref command13.755.12<0.20.02
SMC and SYN using the RL-TD3 agent for the correction of the udref and uqref commands13.554.58<0.20.02
SMC and SYN using the RL-TD3 agent for the correction of the udref, uqref and idref commands13.254.03<0.20.02
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nicola, M.; Nicola, C.-I.; Selișteanu, D. Improvement of the Control of a Grid Connected Photovoltaic System Based on Synergetic and Sliding Mode Controllers Using a Reinforcement Learning Deep Deterministic Policy Gradient Agent. Energies 2022, 15, 2392. https://0-doi-org.brum.beds.ac.uk/10.3390/en15072392

AMA Style

Nicola M, Nicola C-I, Selișteanu D. Improvement of the Control of a Grid Connected Photovoltaic System Based on Synergetic and Sliding Mode Controllers Using a Reinforcement Learning Deep Deterministic Policy Gradient Agent. Energies. 2022; 15(7):2392. https://0-doi-org.brum.beds.ac.uk/10.3390/en15072392

Chicago/Turabian Style

Nicola, Marcel, Claudiu-Ionel Nicola, and Dan Selișteanu. 2022. "Improvement of the Control of a Grid Connected Photovoltaic System Based on Synergetic and Sliding Mode Controllers Using a Reinforcement Learning Deep Deterministic Policy Gradient Agent" Energies 15, no. 7: 2392. https://0-doi-org.brum.beds.ac.uk/10.3390/en15072392

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop