Optimal Transmission Switching for Short-Circuit Current Limitation Based on Deep Reinforcement Learning

Tang, Sirui; Li, Ting; Liu, Youbo; Su, Yunche; Wang, Yunling; Liu, Fang; Gao, Shuyu

doi:10.3390/en15239200

Open AccessArticle

Optimal Transmission Switching for Short-Circuit Current Limitation Based on Deep Reinforcement Learning

by

Sirui Tang

¹,

Ting Li

¹,

Youbo Liu

²,

Yunche Su

¹,

Yunling Wang

¹,

Fang Liu

¹ and

Shuyu Gao

^2,*

¹

State Grid Sichuan Economic Research Institute, Chengdu 610041, China

²

College of Electrical Engineering, Sichuan University, Chengdu 610065, China

^*

Author to whom correspondence should be addressed.

Energies 2022, 15(23), 9200; https://0-doi-org.brum.beds.ac.uk/10.3390/en15239200

Submission received: 22 September 2022 / Revised: 26 November 2022 / Accepted: 28 November 2022 / Published: 5 December 2022

(This article belongs to the Special Issue Energy Management of Smart Grids with Renewable Energy Resource)

Download

Browse Figures

Versions Notes

Abstract

:

The gradual expansion of power transmission networks leads to an increase in short-circuit current (SCC), which has an impact on the secure operation of transmission networks when the SCC exceeds the interrupting capacity of the circuit breakers. In this regard, optimal transmission switching (OTS) is proposed to reduce the short-circuit current while maximizing the loadability with respect to voltage stability. However, the OTS model is a complex combinatorial optimization problem with binary decision variables. To address this problem, this paper employs the deep Q-network (DQN)-based RL algorithm to solve the OTS problem. Case studies on the IEEE 30-bus system and 118-bus system are presented to demonstrate the effectiveness of the proposed method. The numerical results show that the DQN-based agent can select the effective branches at each step and reduce the SCC after implementing the OTS strategies.

Keywords:

transmission network planning; short-circuit current limitation; maximum loadability; deep reinforcement learning

1. Introduction

1.1. Motivations

With the gradual expansion of power transmission networks, the electrical distance between substations has become shorter, which has, in turn, led to an increase in short-circuit current (SCC). When the SCC magnitude exceeds the interrupting capacity of circuit breakers (CBs), the circuit breakers may not be able to interrupt the electric arc. In this case, the branch cannot be opened and, therefore, the short-circuit fault is not isolated, which will lead to damage to the CBs and, more importantly, will endanger the security of the power system. To address this problem, the replacement of CBs with higher interrupting capacity and the installation of a fault current limiter [1,2,3] have been proposed. However, investment in equipment is necessary for the above-mentioned countermeasures. On the contrary, network reconfiguration can reduce the SCC in an economical way, as it does not require investment in equipment.

However, transmission network reconfiguration is a complex combinatorial optimization problem, which is difficult to compute using conventional mathematical programming algorithms. Inspired by the success of reinforcement learning (RL) in solving combinatorial optimization problems, this paper employs the deep Q-network (DQN)-based RL algorithm to solve the OTS problem with the purpose of reducing the short-circuit current while maintaining the maximum loadability of the transmission network.

1.2. Related Works

1.2.1. Optimal Transmission Switching

Transmission network reconfiguration is also called optimal transmission switching (OTS) [4]. It has been reported that OTS can be used to reduce transmission loss [5,6], relieve overloads and voltage violations [7,8], and reduce operating costs [4,9]. With the above-mentioned benefits, OTS has been incorporated with unit commitment (UC) [10,11] and transmission expansion planning (TEP) [12] in order to enhance the flexibility of transmission system’s operation and planning. In [13,14], OTS and UC are coordinated to reduce short-circuit current. In the literature, OTS is usually modeled as a mixed integer-programming (MIP) problem with massive binary variables that are related to each branch in the power network. Therefore, OTS is a complex combinatorial optimization problem. To enhance the computation efficiency, some efforts focus on the computational strategies including solution space reduction [15] and sensitivity analysis [16].

1.2.2. Application of Reinforcement Learning in Power Engineering

In recent years, reinforcement learning has gained more attention as an alternative method for solving combinatorial optimization problems [17]. In the field of power engineering, RL-based methods have been proposed for operation planning [18], voltage control [19], wide-area-damping control [20], and so on. In [21], a proximal policy optimization (PPO) is proposed to learn the control strategy for power systems’ dynamic security. In [22], the multi-agent deep deterministic policy gradient (MADDPG) is proposed to regulate the static var compensators (SVCs) in order to enhance the voltage stability of urban power grids.

1.3. Organization of This Paper

The rest of this paper is organized as follows. The problem’s description and formulation are discussed in Section 2. The deep reinforcement learning-based optimal transmission strategy is proposed in Section 3. In Section 4, case studies on two benchmark system are presented. Finally, the study’s conclusions are presented in Section 5.

2. Problem Description and Formulation

2.1. Computation of Short-Circuit Current

In high-voltage power networks, the short-circuit current of a three-phase short-circuit fault is usually higher than other types of short-circuit faults. Therefore, the three-phase short-circuit current is computed to determine whether the maximum short-circuit current has exceeded the interrupting capacity of the circuit breakers.

In addition, as the resistance is significantly smaller than the reactance for high-voltage transmission lines and transformers, the resistances of all the devices are neglected in practical applications [23]. Under this assumption, the nodal admittance matrix

Y_{s c c}

for short circuit current computation is different from the one for power flow computation. The elements in the nodal admittance matrix

Y_{s c c}

can be computed as follows (1) and (2):

Y_{i i} = \sum_{k = ⟨ i, j ⟩ | ⟨ j, i ⟩ \in L} \frac{π_{k}}{x_{k}} + \sum_{g \in G} \frac{1}{x_{d g}^{″}} - b_{C i}

(1)

Y_{i j} = - \sum_{k = ⟨ i, j ⟩ | ⟨ j, i ⟨ \in L} \frac{π_{k}}{x_{k}}

(2)

where

Y_{i i}

and

Y_{i j}

are the diagonal and the off-diagonal elements in

Y_{s c c}

. L is the set of branches, including the transmission lines and the transformers.

x_{k}

is the reactance of the kth branch, while

π_{k}

denotes the operating status of the kth branch.

π_{k} = 1

indicates that the kth branch is closed; otherwise,

π_{k} = 0

indicates that the kth branch is opened. G is the set of generators, while

x_{d g}^{″}

is the d-axis sub-transient reactance of the gth generator.

b_{C i}

is the shunt capacitor at node i.

After forming the nodal admittance matrix

Y_{s c c}

, the nodal impedance matrix

Z_{s c c}

can be computed by the inversion of

Y_{s c c}

:

Z_{s c c} = Y_{s c c}^{- 1}

(3)

As we are focusing on the three-phase short-circuit current, the SCC of node i can be computed by (4):

I_{s c c, i}^{*} = \frac{V_{i}^{0}}{Z_{i i}} \approx \frac{1}{Z_{i i}}

(4)

where

I_{s c c, i}^{*}

is the per unit value of the SCC.

V_{i}^{0}

is the voltage magnitude under the normal operating condition and

V_{i}^{0}

can be approximated by 1.0 p.u.

Z_{i i}

is the ith diagonal element of the nodal impedance matrix

Z_{s c c}

.

2.2. Formulation of Optimal Transmission Switching for Short-Circuit Current Limitation

In this paper, the optimal transmission switching strategy is studied from the perspective of transmission network development. During the long-term development of transmission networks, there may be a period in which the network is confronted with a short-circuit current problem. Instead of minimizing the operating cost by combining OTS with unit commitment, the proposed OTS model attempts to reduce the short-circuit current while maximizing the loadability of the transmission network. The objective of the proposed OTS model is three-fold, as given in (5)–(8):

\min_{} ⟨ f_{1}, f_{2}, f_{3} ⟩

(5)

Here,

f_{1} = \sum_{i \in B} \frac{I_{s c c, i}^{l i m i t} - I_{s c c, i}}{I_{s c c, i}^{l i m i t}}

(6)

f_{2} = \frac{λ_{0} - λ_{O T S}}{λ_{0}}

(7)

f_{3} = \frac{1}{N_{L}} \sum_{k \in L} (1 - π_{k})

(8)

where

I_{s c c, i}^{l i m i t}

is the maximum limit of the short-circuit current at node i and

I_{s c c, i}

is the real value of the short-circuit current.

λ_{0}

and

λ_{O T S}

are the maximum loadability coefficients computed by the continuation power flow (CPF) [24].

N_{L}

is the number of branches in the power network.

It is clear that the objective

f_{1}

minimizes the over-current of SCC while the objective

f_{2}

attempts to maintain the loadability of the power network after transmission switching. Furthermore, the objective

f_{3}

is set to reduce the number of branches that need to be switched off. The constraints are listed as follows:

(1) The network connectivity constraint. In other words, the transmission-switching strategy should not cause network splitting.

(2) The power flow constraint:

P_{G, i} - P_{D, i} = V_{i} \sum_{j = 1}^{n} V_{j} (G_{i j} \cos δ_{i j} + B_{i j} \sin δ_{i j})

(9)

Q_{G, i} - Q_{D, i} = V_{i} \sum_{j = 1}^{n} V_{j} (G_{i j} \sin δ_{i j} - B_{i j} \cos δ_{i j})

(10)

(3) The branch power flow security constraint:

| S_{i j} | \leq S_{i j}^{m a x}

(11)

(4) The bus voltage magnitude security constraint:

V_{i}^{m i n} \leq V_{i} \leq V_{i}^{m a x}

(12)

where

P_{G, i}

and

P_{D, i}

are the active power generation and the active power load at node i, while

Q_{G, i}

and

Q_{D, i}

are the reactive power generation and the reactive power load.

V_{i}

and

V_{j}

are the bus voltage variables.

G_{i j}

and

B_{i j}

are the real part and the imaginary part of the corresponding element in the nodal admittance matrix for power flow computation.

δ_{i j}

denotes the phase angle different between node i and node j.

S_{i j}

is the power flow from node i to node j and

S_{i j}^{m a x}

is the maximum limit.

V_{i}^{m i n}

and

V_{i}^{m a x}

are the security limits of the bus voltage magnitude.

3. Optimal Transmission Switching Based on Deep Reinforcement Learning

3.1. Brief Introduction to Deep Q-Learning

In the general framework of reinforcement learning, an agent interacts with the environment

ℰ

and, more importantly, learns to select the actions

a

based on the rewards

r

provided by the environment. Intuitively, the environment

ℰ

represents the problem to be solved. At each step t, the agent generates an action

a_{t}

according to the partial or complete observation of the current state

s_{t}

of the environment

ℰ

based on its policy

π (a_{t} | s_{t})

. After implementing the action

a_{t}

, the environment

ℰ

returns a reward

r_{t + 1}

and the new state

s_{t + 1}

to the agent. During the procedure of RL, the agent learns to improve the policy

π (a_{t} | s_{t})

in order to maximize the aggregated rewards.

The conventional algorithm for RL is the Q-learning algorithm. The optimal Q-function

Q^{*} (s, a)

can be defined as the maximum return that can be obtained starting from the current observation

s

by taking the action

a

and following the optimal policy thereafter. The optimal Q-function obeys the Bellman optimality equation as shown in (13):

Q^{*} (s, a) = E [r + γ \max_{a^{'}} Q^{*} (s^{'}, a^{'})]

(13)

where

E [\cdot]

denotes the computation of the expectation of the immediate rewards

r

and the maximum future rewards.

γ

is the discount coefficient.

s^{'}

and

a^{'}

are the possible next states and the corresponding actions.

The basic idea behind many reinforcement learning algorithms is to estimate the Q-function by using the Bellman equation as an iterative update, as shown in (14):

Q_{i + 1} (s, a) = E [r + γ \max_{a^{'}} Q^{*} (s^{'}, a^{'}) | s, a]

(14)

When the action space grows, it is impractical to use the Q-table to form the optimal policy. To address this problem, the deep Q-network-based RL algorithm [25] was proposed by Google DeepMind. In DQN, the neural network is used to approximate the Q-function as shown in (15):

Q (s, a; θ) \approx Q^{*} (s, a)

(15)

Then, the Q-network can be trained by minimizing a sequence of loss functions:

L_{i} (θ_{i}) = E_{s, a ~ ρ (\cdot)} [{(y_{i} - Q (s, a; θ_{i}))}^{2}]

(16)

Here,

y_{i} = E_{s^{'} ~ ℰ} [r + γ \max_{a^{'}} Q^{*} (s^{'}, a^{'}; θ_{i}) | s, a]

(17)

where

y_{i}

is the target for iteration i and

ρ (\cdot)

is a probability distribution over sequences and actions that is referred to as the behavior distribution. The parameters from the previous iteration

θ_{i}

remain fixed when optimizing the loss function

L_{i} (θ_{i})

.

3.2. The Proposed Methodology

We consider the procedure of optimal transmission switching for short-circuit current limitation as a Markov decision process (MDP). The settings of the MDP for optimal transmission switching are as follows.

(1) The environment. The targeted transmission network is considered as the interactive environment for the DRL agent. The computation of the power flow, short-circuit current, and maximum load can be used to compute the rewards.

(2) The state. The state of the environment is set as the network structure, which is represented by the operating state of the branches. In this regard, the state s can be formulated as (18):

s = [π_{1}, π_{2}, \dots, π_{N_{L}}]

(18)

(3) The action. The DRL agent chooses a branch to be switched off at each step.

(4) The reward. The reward is an important component for reinforcement learning as the agent tunes the network parameter of the Q-network according to the reward. Based on the OTS model described in Section 2, the reward function can be defined by (19):

R = {\begin{matrix} R_{p}, i f p o w e r f l o w c o n v e r g e d \\ c_{s c} + c_{V S A} + c_{T S}, o t h e r w i s e \\ - R_{p}, i f p o w e r f l o w d i v e r g e d o r i s l a n d i n g \end{matrix}

(19)

where

c_{s c} = \sum_{i \in B} \frac{I_{s c c, i}^{l i m i t} - I_{s c c, i}}{I_{s c c, i}^{l i m i t}}

(20)

c_{V S A} = \frac{λ_{O T S} - λ_{0}}{λ_{0}}

(21)

c_{T S} = - \frac{1}{N_{L}} \sum_{k \in L} (1 - π_{k})

(22)

(5) The training procedure. During the training procedure, the DRL agent interacts with the environment and thus learns to maximize the reward by selecting the most prospective action. As the action is generated by the Q-network, which is a deep neural network that takes the state as the input and outputs the Q-values for all the potential actions, the training procedure can be viewed as the process that fine-tunes the Q-network. Firstly, the Q-network is initialized with random weights. At the start of each episode, the state of the environment is reset, which means all the branches are closed, and the initial network structure is retained. Then, we generate a random seed

ε

. If

ε

is lower than the threshold (usually 0.1), select a random branch; otherwise, the state is fed into the Q-network and then the branch that is related to the highest Q-value is selected. After selecting the branch, this branch is switched off, and the state and the corresponding network structure are updated. The reward in (19) is computed by determining the power flow, short-circuit current, and the continuation power flow computation. The record (

s_{t}, a_{t}, r_{t}, s_{t + 1}

) is stored in the replay memory

D

. If the SCC at all the nodes is lower than the limit, the episode is terminated. The episodes repeat until the maximum episode is reached. In addition, during the interactive training procedure, when the size of the replay memory

D

is larger than the pre-set capacity

N_{D}

, the recorded instances in the replay memory

D

are used to learn the weights of the Q-network by using back propagation algorithms such as the ADAM algorithm. The pseudo-code of the training procedure is demonstrated in Algorithm 1.

Algorithm 1 Training Procedure of OTS Agent
(1)	Input: the network structure of the power system
(2)	Output: the well-trained Q-network
(3)	Initialize the Q-network and the replay memory $D$ with capacity $N_{D}$
(4)	for episode = 1 to M, do:
(5)	Reset the state
(6)	for t = 1 to T, do:
(7)	With probability $ε$ , select a random action; otherwise, generate the action via the Q-network
(8)	Update the state
(9)	Perform power flow computation, short-circuit computation, and continuation power flow computation according to the changed network structure under the current state
(10)	Compute the reward by (19)
(11)	Store ( $s_{t}, a_{t}, r_{t}, s_{t + 1}$ ) in the replay buffer
(12)	If the SCC at all the nodes is lower than the limit, do:
(13)	End the loop of t
(14)	end if
(15)	end for
(16)	if the size of $D$ is larger than $N_{D}$ , do:
(17)	Sample a minibatch of $S$ samples from $D$
(18)	Update the parameters of Q-network by a gradient descent step on (16)
(19)	end if
(20)	end for

(6) Decision making for OTS-based short-circuit current limitation. With the well-trained Q-network, the Markov decision process for optimal transmission switching starts with the initial network structure. At each step, the Q-network generates an action that is related to the switching of a branch and is expected to obtain the highest reward. Implement the action and then compute the short-circuit current under the changed network structure. If there are any nodes at which the short-circuit current exceeds the interrupting capacity of the circuit break, the action of branch switching continues. Otherwise, if there is no node that suffers from a short-circuit current problem, the MDP for OTS ends, and the final network structure is used as the optimal solution.

4. Results

Case studies on the IEEE 30-bus system and the 118-bus system are presented herein to demonstrate the effectiveness of the proposed deep reinforcement learning-based optimal transmission-switching method. The data of these testing systems can be found in [26]. The sub-transient reactance of each generator in both cases is set uniformly as 0.1 p.u.

4.1. Illustrative Case Study on the Modified IEEE 30-Bus System

The network structure of the IEEE 30-bus system is shown in Figure 1. One transmission line from Bus-11 to Bus-21 is added as in [13] for the case study on the IEEE 30-bus system. Under this network structure, the short-circuit current magnitudes of all the buses are computed and are shown in Figure 2.

The maximum limit of short-circuit current is set to be 12 kA, and the objective is to reduce the SCC of the non-generator buses to this limit. According to the discussion in Section 3.2, the environment for OTS is set, and then the DQN-based agent is trained based on Algorithm I. Except for the branches that will cause islanding if they are switched off, the others are all considered in the action space of the agent. With the well-trained Q-network, the transmission-switching strategy for short-circuit current limitation is generated. During this decision process, the Branches 4–12 is switched off at the first step and then the Branches 6–9 is switched off at the second step. After these two steps, there is no bus at which the short-circuit current exceeds the maximum limit. Then, the OTS strategy is generated, and the short-circuit current magnitudes after transmission switching are shown in Figure 3. It can be seen from Figure 3 that the SCCs are reduced below the limitation.

4.2. Comparative Case Study with Conventional Genetic Algorithm

The proposed OTS model is a typical combinatorial optimal model with binary variables. Conventionally, this kind of optimal model is solved by evolutionary programming algorithms such as genetic algorithms (GA) [27,28]. To further demonstrate the effectiveness of the proposed method, a comparative case study is carried out. The individuals of the population are represented by (18), which is the state of the power network environment. The number of populations is 100, and the individuals are initialized by independent random sampling. The maximum iteration is 100. The mutation rate is 0.2, while the crossover rate is 0.9. The numerical results are shown in Table 1. While the OTS solutions of both methods are feasible, as the SCC of the non-generator buses has been reduced to lower than 12 kA and the minimum margins of the SCCs are comparable to each other, the maximum loadability of the proposed method is 4.0817 times of the base condition, which is higher than that of the genetic algorithm.

4.3. Scalability Case Study on the IEEE 118-Bus System

A case study on the IEEE 118-bus system is presented herein to validate the scalability of the proposed method. The maximum limitation in this case is 25 kA. The short-circuit current magnitudes of all the buses before transmission switching are shown in Figure 4. It can be seen that the SCC of Bus-66 is the highest among all the buses in the testing system.

In this testing system, 103 branches are not allowed to be switched off due to islanding and N-1 reliability. The remaining 83 branches are used to form the action space. After the training of the DQN-based agent, the transmission-switching strategy for short-circuit current limitation is generated. The branches that are switched off during the decision process are Branch 65–68, Branch 60–61, and Branch 65–66. The short-circuit current magnitudes after transmission switching are shown in Figure 5. It can be seen that after the switching of these three branches, the short-circuit current can be reduced below the limitation, which further demonstrates the effectiveness of the proposed DRL-based OTS method.

5. Conclusions

To prevent the short-circuit current from exceeding the interrupting capacity of the breakers, an optimal transmission-switching model has been proposed in this paper to reduce the short-circuit current while maximizing the loadability of the transmission network. Considering that this optimal transmission-switching model is a complex combinatorial optimization problem with binary decision variables, the deep Q-network-based reinforcement-learning algorithm was proposed to search for the optimal solution. Case studies on two benchmark testing systems were presented.

The numerical results show that (1) the proposed method can select the effective branches at each step and reduce the short-circuit current after implementing the transmission-switching strategies, (2) the proposed method outperforms the conventional genetic algorithm in terms of the performance of the solution, and (3) the case studies on the IEEE 118-bus system demonstrate that the proposed method can be applied to transmission networks of different scales.

Author Contributions

Conceptualization, S.T. and T.L.; methodology, Y.L.; software, Y.S.; validation, Y.W. and F.L.; investigation, S.G.; writing—original draft preparation, S.T.; writing—review and editing, T.L.; supervision, Y.L.; funding acquisition, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by State Grid Sichuan Electric Power Company (Grant No.: SGSCJY00GHJS2200041).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Schmitt, H. Fault current limiters report on the activities of CIGRE WG A3.16. In Proceedings of the 2006 IEEE Power Engineering Society General Meeting, Montreal, QC, Canada, 18–22 June 2006; pp. 1–5. [Google Scholar] [CrossRef]
Yang, D.; Zhao, K.; Liu, Y. Coordinated optimization for controlling short circuit current and multi-infeed DC interaction. J. Mod. Power Syst. Clean Energy 2014, 2, 274–284. [Google Scholar] [CrossRef] [Green Version]
Hu, Y.; Chiang, H.-D. Optimal Placement and Sizing for Fault Current Limiters: Multi-Objective Optimization Approach. In Proceedings of the 2018 IEEE Power & Energy Society General Meeting (PESGM), Portland, OR, USA, 5–9 August 2018; pp. 1–5. [Google Scholar] [CrossRef]
Fisher, E.B.; O’Neill, R.P.; Ferris, M.C. Optimal Transmission Switching. IEEE Trans. Power Syst. 2008, 23, 1346–1355. [Google Scholar] [CrossRef] [Green Version]
Bacher, R.; Glavitsch, H. Loss reduction by network switching. IEEE Trans. Power Syst. 1988, 3, 447–454. [Google Scholar] [CrossRef]
Fliscounakis, S.; Zaoui, F.; Simeant, G.; Gonzalez, R. Topology Influence on Loss Reduction as a Mixed Integer Linear Programming Problem. In Proceedings of the 2007 IEEE Lausanne Power Tech, Lausanne, Switzerland, 1–5 July 2007; pp. 1987–1990. [Google Scholar] [CrossRef]
Shao, W.; Vittal, V. Corrective switching algorithm for relieving overloads and voltage violations. IEEE Trans. Power Syst. 2005, 20, 1877–1885. [Google Scholar] [CrossRef]
Shao, W.; Vittal, V. BIP-Based OPF for Line and Bus-bar Switching to Relieve Overloads and Voltage Violations. In Proceedings of the 2006 IEEE PES Power Systems Conference and Exposition, Atlanta, GA, USA, 29 October–1 November 2006; pp. 2090–2095. [Google Scholar] [CrossRef]
Hedman, K.W.; O’Neill, R.P.; Fisher, E.B.; Oren, S.S. Optimal Transmission Switching With Contingency Analysis. IEEE Trans. Power Syst. 2009, 24, 1577–1586. [Google Scholar] [CrossRef]
Hedman, K.W.; Ferris, M.C.; O’Neill, R.P.; Fisher, E.B.; Oren, S.S. Co-Optimization of Generation Unit Commitment and Transmission Switching With N-1 Reliability. IEEE Trans. Power Syst. 2010, 25, 1052–1063. [Google Scholar] [CrossRef]
Khodaei, A.; Shahidehpour, M. Transmission Switching in Security-Constrained Unit Commitment. IEEE Trans. Power Syst. 2010, 25, 1937–1945. [Google Scholar] [CrossRef]
Khodaei, A.; Shahidehpour, M.; Kamalinia, S. Transmission Switching in Expansion Planning. IEEE Trans. Power Syst. 2010, 25, 1722–1733. [Google Scholar] [CrossRef]
Yang, Z.; Zhong, H.; Xia, Q.; Kang, C. Optimal Transmission Switching with Short-Circuit Current Limitation Constraints. IEEE Trans. Power Syst. 2016, 31, 1278–1288. [Google Scholar] [CrossRef]
Tian, S.; Wang, X.; Zhang, Q.; Qi, S.; Dou, X. Transmission switching considering short-circuit current limitations. In Proceedings of the 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2), Beijing, China, 26–28 November 2017; pp. 1–6. [Google Scholar]
Barrows, C.; Blumsack, S.; Bent, R. Computationally efficient optimal Transmission Switching: Solution space reduction. In Proceedings of the 2012 IEEE Power and Energy Society General Meeting, San Diego, CA, USA, 22–26 July 2012; pp. 1–8. [Google Scholar] [CrossRef]
Ruiz, P.A.; Foster, J.M.; Rudkevich, A.; Caramanis, M.C. Tractable Transmission Topology Control Using Sensitivity Analysis. IEEE Trans. Power Syst. 2012, 27, 1550–1559. [Google Scholar] [CrossRef]
Bello, I.; Pham, H.; Le, Q.V.; Norouzi, M.; Bengio, S. Neural Combinatorial Optimization with Reinforcement Learning. arXiv 2016, arXiv:1611.09940. [Google Scholar] [CrossRef]
Liu, J.; Liu, Y.; Qiu, G.; Gu, Y.; Li, H.; Liu, J. Deep-Q-Network-Based Intelligent Reschedule for Power System Operational Planning. In Proceedings of the 2020 12th IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), Nanjing, China, 20–23 September 2020; pp. 1–6. [Google Scholar] [CrossRef]
Huang, R.; Chen, Y.; Yin, T.; Li, X.; Li, A.; Tan, J.; Huang, Q. Accelerated Derivative-Free Deep Reinforcement Learning for Large-Scale Grid Emergency Voltage Control. IEEE Trans. Power Syst. 2022, 37, 14–25. [Google Scholar] [CrossRef]
Gupta, P.; Pal, A.; Vittal, V. Coordinated Wide-Area Damping Control Using Deep Neural Networks and Reinforcement Learning. IEEE Trans. Power Syst. 2022, 37, 365–376. [Google Scholar] [CrossRef]
Gao, Q.; Liu, Y.; Zhao, J.; Liu, J.; Chung, C.Y. Hybrid Deep Learning for Dynamic Total Transfer Capability Control. IEEE Trans. Power Syst. 2021, 36, 2733–2736. [Google Scholar] [CrossRef]
Zhang, X.; Liu, Y.; Duan, J.; Qiu, G.; Liu, T.; Liu, J. DDPG-Based Multi-Agent Framework for SVC Tuning in Urban Power Grid with Renewable Energy Resources. IEEE Trans. Power Syst. 2021, 36, 5465–5475. [Google Scholar] [CrossRef]
TR 60909-0 IEC: 2001; International standard short-circuit currents in three-phase A.C systems-Part 0: Calculation of currents; IEC: São Paulo, Brazil, 2001.
Ajjarapu, V.; Christy, C. The continuation power flow: A tool for steady state voltage stability analysis. IEEE Trans. Power Syst. 1992, 7, 416–423. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing Atari with Deep Reinforcement Learning. arXiv 2013, arXiv:1312.5602. [Google Scholar] [CrossRef]
Power Systems Test Case Archive. Available online: http://labs.ece.uw.edu/pstca/ (accessed on 30 October 2022).
Schmitt, L.M. Theory of genetic algorithms. Theor. Comput. Sci. 2001, 259, 1–61. [Google Scholar] [CrossRef] [Green Version]
Romero, R.; Rider, M.J.; Silva, I.d.J. A Metaheuristic to Solve the Transmission Expansion Planning. IEEE Trans. Power Syst. 2007, 22, 2289–2291. [Google Scholar] [CrossRef]

Figure 1. The network structure of the IEEE 30-bus system.

Figure 2. The short-circuit current magnitudes of all the buses in IEEE 30-bus system before transmission switching.

Figure 3. The short-circuit current magnitudes of all the buses in IEEE 30-bus system after transmission switching.

Figure 4. The short-circuit current magnitudes of all the buses in IEEE 118-bus system before transmission switching.

Figure 5. The short-circuit current magnitudes of all the buses in IEEE 118-bus after transmission switching.

Table 1. Comparative results between the proposed method and genetic algorithm.

Methods	Branches to Be Switched Off	The Minimum Margin of SCC	The Maximum Loadability
Proposed method	4–12\6–9	0.71 kA	4.0817
Genetic algorithm	4–6\6–9\6–10	0.92 kA	2.7047

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, S.; Li, T.; Liu, Y.; Su, Y.; Wang, Y.; Liu, F.; Gao, S. Optimal Transmission Switching for Short-Circuit Current Limitation Based on Deep Reinforcement Learning. Energies 2022, 15, 9200. https://0-doi-org.brum.beds.ac.uk/10.3390/en15239200

AMA Style

Tang S, Li T, Liu Y, Su Y, Wang Y, Liu F, Gao S. Optimal Transmission Switching for Short-Circuit Current Limitation Based on Deep Reinforcement Learning. Energies. 2022; 15(23):9200. https://0-doi-org.brum.beds.ac.uk/10.3390/en15239200

Chicago/Turabian Style

Tang, Sirui, Ting Li, Youbo Liu, Yunche Su, Yunling Wang, Fang Liu, and Shuyu Gao. 2022. "Optimal Transmission Switching for Short-Circuit Current Limitation Based on Deep Reinforcement Learning" Energies 15, no. 23: 9200. https://0-doi-org.brum.beds.ac.uk/10.3390/en15239200

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Transmission Switching for Short-Circuit Current Limitation Based on Deep Reinforcement Learning

Abstract

1. Introduction

1.1. Motivations

1.2. Related Works

1.2.1. Optimal Transmission Switching

1.2.2. Application of Reinforcement Learning in Power Engineering

1.3. Organization of This Paper

2. Problem Description and Formulation

2.1. Computation of Short-Circuit Current

2.2. Formulation of Optimal Transmission Switching for Short-Circuit Current Limitation

3. Optimal Transmission Switching Based on Deep Reinforcement Learning

3.1. Brief Introduction to Deep Q-Learning

3.2. The Proposed Methodology

4. Results

4.1. Illustrative Case Study on the Modified IEEE 30-Bus System

4.2. Comparative Case Study with Conventional Genetic Algorithm

4.3. Scalability Case Study on the IEEE 118-Bus System

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI