1. Introduction
With the continuous development of modern technology, ship electrical power systems that can realize overall coordination of the energy of the entire ship are expected to constitute the development trend of ships in the future [
1,
2]. Ship electrical power systems are significantly different from land power systems [
3]. In particular, they are strongly independent. Because ship electrical power systems have a smaller capacity than onshore power systems, bus voltage fluctuation may occur under the application or removal of a large load, which can easily cause serious faults. Any equipment fault in the system can affect the entire power grid. If potential safety hazards occur during operation, they will threaten the safety of the entire ship. Ship electrical power systems are regarded as the core of the entire ship. They are independent and have high requirements in terms of safe operation and fault diagnosis. They need faster and more accurate fault diagnosis than land power systems in case of system faults [
4]. Therefore, fault diagnosis technologies are necessary to study ship electrical power systems [
5].
Fault diagnosis of a ship electrical power system entails modeling and simulation of the system. Early modeling methods mainly involved physical modeling, i.e., a physical model was established using the similarity principle. At present, the mathematical modeling method is mainly used. This method can abstract the internal characteristics of the system into mathematical formulas, deduce the internal characteristics of the actual system, and diagnose faults through changes in the relationship between the independent variables and the dependent variables of the mathematical formulas. Research on modeling and simulation of ship electrical power systems is based on the power system model and involves a series of studies on how to maintain system stability. The ship electrical power system model is integrated with the modules of each basic unit. First, a mathematical model is established for each basic unit of the power system according to the structure and basic principles of the ship electrical power system. Then, a simulation model is built to form a complete ship electrical power system model [
6]. The research process should include modeling and simulation technologies, automatic control, and other theoretical methods. In particular, the generator and its excitation are related to the voltage stability of the power system. The linear single variable control method, which was first used in excitation control, has been subsequently modified into the nonlinear multivariable control method. The development of the excitation control method has undergone several stages [
7], from the earliest classical proportional integral and differential (PID) control method to the multivariable control method based on modern theory. At present, it is applied as an intelligent control method. In [
8], the authors proposed a T-S fuzzy-weighting-based excitation switching control method for a tidal generator set, which can overcome the dynamic and static performance defects in the excitation control of the tidal generator set and improve its performance. In [
9], feedback control was adopted for the field currents of the two-phase brushless exciter, and speed reference control was adopted for the excitation frequency and phase sequence; this method achieves a constant field current for the main generator. In [
10], the authors presented a nonlinear coordinated excitation and static VAR compensator (SVC) control for regulating the output voltage and improving the transient stability of a synchronous generator infinite bus (SGIB) power system. In terms of the mathematical development of fault diagnosis methods for ship electrical power systems, the author in [
11] developed a higher-order mathematical model of the generator to describe the generator state in greater detail. In [
12], the authors proposed three different mathematical models for the mathematical modeling of a synchronous generator, used the models under different working conditions, and conducted a detailed comparative analysis of the models to improve the simulation accuracy.
In general, fault diagnosis methods are currently categorized into three main types [
13], namely fault diagnosis based on analytical modeling, fault diagnosis based on signal processing, and fault diagnosis based on artificial intelligence. Analytical modeling includes state and parameter estimation as well as consistency testing. It has the characteristics of real-time diagnosis and the essence of deep human systems. However, it also has some defects, such as a large modeling error and significant noise interference. Signal processing includes spectrum analysis and wavelet transformation. It has the advantages of simple application and good real-time performance. However, it cannot deal with potential faults. Artificial intelligence [
14] includes neural networks, fuzzy theory, genetic algorithms, rough sets, artificial immune systems and fuzzy cluster analysis algorithms, fault trees, and support vector machines, which have strong learning and reasoning abilities. To overcome key faults such as a broken rotor bar or electrical phase fault, a fault diagnosis method for the electric drive of an electric ship has been proposed [
15]; however, the number of fault diagnosis objects is insufficient. In [
4], the proposed load monitoring and fault detection method outlines a data-clustering-based approach to extract unique feature vectors from short-time Fourier transform analysis for any pulsed load; however, this method is not suitable for any general load curve integrated solution. In [
16], the location and severity of a stator winding fault of a permanent magnet synchronous motor were modeled and detected, and a mathematical model that can describe both the health state and the fault state was established; however, the mathematical model is not suitable for other ship electrical power system equipment. In [
17], a remote system was introduced for online condition monitoring and fault diagnosis of a gas turbine on an offshore oil well drilling platform on the basis of a kernelized information entropy model. In [
18], a multi-class multi-core correlation vector machine fault diagnosis method based on manifold learning and swarm intelligence optimization was proposed to improve the predictive maintenance activities of diesel engines.
This paper proposes an improved network fault diagnosis model based on a convolutional neural network (CNN). This method can directly input the original image without feature decomposition and extraction. It has significant advantages, such as simple ap-plication, high operation speed, automatic parameter updates, and stable, convergent, and accurate results. These advantages enable the method to overcome existing drawbacks in the fault diagnosis of ship electrical power systems. First, based on the MATLAB/Simulink (The MathWorks Inc., Natick, MA, USA) simulation software platform, the ship electrical power system simulation model is established to understand the normal working state and fault state of the generator and load. Then, the fault response curve is generated and the picture dataset of the network model is obtained. Second, the CNN fault diagnosis model is designed using TensorFlow, an open source tool for deep learning. Finally, network model training is performed, and optimal diagnosis results are obtained to realize structural parameter optimization and diagnosis.
The remainder of this paper is organized as follows.
Section 2 describes the model and simulation of the ship electrical power system.
Section 3 discusses the development of the improved CNN.
Section 4 presents and analyzes the experimental results. Finally,
Section 5 concludes the paper.
3. Construction of Improved CNN
3.1. CNN
CNN has emerged as a research hotspot in many scientific fields, especially pattern classification [
28]. CNN is composed of a series of layers, as well as data flows between the layers. The basic structure is as follows: input layer, convolution layer, activation function, pooling layer, and fully connected layer, i.e., INPUT-CONV-RELU -POOL-FC.
The convolution layer is a feature extraction layer. The input of each neuron is connected to the local receptive field of the previous layer and extracts local features. The convolution layer mainly convolutes the image according to the convolution kernel and reduces noise [
29]. It also involves the principle of “weight sharing”. The calculation formula of the convolution layer is as follows:
where
denotes the number of layers,
represents the feature graph set of the previous layer associated with the
the feature graph of the current layer,
is the
th characteristic diagram output by the
th layer,
is the
th characteristic diagram of the output of the
th layer,
is the convolution kernel between the
th characteristic graph of the
th layer and the
th characteristic graph of the previous layer, and
is the offset of the
th characteristic graph of the
th layer.
The activation function is used to add nonlinear factors, because the convolution method is used to deal with linear operations, i.e., assign weights to each pixel. The expression of the linear model is not sufficient; hence, an activation function is required. Common activation functions include the sigmoid function, tanh function, ReLU function, and leaky ReLU function.
The pooling layer is a feature mapping layer. After adding bias, a new feature map is obtained in the pooling layer through a nonlinear function [
30]. The functions of pooling are as follows: (i) reducing the size of the characteristic diagram and simplifying the computational complexity of the network; (ii) feature compression to extract the main features. The operation formula of the pooling layer is as follows:
where
represents the pooled down-sampling function,
is the ratio column offset, and
is the additive bias.
The fully connected layer is used to connect all the features and send the output value to a classifier (such as a softmax classifier) for classification.
Finally, the test accuracy and error loss function value of the model are output. The structure and parameters of CNN are shown in
Figure 7.
3.2. Improved CNN
The traditional model has a complex structure, massive parameters, and low running speed. Moreover, the convergence speed of the classification results is affected by the method of initializing the parameters and the updating of the network weights, and there are oscillation problems in the accuracy and loss rate curves. In summary, this study makes the following improvements and proposes a CNN model with better performance, which can avoid the above-mentioned issues.
- (1)
All local response normalization (LRN) layers are removed and the initial value program is changed. It is proven by practice that the normalization operation of batch normalization (BN) is used after simple parameter initialization. The use of BN is conducive to the convergence of the samples and the stability of the network.
- (2)
The number of nodes in the fully connected layer is adjusted; based on the reduction and updating of parameters and weights, the running speed is improved and the calculation time is shortened.
- (3)
Nesterov-accelerated adaptive moment estimation (NAdam) is used to update the weights of the neural networks iteratively on the basis of the training data, and the weights can be updated iteratively according to the output results.
- (4)
Kaiming initialization is used to initialize the normal_initializer. After testing, the results will be improved.
CNN model parameters and improved CNN model parameters are listed in
Table 4 and
Table 5, respectively.
3.3. Flow of CNN Algorithm
The algorithm flow of the proposed CNN fault diagnosis model is shown in
Figure 8. It is mainly divided into four stages:
- (1)
Sample image preprocessing: First, the sample dataset is constructed. Second, the size and color of the image are processed to facilitate network learning.
- (2)
Design network fault diagnosis model: Network programs are written and built in the Python (Python Software Foundation, Delaware, USA) compilation environment and TensorFlow (Google Brain, San Francisco, USA) learning framework.
- (3)
Training optimization network model: The weight and threshold are adjusted repeatedly according to the back propagation (BP) algorithm in order to minimize the error signal.
- (4)
The optimized model is tested on the sample image dataset to output the diagnosis results.
3.4. Data Preprocessing
This study employs MATLAB/Simulink to build the ship electrical power system and takes the waveform of fault response curve as the input of the network fault model. Data preprocessing is divided into the following parts: data acquisition, image culling and data normalization.
- (1)
Data acquisition: In the simulation model, different faults in the “three-phase fault” fault module are set for different generators and loads to output the fault response curve. The file type is the JPG-format picture set recognized by CNN.
- (2)
Image culling: After converting the data into JPG-format pictures, some problems such as image overlap or feature blur will occur. These interfering images must be selected and eliminated to ensure the accuracy of the network training and test results.
- (3)
Data normalization: Owing to the difference between the orders of magnitude of the images, this difference will affect the results of the data analysis. To eliminate the influence between dimensions, data normalization is required. The min–max standardization method used in this study is the commonly used linear transformation of data. The result value is mapped to the interval of (0,1). The transformation function is as follows:
where
and
are the maximum and minimum values of the sample data.
After preprocessing of the above-mentioned data, the image set with significant characteristics is used as the input of the CNN. The CNN model performs convolution, pooling, and other operations on the picture set to generate the output of the network model.
4. Experimental Results and Analysis
This study employed a Windows 10 system (Microsoft, Redmond, DC, USA) with the Python 3.7.11 (Python Software Foundation, Wilmington, DE, USA) compiling environment and TensorFlow learning framework to write the network programs. Because the image dataset used in this study was simple and regular, and the amount of data was small, the conventional method was used to adjust the parameters in order to optimize the CNN model.
4.1. Learning Rate
In the training network model, the learning rate is an important super-parameter that controls the speed of network weight adjustment. In general, the higher the learning rate, the faster is the learning of the network. However, if it is too high and reaches extreme values, the accuracy will be reduced; the loss value will stop falling and oscillate repeatedly at a certain position. The lower the learning rate, the slower the decrease in the loss gradient and the longer the convergence time. Therefore, it is crucial to choose an appropriate learning rate. By referring to numerous experiments as well as the literature, the learning rate of the network model was set to 0.0001.
4.2. Experimental Results
After several experiments, the optimal network fault diagnosis model is finally obtained. Compared with the original CNN model proposed, the average accuracy of the identification and classification of the ship electrical power system is up to 99%. This method makes the fault diagnosis of the ship electrical power system more convenient and reliable. The accuracy and loss variation diagrams were generated using PyCharm and TensorFlow frameworks, as shown in
Figure 9a,b, respectively. The accuracy and loss variation diagrams of the original CNN are shown in
Figure 9c,d, respectively. The accuracy of each fault category is listed in
Table 6.
As can be seen from
Figure 9a,b, the overall identification accuracy of the improved model for ship electrical power systems faults increases, and the loss rate decreases as the number of training epochs increases. After the improved model is trained once, the average accuracy of fault diagnosis reaches 97%, and the loss value is less than 0.1. After 4 times of model training, the average accuracy of fault diagnosis is 99%, and the loss value is less than 0.05. A comparison of
Figure 9a–d shows that the recognition accuracy of the improved CNN after the first training epoch is higher than that of the original network after four training epochs. At the same time, the convergence speed of the loss value curve of the improved model is higher than that of the original model; the fluctuation range is smaller and is more stable after convergence. As can be seen from
Table 6, the accuracy of the improved CNN for different faults at different locations is higher than that of the original network, indicating that the improved CNN provides good classification results for the fault identification of the ship electrical power systems; thus, it has considerable potential for the fault diagnosis of ship electrical power systems.