Next Article in Journal
Diagnosis of Citrus Greening Based on the Fusion of Visible and Near-Infrared Spectra
Previous Article in Journal
Application of the Improved Entry and Exit Method in Slope Reliability Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Diagnosability-Integrated Design Approach Based on Graph Theory

College of Coastal Defense, Naval Aviation University, Yantai 264001, China
*
Author to whom correspondence should be addressed.
Submission received: 9 July 2023 / Revised: 30 August 2023 / Accepted: 3 September 2023 / Published: 7 September 2023

Abstract

:
In order to take into account the influence of both system structure and diagnosis algorithm in the diagnosability design of the system, a diagnosability-integrated design method based on graph theory was proposed in this paper. Firstly, based on the diagnosability evaluation results, the difficulty of fault diagnosis was qualitatively analyzed using the K-means method, and the diagnosis plot of measurement point was drawn based on the analysis results. Secondly, the Bron–Kerbosch algorithm was used to extract the maximal cliques from the diagnosis plot of measurement point and determine the set of maximal cliques that can diagnose faults in the system based on the hypergraph edge coverage theorem. Finally, a cascade classifier was set on the maximal clique set to classify and identify faults in the system, and the performance of the diagnosis scheme was evaluated using the posterior probabilities of the classifier outputs combined with the Shannon entropy. At the same time, the method incorporated a measurement point update mechanism, which can decide whether to add additional measurement point according to the evaluation results of Shannon entropy to ensure better diagnosis effect. The results of simulation experiments showed that the fault diagnosis scheme designed by the method of this paper improved the correct rate of diagnosis results by 3.25 percentage points compared with other diagnosis schemes due to the simultaneous consideration of the structure of the system and the diagnosis method, and the diagnosis results of this paper were relatively stable in repeated experiments, which proved the practicality and effectiveness of the method of this paper.

1. Introduction

With the development of science and technology, the systems involved in people’s production and life are developing in the direction of complexity, large-scale and modularization. This change in the system poses a challenge for the fast and accurate detection and isolation of faults in the system and, so, how to improve the fault diagnosis capability of the system, so as to improve the reliability and utilization of the system, has become a hot spot for researchers to study.
The concept of diagnosability has been introduced in order to improve the fault diagnosis capability of the system. Diagnosability is a property that indicates the extent to which system faults can be accurately and efficiently identified [1]. The diagnosability concept was originally established to uncover deeper information about fault diagnosis, such as how difficult or costly it is to detect and isolate a fault, from a system perspective. The current research concluded that the diagnosis capability of the system is suitable to be improved by designing fault diagnosis methods only when the fault of the system can be diagnosed. If the system fault itself is difficult to diagnose or even undiagnosable, it is meaningless to improve the performance of the diagnosis algorithm for this kind of fault [1]. Therefore, taking the difficulty of fault diagnosis into the scope of fault diagnosis, that is, considering the diagnosability of the system, can fundamentally improve the diagnosis capability of the system.
Research work on diagnosability can be divided into two aspects: diagnosability evaluation [2,3,4] and diagnosability design [5,6]. Diagnosability evaluation refers to the process of measuring the degree to which system faults can be identified deterministically and efficiently, mainly revealing how difficult the fault can be diagnosed. High diagnosability means that the system is more capable of fault diagnosis, while low diagnosability means that the system is less capable of fault diagnosis. Diagnosability design refers to a design technique that takes diagnosability as an optimization index in the design stage of system and diagnosis algorithm [1]. It can be seen that the diagnosability design is a key step in improving the system’s ability to diagnose faults. Depending on the object of diagnosability design, diagnosability design can be further categorized into the design of system structure and the design of diagnosis methods.
(1)
Design of the system structure. Researchers in this direction focus on optimizing the design of the system’s structure to improve the system’s ability to respond to faults and, thus, improve the system’s diagnosis capability, while satisfying the premise of system diagnosability. Since the location and number of sensors can affect the effectiveness of fault identification and isolation [7], the current research direction focuses on changing the structure of the system by optimizing the configuration of the sensors in order to improve the diagnosis capability of the system. The optimal configuration of sensors is essentially a multi-objective decision-making (MODM) problem in the multi-criteria decision-making (MCDM) problem [8]. Various objectives such as system cost [9], reliability, number of sensors, redundancy between sensor information [10], system diagnosability [5,11,12,13], and the degree of hazard of missed faults are considered in different sensor configuration schemes to select one of them that is suitable for the system by balancing the conflicts between multiple objectives and reducing information redundancy [14], making the system inherently more capable of acquiring fault information.
(2)
Design of the diagnosis method of the system. Ding [6] evaluated the false alarm rate, fault detection rate and fault detection time based on a stochastic algorithm, and based on this, proposed a design method for observer-based fault detection system, and [15] proposed an adaptive application strategy that combines system diagnosability analysis with actual fault diagnosis. For different degrees of diagnosability, different diagnosis methods are used to detect and isolate faults, so as to achieve a balance between time efficiency and diagnosis accuracy; [16,17] defined K-gap and used it to give performance indexes of fault detection and isolation for use in the analysis of fault diagnosis capability. Literature [18] proposes a strategy for verifying the diagnosability of discrete-time systems using conjunctive regular expressions, and illustrates the construction of diagnoser and the verification of diagnosability in a real-world case. The diagnosability design of the diagnosis method is mainly to take the diagnosability of the system into account, combined with the fault pattern recognition method [19,20], relying on the model or signal of the system, so that the fault of the system can be exposed as much as possible, in order to achieve the purpose of improving the diagnosability of the system.
From the above analysis, it can be seen that the diagnosability design method starts from the system structure design and diagnosis method design to improve the fault diagnosis capability of the system, in which the system structure design is mainly for the design of the system’s sensors, which mainly solves the problem of extracting information from the system; after all, only by extracting high-quality information from the system can reflect the change of the system fault and the system fault be quickly diagnosed. The diagnosis method design mainly solves the problem of how to utilize the information, by transforming the sensor information, extracting the features and other operations, the fault information can be exposed and the diagnosis of the fault can be realized.
Thus, both aspects can have an impact on the diagnosability of a system, but the two factors are currently considered separately when designing for diagnosability. The current process for diagnosability design is shown in Figure 1a. In this process, the design of system structure and diagnosis method are two successive steps, which do not affect each other. Diagnosis method and system structure are separated. However, it is clear from the above analysis that the design of the system structure determines what diagnosis algorithm is used, and the choice of diagnosis algorithm affects the design of the system structure. If both factors are taken into account in the diagnosability design, the degree of fit between the fault diagnosis method and the system structure can be greatly improved, and, thus, the diagnosis capability of the system can be enhanced.
Based on the above ideas, an integrated design process was utilized in this paper when performing diagnosability design to improve system fault diagnosis, and the flow of this design method is shown in Figure 1b, in which the mutual influence between the diagnosis method and the system structure was considered, and the degree of fit between the two factors was improved by considering the specificity of each measurement point for fault diagnosis. Thus, the fault diagnosis performance of system is greatly improved [21]. Relying on this process, a diagnosability integration design method based on graph theory was proposed in this paper. In the current literature, it is rare to see an approach that enhances the diagnostic capability of a system by considering both system architecture design and diagnostic algorithms at the same time. The method was divided into three main steps: 1. Measurement point determination. The diagnosis plot of measurement point was drawn based on the diagnosability evaluation results, the maximal clique in the plot was extracted, and the maximal clique set was obtained according to the edge covering theorem, so that the set could cover the fault mode of the system. 2. Diagnosis method determination. The method of setting a cascade classifier on a maximal clique set was studied, and the posterior probability output problem of the classifier was studied. The output results of classifier and Shannon entropy were used to evaluate the fault identification results of the diagnosis method. 3. Measuring point updating mechanism. The measurement points can be replaced according to the evaluation results of the diagnosis method in order to obtain better diagnosis results.
The rest of this paper can be summarized as follows: Section 2 describes the framework of the algorithm proposed in this paper; Section 3 describes the specific details of the algorithm based on Section 2; Section 4 gives the flow block diagram of the whole algorithm; and Section 5 illustrates the practicality and effectiveness of the method in this paper by taking the filter amplifier circuit as an example.

2. Diagnosability-Integrated Design Method Framework

In the process of actual fault diagnosis, a complete fault diagnosis scheme needs to clarify two basic aspects: where to extract signals, i.e., the selection of measurement points; and how to process the extracted signals so that the faults can be diagnosed, i.e., the determination of the diagnosis method. Both the selection of measurement point and the diagnosis method can affect the effect of diagnosis. In order to take the influence of both into account and improve the fit of both factors, the flow chart of the diagnosability integrated design method proposed in this paper is shown in Figure 2.
As shown in Figure 2, the diagnosability-integrated design method based on graph theory proposed in this paper firstly needed to evaluate the diagnosability of the system, and through the diagnosability evaluation, we can obtain the diagnosis difficulty degree of different test points for different faults, and then, based on the results of the evaluation, the determination of the test points and then the diagnostic program. At this point, the diagnosis effects of the determined measurement points and diagnosis methods were compared with the existing fault diagnosis methods, and if the diagnosis effects of the obtained diagnosis methods were weaker, the measurement points were re-selected through the measurement point updating session until better diagnostic effects were obtained. In this flowchart, it is through the measurement point updating session that the two factors of test points and diagnosis methods can be adjusted to achieve a coordinated and balanced effect through the form of feedback, so as to maximize the final diagnosis capability.

3. Diagnosability Integrated Design Method Details

In this section, the details of the design framework proposed in Section 2 are developed in detail.
In order to facilitate the method description, the notation of this paper is explained.
Let the set of fault modes of the system be F = f 0 , f 1 , , f m , where m denotes the number of fault modes, f 0 denotes the normal state, The set of tests is T = t 1 , t 2 , , t n , where n denotes the number of test. The results of the diagnosability evaluation for this system are
F D = F D j i n × m
F I = F I 1 , ( f 1 , f 2 ) F I 1 , ( f 1 , f 3 ) F I 1 , ( f m 1 , f m ) F I 2 , ( f 1 , f 2 ) F I 2 , ( f 1 , f 3 ) F I 2 , ( f m 1 , f m ) F I n , ( f 1 , f 2 ) F I n , ( f 1 , f 3 ) F I n , ( f m 1 , f m )
where F D is the fault detectability evaluation matrix, F D j i indicates the detection difficulty when using test t j to detect the fault f i , and a higher value means less difficulty (cost) to detect the fault. F I is the fault isolability evaluation matrix, F I j , ( f i , f k ) indicates the isolation difficulty when using test t j to isolate the system fault pair f i , f k , and a larger value means that the difficulty (cost) of isolating the fault pair is smaller.

3.1. Measurement Point Determination

3.1.1. Diagnosis Plot of Measurement Point Determination

The column vectors of Equations (1) and (2) indicate the difficulty of the fault or fault pair to be detected or isolated by the tests in the system. The tests were grouped according to the contents of Equations (1) and (2) and using the K-means clustering algorithm. Use t e s t f i to denote the set of tests that can detect faults f i at a smaller cost, and use t e s t f i , f j to denote the set of tests that can distinguish fault pair f i , f j at a smaller cost.
According to the results of t e s t f i and t e s t f i , f j , the detection and isolation capability of the measurement point t i for different faults and fault pairs can be further revealed. In order to more graphically represent the diagnosis capability of measurement points for different faults, the diagnosis plot of measurement point was introduced in this paper.
The following algorithm was used to draw the diagnosis plot of measurement point G t i .
Based on Algorithm 1 and Definition 1, it can be inferred that
Algorithm 1 the diagnosis plot of measurement point G determination
Input: t e s t f i , i = 1 , 2 , , m
    t e s t f i , f j , i , j = 1 , 2 , , m , i j
Output: G t i = V t i , E t i
01. initialize: V t i f 0 , E t i
02. for  k = 1 to m  do
03.  if  t i t e s t f k
04.     V t i V t i f k
05.     E t i E t i f 0 , f k
06.    end if
07.    for  f j V t i f k  do
08.      if  t i t e s t f k , f j  do
09.        E t i E t i f k , f j
10.      end if
11.    end for
12. end for
13. Return G t i = V t i , E t i
Definition 1. 
Diagnosis plot of measurement point. The diagnosis plot of measurement point G t i is used to qualitatively represent the diagnosis capability of the signal from measurement point t i for different faults, denoted as G t i = V t i , E t i , where V t i denotes the set of vertices of the graph V t i F , indicating that the faults corresponding to the vertices can be detected by test t i . The set of edges of the graph is denoted as E t i = f i , f j | f i , f j V t i , f i , f j indicates that the fault pair f i , f j can be isolated by test t i .
Corollary 1. 
The diagnosis plot of measurement point G t i = V t i , E t i for any measurement point t i
(1)
f 0 V t i ,
(2)
f i V t i ,   f i , f 0 E t i

3.1.2. Maximal Clique Extraction

According to Corollary 1, for a system with three states ( f 0 , f 1 , f 2 ), all forms of the diagnosis plot of measurement point on this system are shown in Figure 3.
Among the four cases shown in Figure 3, (a) indicates that the measurement point is insensitive to both faults and cannot detect the occurrence of faults; (b) indicates that the measurement point can only detect one of the faults but not the other; (c) indicates that the measurement point can detect both faults but cannot distinguish between them; (d) indicates that the measurement point can detect and isolate both faults. In the diagnosability design of this test point, the information reflected by the test point in case (a) was too little, so it was not considered in this paper. The remaining three cases were discussed in this paper, and the distribution of samples in the feature space corresponding to these three cases is shown in Figure 4.
As can be seen in Figure 4, the different forms of diagnosis plot of measurement points corresponded to different situations that distinguish three system states. For the third case, the three states of the system were completely distinguishable, while for both the first and second cases, the presence of the fault f 2 caused some fault states to be indistinguishable, with the difference that for the second case, the fault state f 1 and f 2 were mixed and it was difficult to distinguish between the two fault states. In other words, the fault can be detected, but the fault cannot be isolated and identified, while for the first case, the fault state f 2 and f 0 were mixed. The reason for this situation may be that the fault f 2 was difficult to detect, or it may be that due to the existence of the system structure, the relevant information was not transmitted to the measurement point when the fault occurred, making the signal of the measurement point in the fault state not different from that in the normal state, causing the fault f 2 to be undetectable.
According to the above analysis, when the diagnosis plot of measurement point presented the third case, as shown in Figure 4, the states of the system were best discriminated at this time, and the states in the space did not appear to be mixed. However, in practice, it is not always possible to achieve this; therefore, the diagnosis plot of measurement point G t i = V t i , E t i should contain as many states as possible, and these states should satisfy the structure shown in Figure 3d as much as possible, i.e., the diagnosis plot of measurement point should be as close as possible to the structure of the maximal clique.
In order to extract the maximal clique from the diagnosis plot of measurement point G t i = V t i , E t i , we can refer to the Bron–Kerbosch algorithm in the literature, the details of which are not described in this paper due to the length of the paper.
It should be noted that for each diagnosis plot of measurement point, the maximal cliques contained in it may not be unique (e.g., in (c) of Figure 3, both f 0 , f 1 and f 0 , f 2 are maximal cliques), and the p-th maximal clique of the diagnosis plot of measurement point G t i is noted as C t i p .

3.1.3. Maximal Clique Set Selection Algorithm Based on Hypergraph

Through the above analysis, it can be seen that the structure of the maximal clique found in the diagnosis plot of measurement point represents the system states that can be most easily distinguished using the information of the measurement point; so, maximal clique structure is the basis of designing fault diagnosis classifier. However, it was obvious that the maximal clique of measurement points did not necessarily contain all the states of the system, and so, a single maximal clique cannot correctly identify the state of the system. Therefore, multiple maximal cliques are needed to jointly identify the fault state of the system. Therefore, it is the main problem addressed in this section to select a suitable set of maximal cliques from among many maximal cliques to facilitate the detection and isolation of system faults.
1.
Problem Description
The problem in this section can be described formally as follows:
For the aforementioned system, suppose the set of maximal cliques of the system is C = C t i p , i = 1 , 2 , , n . Now, it is necessary to choose the optimal set of maximal cliques C for the maximal cliques in C such that C t i p C C t i p = F , while ensuring that min C .
2.
Problem solving
In this section, the above problem is solved based on the concept of hypergraph (Definitions 2 and 3).
Definition 2 (Hypergraph). 
Given the undirected hypergraph H = V , ε , the hypergraph vertex set is defined as V = v 1 , v 2 , , v m , which contains m hypergraph vertices. The set of hypergraph edges is defined as a set ε = ε 1 , ε 2 , , ε n containing n hypergraph edges, where for any hypergraph edge v in the set of hypergraph edges ε , ε = ε 1 , ε 2 , , ε p , it means that the hypergraph edge has p hypergraph vertices.
Definition 3 (Edge Coverage). 
Given the hypergraph H = V , ε , C ε . C is said to be an edge covering of H if each vertex is associated with at least one edge in C .
The problem in this section can be converted to solving the edge covering problem in hypergraph: where the state of the system is equivalent to the vertex in hypergraph, and the maximal clique is the edge in hypergraph. For the edge-covering problem of hypergraph, the most classical method is to use a greedy algorithm to solve this problem, and the steps of this method are shown as follows (Algorithm 2).
Algorithm 2 the set of maximal cliques C determination
Input: C = C t i p , i = 1 , 2 , , n
    F = f 0 , f 1 , , f m
Output: C
01. initialize: X F , C
02. while  X =  do
03.    selecet C t i p , so that it can cover most fault in X
04.     C = C C t i p
05.     X = X \ F
06. end while
07. Return  C

3.2. Diagnosis Method Determination

3.2.1. Design of Maximal Clique Classifier

The set of maximal cliques determined through Section 3.1.3 is denoted as
C = C 1 , C 2 , , C n i
where C j represents the j-th maximal clique in the set.
According to the previous section, the system faults contained in the maximal clique C j are easily distinguishable under the measurement signal of the measurement point, and so, a classifier can be set for any maximal clique C j in the set C , and the classifier can identify the system fault states by the signal of the current measurement point.
In classification problems, the difficulty of the classification problem is closely related to the kind of classification. In general, the more kinds of classification, the more detailed features are required and the more resources are consumed.
In order to reduce the difficulty of classification, researchers try to use a cascade approach to solve the classification problem in the current classification problem [22], that is, the problems are first roughly classified by using features, and then the characteristics are finely classified for each rough classification, so as to achieve the purpose of correctly classifying the target samples. Studies have shown that the cascade classification approach has good classification results [23]. In this section, cascade classifiers of each maximal clique in maximal clique set C were designed to reduce classification categories as far as possible, so as to select fewer features in feature extraction and identify the state of the system.
Based on the structure shown in Equation (3), construct the matrix
W = w k j ( m + 1 ) × n i k = 0 , 1 , 2 , , m j = 1 , 2 , , n i
where w k j = 0 , 1 , if w k j = 1 denotes the j-th maximal clique C j in the maximal cluster clique set C can recognize the fault state f k , and vice versa, w k j = 0 .
In this paper, the method of setting up classifiers for each maximal clique can be followed as follows.
The following is a concrete example to illustrate the calculation process of Algorithm 3.
Algorithm 3 Design of cascade classifier on the maximal cliques set
Input:  C = C 1 , C 2 , , C n i
    F = f 0 , f 1 , , f m
Output: classifier design results
1. W i is determined by C i according to Formula (4)
2. while  F  do
3. W i adds up the rows to get the vector w = w 0 , w 1 , , w m T
4. if  w 0 = w 1 = = w m  then
5.       Set the classifier on C j i C , the number of categories is w 0
6.        F
7.      else
8.         n = min w 0 , w 1 , , w m
9.         i = arg min w j
10.        if  n = 1 and i = 1 then
11.          C j i C , s.t w j i = 1
12.         the binary classifier is set on C i j , the positive class is f i , the negative class is all the remaining faults
13.          F F f i
14.          C C C j i
15.        end if
16.        if  n = 1 and i > 1  then
17.         Select C i j so that it can cover the most faults in f i ,
18.          F = F f i
19.          C i = C i C i j
20.        end if
21.        if  n > 1  then
22.         the binary classifier is set on C i j , the positive class is f i , the negative class is all the remaining faults
23.          F F f i
24.          C C C j i
25.        end if
26.      end if
27. end while
Suppose the fault mode of a system is F = f 0 , f 1 , f 2 , f 3 , f 4 , the set of maximal clique is C = C t 1 , C t 2 , C t 3 , and the matrix is determined according to Equation (4) as
W = C t 1 C t 2 C t 3 f 0 f 1 f 2 f 3 f 4 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1
According to the approach of Algorithm 3, at this time, W is accumulated according to the rows to obtain w = 3 , 3 , 3 , 2 , 1 T . Then, choose maximal clique C t 3 and set the binary classifier, where the positive class label is f 4 and the negative class label is f 0 , f 1 , f 2 , f 3 , and update W as
W = C t 1 C t 2 f 0 f 1 f 2 f 3 1 1 1 1 1 1 0 1
according to the approach of Algorithm 3, we can obtain w = 2 , 2 , 2 , 1 T , and then choose a maximal clique C t 2 and set the binary classifier, where the positive class label is f 3 and the negative class label is f 0 , f 1 , f 2 , and update W as
W = C t 1 f 0 f 1 f 2 1 1 1
w = 1 , 1 , 1 T is obtained, then the maximal clique C t 2 is chosen and a triple classifier is set up, in which three system states f 0 , f 1 and f 2 are distinguished, respectively.

3.2.2. Feature Cacullation and Feature Selection

Section 3.2.1 sets the cascade classifier on maximal clique, but the features of the signal need to be extracted and selected before the classification, assuming that the set of features of the signal is J = J 1 , J 2 , , J d , where d denotes the number of features. The signal features here can use a variety of signal features such as time domain signal characteristics, frequency domain signal characteristics, and time-frequency domain signal characteristics.
In the feature selection session, the signal features that are beneficial for fault diagnosis are selected based on the Relief_F algorithm. The specific algorithm for feature selection is shown in Algorithm 4.
Algorithm 4 Signal feature screening algorithm based on Relief_F
Input: feature set J = J 1 , J 2 , , J d
   samples x p , y p , p = 1 , , T , y p F
Output: C
01. initialize: W
02. for  i = 1 to T  do
03.   select sample x i
04.   for  j = 1 to d  do
05.     find k neighbors of the same class for x i n e a r h i t s
06.     for every class c c l a s s x i
07.       find k neighbors of different classes of x i n e a r m i s s e s
08.     end for
09.    W ( j ) = W ( j ) n = 1 k d i f f ( x i , n e a r h n ) T × k + c c l a s s x i p ( c ) 1 p c l a s s x i n = 1 k d i f f ( x i , n e a r m n c ) T × k
10.   end for
11. end for
12. s u m 0
13. J
14. while  s n m < 0.5  do
15.      i = arg max W
16.      s u m s u m + max W
17.      J J J i
18.      W W W i
19. end while
20. Return J

3.2.3. Posterior Probability Output of Classifier

According to the approach shown in Section 3.2.1, a classifier can be determined for each maximal clique. In the design of classifiers, constructing a classifier to generate posterior probabilities is a better approach in the discrimination case. Combining multiple types of posterior probabilities is useful in the final decision, even if the discrimination result is chosen based on the maximum posterior probability, which is the most Bayesian decision for the equal loss case. The output of the posterior probabilities allows not only to determine the classification of the classifier but also to measure the classification performance of the classifier by the posterior probabilities.
In this paper, we took the support vector machine (SVM) as an example to illustrate the output method of its posterior probability.
Assume that the training sample data set is x i , y i . For a sample x in it, the output of the support vector machine takes the form
f x = sign f x = sign h x + b
where sign denotes the symbolic function acting as a decision function.
h x = i y i α i k x i , x
h x + b is a quantity proportional to the distance of the interface, and it is also believed that the closer the point is to the interface, the less likely it is to be correctly classified; the farther it is from the interface, the more likely it is to be correctly classified.
For a binary classification problem, based on the Bayesian formula, we can get
P y = 1 | f x = P f x | y = 1 P y = 1 i = ± 1 P f x | y = i P y = i
By fitting the Equation (9), we can get
P y = 1 | f x = 1 1 + exp A f x + B
where A and B are the parameters to be estimated.
Equation (10) was got from dataset f x i , y i using maximum likelihood estimation.
Construct the maximum likelihood function according to Equation (10)
L = y i = 1 P y = 1 | f x i × y i = 1 P y = 1 | f x i
Turning Equation (11) into a log-likelihood function
log L = log y i = 1 P y = 1 | f x i + y i = 1 P y = 1 | f x i = log y i = 1 P y = 1 | f x i + log y i = 1 P y = 1 | f x i = y i = 1 log P y = 1 | f x i + y i = 1 log P y = 1 | f x i
So
A , B = arg max A , B log L = arg max A , B y i = 1 log P y = 1 | f x i + y i = 1 log P y = 1 | f x i

3.3. Performance Evaluation Method of Diagnosis Scheme Based on Information Entropy

A diagnosis scheme for the system can be designed according to the methods shown in Section 3.1 and Section 3.2. In order to study the pros and cons among the schemes, this section examines the evaluation of the diagnosis scheme.
It is assumed that the signal set of each measurement point obtained by a certain operation of the system is s = s 1 , s 2 , , the diagnosis results for this sample can be expressed in terms of the posterior probability vector of the output of the diagnosis scheme according to the previous section
L s = P f 0 | s , P f 1 | s , , P f m | s m P y m | s = 1
suppose the posterior probability of the diagnosis scheme for the current state of the system is P f 1 | s = 0.9 and P f 0 | s = 0.1 , then it can be considered that the current state of the system is f 1 ; while when its diagnosis posterior probability is P f 1   | s = 0.51 and P f 0 | s = 0.49 , although it can also be considered that the system is in the fault state f 1 , However, in this case, it is reasonable to assume that the reliability of the diagnosis scheme for the current system state identification results is low. Therefore, from the posterior probability results shown in Equation (13), its distribution indicates confidence of the diagnosis scheme to the diagnosis result, when L s are more concentrated, it means that the diagnosis scheme has more confidence for the diagnosis results, and the diagnosis scheme has better diagnosis performance at this time, while when L s are more dispersed, it means that the diagnosis method has lower confidence for the diagnosis results. Therefore, by calculating the entropy of Equation (13) to quantify the confidence E S s of the classifier on the sample, the smaller the entropy, the lower the uncertainty in the classification, and the higher the confidence of the classifier output, i.e., the confidence and entropy are inversely proportional. In the selection of entropy, the more commonly used Shannon entropy is chosen as a measure of the diagnosis effect of a single sample in this paper. If the diagnosis of the current sample is correct, the Shannon entropy of Equation (13) is the loss of the diagnostic scheme to the sample diagnosis, and if the diagnosis of the current sample is wrong, the maximum Shannon entropy in this classification case is used as the loss of the diagnostic scheme to the sample diagnosis, that is
E S s = m P y m | s log P y m | s ,   s   is   correctly   diagnosed log 1 m + 1 ,   s   is   misdagnosed

3.4. Measurement Point Update Mechanism

The diagnosis result of the current fault diagnosis scheme can be evaluated by Equation (14), and when the evaluation result is not good, it means that the current set of maximal cliques C , i.e., the selected measurement points, does not fully reflect the fault information of the system, and additional information needs to be obtained by adding new measurement points and using this new information to diagnose the system.
For the existing set C of maximal cliques, it is necessary to find new maximal cliques from the set C C to supplement the fault information of the system, and the following principles should be followed when selecting new maximal cliques:
(1)
The new maximal clique should cover as many faults as possible in order to provide more information for the fault diagnosis of the system;
(2)
When there are multiple maximal cliques in C C that can cover the same number of faults, the most economical one is selected based on the principle of economy.

4. Algorithm Flow Design and Implementation

The schematic diagram of the diagnosability-integrated design approach based on graph theory proposed in this paper is shown in Figure 5.

5. Instance Verification

5.1. Experiment Preparation

In this paper, a filter amplifier circuit was used to verify the proposed method. The circuit diagram of the experimental circuit is shown in Figure 6, and 20 dB of noise was added to the measured signal in it. The simulated faults of the experimental circuit are shown in Table 1. The circuit diagram was run 50 times in different states, where 30 sets of data for each state were used for classifier training, 10 sets were used for diagnosis method evaluation, and 10 sets were used for final testing. The set of all classifier training data is denoted as D 1 , the set of all diagnosis method evaluation data is denoted as D 2 , and the set of test data is denoted as D t e s t . The signal features selected in this paper is shown in Table 2.

5.2. Experimental Results

Based on the results of the diagnosability evaluation of the system shown in Figure 6, the diagnosis plot of measurement point is drawn in conjunction with Algorithm 1, as shown in Figure 7.
Using the Bron–Kerbosch algorithm to analyze the diagnosis plot of measurement point and extract maximal cliques, a total of four maximal cliques can be extracted from the diagnosis plot of measurement point shown in Figure 7, which is recorded as C = C t 2 , C t 3 , C t 5 , C t 6 . We can obtain C = C t 6 from C = C t 2 , C t 3 , C t 5 , C t 6 using Algorithm 2, and according to Algorithm 3, the design result of cascade classifier based on the maximal clique can be obtained: using the data of measurement point 6, the five states of the system were identified. Algorithm 4 was used to calculate the contribution of each feature to the diagnosis, which is shown in Figure 8. Finally, feature 12 and 13 were selected to distinguish the system state, and then, a complete fault diagnosis scheme can be obtained.
Data set D 2 was used to evaluate the diagnosis results of this scheme, and the average Shannon entropy of samples is 0.4815.
The measurement point information was updated to C = C t 5 , C t 6 . Repeating the above steps, the average Shannon entropy of the evaluation using the sample in this condition was 0.5369. This result was not as good as the last diagnosis, indicating that the diagnosis method obtained using C = C t 6 is better.
Therefore, the diagnosis scheme obtained by the graph-theory-based diagnosability-integrated design method in this paper was as follows: the signal of measuring point 6 was collected, and the two features of gravity frequency and mean square frequency can be used to identify the fault of the system.
Using this diagnosis scheme, the state of D t e s t was identified, and the results are shown in Figure 9.

5.3. Diagnosis Results of Other Methods in the Literature

The method in this paper was compared with four common fault recognition methods in the literature, namely, K-nearest neighbor (KNN) classifiers [24], random forests (RF) [25], discriminative classifiers, and support vector machines (SVM) [26], to illustrate the degree of superiority of the different methods by comparing the error rate of recognizing systematic fault patterns under the different methods.
Firstly, the data of measurement points 5 and 6 were extracted according to the signal features in Table 2, and the feature vectors of multiple measurement points were then pieced together to form a new vector or matrix for diagnosis [27], and the schematic diagram of the piecing method is shown in Figure 10.
In this paper, the feature vectors were recombined by means of horizontal splicing of the data from multiple measurement points. In performing the diagnosis, the data set D 1 (a total of 5 × 30 groups of samples) was used for training, and the data sets D 2 and D t e s t (a total of 5 × 20 groups of samples) were used for testing. The fault diagnosis results are shown in Table 3, and the table shows the number of samples that were incorrectly diagnosed.
The four methods in the literature and the method in this paper were subjected to 50 simulation experiments, and the number of samples that were misdiagnosed in each experiment was recorded, and the statistical indexes such as the mean and quartile of the experimental results of each method were counted and plotted as a box-and-line diagram, as shown in Figure 11.

5.4. Experimental Analysis

Table 3 and Figure 9 show the diagnosis of different fault modes in one experiment. From the results, the fault diagnosis scheme designed by this paper’s method can identify the faults of the system well, and the methods in the literature can only identify some of the fault modes, while there were difficulties in identifying other faults. For example, the KNN and random forest methods had poorer results in the identification of fault mode 4 compared to other fault modes. In terms of diagnosis correctness, the correctness of this paper’s method was improved by 3.25 percentage points compared to the average correctness of the four methods in the literature.
Figure 11 shows the statistical results of the diagnosis of different fault modes in 50 repeated experiments. From the statistical results, the median number of incorrectly diagnosed samples of this paper’s method was 0, and the maximum value was 2, which indicates that at least in half of the experiments, this paper’s method was able to correctly diagnose all the samples, and the number of incorrectly diagnosed samples of the worst diagnosis result was 2. This diagnosis result was already better than that of the KNN and the random forest (the minimum value of the number of incorrectly diagnosed samples of the two methods was 5 and 2, respectively), while for the use of discriminative classifier and SVM, both methods performed stably in 50 repetitions, and their number of incorrectly diagnosed samples was 1.
In the comparison method, the design of the fault diagnosis method can only be based on the measurement points that have been selected, which does not guarantee that the information provided by the selected measurement points is best adapted to the current fault diagnosis method. Since the information provided by two groups of measurement points, measurement point 5 and measurement point 6, was used in the experiments to diagnose the faults of the system, in order to handle multiple groups of measurement points, the method adopted in the literature was to splice the signal features of multiple measurement points horizontally to form new feature vectors to be used, which seems to be a fast and easy way, but the implied premise of doing so is that the distinguishing ability of each signal for the faults is the same, and the same features on different signals for the faults, which makes the implied premise obviously over-idealized. In the actual classifier, the classification is based on the relative position of the sample features in the feature space, in this case, when the features used are not sensitive to faults, the addition of these features will make the originally clear relative position become blurred, and, thus, the phenomenon of misclassification occurs. This leads to the situation in the experiment: although the information of two measurement points was used in the comparison experiment, which provided more diagnosis information, the diagnosis effect was not satisfactory. Therefore, it is not the case that the more measurement point information is used, the better the diagnosis effect is, but only by combining the information of the measurement points with the specific fault diagnosis method can the best fault diagnosis effect be produced.
In the method of this paper, the inclusion of the measurement point updating link makes the diagnosability design work to form a feedback, and the existence of the feedback makes the selection of the measurement point and the fault pattern recognition method no longer a back-and-forth step, and the two can be balanced to realize the optimal fit. In this paper, people can not only choose the fault diagnosis method based on the selected test points but also the existence of feedback allows people to “reflect” on whether the current measurement data are appropriate for the fault diagnosis method, and improve the data through the measurement-point-updating mechanism. Thus, in a cycle, the selected test points and the diagnosis methods are adapted to each other, thus obtaining better diagnosis results and improving the system’s diagnostic capability.

6. Conclusions

To address the problem of diagnosability design, in order to take into account both system structural factors and diagnostic schemes, and to improve the degree of fit between diagnostic methods and system structure, a diagnosability-integrated design method based on graph theory was proposed in this paper, which can generate specific fault diagnosis methods for specific systems, and can be applied to the diagnosis of faults on electronic, mechanical, and other systems. The method draws a diagnosis plot of measurement point through the diagnosability evaluation results, extracts the maximal cliques from the diagnosis plot of measurement point, and sets cascading classifiers on the based on maximal cliques to achieve the design of diagnosis methods, and calculates the Shannon entropy based on the posterior probability of the classifier output as the diagnosis capability measure of each diagnosis scheme. The diagnosis scheme with low diagnosis capability is reconfigured through the measurement point update mechanism to achieve the purpose of diagnostic scheme optimization. The results of simulation experiments showed that compared with the existing methods in the literature, the method in this paper improved the average diagnosis correct rate by 3.25 percentage points, while in many simulation experiments, the error rate of most of the methods in this paper was 0, which indicates that the method in this paper had a high degree of accuracy and stability.
The ultimate aim of the method purpose of this paper was to design the diagnostic scheme of the system to improve the fault diagnosis capability of the system through the diagnosability design, and the operation of the method was based on the rich operation data of the system. In the future, how to improve the fault diagnosis capability of the system through the diagnosability design can be considered, especially in the case of the system signal data being difficult to obtain or the data being contaminated.

Author Contributions

Conceptualization, J.L. and X.S.; methodology, J.L.; writing—original draft preparation, J.L.; writing—review and editing, X.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available on request due to the restriction of privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wang, D.; Fu, F.; Liu, C.; Li, W. Connotation and Research Status of Diagnosability of Control Systems: A Review. Acta Autom. Sin. 2018, 44, 1537–1553. [Google Scholar]
  2. Liu, Q.; Wang, Z.; Zhang, J.; Zhou, D. Necessary and sufficient conditions for fault diagnosability of linear open—And closed-loop stochastic systems under sensor and actuator faults. IEEE Trans. Autom. Control 2022, 67, 4178–4185. [Google Scholar] [CrossRef]
  3. Zhao, D.; Ahn, C.K.; Paszke, W.; Fu, F.; Li, Y. Fault diagnosability analysis of two-dimensional linear discrete systems. IEEE Trans. Autom. Control 2021, 66, 826–832. [Google Scholar] [CrossRef]
  4. Wang, Z.; Wang, Z.; Shen, Y. Fault isolability evaluation based on zonotope. Acta Autom. Sin. 2022, 48, 1921–1930. [Google Scholar]
  5. Ou Yang, D.; Sun, R.; Tian, X.; Gao, B. Set bloking-based approach to sensor selection in uncertain systems. J. Jilin Univ. Eng. Technol. Ed. 2023, 53, 547–554. [Google Scholar]
  6. Ding, S.; Li, L.; Krüger, M. Application of randomized algorithms to assessment and design of observer-based fault detection systems. Automatica 2019, 107, 175–182. [Google Scholar] [CrossRef]
  7. Wang, G.; Zhao, H.; Guo, S. Structual analusis based sensor placement of a wind turbine gearbox. J. Vib. Shock 2018, 37, 181–188. [Google Scholar]
  8. Zhou, G.; Yi, T.; Xie, M.; Li, H.; Xu, J. Optimal wireless sensor placement in structural health monitoring emphasizing information effectiveness and network performance. J. Aerosp. Eng. 2021, 34, 04050112. [Google Scholar] [CrossRef]
  9. He, K.; Li, X.B.; Sun, F.; Yang, Q.; Wu, B.; Meng, C.; Cai, W. Sensor optimization for variation diagnosis in multistation assembly processes. Math. Probl. Eng. 2022, 2022, 7904677. [Google Scholar] [CrossRef]
  10. Jiang, D.; Li, W. Optimal Sensor Development for Power Supply Vehicles under Hybrid Information-Entropy Constrains. Acta Armamentarii 2022, 43, 1763–1771. [Google Scholar]
  11. Peng, Z.; Liu, Z. Optimal sensor placement of a gear box based on fault diagnosability. J. Vib. Shock 2021, 40, 155–163. [Google Scholar]
  12. Shi, S. Sensing Optimization Design of UAV Electric Actuator Operation State. Master’s Thesis, Harbin Institute of Technology, Harbin, China, 2020. [Google Scholar]
  13. Jiang, D.; Li, W.; Wang, J.; Sun, X. Research on Sensor Optimal Placement Method Using Quantitative Evaluation of Fault Diagnosability. Acta Autom. Sin. 2018, 44, 1128–1137. [Google Scholar]
  14. Zhang, Z.; Li, Q.; Liu, H.; Ding, R. Optimal placement of strain sensors for urban rail vehicles based on information entropy. J. Northeast. Univ. Nat. Sci. 2020, 41, 367–373+412. [Google Scholar]
  15. Cui, Y.; Shi, J.; Wang, Z. System-level operational diagnosability analysis in quasi real-time fault diagnosis: The probabilistic approach. J. Process Control 2014, 24, 1444–1453. [Google Scholar] [CrossRef]
  16. Ding, S. Application of factorization and gap metric techniques to fault detection and isolation part ii: Gap metric technique aided fdi performance analysis. IFAC-PapersOnLine 2015, 48, 119–124. [Google Scholar] [CrossRef]
  17. Ding, S. Application of factorization and gap metric technique to fault detection and isolation part i: A factorization technique based fdi framework. IFAC-PapersOnLine 2015, 48, 113–118. [Google Scholar] [CrossRef]
  18. Seth, A.D.; Biswas, S.; Dhar, A.K. Diangoser design strategy for discrete system: Case study of neutralization system. Adv. Control Appl. 2022, 4, e114. [Google Scholar] [CrossRef]
  19. Zhou, H.; Zhang, H.; Zhong, F. Fault diagnosis of rolling bearings based on locally joint sparse marginal embedding. J. Vib. Shock 2023, 42, 124–130. [Google Scholar]
  20. Guo, R.; Zhang, H.; Niu, W.; Luo, X.; Cai, W.; Wang, J.; Wang, Y.; Zhao, J. Study on health status classification of variable gear pump based on GLCT and CPA-SVM. J. Mech. Eng. 2023, 59, 310–319. [Google Scholar]
  21. Wang, D.; Fu, F.; Li, W.; Tu, Y.; Liu, C.; Liu, W. A review of diagnosability of control systems with applications to spacecraft. Annu. Rev. Control 2020, 49, 212–229. [Google Scholar] [CrossRef]
  22. Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001. [Google Scholar]
  23. Wu, F.; Wang, X.; Ding, J.; Du, P.; Tan, K. Improved cascade forset deel learning model for hyperspectral imagery classification. Natl. Remote Sens. Bull. 2020, 24, 439–453. [Google Scholar] [CrossRef]
  24. Ding, T.; Yan, Y.; Xie, Y.; Li, X. Fault diagnosis of scintillator dector based on improved KNN algorithm. At. Energy Sci. Technol. 2022, 56, 1431–1439. [Google Scholar]
  25. Ye, L.; Zheng, D.; Liu, Y.; Niu, S. Unbalanced data classification based on whale swarm optimization random forest algorithm. J. Nanjing Univ. Nat. 2022, 42, 99–105. [Google Scholar]
  26. Zhou, X.; Feng, Y.; Chen, L.; Luo, W.; Liu, S. Transformer fault diagnosis based on SVM optimized by the improved bald eagle search algorithm. Power Syst. Prot. Control 2023, 51, 118–126. [Google Scholar]
  27. Zhang, Y. Research on Test Point Selection and Fault Diagnosis Method of Analog Circuit. Master’s Thesis, Beijing University of Chemical Technology, Beijing, China, 2022. [Google Scholar]
Figure 1. Comparison of two diagnosability design processes. (a) General flow of diagnosability design; (b) the flow of integrated diagnosability design.
Figure 1. Comparison of two diagnosability design processes. (a) General flow of diagnosability design; (b) the flow of integrated diagnosability design.
Applsci 13 10080 g001
Figure 2. Flowchart of diagnosability integrated design method framework.
Figure 2. Flowchart of diagnosability integrated design method framework.
Applsci 13 10080 g002
Figure 3. Forms of the diagnosis plot of measurement point of three-state system. The dots in the figure represent the states of the system, and the connecting lines between the dots represent that the two states are distinguishable. The subfigures (ad) show the four cases of whether the states of the three-state system are distinguishable or not, respectively.
Figure 3. Forms of the diagnosis plot of measurement point of three-state system. The dots in the figure represent the states of the system, and the connecting lines between the dots represent that the two states are distinguishable. The subfigures (ad) show the four cases of whether the states of the three-state system are distinguishable or not, respectively.
Applsci 13 10080 g003
Figure 4. Schematic diagram of sample distribution for different diagnosis plot of measurement point.
Figure 4. Schematic diagram of sample distribution for different diagnosis plot of measurement point.
Applsci 13 10080 g004
Figure 5. Schematic diagram of diagnosability-integrated design approach based on graph theory.
Figure 5. Schematic diagram of diagnosability-integrated design approach based on graph theory.
Applsci 13 10080 g005
Figure 6. Circuit diagram.
Figure 6. Circuit diagram.
Applsci 13 10080 g006
Figure 7. The diagnosis plot of measurement point.
Figure 7. The diagnosis plot of measurement point.
Applsci 13 10080 g007
Figure 8. Contribution of different features to fault diagnosis of measurement point 6.
Figure 8. Contribution of different features to fault diagnosis of measurement point 6.
Applsci 13 10080 g008
Figure 9. Fault diagnosis results of the method in this paper.
Figure 9. Fault diagnosis results of the method in this paper.
Applsci 13 10080 g009
Figure 10. Multi-measurement point signal characteristics splicing way schematic.
Figure 10. Multi-measurement point signal characteristics splicing way schematic.
Applsci 13 10080 g010
Figure 11. Distribution of multiple simulation experimental results of different diagnostic methods.
Figure 11. Distribution of multiple simulation experimental results of different diagnostic methods.
Applsci 13 10080 g011
Table 1. Circuit fault setting.
Table 1. Circuit fault setting.
ComponentFault
f 0 \\
f 1 C 1 10   nf
f 2 R 8 50   k Ω
f 3 R 18 short circuit
f 4 R 18 20   k Ω
Table 2. Signal feature.
Table 2. Signal feature.
NO.FeatureNO.Feature
1maximum value8skewness
2maximum value9waveform factor
3mean value10impulse Factor
4root-mean-square value11margin factor
5peak-to-peak value12gravity frequency
6peak factor13mean square frequency
7kurtosis14root-mean-square frequency
Table 3. Other diagnosis methods results.
Table 3. Other diagnosis methods results.
NO.Methodf0f1f2f3f4Total
1KNN020035 (5%)
2RF000066 (6%)
3discriminative classifier000011 (1%)
4SVM000011 (1%)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lv, J.; Shi, X. A Diagnosability-Integrated Design Approach Based on Graph Theory. Appl. Sci. 2023, 13, 10080. https://0-doi-org.brum.beds.ac.uk/10.3390/app131810080

AMA Style

Lv J, Shi X. A Diagnosability-Integrated Design Approach Based on Graph Theory. Applied Sciences. 2023; 13(18):10080. https://0-doi-org.brum.beds.ac.uk/10.3390/app131810080

Chicago/Turabian Style

Lv, Jiapeng, and Xianjun Shi. 2023. "A Diagnosability-Integrated Design Approach Based on Graph Theory" Applied Sciences 13, no. 18: 10080. https://0-doi-org.brum.beds.ac.uk/10.3390/app131810080

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop