Next Article in Journal
Home Meal Replacement (Convenience Food) Consumption Behavior of Single-Member Households in Vietnam by Food Consumption Value
Previous Article in Journal
Research on Environmental Protection Strategy of Urban Construction Subject Based on Evolutionary Game
Previous Article in Special Issue
Adapting Railway Maintenance to Climate Change
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Unsupervised Machine Learning for Missing Clamp Detection from an In-Service Train Using Differential Eddy Current Sensor

1
Division of Operation and Maintenance Engineering, Lulea University of Technology, 97187 Lulea, Sweden
2
Alstom Transportation, 11743 Stockholm, Sweden
*
Author to whom correspondence should be addressed.
Sustainability 2022, 14(2), 1035; https://0-doi-org.brum.beds.ac.uk/10.3390/su14021035
Submission received: 20 November 2021 / Revised: 30 December 2021 / Accepted: 14 January 2022 / Published: 17 January 2022

Abstract

:
The rail fastening system plays a crucial role in railway tracks as it ensures operational safety by fixing the rail on to the sleeper. Early detection of rail fastener system defects is crucial to ensure track safety and to enable maintenance optimization. Fastener inspections are normally conducted either manually by trained maintenance personnel or by using automated 2-D visual inspection methods. Such methods have drawbacks when visibility is limited, and they are also found to be expensive in terms of system maintenance cost and track possession time. In a previous study, the authors proposed a train-based differential eddy current sensor system based on the principle of electromagnetic induction for fastener inspection that could overcome the challenges mentioned above. The detection in the previous study was carried out with the aid of a supervised machine learning algorithm. This study reports the finding of a case study, along a heavy haul line in the north of Sweden, using the same eddy current sensor system mounted on an in-service freight train. In this study, unsupervised machine learning models for detecting and analyzing missing clamps in a fastener system were developed. The differential eddy current measurement system was set to use a driving field frequency of 27 kHz. An anomaly detection model combining isolation forest (IF) and connectivity-based outlier factor (COF) was implemented to detect anomalies from fastener inspection measurements. To group the anomalies into meaningful clusters and to detect missing clamps within the fastening system, an unsupervised clustering based on the DBSCAN algorithm was also implemented. The models were verified by measuring a section of the track for which the track conditions were known. The proposed anomaly detection model had a detection accuracy of 96.79% and also exhibited a high score of sensitivity and specificity. The DBSCAN model was successful in clustering missing clamps, both one and two missing clamps, from a fastening system separately.

1. Introduction

Railway transportation is a significant mode of transportation for reasons of environmental friendliness, safety, cost, and lower energy consumptions. It is a sustainable mode of transportation that supports the economic and industrial expansion of the society through the mobilization of freight and passengers [1]. The growing demand to shift huge volumes of passengers and freight traffic and the current state of the existing railway infrastructure are issues that require substantial attention in the field of transportation [2]. Capital expansion of the railway infrastructure is a cost-intensive and time-consuming approach. Thus, the maintenance and renewal (M&R) process needs to be subjected to continuous improvement for the existing infrastructure to meet the capacity demand without compromising the quality of the provided service [3,4].
The operational capacity of a given railway infrastructure is significantly influenced by its utilization methods and its technical state or quality [5]. One of the key aspects of railway infrastructure maintenance is the inter-dependability of its operational capacity and the infrastructure condition. A high operational capacity with good service quality is achieved when the infrastructure is in a good state with high quality. With the increase in capacity utilization, the infrastructure is subjected to a higher traffic load leading to rapid deterioration of the infrastructure and degradation to its components. This consequently leads to higher M&R needs which require track possession, and thus eventually resulting in the reduction in operational capacity. The downtime arising from these M&R of networks is responsible for nearly half of all the delays to passengers. In Sweden, an average of 572.7 h per year of delay is incurred due to the failure of components in the railway track [6]. To mitigate such delays and to ensure safe, sustainable, and reliable operation, the track and its components need to be inspected frequently.
Rail fastening systems are a vital component in the railway track system as they clamp the rail to the sleepers, preventing the longitudinal, transverse, and lateral deviation of the rail from the sleepers, thus aiding in preserving the designed track geometry. Failures of fasteners can also widen the gauge, increase wheel flange wear, reduce the safety of train operations, and may lead to catastrophic accidents [7]. At present, the inspection of track fasteners is carried out by manual inspection or using track inspection cars. Manual inspections are slow, labor-intensive, pose safety issues for maintenance staff, and are prone to human errors. Manual inspections are furthermore time-consuming and expensive for railroad companies, especially for long-term and large-scale development projects. Fastener inspection based on inspection cars has limited capabilities when it comes to inspection frequencies and requires track possession during the inspection process, thus reducing the operational capacity [8]. With the continuous expansion of the railway network and recent technological developments, an automated rail inspection system based on machine vision has gained significant importance. To inspect track defects and detect the presence/absence of the fastening bolts, References [9,10] introduced their vision system for real-time infrastructure inspection (VISyR) system, which was a fully automatic and configurable field programmable gate-array (FPGA)-based vision system for real-time infrastructure inspection. The system used a Dalsa Piranha 2 line camera (Matrox) with 1024 resolution to acquire images of the rail. An image-based detection device comprising of two industrial laser range scanners, one for each rail, was used by Babenko [11] to inspect fasteners. On the acquired images, a convolutional filter bank was applied in this study. Each fastener type had a filter associated with it and the coefficients for each were estimated using an illumination normalized version of the Optimal Trade-off Maximum Average Correlation Height (OT-MACH) filter. A track cart was used by Resendiz et al. [12] to capture the video of the railroad track with off-the-shelf cameras. A texture classification with a bank of Gabor filters followed by support vector machine (SVM) was used to determine the location of the rail components from the captured video. Structured light sensors were employed by Mao et al. [13] to inspect fastener conditions and they used a decision tree classifier to classify the defects in the fastening system.
The application of automated machine vision for fastener inspection has predominated over the past two decades, but the detection methods from these images have varied over time. Image processing and deep learning-based methods are the two commonly employed methods for detecting fasteners from the images [8]. Locating and segmenting the fastener region, extracting fastener features, and using a classification algorithm for fastener defect identification are the three main aspects of the image processing-based method. For detecting missing hexagonal-headed bolts, Marino et al. [9] made use of a multilayer perceptron neuron classifier. Stella et al. [14] employed principal component analysis and wavelet transform for pre-processing the fastener images and used a neural network classifier to detect missing hook-shaped fasteners. To match the fastener images, Yang et al. [15] used direction field as a template and used linear discriminant analysis (LDA) for matching and to obtain the weight coefficient matrix. Ruvo et al. [10] adopted an error backpropagation algorithm on the rail images to model two types of fasteners and introduced a FPGA-based architecture [16] using the same algorithm. AdaBoost [17,18], structure topic model (STM) [19], line local binary pattern (LLBP) [20], support vector machines (SVM) [21], and edge detection methods [22] are other frequently used techniques to detect fasteners from rail images. These traditional methods aid in fastener inspection with minimal manpower and reduced equipment resources; however, the detection accuracy could easily stagnate as it is difficult to manually design robust and accurate features for rail components due to the diversity of shapes and complex backgrounds [23]. With the increase in computing power and development of the graphical processing unit (GPU), deep learning methods [24,25,26,27,28] for detecting fasteners from rail images have gained substantial importance.
Over the past few years, significant advancements have been made in detecting fasteners and identifying the defects from railway track images; however, there are some underlying drawbacks associated with this method. The robustness and position accuracy are two major concerns associated with this mode of fastener detection [26]. It is a relatively expensive task to mount and maintain a reliable and high-quality automated visual inspection system as they are integrated with the operation and are subjected to vibrations, brightness fluctuations, and motion blurring during high-speed travel. This can deteriorate the accuracy of fastener condition detection and can raise safety concerns. Furthermore, the detection task becomes complicated when rail and its components are concealed due to the presence of rust and dust. Another significant drawback of this method is its inability to detect the rail and its components when they are obscured due to the presence of snow, sand, stones, and other debris. This calls for a removal process or additional rail surface treatments that add to the expense of the railroad companies. In Sweden, around 298,080 EUR were spent in 2014 to inspect two lines with a track length of ca. 300 km, of which more than 75% was utilized to inspect track components that exhibit magnetic characteristics (rail fastening, weld joints, insulation joints, etc.) [29]. With the extension of high-speed railway networks, the maintenance managers are striving to reduce these operation and maintenance costs through effective condition-based maintenance (CBM), while augmenting the quality and capacity of the rail services.
Non-destructive testing (NDT) plays a significant part in the condition-based maintenance of railway infrastructure. Eddy current (EC) testing is one of several NDT methods that work on the principle of electromagnetism, which is used for examining metallic components. In earlier research, the authors proposed a train-based differential eddy current sensor [6,29] for fastener inspection that can overcome the major challenges associated with the visual inspection. EC sensors work on the principle of electromagnetism and are not affected by the presence of nonconductive materials in the sensor-to-target gap. EC sensors can be used in dirty environments, such as water, oil, etc., where other inspection system fails. The proposed inspection technique using differential eddy current sensors was able to detect fastener signature when mounted on a trolley system at a distance of 65 mm above the railhead. The results presented in the previous literature were based on controlled measurements carried out on multiple short track sections, where the likelihood of disturbances was minimal. The missing clamp detection model in the previous study made use of supervised machine learning models with data points that had predefined labels to train the model and optimize their label recognition capacity in a given data set. The present study reports a case study where the measurement system was mounted on an in-service freight train and where the fastener condition and other likelihood of disturbances were unknown. This case study aims to implement unsupervised anomaly detection to the measurements obtained from a train-based differential eddy current sensor for monitoring railway fastening systems. The purpose of this study was to facilitate the development of a train-based automated measurement system for inspecting railway fastening systems and detect and analyze anomalies from the fastener inspection measurement.
The remainder of this paper is structured as follows. Section 2 elaborates the research methodology followed for this case study. The results and analysis from the study are explained in Section 3 and the conclusions and future work are discussed in Section 4.

2. Methodology

The proposed unsupervised fastener detection models are based on applying algorithms capable of recognizing patterns and relationships in the data, without any prior knowledge. The devised detection model will make use of two unsupervised machine learning models. The first model is implemented for detecting anomalous behavior from the given data points to separate the healthy class from the anomalous class. The next model will make use of a clustering technique on the data points to segregate them into different groups to extract meaningful information regarding the anomalous points.

2.1. Data Collection

2.1.1. Differential Eddy Current Sensor—Lindometer

For decades, the eddy current method has been well known for the non-destructive testing of electrically conductive objects [30]. EC testing is based on the phenomenon of electromagnetic induction, where an alternating current passing through a conducting coil creates an oscillating magnetic field. Every coil is characterized by an impedance ( Z i ) , which is a complex-valued generalization of resistance, for a single frequency sinusoidal excitation f. The impedance of the coil (refer Equation (1)) [31] can be expressed as:
Z i = V i I i = R i + j X i   w h e r e   X i = 2 π f L i
where V i and I i are the voltage and current across the coil and R i is the resistance and X i is inductive reactance of the coil with an inductance of   L i . Impedance Z i has a magnitude Z and phase φ [31] (refer Equations (2) and (3)).
Z = R i 2 + X i 2
φ = tan 1 X i R i
EC inspections are based on Faraday’s law of electromagnetic induction which states that a circular current is induced in an electric conductor due to an alternating magnetic induction flux. In turn, the induced circular current, known as the eddy current, creates a secondary magnetic field that tends to weaken the effect of the primary magnetic field. As the EC intensity increases in the test piece, the imaginary part of the coil impedance decreases. The real part of the coil impedance also reduces as the EC contributes to the increase in power dissipation of energy. The new coil impedance ( Z f ) (refer Equation (4)) [31] can be expressed as:
Z f = V f I f = R f + j X f
EC inspection generally measures this change in coil impedance from Z i to Z f in the form of either current or voltage signals to extract information of the test piece. EC density is greatest on the surface and is not uniformly distributed throughout the entire volume of the test piece. The current flow decreases exponentially as the distance from the surface of the test piece increases. The skin depth (refer Equation (5)) [31] (δ) is the distance from the surface at which the eddy current density decreases to a level of ‘1/e’ of its surface value and is expressed as:
δ = 2 µ ω σ
where σ is the conductivity given by the reciprocal of resistivity (ρ) σ = 1/ρ, µ is the magnetic permeability, ω is the angular frequency of the current given by ω = 2πf.
In principle, eddy current sensors are sensitive to local fluctuations of the magnetic permeability (µ), conductivity (σ), and the geometric form of the material, and hence can be used to detect inhomogeneities along the rail track [30]. For train-based applications, differential EC sensors are preferred. The differential EC sensor used for this case study was developed by Alstom Transport (Stockholm, Sweden) and was named as ‘Lindometer’. Figure 1 depicts the proposed sensor consisting of one driver coil ‘D’ and two pick-up coils ‘P1′ and ‘P2′. The driving coil ‘D’ is driven by a sinusoidal primary current i(t) that generates an alternating primary magnetic field. Eddy currents are thus induced within the rail and other electrically conductive components located in the proximity of the sensor. As a result of these ECs, a secondary magnetic field is generated, which has an opposite direction to that of the primary field, complying with Lenz’s law.
The information along the rail is represented as variations in amplitude or phase or a combination of both, which are extracted and analyzed using demodulation techniques. The size of the driving coil is approximately 18 (z), 70 (x), and 155 (y) mm. The two pick-up coils are encased by the driving coil which acts as an outer winding. The winding is applied in one layer with 22 turns using a copper wire of 0.7 mm diameter. The pick-up coils have a size of 18 (z), 30 (x), and 150 (y) mm, with each coil having a winding applied in one layer with 94 turn with a copper wire of diameter of 0.16 mm. The two pick-up coils are placed side by side along the x-direction with a gap of 4 mm.
The two pick-up coils are enclosed by the driving coil and differentially coupled as depicted in the circuit diagram given in Figure 2. The differentially coupled pick-up coils cancel out the cross talk between the pickup and driver coil, though not completely. The resulting voltage u(t) is the result of the induction of the ECs along the rail and the cross talk residue that are linearly superimposed. The quality of the cross talk cancellation is determined by the geometrical symmetry between the three coils, and hence the windings are placed in an even layer with no crossovers. EC is generated in the rail and vicinity along the x-y plane by the driving coil and the pick-up coils are sensitive only to the z-component of the generated flux due to the geometrical orientation, as depicted in Figure 1. The differentially coupled pick-up coils (P1–P2) are sensitive only to the changes in the EC along the rail and its vicinity. If there is an even surface (as an ideal rail with no other electrical components or defects) with no change in the geometric form of the material, or conductivity (σ) or magnetic permeability (µ), the resulting voltage will be zero due to the induction of similar ECs all across the place. When there is a change in the geometry or conductivity of magnetic permeability at one single point along the rail and its vicinity, a change in EC takes place. Only the singular point with the EC change will create a signal, due to the symmetry of the differentially coupled pick up coils, given by (refer Equation (6)).
u t = P 1 v t P 2 v t
The Lindometer encloses two such independent differential EC sensors placed at a distance of 20 cm apart. The Lindometer uses two driving fields at frequencies of 18 and 27 kHz, respectively. Two channels were installed within the Lindometer to facilitate future speed measurements using cross-correlation techniques. The above-mentioned frequencies fall under the rail norms and can be used for inspecting track and its components. For this case study, only one channel with the driving field of frequency 27 kHz was used for the measurement along the track. To stabilize the sensor, both against temperature drift and vibration, the entire unit is vacuum potted with epoxy resin.

2.1.2. Case Study: Train Measurement along the Iron Ore Line, Sweden

The Iron Ore Line (IOL) is the first longer railway in Sweden which was electrified in 1915. It is a 398 km line that runs between Riksgränsen and Boden, Sweden. The track is designed for single-track use and has a track gauge of 1435 mm. On an average annually, 29 million tons (MGT) of iron ore is transported to the ports of Narvik and Luleå via this line. The maximum allowed speed for an unloaded freight train is 70 km/h and when loaded the allowable speed limit is 60 km/h. For passenger trains, the speed can vary from 120 to 135 km/h. Figure 3 depicts the geographical location of the Iron line.
For the present study, the Lindometer was mounted on an unloaded freight train (refer to Figure 4) and measurements were carried out from Kiruna (depicted in Figure 3 as a red marker with a black arrow indicating the direction of measurement). The speed of the train was 70 km/h, and the measurement was carried out for a length of approximately 2.5 km. The measurement was recorded using a standard laptop (Dell Ultrabook). The track section considered for this case study had a concrete sleeper with Pandrol fast clip fasteners. The measured track section included one Switch & Crossing (S&C) and one bridge as well as other standard track parts such as insulation joints and welds, etc.
The measured track in this study is depicted in Figure 5. A total of 3718 sleepers were recorded where the ground truth was inspected for the final 187 sleepers of the section. The ground truth included the position of insulation joints, welds, and missing clamps, etc. In this part, clamps were also manually removed to induce a predefined pattern of fastener anomalies. The section included 172 sleepers with no missing clamps (called healthy fastening system), 6 instances with one missing clamp, and 2 instances where both clamps were missing. The ground truth section also included 4 instances of weld joints and 3 instances of insulation joints.

2.2. Signal Processing and Feature Extraction

Several signal processing methods were implemented before sufficient information could be extracted from the raw signal, corresponding to the individual fastening system. The EC signal had to be demodulated, resampled, filtered, and rotated to extract relevant features pertaining to the fastening system [29]. The signal processing techniques employed in this study is depicted in Figure 6 and a detailed explanation of the same can be found in the previous study [6,29].
The coil impedance experiences a change when they come in the vicinity of fasteners. The return field from the rail surface modulates the tone from the oscillator. A quadrature amplitude demodulator was used to extract the signal caused by the impedance variation and the raw sensor signal was multiplied by its carrier frequency (27 kHz) and low pass filtered (2 kHz) to extract the baseband. The output from the demodulator is X-axis and Y-axis signals which represent the real and imaginary parts of the impedance, respectively. The signal was then resampled from 215.52 to 35.92 kHz.
After demodulating and resampling the sensor signal, a bandpass filter of lower bound and upper bound of 29 and 34 Hz, respectively, was applied, as the periodicity of the fastener was found to be in this range. The filtering was carried out to retrieve maximum information pertaining to the fastener system and attenuate other frequency components corresponding to noise and other ferromagnetic components.
After demodulation, resampling, and bandpass filtering, the fastener signatures in the signal were found to be shifted from the in-phase direction. The complex EC signal was rotated such that the fastener signatures were projected along the in-phase direction to extract maximum information pertaining to the fastener and have better visualization. The EC signal was rotated by degree θ or Φ radian, such that the peak amplitude of the fastener signatures was maximized. The signal was rotated by an optimal angle (found from the method employed in a previous study [29]) of 255° to align the fastener signature along the in-phase direction.
The bandpass filter and the rotation of the EC signals suppress, to an extent, the disturbances arising due to the presence of conductive and magnetic material in the sensor-to-target gap. The bandpass filter was set to extract the fastener signatures and attenuate other frequency components outside that range, which could alter the energy content associated with the fastener signatures. The frequency band for the bandpass filter is dependent on the speed of the train and must be adjusted accordingly. Different components within the railway system have different geometrical shapes, different values of electrical conductivity, and magnetic permeability. Hence, they will occur at a different angle from one another from the in-phase direction compared to the fasteners. Since the study aims to detect fastener signature, the rotation angle was set to align the fastener signatures along the in-phase direction, thus suppressing, to an extent, information from other disturbances.
Three features are extracted for the individual fasteners, namely arc length of the complex signal, peak-to-peak, and RMS. The peak-to-peak and RMS feature is obtained from the real part of the EC signal, whereas the arc length feature comprises information from both the real and the imaginary part of the EC signal. A total of 3718 fastener signatures are recorded from the training measurement and thus the feature matrix will have a size of 3718 × 3.

2.3. Anomaly Detection

Anomaly detection refers to the problem of identifying data points, events, and/or observations that deviate from the expected behavior [32]. These non-conforming points or events or observations are referred to as anomalies or outliers. The general goal of an anomaly detection approach is to define a region representing the normal behavior and identify other observations in the data set which do not belong to this normal region as an anomaly. One of the main challenges for this approach is the availability of labeled data for the training/validation of models. Based on the availability of the label, anomaly detection is classified into three categories: supervised, semi-supervised, and unsupervised anomaly detection. Models using the supervised anomaly detection technique assume the availability of training data points with labels for both normal and anomalous classes. The major challenge for this method is to obtain labeled data that are accurate and well representative of all types of behavior. Labeling is usually performed manually by human experts and thus requires substantial effort and is cost intensive. Semi-supervised techniques assume that the labels are available only for the normal class during training. The typical approach employed in semi-supervised anomaly detection is to build models for the class corresponding to normal behavior and use the model to identify anomalies during the test stage. Unsupervised anomaly detection techniques, on the other hand, do not require training data and work on the assumption that normal instances are more frequent than the anomalies in the data set. Unsupervised learning does not require human expertise to label the entire length of the data set and are hence more cost-effective.
Two forms of unsupervised machine learning techniques are implemented to analyze the fastener measurements. The first model was implemented to identify and segregate the anomalous data points from the healthy or normal ones. The normal behavior in a railway fastening system is when the fastening system has both the rail clamps intact. To segregate the data points to normal and anomalous points, a combination of isolation forest (IF) and connectivity-based outlier factor (COF) were used (briefly discussed below). The two methods were combined to detect both global and local anomalies. The ground truth points were used to evaluate the anomaly detection model based on accuracy, sensitivity, and specificity.
The second method in the anomaly detection model aims to group the anomalous points into meaningful clusters. The clustering was carried out using the DBSCAN algorithm (briefly discussed below). To overcome the problem of identifying what each cluster represents, the authors used a smaller set of data from the measurement where the labels for the data points were known. The smaller set of data was used to identify some or all of the found clusters in order to know what the clusters represented.

2.3.1. Isolation Forest (IF)

The isolation forest is an ensemble-based unsupervised anomaly detection method that is an extension of decision trees. It works on the basis of isolation where iterative partitioning of the input space is carried out to separate a new observation from the rest of the data. The isolation step creates a tree where an observation is present at each leaf and each internal node is associated with a split on one variable. The isolation step is repeated t times creating different trees. An anomaly score is then generated for each data point by traversing through each tree in the forest. A comprehensive description of the isolation forest is given by F.T. Liu et al. [33].

2.3.2. Connectivity-Based Outlier Factor (COF)

The connectivity-based outlier factor is an improved version of the local outlier factor (LOF) [34] where a degree of outlier is assigned to each data point. This degree of outlier is called the connectivity-based outlier factor. The main difference between LOF and COF is the computation of neighborhood k. LOF computes the neighborhood using Euclidean distance, whereas COF uses the short path method called the chain distance to calculate the nearest neighbors. Once the neighborhood is computed, an anomaly score is generated for each data point. A comprehensive description of COF is given by Tang et al. [35].
Isolation forests are sensitive to global anomalies and may often find it difficult to detect local anomaly points. COF is sensitive to local anomalies and may have difficulties in detecting global anomalies. An effective anomaly detection algorithm should be able to detect both global and local anomalies. In most cases, if the chosen algorithm is effective in finding global anomalies, then they fail to determine local anomalies and vice versa [36]. To overcome the problems associated with this problem, an integrated approach using both isolation forest and connectivity-based outlier factor is used in this study. A point is considered as an anomaly only when the point is detected as an anomaly by both algorithms.
As per the guidelines laid down by the Swedish transport administration (Trafikverket), no more than 4 clamps can be missing within a distance of 20 sleepers (20 sleepers × 4 clamps/sleeper = 80 clamps). This accounts for 5% of tolerance in missing clamps over 20 sleepers. This information is used to set the threshold value for the anomaly scoring for both the above algorithms.

2.3.3. DBSCAN

Density-based spatial clustering of application with noise (DBSCAN) is an unsupervised density-based clustering algorithm. DBSCAN works well with arbitrary shaped and sized clusters and does not require to pre-specify the number of required clusters. DBSCAN requires two main parameters, epsilon (ε) and the minimum number of points, to form clusters of the dense region. Epsilon represents the maximum radius of the neighborhood and minimum points specify the minimum number of data points within the radius of the neighborhood. The closely packed points group together and form a cluster and the points that are in low-density regions are marked as noise. A comprehensive description of the DBSCAN algorithm is given by Ester et al. [37].

3. Results and Discussion

The results for this case study were processed in three steps. First, the signal processing and feature extraction techniques were carried out on the raw signal obtained from the training measurement. Second, the extracted features were used as an input to detect anomalies from the measurement sequence. Third, clustering was carried out to identify different groups within the measurement points. A set of data points (ground truth) whose labels were known was used to identify what each cluster represented. The ground truth was also used to evaluate the anomaly detection method.

3.1. Signal Processing and Feature Extraction

Figure 7 depicts the time signal plot after demodulation, resampling, filtering, and rotating the raw signal, obtained from the measurement carried out on the track section having 3718 sleepers. A small window of the time signal is expanded in Figure 7 to depict fastener signatures. The IQ plot (depicted in Figure 8) shows that the majority of the fastener signatures (representing healthy behavior) are aligned in parallel with respect to the in-phase direction. Several signatures are found to be shifted at various angles with respect to the in-phase direction or the real axis. This is due to the presence of other magnetic components, such as the weld joints, insulation joints, etc., that have different magnetic permeability, conductivity, or geometric form. The presence of such components near a fastening system can affect the induced voltage in the eddy current sensor, thus causing a deviation in the corresponding signature, and are considered as an anomalous behavior for this study.
The zero-crossing in the signal (refer to Figure 7) from the positive to the negative induction represents the center positioning of the fastening system. The zero-crossing was used as a way to segregate individual fastening systems and features were extracted for the same. The standardized feature values of individual fastener signatures for the 3718 measured sleepers are depicted in Figure 9. The feature matrix used in this case study for the anomaly detection purpose will make use of these three features and will have a dimension of 3718 × 3 (3718 samples and 3 features).

3.2. Anomaly Detection

The feature matrix obtained from above was used as an input to both anomaly detection algorithms. For isolation forest (IF), the number of trees was set as 1000 and a subsampling size of 128 was used. The contamination parameter for IF was not specified as this study makes use of unsupervised anomaly detection and no information regarding the percentage of outlier points was known for the entire data set. Figure 10 depicts the anomaly scores obtained for the measurement points (3718 samples) for both algorithms. The threshold value for both algorithms was calculated using the 95% quantile of the distribution of anomaly scores. A measurement point with a score lower than the threshold value was considered as normal or healthy behavior. A measurement point with an anomaly score greater than the threshold value was considered as an anomaly. The red marker in Figure 10 corresponds to the identified anomaly and the green marker represented measurement points that were normal or healthy. Out of 3718 instances, IF detected 186 anomalies and COF detected 136 anomalies.
The scatter plot of both the algorithms is depicted in Figure 11. The scatter plot (2-D) aids in understanding the location of anomalies with respect to the healthy or normal cluster. For both of the algorithms, most of the anomaly points detected were wide from the cluster of normal points. IF detected around 76 anomalous points that were close to the border of the normal clusters. COF, on the other hand, had around 25 points detected as anomalies that were within or very close to the normal class.
Figure 12 depicts the scatter plot when both IF and COF algorithms were combined. When both the algorithms were combined, a total of 121 measurement points were detected as anomalies. Out of the 121 anomalous points, only 10 points were close to or within the normal cluster, thus indicating that combining the two algorithms was much more efficient in separating the anomalous points from the normal cluster.
The ground truth points were utilized for evaluating the performance of the anomaly detection algorithms. The ground truth had 187 measurement points whose labels were available. A total of 172 of the 187 points were healthy or exhibited normal behavior and 15 instances were anomalous in the ground truth points. The anomalous instances included the presence of missing clamps (both one and two missing clamps within a fastening system), presence of weld joint, and presence of insulation joint as depicted in Figure 5. Figure 13 depicts the confusion matrix for the three methods obtained for the ground truth points. Both IF and COF were able to detect all the anomalous points precisely. The false negative (where the actual label is positive but incorrectly predicted as negative) was significantly high for IF and COF. When the two algorithms were combined, the false negative dropped significantly.
Table 1 depicts the evaluation parameters calculated from the confusion matrix. Accuracy, sensitivity, and specificity are used to evaluate the performance of the three methods. Sensitivity, in this study, indicates the proportion of normal instances that were predicted correctly. Specificity, on the other hand, is the proportion of anomalous cases that were predicted correctly. The specificity of all three methods was found to be 100% as it was able to predict all anomalous points precisely. The accuracy and sensitivity were high when IF and COF algorithms were combined for the detection purpose, rather than when they were used individually.

3.3. Clsutering Using DBSCAN Algorithm

The two main parameters required for the DBSCAN clustering algorithm, to form clusters of the dense region, are epsilon (ε) and the minimum number of points. The basic criterion for choosing minimum number points is to use a value greater than or equal to the dimension of the data set. The minimum number of points chosen for this study was four (the lowest value that is accepted for a data set of dimension three). The choice was based on the fact that a given section will not usually have many fastening systems with both the clamps missing. The epsilon value is computed from the input data using a ‘k’ nearest neighbor (k-NN) search with the given minimum points (refer Figure 14). ‘k’ is the number of neighbors of a point, which is one less than the minimum number of points in the neighborhood. The epsilon value obtained was 0.106.
The clusters obtained using the DBSCAN algorithm are depicted in Figure 15a. For epsilon values of 0.106 and 4 minimum points, the proposed algorithm detected 4 clusters of dense regions with distinct boundaries, and other points were recorded as noise. The major dilemma with an unsupervised clustering method is in making knowledgeable interpretations regarding the clusters obtained. From the previous results obtained during anomaly detection and system knowledge, the cluster with the maximum number of points can be inferred as a healthy class. However, the remaining clusters representing the anomalous point were difficult to interpret.
Silhouette score was utilized to estimate the quality of the clusters formed by the DBSCAN algorithm. Silhouette score determines how well each sample lies within its respected cluster. Normally, the value of the silhouette coefficient is given between [−1, 1]. A score of one represents that the clusters are very dense and nicely separated. A score of zero depicts that the clusters are overlapping. A score of less than zero means that data belonging to a cluster may be wrong/incorrect. The proposed clustering model had an average Silhouette score of 0.8993, representing a good quality of clustering by the algorithm on the entire data set. Of the samples, 98.78% had an individual score above zero, representing that the samples were well belonging to their respected clusters. Only 45 samples (1.21%) had a score below zero. All the 45 samples that exhibited a score below zero were found to be marked as noise by the algorithm.
The ground truth points were utilized to understand and interpret what each cluster represented. The ground truth measurement points contained 2 instances of a fastening system with both clamps missing, 6 instances of a fastening system with one clamp missing, 5 instances of weld joints, 3 instances with the presence of insulation joint (refer Figure 5), and the remaining 172 points represented healthy fastening systems with both clamps intact. The position of the ground truth points was plotted along with the scatter plot of the clusters formed and the same is depicted in Figure 15b. All the healthy points from the ground truth (marked with green circles in Figure 15b) were found along the region of cluster 1, which had the maximum number of observations within the cluster (3605 samples). The position of all fastening systems with one clamp missing from the ground truth measurement (marked with blue triangles in Figure 15b) was found to be within cluster 2. Similarly, the position of all fastening systems with both clamps missing (marked with diamonds in magenta) and all weld joints (marked with black squares) was found within cluster 4 and cluster 3, respectively. However, the position of the insulation joint (marked in red squares) did not fall into any clusters and was aligned along with the noise.
Figure 16 depicts the final model obtained by combining the clusters obtained using DBSCAN with the information obtained from ground truth points. The DBSCAN was able to detect and cluster healthy fastening systems, fastening systems with one clamp missing, fastening systems with both clamps missing, and weld joints separately. Out of the total 3718 samples, the healthy cluster contained 3605 samples and 31 samples belonged to weld joints. A total of 14 fastening systems had one clamp missing and 4 fastening systems had both clamps missing. A total of 64 samples were marked as noise by the algorithm and the insulation joint from the ground truth points were found along with the noise. The noise in this study includes various other rail components that have different magnetic permeability or electrical conductivity or have different geometrical shapes (such as switches and crossings, insulation joints, bridges, etc.), which will be analyzed further in future studies by incorporating features specific for such components.

4. Conclusions and Future Work

In previous studies [6,29], the authors proposed an alternate approach using a train-based differential eddy current sensor for fastener inspection that can overcome major challenges associated with automated visual inspection systems. This paper presents an automated train-based measurement system with the aid of unsupervised machine learning approaches to facilitate reliable and effective monitoring of railway fasteners by reducing human biases and error for detection of the state of railway fastening systems. The data set used for this study was obtained from an actual train measurement along a heavy haul line in the north of Sweden, where the measurement system was installed on a freight train and where the fastener condition and other likelihood of disturbances were unknown. Unsupervised machine learning models were adopted in this study for detecting and analyzing the underlying patterns and relationships in the data collected.
An anomaly detection model combining isolation forest and connectivity-based outlier factor was proposed to detect anomalies from the data set and to segregate them from the normal or healthy class. The performance of the proposed detection algorithm was evaluated on the ground truth data points, whose label pertaining to their specific condition was known. The proposed method had higher accuracy, sensitivity, and produced significantly fewer false negatives than when IF and COF were utilized individually. The proposed combined IF and COF method was also able to detect all the anomalous points precisely. To segregate the anomalies, a clustering method using the DBSCAN algorithm was also implemented on the data set. DBSCAN yielded four clusters including the healthy or normal cluster. The normal cluster was identified by combining system knowledge and the results or information received from the previous anomaly detection model. To interpret the remaining clusters, the ground truth points were utilized, and the results show that the model was able to segregate a healthy fastening system from other anomalies. Furthermore, the model was able to efficiently detect missing clamps (both one and two missing clamps within a fastening system) and weld joints and segregate them with distinct boundaries. Around 1.72% (64 samples) of the total samples were marked as noise during the clustering method. The noise in the measurement can correspond to various other track components (such as insulation joints, switches, and crossings, etc.) that exhibit different magnetic properties compared to those of fasteners. Future research will make use of features relevant to such components and, by considering the angle of rotation pertaining to such components, further segregate these noises into meaningful clusters.
The current study incorporates only one type of fastener, namely the Pandrol fast-clip. Future studies will also focus on detecting different types of fasteners that can be categorized by the rotation angle, as different fasteners have different geometrical shapes. Future research will also include high-speed measurements, detection of other magnetic track components, detection and quantification of rail defects, and development of efficient condition monitoring techniques with the use of artificial intelligence to detect and predict degradation and faults from big data. The features utilized for this study are subjected to change when the distance between the sensor and the target object varies (i.e., liftoff effect). Liftoff for this application can occur due to wheel wear of the train. However, this is a slow occurring process that can be handled by continuous automatic calibration of the system where the clusters formed by the healthy signatures are used as a reference. This study will be carried out in future work.

Author Contributions

Conceptualization, P.C., F.T., J.O., H.L., and M.R.; methodology, P.C., F.T., and M.R.; software, P.C., F.T., and J.O.; validation, P.C., F.T., and M.R.; formal analysis, P.C., F.T., H.L., and M.R.; investigation, P.C., F.T., and M.R.; resources, P.C., J.O., H.L., and M.R.; data curation, P.C., J.O., and F.T.; writing—original draft preparation, P.C.; writing—review and editing, P.C., F.T., H.L., J.O., and M.R; visualization, P.C. and F.T.; supervision, H.L. and M.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported and funded by the Luleå Railway Research Centre (JVTC) and Trafikverket (Swedish Transport Administration). The work has been carried out within the strategic innovation program InfraSweden2030, supported by Vinnova, Formas and Energimyndigheten, and the Shift2Rail project IN2SMART.

Institutional Reviewed Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to express their gratitude to LKAB for the measurement support. The authors would also like to thank Ulf Ranggard of ElOptic i Norden AB, Anders Thornemo, and David Lindow of Alstom, Sweden, and Olavi Kumpulainen of Consisthentic AB, for their guidance in this study. The authors would also like to thank Bony Thomas from the Department of Engineering Science and Mathematics at Luleå University of Technology for his support.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Nyström, B. Aspects of Improving Punctuality: From Data to Decision in Railway Maintenance. Ph.D. Thesis, Luleå Tekniska Universitet, Luleå, Sweden, 2008. [Google Scholar]
  2. European Commission. Roadmap to a Single European Transport Area—“Towards a Competitive and Resource Efficient Transport System”. 2011. Available online: https://ec.europa.eu/transport/themes/european-strategies/white-paper-2011_en (accessed on 14 November 2021).
  3. Famurewa, S.M.; Asplund, M.; Rantatalo, M.; Parida, A.; Kumar, U. Maintenance analysis for continuous improvement of railway infrastructure performance. Struct. Infrastruct. Eng. 2015, 11, 957–969. [Google Scholar] [CrossRef]
  4. Di Mascio, P.; Loprencipe, G.; Moretti, L. Competition in rail transport: Methodology to evaluate economic impact of new trains on track. In Sustainability, Eco-Efficiency and Conservation in Transportation Infrastructure Asset Management. In Proceedings of the 3rd International Conference on Tranportation Infrastructure, Split, Croatia, 28–30 April 2014; CRC Press: Boca Raton, FL, USA; pp. 669–675. [Google Scholar]
  5. Patra, A.P.; Kumar, U.; Kråik, P.O.L. Availability target of the railway infrastructure: An analysis. In Proceedings of the 2010 Proceedings-Annual Reliability and Maintainability Symposium (RAMS), San Jose, CA, USA, 25–28 January 2010; pp. 1–6. [Google Scholar] [CrossRef]
  6. Chandran, P.; Thiery, F.; Odelius, J.; Famurewa, S.M.; Lind, H.; Rantatalo, M. Supervised Machine Learning Approach for Detecting Missing Clamps in Rail Fastening System from Differential Eddy Current Measurements. Appl. Sci. 2021, 11, 4018. [Google Scholar] [CrossRef]
  7. Wang, S.; Dai, P.; Du, X.; Gu, Z.; Ma, Y. Rail fastener automatic recognition method in complex background. In Proceedings of the Tenth International Conference on Digital Image Processing (ICDIP 2018), Shanghai, China, 11–14 May 2018; Volume 10806, p. 1080625. [Google Scholar] [CrossRef]
  8. Wei, X.; Wei, D.; Suo, D.; Jia, L.; Li, Y. Multi-target defect identification for railway track line based on image processing and improved YOLOv3 model. IEEE Access 2020, 61973–61988. [Google Scholar] [CrossRef]
  9. Marino, F.; Distante, A.; Mazzeo, P.L.; Stella, E. A real-time visual inspection system for railway maintenance: Automatic hexagonal-headed bolts detection. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2007, 37, 418–428. [Google Scholar] [CrossRef]
  10. De Ruvo, P.; Distante, A.; Stella, E.; Marino, F. A GPU-based vision system for real time detection of fastening elements in railway inspection. In Proceedings of the 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 7–10 November 2009; pp. 2333–2336. [Google Scholar] [CrossRef]
  11. Babenko, P. Visual Inspection of Railroad Tracks. Ph.D. Thesis, University of Central Florida, Orlando, FL, USA, 2009. [Google Scholar]
  12. Resendiz, E.; Hart, J.M.; Ahuja, N. Automated visual inspection of railroad tracks. IEEE Trans. Intell. Transp. Syst. 2013, 14, 751–760. [Google Scholar] [CrossRef] [Green Version]
  13. Mao, Q.; Cui, H.; Hu, Q.; Ren, X. A rigorous fastener inspection approach for high-speed railway from structured light sensors. ISPRS J. Photogramm. Remote Sens. 2018, 143, 249–267. [Google Scholar] [CrossRef]
  14. Stella, E.; Mazzeo, P.; Nitti, M.; Cicirelli, G.; Distante, A.; D’Orazio, T. Visual recognition of missing fastening elements for railroad maintenance. In Proceedings of the IEEE 5th International Conference on Intelligent Transportation Systems, Singapore, 6 September 2002; pp. 94–99. [Google Scholar] [CrossRef]
  15. Yang, J.; Tao, W.; Liu, M.; Zhang, Y.; Zhang, H.; Zhao, H. An efficient direction field-based method for the detection of fasteners on high-speed railways. Sensors 2011, 11, 7364–7381. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. De Ruvo, G.; De Ruvo, P.; Marino, F.; Mastronardi, G.; Mazzeo, P.L.; Stella, E. A FPGA-based architecture for automatic hexagonal bolts detection in railway maintenance. In Proceedings of the Seventh International Workshop on Computer Architecture for Machine Perception (CAMP’05), Palermo, Italy, 4–6 July 2005; pp. 219–224. [Google Scholar] [CrossRef]
  17. Xia, Y.; Xie, F.; Jiang, Z. Broken railway fastener detection based on AdaBoost algorithm. In Proceedings of the 2010 International Conference on Optoelectronics and Image Processing, Haikou, China, 11–12 November 2010; Volume 1, pp. 313–316. [Google Scholar] [CrossRef]
  18. Rubinsztejn, Y. Automatic Detection of Objects of Interest from Rail Track Images. Master’s Dissertation, Faculty of Engineering and Physical Science, University of Manchester, Manchester, UK, 2011. Available online: http://studentnet.cs.manchester.ac.uk/resources/library/thesis_abstracts/BkgdReportsMSc11/Rubinsztejn-Yohann-bkgd-rept.pdf (accessed on 12 November 2021).
  19. Feng, H.; Jiang, Z.; Xie, F.; Yang, P.; Shi, J.; Chen, L. Automatic fastener classification and defect detection in vision-based railway inspection systems. IEEE Trans. Instrum. Meas. 2013, 63, 877–888. [Google Scholar] [CrossRef]
  20. Fan, H.; Cosman, P.C.; Hou, Y.; Li, B. High-speed railway fastener detection based on a line local binary pattern. IEEE Signal Process. Lett. 2018, 25, 788–792. [Google Scholar] [CrossRef]
  21. Mazzeo, P.L.; Ancona, N.; Stella, E.; Distante, A. Visual recognition of hexagonal headed bolts by comparing ICA to wavelets. In Proceedings of the 2003 IEEE International Symposium on Intelligent Control, Houston, TX, USA, 8 October 2003; pp. 636–641. [Google Scholar] [CrossRef]
  22. Singh, M.; Singh, S.; Jaiswal, J.; Hempshall, J. Autonomous rail track inspection using vision based system. In Proceedings of the IEEE International Conference on Computational Intelligence for Homeland Security and Personal Safety, Alexandria, VA, USA, 16–17 October 2006; pp. 56–59. [Google Scholar] [CrossRef]
  23. Li, Y.; Trinh, H.; Haas, N.; Otto, C.; Pankanti, S. Rail component detection, optimization, and assessment for automatic rail track inspection. IEEE Trans. Intell. Transp. Syst. 2013, 15, 760–770. [Google Scholar] [CrossRef]
  24. Song, Q.; Guo, Y.; Yang, L.; Jiang, J.; Liu, C.; Hu, M. High-speed railway fastener detection and localization system. arXiv 2019, arXiv:1907.01141. [Google Scholar]
  25. Wang, T.; Yang, F.; Tsui, K. Real-Time Detection of Railway Track Component via One-Stage Deep Learning Networks. Sensors 2020, 20, 4325. [Google Scholar] [CrossRef] [PubMed]
  26. Wei, X.; Yang, Z.; Liu, Y.; Wei, D.; Jia, L.; Li, Y. Railway track fastener defect detection based on image processing and deep learning techniques: A comparative study. Eng. Appl. Artif. Intell. 2019, 80, 66–81. [Google Scholar] [CrossRef]
  27. Liu, J.; Huang, Y.; Zou, Q.; Tian, M.; Wang, S.; Zhao, X.; Dai, P.; Ren, S. Learning visual similarity for inspecting defective railway fasteners. IEEE Sens. J. 2019, 19, 6844–6857. [Google Scholar] [CrossRef]
  28. Chandran, P.; Asber, J.; Thiery, F.; Odelius, J.; Rantatalo, M. An Investigation of Railway Fastener Detection Using Image Processing and Augmented Deep Learning. Sustainability 2021, 13, 12051. [Google Scholar] [CrossRef]
  29. Chandran, P.; Rantatalo, M.; Odelius, J.; Lind, H.; Famurewa, S.M. Train based differential eddy current sensor system for rail fastener detection. Meas. Sci. Technol. 2019, 30, 125105. [Google Scholar] [CrossRef]
  30. Engelberg, T.; Mesch, F. Eddy current sensor system for non-contact speed and distance measurement of rail vehicles. WIT Trans. Built Environ. 2000, 50. [Google Scholar] [CrossRef]
  31. García-Martín, J.; Gómez-Gil, J.; Vázquez-Sánchez, E. Non-destructive techniques based on eddy current testing. Sensors 2011, 11, 2525. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. CSUR 2009, 41, 1–58. [Google Scholar] [CrossRef]
  33. Liu, F.T.; Ting, K.M.; Zhou, Z. Isolation forest. In Proceedings of the Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar] [CrossRef]
  34. Breunig, M.M.; Kriegel, H.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104. [Google Scholar] [CrossRef]
  35. Tang, J.; Chen, Z.; Fu, A.W.; Cheung, D.W. Enhancing effectiveness of outlier detections for low density patterns. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Taipei, Taiwan, 6–8 May 2002; pp. 535–548. [Google Scholar] [CrossRef]
  36. Jabbar, A.M. Local and Global Outlier Detection Algorithms in Unsupervised Approach: A Review. Iraqi J. Electr. Electron. Eng. 2021, 17, 1–12. Available online: https://www.iasj.net/iasj/download/06ff35b010bf02b8 (accessed on 18 November 2021). [CrossRef]
  37. Ester, M.; Kriegel, H.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the KDD’96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 226–231199696. Available online: https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf?source=post_page (accessed on 11 November 2021).
Figure 1. Arrangement of the differential eddy current sensor (Lindometer) measuring over a railhead.
Figure 1. Arrangement of the differential eddy current sensor (Lindometer) measuring over a railhead.
Sustainability 14 01035 g001
Figure 2. Circuit diagram of the differential EC sensor.
Figure 2. Circuit diagram of the differential EC sensor.
Sustainability 14 01035 g002
Figure 3. The geographical location of the Iron Ore Line, Malmbanan.
Figure 3. The geographical location of the Iron Ore Line, Malmbanan.
Sustainability 14 01035 g003
Figure 4. Lindometer mounted on a freight train.
Figure 4. Lindometer mounted on a freight train.
Sustainability 14 01035 g004
Figure 5. Measurement pattern.
Figure 5. Measurement pattern.
Sustainability 14 01035 g005
Figure 6. Signal processing techniques for extracting fastener signatures from the raw signal.
Figure 6. Signal processing techniques for extracting fastener signatures from the raw signal.
Sustainability 14 01035 g006
Figure 7. Time signal after demodulation, filtering, and rotating the raw signal.
Figure 7. Time signal after demodulation, filtering, and rotating the raw signal.
Sustainability 14 01035 g007
Figure 8. IQ plot after demodulating the signal.
Figure 8. IQ plot after demodulating the signal.
Sustainability 14 01035 g008
Figure 9. Standardized features for individual fastener signature with respect to the sleeper number: (a) peak-to-peak, (b) RMS, (c) arc length of the complex signal.
Figure 9. Standardized features for individual fastener signature with respect to the sleeper number: (a) peak-to-peak, (b) RMS, (c) arc length of the complex signal.
Sustainability 14 01035 g009
Figure 10. Anomaly scores for all measurement points: (a) isolation forest scores, (b) connectivity-based outlier factor.
Figure 10. Anomaly scores for all measurement points: (a) isolation forest scores, (b) connectivity-based outlier factor.
Sustainability 14 01035 g010
Figure 11. Scatter plot depicting normal and anomalous points with respect to two features: (a) isolation forest, (b) connectivity-based outlier factor. The normal instances are marked with green markers and anomalous points detected are marked with red markers.
Figure 11. Scatter plot depicting normal and anomalous points with respect to two features: (a) isolation forest, (b) connectivity-based outlier factor. The normal instances are marked with green markers and anomalous points detected are marked with red markers.
Sustainability 14 01035 g011
Figure 12. Scatter plot depicting normal and anomalous points with respect to two features for IF and COF combined.
Figure 12. Scatter plot depicting normal and anomalous points with respect to two features for IF and COF combined.
Sustainability 14 01035 g012
Figure 13. Confusion matrix for the ground truth points: (a) isolation forest, (b) connectivity-based outlier factor, (c) IF and COF combined.
Figure 13. Confusion matrix for the ground truth points: (a) isolation forest, (b) connectivity-based outlier factor, (c) IF and COF combined.
Sustainability 14 01035 g013
Figure 14. Epsilon value estimation for DBSCAN.
Figure 14. Epsilon value estimation for DBSCAN.
Sustainability 14 01035 g014
Figure 15. Scatter plot for clusters formed with DBSCAN: (a) clusters without ground truth, (b) clusters with ground truth labels.
Figure 15. Scatter plot for clusters formed with DBSCAN: (a) clusters without ground truth, (b) clusters with ground truth labels.
Sustainability 14 01035 g015
Figure 16. Final model where clusters are merged based on the collected ground truth points.
Figure 16. Final model where clusters are merged based on the collected ground truth points.
Sustainability 14 01035 g016
Table 1. Performance of the algorithms on the ground truth points (all scores are in percentile).
Table 1. Performance of the algorithms on the ground truth points (all scores are in percentile).
AlgorithmAccuracySensitivitySpecificity
IF88.7787.79100
COF90.9090.11100
IF and COF96.79 96.51 100
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Chandran, P.; Thiery, F.; Odelius, J.; Lind, H.; Rantatalo, M. Unsupervised Machine Learning for Missing Clamp Detection from an In-Service Train Using Differential Eddy Current Sensor. Sustainability 2022, 14, 1035. https://0-doi-org.brum.beds.ac.uk/10.3390/su14021035

AMA Style

Chandran P, Thiery F, Odelius J, Lind H, Rantatalo M. Unsupervised Machine Learning for Missing Clamp Detection from an In-Service Train Using Differential Eddy Current Sensor. Sustainability. 2022; 14(2):1035. https://0-doi-org.brum.beds.ac.uk/10.3390/su14021035

Chicago/Turabian Style

Chandran, Praneeth, Florian Thiery, Johan Odelius, Håkan Lind, and Matti Rantatalo. 2022. "Unsupervised Machine Learning for Missing Clamp Detection from an In-Service Train Using Differential Eddy Current Sensor" Sustainability 14, no. 2: 1035. https://0-doi-org.brum.beds.ac.uk/10.3390/su14021035

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop