Vehicular-Network-Intrusion Detection Based on a Mosaic-Coded Convolutional Neural Network

Hu, Rong; Wu, Zhongying; Xu, Yong; Lai, Taotao

doi:10.3390/math10122030

Open AccessArticle

Vehicular-Network-Intrusion Detection Based on a Mosaic-Coded Convolutional Neural Network

¹

Fujian Provincial Key Laboratory of Big Data Mining and Application, Fujian University of Technology, Fuzhou 350118, China

²

Fujian Key Laboratory of Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, China

³

Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang University, Fuzhou 350108, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(12), 2030; https://0-doi-org.brum.beds.ac.uk/10.3390/math10122030

Submission received: 3 May 2022 / Revised: 1 June 2022 / Accepted: 2 June 2022 / Published: 11 June 2022

(This article belongs to the Special Issue Evolutionary Computation for Deep Learning and Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

:

With the development of Internet of Vehicles (IoV) technology, the car is no longer a closed individual. It exchanges information with an external network, communicating through the vehicle-mounted network (VMN), which, inevitably, gives rise to security problems. Attackers can intrude on the VMN, using a wireless network or vehicle-mounted interface devices. To prevent such attacks, various intrusion-detection methods have been proposed, including convolutional neural network (CNN) ones. However, the existing CNN method was not able to best use the CNN’s capability, of extracting two-dimensional graph-like data, and, at the same time, to reflect the time connections among the sequential data. Therefore, this paper proposed a novel CNN model, based on two-dimensional Mosaic pattern coding, for anomaly detection. It can not only make full use of the ability of a CNN to extract grid data but also maintain the sequential time relationship of it. Simulations showed that this method could, effectively, distinguish attacks from the normal information on the vehicular network, improve the reliability of the system’s discrimination, and, at the same time, meet the real-time requirement of detection.

Keywords:

Internet of Vehicles; Control Area Network Bus; intrusion detection; intelligent connected vehicle; convolutional neural network; deep learning

MSC:

68T20

1. Introduction

With the development of the Internet of Things (IoT), big data, and other technologies, the connection of vehicles to the Internet is becoming, increasingly, common, and the automobile industry is, also, evolving towards the direction of an intelligent Internet of Vehicles (IoV) [1]. IoV refers to the network system that connects the internal devices of vehicles, cars and people, cars and cars, cars and road, and cars and cloud platforms, through various mobile communication technologies [2]. As an important part of the smart city network, the importance of the Internet of Vehicles is apparent, in intelligent transportation and autonomous driving [3]. Governments and enterprises, in many countries, are working together towards the direction of an intelligent IoV [4,5].

As the early automobile was a relatively closed and independent system, the issue of automobile network security did not attract much attention. Nowadays, with the continuous development of IoV technology, network attacks against cars are increasingly frequent. Attackers can attack a vehicle, using either physical or remote access, and take control of the car, which seriously threatens the normal running of the car and the life of the driver [6]. Frequent information exchange between vehicles and the outer world leads to more and more types of external interfaces of vehicles, which, also, leads to an increasing number of attack paths against vehicle-mounted networks (VMNs) [7].

Some famous experimental attacks were conducted in the past few years, including an attack on the control systems of the Ford Escape and Toyota Prius, in 2013 [8], and the remote attack on more than 20 models, to estimate the difficulty of remote exploitation for these vehicles [9]. Later, a demonstrative remote control of steering and braking on Fiat-Chrysler vehicles was presented, forcing Fiat to recall 1.4 million vehicles in an emergency [10]. In 2016, a car’s powertrain and steering wheel were successfully interfered with, by the injection of an attack message through the Jeep’s onboard diagnostic (OBD) system interface [11]. A Tesla was, also, attacked, due to a security vulnerability, which allowed acquisition of the location information of the vehicle, to remotely control it [12,13]. In 2018, physical contact and remote attacks on a number of BMW models were realized, to control the vehicle [14]. In 2020, an attacker successfully developed a new key clone, called the Relay Attack, for Tesla cars, and demonstrated it on a Tesla Model X electric car [15]. From 2016 to 2019, the security incidents of the intelligently connected vehicles increased by seven times, among which the incidents in 2019 increased by 99%, compared with those in 2018. In 2019, 82% of security incidents were caused by remote attacks.

In addition to VMN, intelligently connected vehicle platform, and network-level platform and terminals, the IoV system, also, includes various ECUs (Electric Control Units). Common ECUs include body control, engine control, airbag, etc. [16], which are associated with the internal communication buses and constitute an on-board network system. Common on-board network protocols include the CAN (Control Area Network) bus, Flex Ray bus, and MOST (Media Oriented System Transport) bus, etc. [17], among which the CAN bus protocol is the most widely used on-board network protocol at present and has, virtually, become the actual standard for it [18].

Attack messages sent by attackers are transmitted through the VMN, so prevention of intrusion via VMN is the most important task in the area of vehicle information security [19,20]. Since the data containing different functions are transmitted, periodically, through the CAN bus, an intruder can attack a function to control the car through replay, without the need to master the CAN protocol [21]. Currently, attacks on the vehicle-mounted CAN network include discarding, tampering, reading, spoofing, flooding, and replaying, etc. Discard means that the attacker deletes the key message data on the CAN bus, interfering with the normal operation of the vehicle. Tampering is when the attacker modifies the data content of the CAN message, causing the car to follow the wrong instructions. Reading is when the attacker obtains real data on the CAN bus, through a node controlled by intrusion. Spoofing is when the attacker uses the attacked node to send diagnostic and attack information, to occupy ECU resources. Flooding means that messages are sent with a high priority to the CAN bus at high frequency, thus occupying the CAN resources and preventing other nodes on the network from sending messages normally, causing the bus network to collapse. Replaying means that an attacker can attack and control an ECU at will and reload its message data onto the CAN bus.

The methods and technologies of vehicle-network-information security can be divided into encryption-authentication technology, the security architecture standard, and network-intrusion-detection technologies. In this paper, the method based on network-intrusion-detection technology is adopted, to detect the attacks on the vehicle network.

As for the intrusion-detection systems, they can be divided into anomaly-based and misuse-based ones, according to the technology adopted. A misuse-detection system is based on the extraction of the characteristics and rules of the attack behavior, to establish a feature rule. During intrusion detection, if the characteristic behavior of the system is found to match the characteristics of the feature rule, it is considered an attack; otherwise, it is normal. An anomaly-detection system, first, establishes a characteristic rule of normal behavior and sets a threshold value. When there is an intrusion, the system compares it with the normal one. If the result is greater than the threshold, it is an attack; otherwise, it is normal. The intrusion detection method, based on misuse, should update the attack-behavior-characteristic library in real time. Otherwise, it may not be able to detect attacks that are not present in the library [22,23]. Whereas, anomaly detection does not require periodic updates to the system and can detect attacks, if the normal behavior of the VMN can be successfully defined. Therefore, it is more suitable for the detection of a vehicle-mounted CAN network. When the difference between the received message and the predicted result is greater than a threshold value, it would be recognized as an anomaly. By using detection sensors, a structured anomaly detection method was proposed to detect the CAN identifier and message frequency [24]. In another piece of research, attackers were prevented from analyzing and tampering ECU codes, using a Markov-chain-decision model on the encrypted storage system of the ECU [25]. An anomaly-detection method, based on information entropy [26], and a frequency-based anomaly-detection method were, also, proposed for the detection of anomaly intrusions [27]. To detect malicious attack messages in time, a lightweight intrusion-detection algorithm was proposed, based on analyzing the messages’s time interval, with a response time of less than 1 ms [28]. Larson et al. proposed a CAN bus intrusion-detection method, based on protocol-level security rules between ECUs, to detect ECU exceptions [29]. Murvay et al. proposed a method to identify the sender of information from the ECU, by analyzing the characteristics of the CAN bus signals [30]. In addition to traditional methods, neural networks have, also, been used to detect anomaly intrusions. Taylor et al. proposed an anomaly-detection method, based on the LSTM (Long Short-Term Memory) neural network, in which the network was trained with the content of the message to predict the content of the next message [31]. One-dimensional (1D) CNN models can, also, be used in the processing of 1D data, and good results have been obtained [32,33,34,35]. However, 1D convolution operation is not able to identify the time connections between the data. Therefore, in intrusion detections, only 2D CNNs were used to process 1D sequential data. For example, an intrusion-detection model, based on the deep convolutional neural network, was proposed by Song et al. [36]. However, there are, still, problems when using a CNN on this issue, such as how to convert the 1D sequential data into 2D grid data, which are easier to be recognized by the CNN, etc. [37].

In view of this, this paper focuses on the security issues of a vehicle-mounted CAN network, from the perspective of intrusion-detection technology, combined with a deep learning model and a novel Mosaic pattern-based coding method. As CNN is only good at dealing with grid data, such as images, and CAN is, typically, a kind of sequential data, a novel pattern-based coding method was proposed in this paper, to make the CNN more effectively extract the data characteristics. The main contributions of this paper include:

A 2D Mosaic-coding method was proposed, for converting the 1D attack data into 2D grid data, to make full use of the ability of a CNN to extract grid data and maintain the sequential time relationship of the data.
Different thresholds higher than 0.5 were set, to effectively test the reliability of the model.
Extensive experiments were carried out, to show that our method could achieve better performance with higher classification capability, and that it was more reliable and stable in identifying intruders’s attacks than the previous method.

The rest of this paper is organized as follows. In the next section, we briefly describe the background of a CNN, the CAN data, and the previous CNN model for this problem. Section 3 describes our proposed method. In Section 4, the effectiveness of the proposed method is evaluated, and the performances of the proposed and the existing methods are compared and discussed. Section 5 makes a conclusion for this paper.

2. Background Knowledge

This section describes CNN, the CAN data of a vehicle-mounted CAN bus network, the previous CNN model for intrusion attack detection, and the data used in this paper.

2.1. Vehicle-Mounted CAN Bus Network

Developed by the Bosch company, Germany, CAN was famous for the research, development, and production of vehicular electronic products, and eventually became an international standard (ISO 11898). It is one of the most widely used field buses in the world [38] and has the following characteristics: (1) long data-transmission distance (up to 10 km); (2) fast data-transmission rate (up to 1 Mbit/s); (3) reliable error-handling and detecting mechanism; (4) automatic resending mechanism (message automatically resent, when damaged); (5) excellent arbitration mechanism; and (6) multi-master mode of work. The data on the CAN bus is transmitted in the form of a message.

The CAN bus uses a twisted pair of high-level CAN_H and low-level CAN_L buses, to connect all nodes on the car. CAN-bus protocol uses the difference between CAN_H and CAN_L, to encode the signal with 0, representing the dominant bit, and 1, representing the recessive bit. Figure 1 shows the signal logic of the CAN bus. When a recessive bit is sent by an ECU, the voltages of CAN_H and CAN_L are both 2.5 V, and the difference between them is 0 V. When a dominant bit is sent by it, the voltage of CAN_H is 3.5 V, but the voltage of CAN_L is 1.5 V, so the difference between them is 2.0 V.

The CAN bus adopts a broadcast protocol, with each message having a CAN ID. Whether a message on the bus will be accepted by an ECU is determined by its CAN ID. When messages are sent simultaneously, by several ECU nodes, to the CAN bus, a bit-by-bit discrimination on the CAN ID is performed by the CAN bus, to determine the priority of the message, based on the principle that the ECU with a smaller CAN ID takes a higher priority. To maintain the system’s consistency, the ECU will, periodically, broadcast messages, even if the content of them has not been changed, which makes the ECU have a periodic message. At present, the CAN bus data frame includes standard and extension versions. The difference between them comes from the number of identifier (ID) bits. The standard frame supports an 11-bit ID, while the extension frame supports a 29-bit ID. According to the different bits of the number of CAN IDs, the data are processed with 11 bits and 29 bits, respectively. The structure of the CAN message is shown in Figure 2, which mainly includes the frame-start, arbitration, control, data, validation (CRC), and frame-end fields.

2.2. Convolutional Neural Network for Intrusion Attack Detection

A Convolutional Neural Network (CNN), as an important branch of artificial neural networks, is a kind of feedforward neural network with deeper structure, including convolution and pulling operations. Compared with other deep learning neural networks, A CNN has unique advantages in processing grid data, such as images, and has achieved great success in target detection and image processing [39]. A CNN, typically, consists of input, convolution, lower-sampling (pooling), full-connection, and output layers.

The convolutional layer can, automatically, extract the features of the input patterns, effectively reduce the number of network parameters, and alleviate the overfitting of the model. Convolution operations have three important characteristics: sparse connection, parameter sharing, and isomorphic representation, which are used to improve the system’s performance [40]. In a convolution operation, if the input is a 2D grid data I(m, n) and the convolution kernel is, also, two-dimensional (identified as K), the feature map obtained after the convolution operation is a 2D grid data as well (identified as S(i, j)), which can be expressed as:

S (i, j) = (K * I) (i, j) = \sum_{m} \sum_{n} I (i + m, j + n) K (m, n),

(1)

Since the operation is linear, an activation function would be applied to provide nonlinearity for the model, which takes the following form:

h (x) = g (f (x))

(2)

where f(x) represents the output of the neuron, g is the activation function, and h(x) is the final output of the neuron. The hyperbolic tangent function Tanh is, generally, used as the activation function, which is expressed as:

\tanh (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

(3)

A softmax function is, also, often served as an activation function in the final output layer of the network, which simulates the probabilities of multiple classifications. Imagine a K-dimensional vector with z_i representing the ith real value of the vector, the softmax is expressed as:

σ (z_{i}) = e x p (z_{i}) / \sum_{i = 0}^{K} e x p (z_{i}) .

(4)

The function of the pooling layer is for dimensional reduction. Theoretically, the features extracted through convolution operation can be utilized as the input of the classifier, but the number of neurons is, usually, quite large, which will consume a large number of computational resources, to classify directly. Therefore, the pooling layer is used to compress the features obtained from the convolution operation, by substituting multiple neurons within a subsampling window with a single output neuron, to reduce the dimensionality and cut down the number of calculations. Commonly used pooling operations are average pooling and maximum pooling [41].

Finally, the Fully Connected (FC) layers play the role of “classifier” in the output of a CNN. In the CNN structure, after multiple convolutional and pooling layers, one or more FC layers are used. Each neuron in the FC layer is connected to all neurons in the previous layer, and the FC layer can integrate the local information, having categorical distinguishing properties in the convolutional or pooling layer. To improve the performance of a CNN network, an activation function is added to each neuron in the FC layer. The output values of the final layer can be categorized using the softmax regression, which is, also, known as the softmax layer. For a specific classification task, it is essential to use an appropriate loss function. There are several commonly used loss functions for CNN, each with different characteristics. In general, the FC layer of a CNN is the same as the MLP (Multi-Layer Perceptron) structure, and the training of the CNN, mostly, adopts the BP (Back Propagation) algorithm [42,43].

Since CAN ID is a kind of time series data, and a CNN is good at processing grid data similar to images, researchers have directly put the 29 CAN ID data to splice a 29 × 29 grid data, by means of sequential arrangement [36], which is shown in Figure 3.

In the training process, 29 consecutive CAN IDs were, first, randomly selected from the dataset and, then, connected together to splice a 29 × 29 grid data, as the input to the CNN. As the convolution operation is still two-dimensional (2D), this kind of sequential CAN ID data arrangement may not be able to take advantage of the feature of a CNN to extract grid data and, at the same time, to maintain the time connection of the original data.

2.3. Dataset

The vehicular-network-attack dataset, collected by the Hacking and Countermeasure Research Lab in South Korea, was used in this paper to train and test our model. It contains four attack types: Fuzzy, DoS, Spoofing Gear, and Spoofing RPM [44]. The dataset was the record of the CAN traffic via the OBD-II port of a real vehicle, when the attack message was injected. In the data-collection process, the engine of the test vehicle was running. There were 26 distinct normal CAN IDs on the CAN bus. Each dataset had a total of 30 min to 40 min of CAN traffic data and contained some injected attack messages. The specific characteristics of each dataset are as follows:

(1): DoS Attack: The attack data were injected every 0.3 ms with an ID of “0000”. According to the principle of the smaller the ID, the higher the priority, data with a “0000” ID have the highest priority, hence, the attack will occupy the bus resources.
(2): Fuzzy Attack: The attack data were injected with randomly selected packet data, every 0.5 ms.
(3): Spoofing Attack (RPM/GEAR): Message data containing speed or gear information were injected, every 1 ms.

All CAN data extracted from real vehicles, mainly, include the following parts (the labels in the dataset were added to the real traffic data, by the researchers):

(1): TIMESTAMP: The timestamp of the data record.
(2): CAN ID: The identifier (as a hexadecimal) of each packet of data.
(3): DLC: The number of bytes in the data field, of each CAN packet data (0–8).
(4): Data [0–7]: The specific value of each byte in the data field.
(5): Flag: The label for each piece of data, where T stands for attack data, and R stands for normal data.

3. Method

A novel CNN model, based on 2D Mosaic pattern coding, was proposed in this paper, to detect intrusions. The method consists of two processes: 2D Mosaic pattern-based coding and the CNN model. This section presents a detailed introduction to each process.

3.1. Two-Dimensional Mosaic Pattern-Based Coding

During the coding process, the CAN ID was, first, extracted from the dataset; it was, then, converted to a binary number, either 11 bits or 29 bits. In terms of data coding, to better fit a CNN’s ability to extract 2D grid data features, a novel 2D Mosaic pattern-based coding was proposed to represent the data. In this method, all 1D data were, first, converted to a 2D coding, and, then, the data were packed to splice a Mosaic pattern. As the dataset used in this paper only contains an 11-bit CAN ID, it can be converted into either 11-bit or 29-bit (in this case, the remaining 18 bits are all zero) binary numbers, both of which were used to carry out our experiments.

In the filling of an 11-bit CAN ID, the binary numbers were put into a 4 × 4 grid matrix in the following order: M[0][0]~M[0][3], M[1][3]~M[3][3], M[3][2]~M[3][0], and M[2][0], while the remaining 5 elements were zeroed. The reason that the data is filled in this way is that it places the data in the surroundings of the grid, so that the patterns could be more easily recognized by the CNN. The 2D data grid would, then, be further put together, to splice a larger 2D Mosaic structure in the format of 4 × 4, 6 × 6, or 8 × 8 data patterns. As the data grids were, continuously, arranged from the data series, to splice such a Mosaic data structure, the temporal relationship among sequential data could be well maintained. Take the 16 4 × 4 data grids, to splice a Mosaic pattern with 24 × 24 elements, as an example; 16 such data grids were arranged, one after another, to maintain the temporal feature of the original data. Figure 4 shows the result, with the digits in the grid representing the bits in the ID binary.

If 29-bit CAN ID data were used, it was converted to a 6 × 6 data grid. In this grid, 29 bits of binary ID were filled into a 6 × 6 matrix, in the following order M[1][1]~M[1][4], M[2][1]~M[2][4], M[3][1]~[3][4], M[4][1]~M[4][4], M[0][0]~M[0][5], M[1][0], M[1][5], M[2][0], M[2][5], M[3][0], M[3][5], and M[4][0], while the remaining 7 elements in the matrix were filled with zero. The reason that the data is filled in this way is that in the current data, only the first 11 bits are nonzero and, thus, it is easier for the CNN to recognize the patterns. Similarly, after the data were gridded, they were arranged, continuously, to splice a Mosaic structure in the format of 3 × 3, 4 × 4, 6 × 6, or 8 × 8, to maintain the temporal characteristics between the data. Taking the 3 × 3 Mosaic as an example, 9 data grids were put together, to splice an 18 × 18 Mosaic structure, which was used as the input to the network. Figure 5 shows the result, with the digits in the grid representing the bits in the ID.

As an example, the CAN ID data grids, containing the Fuzzy attack type, are shown in Figure 6. We can see from this figure that our Mosaic-based coding method showed clear patterns for different attack data, which the CNN would recognize more easily.

3.2. Convolutional Neural Network Model

The CNN model employed in this paper is composed of an input, a convolution, a pooling, a full connection, and an output layer [45]. The input layer takes the grid data encoded by the 2D Mosaic pattern. If each CAN ID is converted to a 6 × 6 grid, and the grid is a 6 × 6 Mosaic, the dimension of the network input X is 36 × 36.

The convolution layer contains 20 convolution kernels, the size of output in the convolution layer is worked out by Equation (5), where O is the size of output, K is the kernel size, P is the filling method, and S is the size of the step. We set the dimension of the output and input of the convolution layer as the same, by using an appropriate filling method. The dimension of the pooling window F was 2 × 2, and the maximum pooling algorithm was used, with the data of each nonrepeated 2 × 2 region being taken, so the dimension of the output after pooling was reduced to X/4.

O = \frac{W - K + 2 P}{S} + 1

(5)

The input of the convolution is dependent on the output of the previous layer, and the output of it determines the input of the next layer. When several convolution layers are used, the output of each convolution layer is worked out by Equation (6), where

a^{i}

represents the output of the convolution layer,

a^{i - 1}

is the output of the previous layer,

σ

is the activation function,

W^{i}

is the connection weight, and

b^{i}

is the bias of the layer.

a^{i} = σ (z) = σ (a^{i - 1} \times W^{i} + b^{i})

(6)

Then, there comes the full connection layer. It was connected to the neurons, leveled after the pooling layer, containing 128 neurons in this paper. The bias value b of the full-connection neurons was assigned an initial value of 0.1, and the activation function was tanh. The layer was, finally, followed by a dropout operation, which was done by disabling the activation function in some neurons with a certain probability, to reduce the possibility of overfitting [46].

Finally, there is the output layer. As there were only normal and attack types of output in this paper, the output layer only has two neurons, which were fully connected to the 128 neurons from the previous layer. The softmax function was used at this level, to classify the 2 categories. Finally, one hot type of output was used, in which only one output neuron is true or 1, and the other neuron is false or 0. This means that as long as one of the two outputs is greater than 0.5, the other output must be smaller than 0.5, so, then, the input data is classified by the larger output. This is equivalent to a threshold value of 0.5. The structure of the CNN is shown in Figure 7.

In data processing, data grids with sizes of 4 × 4 and 6 × 6 as well as Mosaic patterns with sizes of 4 × 4, 6 × 6, and 8 × 8 were used. For a Mosaic pattern with a grid size of 6 × 6, taken as input for the training and test sets, it contained 36 CAN IDs. If at least one of these 36 CAN IDs has a label T (meaning that the input includes at least one attack data), it will be identified as an attack (T). Otherwise, it will be identified as normal (R). For a Mosaic pattern with a size of 4 × 4 or 8 × 8, similar rules will be used.

Finally, the cross-entropy loss function was used, and the adaptive moment estimation optimization method was applied to minimizing the function. After the learning rate was constantly adjusted, an optimal learning rate for the optimizer was set as 1 × 10⁻⁴. After the Mosaic patterns were constructed, they were randomized, and 75% of the data was taken as the training set to train the model, while the remaining 25% was used as the test set to test the model. The overall scheme of this paper is shown in Figure 8.

3.3. Model Evaluation Method

The detection of intrusion attacks is, actually, a dichotomous task, so the dichotomous confusion matrix and the associated evaluation indexes are employed to evaluate the performance of the model. There are only two categories of classification targets, namely, positive (attack message) or negative (normal message). The specific meanings of TP, FP, FN, and TN in this figure are described below.

(1): True positive (TP): The number of positive samples being correctly classified as positive by the classifier, i.e., the grid data that contains the attack message is correctly judged as attack.
(2): False positive (FP): The number of negative samples being incorrectly classified as positive by the classifier, i.e., the grid data that does not contain any attack message is misjudged as attack.
(3): False negative (FN): The number of positive samples being wrongly classified as negative by the classifier, i.e., the grid data containing the attack message, is misjudged as normal.
(4): True negative (TN): The number of negative samples being correctly classified as negative by the classifier, i.e., the grid data that does not contain any attack message, is correctly judged as normal.

The total number of Mosaic patterns, processed in a specific coding case, is denoted as SUM and shown in Table 1. The final output of the model in this paper is the probability representing the normal data (R), attack data (T), or unrecognizable patterns, according to the threshold of the sample. Generally, the input data were classified as either normal (R) or attack (T), by comparing them with a cut value (threshold), which was normally set as 0.5. However, this would result in a low classification reliability because even when the output is only slightly greater than 0.5 (say, 0.501), it will be classified as a certain category. Hence, higher thresholds, i.e., 0.6, 0.7, 0.8, and 0.9, were employed in this paper, to improve the classification reliability of the model. When one of the two output neurons is greater than the threshold, the category corresponding to that output will be taken as the discrimination result. Otherwise, it will be classified as an unrecognized sample. For example, when the threshold is 0.7, if neither of the two outputs is greater than 0.7, it is classified as an unrecognizable sample. Only when one of the two outputs is greater than or equal to 0.7, would the sample be correctly classified by the model. The evaluation indexes adopted in this paper based on the confusion matrix are described as follows:

(1): Unrecognizable rate (UR): the fraction of the number of unrecognized samples divided by the total number of test samples. When a threshold higher than 0.5 was set, the unrecognized sample could be either a normal input or an intrusion. This index can, effectively, test the reliability of the model, which can be worked out as follows (where L is the number of unrecognized samples):

UR = \frac{L}{S u m}

(7)

(2): Accuracy: the proportion of the number of samples correctly classified over the total test samples. Generally speaking, the higher the accuracy, the better the classifier. It is defined as:

Accuracy = \frac{TP + TN}{Sum}

(8)

(3): Recall: a measure of coverage, indicating how many positive samples are correctly classified as positive. It is defined as:

Recall = \frac{TP}{TP + FN}

(9)

(4): False Negative Rate (FNR): the percentage of misdiagnosed positive samples, which is defined as:

FNR = \frac{FN}{TP + FN}

(10)

(5): Precision: the percentage of positive samples correctly classified over all classified positive samples. It is defined as:

Precision = \frac{TP}{TP + FP}

(11)

(6): Comprehensive classification rate (F1_score): the harmonic average of the model precision and recall rate, indicating the overall accuracy of the model. It is defined as:

F 1_score = \frac{2 \times Precision \times Recall}{Precision + Recall}

(12)

Table 1. The numbers of normal (R) and attack (T) samples, for each coding method.

Dataset	Original Data	29 × 29 Sequence	4 × 4 Mosaic	6 × 6 Mosaic	8 × 8 Mosaic
DoS	R: 3,078,250	R: 88,954	R: 161,491	R: 71,598	R: 40,149
DoS	T: 587,521	T: 37,451	T: 67,619	T: 30,228	T: 17,128
Fuzzy	R: 3,347,013	R: 87,888	R: 159,796	R: 70,741	R: 39,665
Fuzzy	T: 491,847	T: 44,486	T: 80,132	T: 35,894	T: 20,317
Gear	R: 3,845,890	R: 87,928	R: 159,673	R: 70,787	R: 39,673
Gear	T: 597,252	T: 65,283	T: 118,023	T: 52,633	T: 29,751
RPM	R: 3,966,805	R: 87,997	R: 159,752	R: 70,831	R: 39,713
RPM	T: 654,897	T: 71,372	T: 129,104	T: 57,549	T: 32,501

4. Experimental Result and Discussions

Our CNN model, based on Mosaic pattern coding strategy, was tested in a series of experiments. We evaluated the effectiveness of the proposed method, made comparisons with other methods for the same problem, and discussed the accuracy as well as performance of the existing method and ours.

4.1. Experimental Result

The data were tested over all thresholds, from 0.5 to 0.9, and, then, the numbers of TP, TN, FP, and FN in the confusion matrix can be worked out. After that, each indicator expressed in Equations (7)–(12) can be calculated. For example, when we used the 4 × 4 Mosaic 6 × 6 data-grid-coding method to classify and discriminate DoS attack datasets, the results were TP = 16853, TN = 40373, FP = 0, and FN = 52. According to these specific values, we can further evaluate the quality of the model. Table 2 and Table 3 only showed the results for the threshold values of 0.5 and 0.9. The coding methods used in the tests can be divided into the following four categories: (1) sequential coding of size 16 × 16; (2) sequential coding of size 29 × 29; (3) Mosaic coding of sizes 4 × 4, 6 × 6, and 8 × 8 Mosaic patterns, with each ID being converted to a 4 × 4 Mosaic pattern; and (4) Mosaic coding of sizes 4 × 4, 6 × 6, and 8 × 8 Mosaic patterns, with each ID being converted to a 6 × 6 Mosaic pattern. Coding methods (1) and (3) correspond to an 11-bit ID, but methods (2) and (4) correspond to a 29-bit ID. In the sequential-coding method, take the 16 × 16 sequence as an example; each 11-bit CAN ID was, first, extended to 16 bits, by adding 5 zeros to the end of it, and, then, 16 such CAN IDs were put together, sequentially, to splice a 16 × 16 sequence data grid. In the Mosaic-coding method, take the 4 × 4 Mosaic 4 × 4 data grid as an example; each 11-bit CAN ID was, first, extended to 16 bits, by adding 5 zeros to the end of it, and, then, each of them was grid-coded as a 4 × 4 Mosaic pattern; at last, 16 such Mosaic patterns were grid-coded again, to splice a 4 × 4 data grid.

As can be seen from the experimental results in Table 2 and Table 3, although different Mosaic-coding methods would lead to slightly different results, our method, generally, achieved better results compared to the direct 16 × 16 or 29 × 29 sequential coding ones, with the best results always given by our method. No matter whether the 11-bit or 29-bit CAN ID was used, for a DoS attack, the 4 × 4 data-grid-coding method showed the best results; for Fuzzy attack, the 8 × 8 data-grid-coding method gave the best; and for Gear and RPM attacks, the 6 × 6 data-grid-coding method was the best. This is because the Fuzzy attack is the most difficult to detect, followed by the Gear and RPM attacks, while the DoS attack is the easiest to detect. Therefore, a more difficult attack may need a more complicated network to detect it.

To show the good performance of our method, Table 4 gives the percentage increase for our method, over the sequential-coding method for each performance index, with different ID and threshold values. We can see from this table that, over all the performance indexes, our method always achieved better results. This is more prominent for the indexes of UR and FNR, as the other indexes obtained by the sequential-coding method had already quite good, and our method could not further improve them much. Therefore, the coding method proposed in this paper is superior to the direct sequential-coding method, in overall performance. We, also, found that when Mosaic coding is adopted, the 29-bit coding method is slightly better than 11-bit coding, which may be because the CAN ID of 29 bits is processed by 6 × 6 grid size, resulting in a larger model input data and more model parameters.

When the threshold was increased from 0.5 to 0.6, 0.7, 0.8, or 0.9, some of the outputs that did not meet the judgment conditions would not be able to be classified and would be discarded as unrecognized samples. The higher the threshold is, the more the samples are discarded, resulting in a larger UR value. According to our results, as the threshold increased, the values of Precision, Recall, and F1_score all increased, while the values of Accuracy and FNR decreased. The reason of the decrease in Accuracy is that it was worked out using the sum of TP and TN divided by the total number of patterns processed in that test. When the threshold increased, the total number of TP and TN would decrease, leading to a reduction in Accuracy. According to the experimental results, our coding method led to a smaller UR value, indicating that our model had a better discrimination rate.

To have an overall view on how the performance of the models would be changed under different thresholds, the changes of each performance index in the Fuzzy dataset under different thresholds are shown in Figure 9.

We can see from Figure 9 that the change in performance indexes over different thresholds is not as big for the Mosaic pattern coding as it is for the sequential coding. The performance of Precision was nearly unchanged with the threshold, when the Mosaic pattern coding method was used, but it changed for the sequential-coding method, especially for the 11-bit ID dataset. For other performance indexes, although they changed slightly when the Mosaic pattern coding method was used, the change was much bigger when the sequential-coding method was used. In all cases, the performances of the Mosaic pattern coding method were always better than those of the sequential-coding method. Again, in all cases, the performance of the sequential coding-method on the 11-bit ID dataset changed the most, with the change of the threshold. These showed that the proposed Mosaic pattern coding method not only had better performance and higher discrimination rate but also had lower dependence on the change of the threshold values.

4.2. Discussion

A convolutional neural network is good at processing 2D grid data, such as images. To make the CAN ID data meet the requirement of the 2D grid structure as the input data of the CNN model, some researchers, directly, packed the data in sequence, as the input data of the CNN model. This kind of data-processing method was unable to make full use of the advantages of the CNN, in pattern extraction. Therefore, this paper proposed a novel data-coding method, by converting the 1D ID to a 2D data block, and each such data block was joined together with a time connection, to splice a Mosaic pattern to reflect the time association. This was convenient for the CNN, to effectively extract the data patterns, and it, also, maintained the time characteristics among the data.

To evaluate the reliability of different models, our model was compared with the sequential-coding model. As the UR value is good at evaluating how well a model would be to distinguish patterns at a higher standard, Figure 10 gives the change of UR with that of the threshold, for both the Mosaic-coding and the direct-sequential-coding methods. It can be seen from this figure that the UR value increases with the increase in the threshold in both methods, and the larger the threshold is, the higher the UR value, representing more data that were unrecognized by the higher standard. However, comparing the Mosaic-coding method with the sequential-coding methods, it can be seen that the increase in the UR with the Mosaic-coding method is much lower than that with the sequential-coding method. For example, even when the threshold value equals 0.9, the UR value with the Mosaic-coding method is still lower than 0.2%, meaning that the classification capability of the proposed model is much higher than that of the previous one, which indicates that the Mosaic-coding method is more reliable and stable in identifying the intruders’s attack.

In recent years, with the development of machine learning, more and more researchers use machine-learning methods for intrusion-detection research [47]. To further verify the feasibility of the proposed method, the Mosaic-coded-CNN method was, also, used to compare with some other classical machine-learning algorithms, using 29-bit CAN ID. The results obtained from an Artificial Neural Network (ANN) with 2 hidden layers, an LSTM with 256 hidden units, a Support Vector Machine (SVM), a K-Nearest Neighbor (KNN) for K = 5, a Naive Bayes (NB), and a Decision Trees (DT) [36] were used for comparison in this paper. The Markov-Transition-Field (MTF) method [48] was, also, used to convert each binary CAN ID sequence data into a 2 × 2 grid image, for comparison. The experimental results are shown in Figure 11a–d.

Experimental results showed that, generally speaking, our method performed slightly better than or as well as the other machine-learning algorithms, for the evaluation indexes of Precision, Recall, F1_score, and Accuracy. For FNR, our method always performed much better than KNN, NB, and DT. However, it was sometimes slightly worse than the algorithm of ANN, LSTM, or SVM, for some datasets. This is because we only used the simplest CNN model in this paper, which has certain limitations and could be further optimized.

Finally, to compare the running time of the Mosaic-pattern-coding method with that of the sequential-coding method, Table 5 gives the time (in seconds) taken to run the program. The program was run on an Intel(R) Core (TM) i7-4510U [email protected] GHz laptop; the machine has two cores and four threads, and the program runs on its CPU.

It can be seen from Table 5 that the program running time is different, with different Mosaic sizes and coding methods. Compared with the sequential-coding method, the Mosaic-coding method, generally, takes a little longer to run. This is because, no matter whether the 4 × 4 data grid (for an 11-bit CAN ID) or the 6 × 6 data grid (for a 29-bit CAN ID) was used in our method, there were always some redundant bits in the grid, which would consume some of the running time of the program. We, also, find, from Table 5, that the 6 × 6 Mosaic pattern took slightly longer to run than the 4 × 4 or 8 × 8 Mosaic patterns, in both the 11-bit and 29-bit CAN IDs. This is, mainly, due to the total number of running cycles, the number of samples, and the number of convolutions, which were different for different coding methods, and the product of them was the largest for the 6 × 6 Mosaic patterns. However, this only reflects the training time. After the networks have been trained, the test time would be much shorter. The test time for one sample, along with the number of parameters in the model for each method, is shown in Figure 12, which showed that all test times were in the order of sub-milliseconds. If a slightly higher-performance computer was used, they would be further reduced. Therefore, our model can well meet the timely requirement of detecting intrusion attacks in real time. We, also, find in this figure that the test time of the model is, mainly, determined by the size of the input data and the complexity of the model itself. When the structure of the model changes, the test time of the model will, inevitably, change. For example, when the number of convolution layers or the number of fully connected layers increases, the test time of the model will, also, increase.

4.3. Some Limitations of the Mothed

The first limitation is related to the training dataset. As the data used in this paper are composed of four independent datasets, each containing only one attack type, we, also, conducted model training and testing on each dataset, separately. As a result, the models trained on a specific attack dataset are only guaranteed effective against that type of attack. In order to test the performance of the model trained on one dataset and tested on another, we list, in Table 6, the accuracy of the model trained using the 8 × 8 Mosaic 6 × 6 data-grid-coding method on one dataset and tested on all four datasets. We can see from this table that other types of attacks are detected by a model trained on only one type of attack data, with a decreased accuracy. Therefore, we should be careful when planning to use a model trained on one type of attack data to detect other types of attacks. Nevertheless, we shall improve the design in our future work, to include all types of attacks.

The other limitation is the possible adversarial attack, which is the artificially designed “noise” added to the samples, in the training process of the model. Although the modified sample is difficult to be directly distinguished by human eyes, it is an attack method that may degrade the model’s classification ability. In fact, a number of studies have proven that adversarial attacks can affect the performance in image recognition, natural language processing, malware detection, and other fields [49]. The CNN model, also, would be vulnerable to adversarial attacks, in which some carefully designed input perturbations, either at training or the test stage, could divert its predictions. Adversarial training is an effective method, to defend against the adversarial sample attacks. It generates an adversarial sample set, by adding various disturbance information to the original samples, and trains the deep-neural-network model with the adversarial and original sample sets [50]. As most of the vehicle’s CAN IDs are periodic, the message on the CAN bus will be different, when the vehicle is in different driving states. Therefore, we can train the detection model, by collecting CAN ID data at different driving states (such as static, braking, accelerating, etc.) of the vehicle, so that the model can learn richer and more complex information and improve the robustness of it. As the detection models can be trained off-line, in which adversarial attacks will be classified as attack messages, we did not consider this issue in our paper. However, this would be an important issue for us to further explore, in the future.

5. Conclusions

In this paper, a Mosaic-pattern-based-data-coding method was proposed, to make the CNN model more conducive to the extraction of 2D data features and to make the CNN model more explanatory in the extraction of time features of the attack data. In terms of coding methods, 4 × 4, 6 × 6, and 8 × 8 Mosaic patterns, with 11-bit and 29-bit CAN IDs, were adopted. The experimental results showed that the Mosaic-coding method was better than the sequential-coding method, in terms of identifying the intruders’s attack and increasing stability, with the change of the thresholds. By setting different thresholds, it was shown that the method proposed in this paper was more reliable, with a higher classification capability to identify the intrusions from a normal message. Our model, also, generally, performs better than other classical machine-learning algorithms. In terms of running time, although the proposed Mosaic-coding methods took slightly longer to train, they were very fast in identifying the intrusions, by testing only one sample in the real-time running, meaning that the proposed model is able to meet the real-time requirement of the system.

Author Contributions

R.H. organized and supervised the research, discussed the concepts, reviewed the manuscript, and supported the research by providing funds. Z.W. conducted the simulations, discussed the results, and wrote the first draft of this paper. Y.X. conceived the idea, supervised the research, reviewed the manuscript, and supported the research by providing funds. T.L. supervised the research. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Natural Science Foundation of Fujian [Grant numbers 2019J05113 and 2021J011069]; the Fujian Provincial Key Laboratory of Information Processing and Intelligent Control (Minjiang University) [Grant number MJUKF-JK202002]; and the scientific research start-up fund of Fujian University of Technology [Grant number GY-Z21213].

Data Availability Statement

This paper uses a public dataset, published by HCRL Laboratory in South Korea.

Conflicts of Interest

The authors declare no conflict of interest.

References

Tuohy, S.; Glavin, M.; Hughes, C. Intra-vehicle networks: A review. IEEE Trans. Intell. Transp. Syst. 2015, 16, 534–545. [Google Scholar] [CrossRef]
Koscher, K.; Czeskis, A.; Roesner, F.; Patel, S.; Kohno, T.; Checkoway, S.; McCoy, D.; Kantor, B.; Anderson, D.; Shacham, H.; et al. Experimental security analysis of a modern automobile. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, Oakland, CA, USA, 16–19 May 2010; pp. 447–462. [Google Scholar]
Li, X.; Hu, Z.; Xu, M. Transfer learning based intrusion detection scheme for Internet of vehicles. Inf. Sci. 2021, 547, 119–135. [Google Scholar] [CrossRef]
Duan, W.; Gu, J.; Wen, M.; Zhang, G.; Ji, Y.; Mumtaz, S. Emerging technologies for 5G-IoV networks: Applications, trends and opportunities. IEEE Netw. 2020, 34, 283–289. [Google Scholar] [CrossRef]
Ahmed, E.S.A.; Mohammed, Z.T.; Hassan, M.B.; Saeed, R.A. Algorithms Optimization for Intelligent IoV Applications. In Handbook of Research on Innovations and Applications of AI, IoT, and Cognitive Technologies; IGI Global: Hershey, PA, USA, 2021; pp. 1–25. [Google Scholar]
Marchetti, M.; Stabili, D. READ: Reverse engineering of automotive data frames. IEEE Trans. Inf. Forensics Sec. 2018, 14, 1083–1097. [Google Scholar] [CrossRef]
Liu, J.; Zhang, S.; Sun, W. In-vehicle network attacks and countermeasures: Challenges and future directions. IEEE Netw. 2017, 31, 50–58. [Google Scholar] [CrossRef]
Miller, C.; Valasek, C. Adventures in automotive networks and control units. DefCon 2013, 21, 15–31. [Google Scholar]
Miller, C.; Valasek, C. A Survey of Remote Automotive Attack Surfaces; Tech. Rep.; Black Hat USA: Las Vegas, NV, USA, 2014; p. 94. [Google Scholar]
Miller, C.; Valasek, C. Remote Exploitation of an Unaltered Passenger Vehicle; Tech. Rep.; Black Hat USA: Las Vegas, NV, USA, 2015; Available online: https://illmatics.com/Remote%20Car%20Hacking.pdf (accessed on 2 May 2022).
The Jeep Attackers Are Back to Prove Car Hacking Can Get Much Worse. Available online: https://www.wired.com/2016/08/jeep-hackers-return-high-speed-steering-acceleration-hacks/ (accessed on 30 December 2020).
Internet of Vehicles Network Security White Paper. 2017. Available online: http://www.askci.com/news/chanye/20170922/093549108274_3.shtml (accessed on 30 December 2020).
The Latest Research Results of Tencent Cohen Lab: 2017 Once Again Realized the Remote Attack without Physical Contact on Tesla. Available online: https://keenlab.tencent.com/zh/2017/07/27/New-Car-Hacking-Research-2017-Remote-Attack-Tesla-Motors-Again/ (accessed on 30 December 2020).
The Latest Automotive Safety Research Results of Tencent Cohen Lab: A Review of the Safety Research of Many BMW Model. Available online: https://keenlab.tencent.com/zh/2018/05/22/New-CarHacking-Research-by-KeenLab-Experimental-Security-Assessment-of-BMW-Cars/ (accessed on 30 December 2020).
Attackers Relay Tesla Model X to Drive Away in 3 Minutes. 2020. Available online: https://finance.sina.comcn/stock/usstock/c/2020–11-24/doc-iiznctke2927970.shtml (accessed on 30 December 2020).
Choi, W.; Jo, H.J.; Woo, S. Identifying ECUs using inimitable characteristics of signals in controller area networks. IEEE Trans. Veh. Technol. 2018, 67, 4757–4770. [Google Scholar] [CrossRef]
Sagstetter, F.; Lukasiewycz, M.; Chakraborty, S. Generalized asynchronous time-triggered scheduling for FlexRay. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2016, 36, 214–226. [Google Scholar] [CrossRef]
Lin, C.; Sangiovanni-Vincentelli, A. Cyber-security for the controller area network (CAN) communication protocol. In Proceedings of the 2012 International Conference on Cyber Security, Alexandria, VA, USA, 14–16 December 2012; pp. 1–7. [Google Scholar]
Wu, W.; Kurachi, R.; Zeng, G. IDH-CAN: A hardware-based ID hopping CAN hechanism with enhanced security for automotive real-time applications. IEEE Access 2018, 6, 54607–54623. [Google Scholar] [CrossRef]
Mehedi, S.; Anwar, A.; Rahman, Z.; Ahmed, K. Deep Transfer Learning Based Intrusion Detection System for Electric Vehicular Networks. Sensors 2021, 21, 4736. [Google Scholar] [CrossRef]
Wu, W.; Li, R.; Xie, G.; An, J.; Bai, Y.; Zhou, J.; Li, K. A survey of intrusion detection for in-vehicle networks. IEEE Trans. Intell. Transp. Syst. 2020, 21, 919–933. [Google Scholar] [CrossRef]
Pierazzi, F.; Apruzzese, G.; Colajanni, M.; Guido, A.; Marchetti, M. Scalable architecture for online prioritisation of cyber threats. In Proceedings of the 2017 9th International Conference on Cyber Conflict (CyCon), Tallinn, Estonia, 30 May–2 June 2017; pp. 1–18. [Google Scholar]
Khan, M.A. HCRNNIDS: Hybrid convolutional recurrent neural network-based network intrusion detection system. Processes 2021, 9, 834. [Google Scholar] [CrossRef]
Müter, M.; Groll, A.; Freiling, F.C. A structured approach to anomaly detection for in-vehicle networks. In Proceedings of the 2010 Sixth International Conference on Information Assurance and Security, Atlanta, GA, USA, 23–25 August 2010; pp. 92–98. [Google Scholar]
Yin, J.; Ren, J.G.; Lu, H. Quantum teleportation and entanglement distribution over 100-kilometre free-space channels. Nature 2012, 488, 185–188. [Google Scholar] [CrossRef] [PubMed]
Müter, M.; Asaj, N. Entropy-based anomaly detection for in-vehicle networks. In Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV), Baden-Baden, Germany, 5–9 June 2011; pp. 1110–1115. [Google Scholar]
Taylor, A.; Japkowicz, N.; Leblanc, S. Frequency-based anomaly detection for the automotive CAN bus. In Proceedings of the 2015 World Congress on Industrial Control Systems Security (WCICSS), London, UK, 14–16 December 2015; pp. 45–49. [Google Scholar]
Song, H.M.; Kim, H.R.; Kim, H.K. Intrusion detection system based on the analysis of time intervals of can messages for in-vehicle network. In Proceedings of the 2016 International Conference on Information Networking (ICOIN), Kota Kinabalu, Malaysia, 13–15 January 2016; pp. 63–68. [Google Scholar]
Larson, U.E.; Nilsson, D.K.; Jonsson, E. An approach to specification-based attack detection for in-vehicle networks. In Proceedings of the 2008 IEEE Intelligent Vehicles Symposium, Eindhoven, The Netherlands, 4–6 June 2008; pp. 220–225. [Google Scholar]
Murvay, P.S.; Groza, B. Source identification using signal characteristics in controller area networks. IEEE Signal. Proc. Lett. 2014, 21, 395–399. [Google Scholar] [CrossRef]
Taylor, A.; Leblanc, S.; Japkowicz, N. Anomaly Detection in Automobile Control Network Data with Long Short-Term Memory Networks. In Proceedings of the 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Montreal, QC, Canada, 17–19 October 2016; pp. 130–139. [Google Scholar]
Su, Y.; Zhao, Y.; Sun, M.; Zhang, S.; Wen, X.; Zhang, Y.; Liu, X.; Liu, X.; Tang, J.; Wu, W.; et al. Detecting Outlier Machine Instances Through Gaussian Mixture Variational Autoencoder with One Dimensional CNN. IEEE Trans. Comput. 2021, 71, 892–905. [Google Scholar] [CrossRef]
Mozaffari, M.H.; Tay, L.L. Anomaly detection using 1D convolutional neural networks for surface enhanced raman scattering. In SPIE Future Sensing Technologies; International Society for Optics and Photonics: Washington, DC, USA, 2020; Volume 11525, p. 115250S. [Google Scholar]
Yu, Q.; Kavitha, M.; Kurita, T. Detection of one dimensional anomalies using a vector-based convolutional autoencoder. In Asian Conference on Pattern Recognition; Springer: Cham, Switzerland, 2019; pp. 516–529. [Google Scholar]
Hsieh, C.-H.; Li, Y.-S.; Hwang, B.-J.; Hsiao, C.-H. Detection of Atrial Fibrillation Using 1D Convolutional Neural Network. Sensors 2020, 20, 2136. [Google Scholar] [CrossRef] [Green Version]
Song, H.M.; Woo, J.; Kim, H.K. In-vehicle network intrusion detection using deep convolutional neural network. Veh. Commun. 2019, 21, 100198. [Google Scholar] [CrossRef]
Hu, R.; Wu, Z.; Xu, Y.; Lai, T.; Xia, C. A multi-attack intrusion detection model based on Mosaic coded convolutional neural network and centralized encoding. PLoS ONE 2022, 17, e0267910. [Google Scholar] [CrossRef]
Davis, R.I.; Burns, A.; Bril, R.J. Controller area network (CAN) schedulability analysis: Refuted, revisited and revised. Real-Time Syst. 2007, 35, 239–272. [Google Scholar] [CrossRef] [Green Version]
Mishkin, D.; Sergievskiy, N.; Matas, J. Systematic evaluation of convolution neural network advances on the Imagenet. Comput. Vis. Image. Underst. 2017, 161, 11–19. [Google Scholar] [CrossRef] [Green Version]
Dimauro, G.; Deperte, F.; Maglietta, R.; Bove, M.; La Gioia, F.; Renò, V.; Simone, L.; Gelardi, M. A Novel Approach for Biofilm Detection Based on a Convolutional Neural Network. Electronics 2020, 9, 881. [Google Scholar] [CrossRef]
Xue, Y.; Wang, Y.; Liang, J.; Slowik, A. A Self-Adaptive Mutation Neural Architecture Search Algorithm Based on Blocks. IEEE Comput. Intell. Mag. 2021, 16, 67–78. [Google Scholar] [CrossRef]
Xue, Y.; Jiang, P.; Neri, F.; Liang, J. A Multi-Objective Evolutionary Approach Based on Graph-in-Graph for Neural Architecture Search of Convolutional Neural Networks. Int. J. Neural Syst. 2021, 31, 2150035. [Google Scholar] [CrossRef]
Kim, J.; Kim, J.; Kim, H.; Shim, M.; Choi, E. CNN-based network intrusion detection against denial-of-service attacks. Electronics 2020, 9, 916. [Google Scholar] [CrossRef]
Song, H.M.; Woo, J.; Kim, H.K. Can Network Intrusion Datasets. Available online: http://ocslab.hksecurity.net/Datasets/car-hacking-dataset (accessed on 30 December 2019).
Albawi, S.; Mohammed, T.A.; Alzawi, S. Understanding of a Convolutional Neural Network. In Proceedings of the 2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017; pp. 1–6. [Google Scholar]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef] [PubMed]
Corsini, A.; Yang, S.J.; Apruzzese, G. On the Evaluation of Sequential Machine Learning for Network Intrusion Detection. In Proceedings of the 16th International Conference on Availability, Reliability and Security, New York, NY, USA, 17–20 August 2021; pp. 1–10. [Google Scholar]
Wang, Z.; Oates, T. Encoding time series as images for visual inspection and classification using tiled convolutional neural networks. In Proceedings of the Workshops at 29th AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–26 January 2015. [Google Scholar]
Apruzzese, G.; Andreolini, M.; Ferretti, L.; Marchetti, M.; Colajanni, M. Modeling Realistic Adversarial Attacks against Network Intrusion Detection Systems. Digit. Threat. Res. Pract. 2021. [Google Scholar] [CrossRef]
Biggio, B.; Roli, F. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognit. 2018, 84, 317–331. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Signal logic of CAN bus.

Figure 2. CAN message structure.

Figure 3. Sequential arrangement of the CAN data.

Figure 4. The 16 4 × 4 data grids, to splice a Mosaic data structure, in which 16 such data grids were, sequentially, arranged one after another in the data pattern, to keep the temporal characteristics of the original data.

Figure 5. The data arrangement of 3 × 3 Mosaic pattern, with 29-bit ID.

Figure 6. Grid data image, containing Fuzzy attack.

Figure 7. The CNN model structure.

Figure 8. The flowchart of the overall scheme.

Figure 9. The change of performance for each model, under different thresholds for Fuzzy dataset.

Figure 10. Comparison of the UR values, for different models.

Figure 11. Comparison of the experimental results of our method, with some other classical machine-learning algorithms.

Figure 12. The test time for one sample and the number of parameters in the model, for each method. The vertical axis on the left represents the number of parameters (connections) in the model, while on the right it represents the test time in microseconds. The horizontal axis represents the coding method.

Table 2. Simulation results with the output threshold of 0.5 (the best results are shown in bold).

Dataset	Coding Method	Precision	Recall	F1_Score	Accuracy	FNR
DoS	16 × 16 sequence	1.0	0.99674	0.99837	0.99903	0.325%
	4 × 4 Mosaic 4 × 4 data grid	0.99988	0.99698	0.99843	0.99907	0.301%
	6 × 6 Mosaic 4 × 4 data grid	1.0	0.99192	0.99594	0.99760	0.807%
	8 × 8 Mosaic 4 × 4 data grid	1.0	0.98879	0.99436	0.99664	1.12%
	29 × 29 sequence	1.0	0.99519	0.99759	0.99857	0.48%
	4 × 4 Mosaic 6 × 6 data grid	1.0	0.99692	0.99845	0.99909	0.307%
	6 × 6 Mosaic 6 × 6 data grid	1.0	0.99192	0.99594	0.99760	0.807%
	8 × 8 Mosaic 6 × 6 data grid	1.0	0.99042	0.99518	0.99713	0.957%
Fuzzy	16 × 16 sequence	0.98572	0.93390	0.95911	0.97340	6.609%
	4 × 4 Mosaic 4 × 4 data grid	0.98383	0.91753	0.94952	0.96742	8.246%
	6 × 6 Mosaic 4 × 4 data grid	0.99354	0.97804	0.98573	0.99047	2.195%
	8 × 8 Mosaic 4 × 4 data grid	0.99939	0.97794	0.98855	0.99233	2.205%
	29 × 29 sequence	0.99732	0.97086	0.98391	0.98933	2.913%
	4 × 4 Mosaic 6 × 6 data grid	0.99756	0.91913	0.95674	0.97224	8.086%
	6 × 6 Mosaic 6 × 6 data grid	0.99639	0.98607	0.99120	0.99411	1.392%
	8 × 8 Mosaic 6 × 6 data grid	0.99980	0.98661	0.99316	0.99539	1.338%
Gear	16 × 16 sequence	0.99877	0.99206	0.99540	0.99611	0.793%
	4 × 4 Mosaic 4 × 4 data grid	0.99911	0.99051	0.99479	0.99559	0.948%
	6 × 6 Mosaic 4 × 4 data grid	1.0	0.99635	0.99817	0.99844	0.364%
	8 × 8 Mosaic 4 × 4 data grid	1.0	0.99475	0.99737	0.99775	0.524%
	29 × 29 sequence	0.99993	0.99663	0.99828	0.99853	0.336%
	4 × 4 Mosaic 6 × 6 data grid	0.99863	0.99356	0.99609	0.99668	0.643%
	6 × 6 Mosaic 6 × 6 data grid	1.0	0.99749	0.99874	0.99893	0.25%
	8 × 8 Mosaic 6 × 6 data grid	1.0	0.99394	0.99696	0.99740	0.605%
RPM	16 × 16 sequence	0.99918	0.99355	0.99636	0.99675	0.644%
	4 × 4 Mosaic 4 × 4 data grid	0.99965	0.99402	0.99683	0.99717	0.597%
	6 × 6 Mosaic 4 × 4 data grid	0.99993	0.99645	0.99818	0.99837	0.354%
	8 × 8 Mosaic 4 × 4 data grid	1.0	0.99630	0.99815	0.99833	0.369%
	29 × 29 sequence	0.99994	0.99630	0.99811	0.99831	0.369%
	4 × 4 Mosaic 6 × 6 data grid	0.99944	0.99634	0.99788	0.99811	0.365%
	6 × 6 Mosaic 6 × 6 data grid	0.99993	0.99652	0.99822	0.99841	0.347%
	8 × 8 Mosaic 6 × 6 data grid	1.0	0.99581	0.99790	0.99811	0.418%

Table 3. Simulation results with the output threshold of 0.9 (the best results are shown in bold).

Dataset	Coding Method	UR	Precision	Recall	F1_Score	Accuracy	FNR
DoS	16 × 16 sequence	0.15%	1.0	0.99762	0.99880	0.99780	0.237%
	4 × 4 Mosaic 4 × 4 data grid	0.101%	1.0	0.99774	0.99887	0.99832	0.225%
	6 × 6 Mosaic 4 × 4 data grid	0.184%	1.0	0.99480	0.99739	0.99662	0.519%
	8 × 8 Mosaic 4 × 4 data grid	0.335%	1.0	0.99409	0.99703	0.99490	0.590%
	29 × 29 sequence	0.139%	1.0	0.99774	0.99887	0.99794	0.225%
	4 × 4 Mosaic 6 × 6 data grid	0.108%	1.0	0.99804	0.99901	0.99834	0.195%
	6 × 6 Mosaic 6 × 6 data grid	0.192%	1.0	0.99560	0.99779	0.99677	0.439%
	8 × 8 Mosaic 6 × 6 data grid	0.258%	1.0	0.99528	0.99763	0.99601	0.471%
Fuzzy	16 × 16 sequence	11.14%	0.99910	0.97755	0.98821	0.88184	2.244%
	4 × 4 Mosaic 4 × 4 data grid	15.78%	0.99874	0.97208	0.98523	0.83420	2.791%
	6 × 6 Mosaic 4 × 4 data grid	3.308%	0.99847	0.99124	0.99484	0.96361	0.875%
	8 × 8 Mosaic 4 × 4 data grid	2%	1.0	0.98960	0.99477	0.97666	1.039%
	29 × 29 sequence	2.94%	0.99961	0.98984	0.99470	0.96727	1.015%
	4 × 4 Mosaic 6 × 6 data grid	6.992%	0.99981	0.96807	0.98369	0.92092	3.192%
	6 × 6 Mosaic 6 × 6 data grid	1.695%	0.99907	0.99529	0.99718	0.98120	0.47%
	8 × 8 Mosaic 6 × 6 data grid	1.193%	1.0	0.99349	0.99673	0.98592	0.65%
Gear	16 × 16 sequence	1.281%	0.99982	0.99659	0.99821	0.98568	0.340%
	4 × 4 Mosaic 4 × 4 data grid	1.441%	0.99986	0.99655	0.99820	0.98408	0.344%
	6 × 6 Mosaic 4 × 4 data grid	0.223%	1.0	0.99809	0.99904	0.99695	0.190%
	8 × 8 Mosaic 4 × 4 data grid	0.167%	1.0	0.99635	0.99817	0.99677	0.364%
	29 × 29 sequence	0.177%	0.99993	0.99815	0.99904	0.99741	0.184%
	4 × 4 Mosaic 6 × 6 data grid	1.408%	0.99993	0.99749	0.99871	0.98483	0.250%
	6 × 6 Mosaic 6 × 6 data grid	0.136%	1.0	0.99878	0.99938	0.99812	0.121%
	8 × 8 Mosaic 6 × 6 data grid	0.178%	1.0	0.99608	0.99803	0.99654	0.391%
RPM	16 × 16 sequence	1.013%	0.99993	0.99776	0.99884	0.98885	0.223%
	4 × 4 Mosaic 4 × 4 data grid	1.042%	0.99996	0.99813	0.99905	0.98874	0.186%
	6 × 6 Mosaic 4 × 4 data grid	0.208%	1.0	0.99804	0.99902	0.99704	0.195%
	8 × 8 Mosaic 4 × 4 data grid	0.321%	1.0	0.99789	0.99894	0.99584	0.21%
	29 × 29 sequence	0.190%	1.0	0.99763	0.99881	0.99703	0.236%
	4 × 4 Mosaic 6 × 6 data grid	0.811%	0.99996	0.99874	0.99935	0.99131	0.125%
	6 × 6 Mosaic 6 × 6 data grid	0.158%	1.0	0.99790	0.99895	0.99747	0.209%
	8 × 8 Mosaic 6 × 6 data grid	0.204%	1.0	0.99789	0.99894	0.99700	0.21%

Table 4. The percentage increase for our method, over the sequential-coding method for each performance index.

Dataset	Threshold	ID	UR	Precision	Recall	F1_Score	Accuracy	FNR
DoS	0.5	11 bits		−0.01%	0.02%	0.01%	0.00%	−7.38%
	0.5	29 bits		0.00%	0.17%	0.09%	0.05%	−36.04%
	0.9	11 bits	−32.67%	0.00%	0.01%	0.01%	0.05%	−5.06%
	0.9	29 bits	−22.30%	0.00%	0.03%	0.01%	0.04%	−13.33%
Fuzzy	0.5	11 bits		1.39%	4.72%	3.07%	1.94%	−66.64%
	0.5	29 bits		0.25%	1.62%	0.94%	0.61%	−54.07%
	0.9	11 bits	−82.05%	0.09%	1.23%	0.66%	10.75%	−53.70%
	0.9	29 bits	−59.42%	0.04%	0.37%	0.20%	1.93%	−35.96%
Gear	0.5	11 bits		0.12%	0.43%	0.28%	0.23%	−54.10%
	0.5	29 bits		0.01%	0.09%	0.05%	0.04%	−25.60%
	0.9	11 bits	−82.59%	0.02%	0.15%	0.08%	1.14%	−44.12%
	0.9	29 bits	−23.16%	0.01%	0.06%	0.03%	0.07%	−34.24%
RPM	0.5	11 bits		0.08%	0.29%	0.18%	0.16%	−45.03%
	0.5	29 bits		0.00%	0.02%	0.01%	0.01%	−5.96%
	0.9	11 bits	−79.47%	0.01%	0.03%	0.02%	0.83%	−12.56%
	0.9	29 bits	−16.84%	0.00%	0.03%	0.01%	0.04%	−11.44%

Table 5. Program running time.

Time(s)	DoS	Fuzzy	Gear	RPM
16 × 16 sequence	146.46	177.81	168.22	164.01
4 × 4 Mosaic 4 × 4 data grid	189.47	199.76	222.33	220.94
6 × 6 Mosaic 4 × 4 data grid	231.28	240.61	252.27	261.70
8 × 8 Mosaic 4 × 4 data grid	218.53	225.06	240.57	243.01
29 × 29 sequence	295.78	289.87	313.68	310.51
4 × 4 Mosaic 6 × 6 data grid	321.08	343.74	350.75	362.31
6 × 6 Mosaic 6 × 6 data grid	416.07	420.12	454.92	462.13
8 × 8 Mosaic 6 × 6 data grid	384.99	392.84	420.24	422.08

Table 6. Cross-validation accuracy of the model.

	Dos	Fuzzy	Gear	RPM
Test	Dos	Fuzzy	Gear	RPM
Dos	0.99560	0.69329	0.70132	0.69560
Fuzzy	0.65404	0.99439	0.84149	0.66090
Gear	0.56274	0.99775	0.99781	0.57369
RPM	0.54946	0.55860	0.54647	0.99711

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, R.; Wu, Z.; Xu, Y.; Lai, T. Vehicular-Network-Intrusion Detection Based on a Mosaic-Coded Convolutional Neural Network. Mathematics 2022, 10, 2030. https://0-doi-org.brum.beds.ac.uk/10.3390/math10122030

AMA Style

Hu R, Wu Z, Xu Y, Lai T. Vehicular-Network-Intrusion Detection Based on a Mosaic-Coded Convolutional Neural Network. Mathematics. 2022; 10(12):2030. https://0-doi-org.brum.beds.ac.uk/10.3390/math10122030

Chicago/Turabian Style

Hu, Rong, Zhongying Wu, Yong Xu, and Taotao Lai. 2022. "Vehicular-Network-Intrusion Detection Based on a Mosaic-Coded Convolutional Neural Network" Mathematics 10, no. 12: 2030. https://0-doi-org.brum.beds.ac.uk/10.3390/math10122030

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Vehicular-Network-Intrusion Detection Based on a Mosaic-Coded Convolutional Neural Network

Abstract

1. Introduction

2. Background Knowledge

2.1. Vehicle-Mounted CAN Bus Network

2.2. Convolutional Neural Network for Intrusion Attack Detection

2.3. Dataset

3. Method

3.1. Two-Dimensional Mosaic Pattern-Based Coding

3.2. Convolutional Neural Network Model

3.3. Model Evaluation Method

4. Experimental Result and Discussions

4.1. Experimental Result

4.2. Discussion

4.3. Some Limitations of the Mothed

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI