Next Article in Journal
Distributed Interoperable Records: The Key to Better Supply Chain Management
Previous Article in Journal
A Low Distortion Audio Self-Recovery Algorithm Robust to Discordant Size Content Replacement Attack
Previous Article in Special Issue
CBAM: A Contextual Model for Network Anomaly Detection
 
 
Article
Peer-Review Record

Using Autoencoders for Anomaly Detection and Transfer Learning in IoT

by Chin-Wei Tien 1, Tse-Yung Huang 1, Ping-Chun Chen 1 and Jenq-Haur Wang 2,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Submission received: 16 June 2021 / Revised: 11 July 2021 / Accepted: 14 July 2021 / Published: 15 July 2021

Round 1

Reviewer 1 Report

The manuscript entitled “Using Autoencoders for Anomaly Detection and Transfer Learning in IoT” focuses on a ML-based approach aimed at identifying IoT devices for anomaly detection purposes.

The idea is quite good with the added value of using real data collected in an existing target factory.

This notwithstanding, the manuscript needs some improvement according the following lines:

1. The authors limit their analysis to the accuracy measures, in order to evaluate the performance of the adopted methods. In contrast, they neglect a temporal analysis (or time complexity analysis) which could be critical in revealing attacks (such as the DDoS) that usually are effective in few minutes. Moreover, since IoT devices are involved in the experiment, it should be useful to better elaborate concepts about potential impacts onto energy/battery of such devices.

2. A deeper comparative analysis on the feature selection methods (not only through the XGBoost algorithm) should be performed.

Along this line, some recent works offer interesting hints:

- "Network Intrusion Detection System using Feature Extraction based on Deep Sparse Autoencoder," (IEEE ICTC conference, 2020);

-  “Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection”, (IEEE ,Trans. on Netw. and Serv. Management, 2020);

- “Supervised Feature Selection Techniques in Network Intrusion Detection: a Critical Review”,(Elsevier, Engineering applications of artificial intelligence, 2020);

- Network Intrusion Detection System Using Neural Network and Condensed Nearest Neighbors with Selection of NSL-KDD Influencing Features, (IEEE IoTaIS conference, 2021);

3. The reference section is too poor and should be improved (eventually with the aforementioned works and others selected by the authors).

Author Response

Q: The manuscript entitled “Using Autoencoders for Anomaly Detection and Transfer Learning in IoT” focuses on a ML-based approach aimed at identifying IoT devices for anomaly detection purposes. The idea is quite good with the added value of using real data collected in an existing target factory.

 

A: Thanks for your valuable comments and encouragement.

 

Q: This notwithstanding, the manuscript needs some improvement according the following lines:

  1. The authors limit their analysis to the accuracy measures, in order to evaluate the performance of the adopted methods. In contrast, they neglect a temporal analysis (or time complexity analysis) which could be critical in revealing attacks (such as the DDoS) that usually are effective in few minutes. Moreover, since IoT devices are involved in the experiment, it should be useful to better elaborate concepts about potential impacts onto energy/battery of such devices.

 

A: Thanks for your valuable comments. We have added the time efficiency analysis for the adopted methods in our experiments (Sec.4.3).

Also, the proposed method could bring potential impact on the energy consumption of IoT devices. In supervised learning algorithms, the energy consumption is largely determined by the training phase which usually takes longer time depending on the amount of training data. Since our model is based on incremental learning, we do not need to re-train the model from scratch. It’s only needed when we add new device types with different behaviors. This helps reduce the power consumption. The testing phase is a simple classification which only accounts for a neglectable portion of the energy consumption.

The discussions on the potential impacts on the energy consumption of IoT devices are added in Sec. 5.

 

 

Q: 2. A deeper comparative analysis on the feature selection methods (not only through the XGBoost algorithm) should be performed.

Along this line, some recent works offer interesting hints:

- "Network Intrusion Detection System using Feature Extraction based on Deep Sparse Autoencoder," (IEEE ICTC conference, 2020);

-  “Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection”, (IEEE ,Trans. on Netw. and Serv. Management, 2020);

- “Supervised Feature Selection Techniques in Network Intrusion Detection: a Critical Review”,(Elsevier, Engineering applications of artificial intelligence, 2020);

- Network Intrusion Detection System Using Neural Network and Condensed Nearest Neighbors with Selection of NSL-KDD Influencing Features, (IEEE IoTaIS conference, 2021);

  1. The reference section is too poor and should be improved (eventually with the aforementioned works and others selected by the authors).

 

A: Thanks for your helpful suggestions.

Various feature selection methods have different effects on different datasets. Network intrusion packets might demonstrate different characteristics from anomalous packets in IoT environment. From our previous experience, the effects of gradient-boosted decision trees (GBDT, or XGBoost in particular) on network intrusion detection are very good. Thus, XGBoost was used for device identification in our experiments since it’s an efficient open-source implementation that is suitable for network intrusion detection. Then, in anomaly detection, since the number of features are small in Modbus/TCP packets, we simply compare the performance with different sets of feature combination.

We have added references on feature selection issues in network intrusion detection and anomaly detection.

Reviewer 2 Report

The paper is written in clear and understandable language. The problem of Using Autoencoders for Anomaly Detection and Transfer Learning in IoTis very actual. The problem of system security is taken more and more seriously as the number and types of attacks are constantly increasing.  The article is interesting, describing in detail the analysis of supervised learning approach to device type identification and using autoencoders for anomaly detection and transferlearning in IoT. Very interesting problem and interesting presentation.

Further  investigation  is  needed  to evaluate proposed  method with different types of devicesand network attacks in different IoT scenarios.  It's a good idea to add data flow diagrams. New IoT applications will create new traffic patterns, and we live in a world where new applications and devices come to the market rapidly. New valid and legitimate applications may create sudden changes which is a huge challenge for anomaly detection.

 

A few issues need to be clarified:

The  device  identification module then utilizes supervised learning methods to train models from network traffics to distinguish among devices. Which Training Harvest Proportions Are Best? The authors point out that one-class SVM, Isolation forest, and autoencoders, that these collections were selected based on the literature. Have the authors verified these parameters? Are they really optimal?

The  best  accuracy  of  99.4%  can  be  obtained  for decision  trees,  random  forests,  and  artificial  neural  networks  (ANN). What other types of devices in the future do the authors consider besides the ones mentioned (computer, controller, robotic arm)? What will be the impact on changing the nature of the traffic?

What range can a different type of traffic have? Is encrypted traffic taken into account? Please explain. Would other proportion influence the obtained results? However, to show this and convince the reader, two things must be verified; (1) if there is any anomaly, are there really sudden deviations? (2) if there is no anomaly, are you sure there is no deviations?

 

With 10% for validation, the dataset is divided into 18,000 events as the training set, 2,000 as the validation set, and 1,447 as the test set. To better learn the classifiers, we used 110-dimensional features. The parameters of the neural network include: ELU (Exponential Linear Unit) as the activation function, Adam optimizer, 150 epochs, with the batch size of 32. In experiment with three dense layers of neural networks, the final test accuracy of the neural network is 92.8%. How to choose the optimal parameters?

 

The definition of Sites A and B is unclear. Is it possible to accurately describe and visualize these collections?

Whether the collected data and tests took into account attacks from within (e.g. human factor, operator error, sabotage, etc.).What types of attacks were covered? Or only: DoS, reconnaissance, worms, backdoors? What about the rest? Whether they don't matter. What is their percentage share in the total.

 

Modbus over TCP / IP protocol and Modbus / TCP are included. How will the results be influenced by adding protocols such as EtherCAT, Profibus, Profinet, Ethernet / IP. Is it possible to fully implement the network in a real system based only on the protocols selected in the article? Will the traffic signatures change?

 

 

Authors show a good knowledge of the state of the art in the subject area. They describe some of the current and future challenges for research. The paper is correct, quite detailed and presented in a proper way. The paper does not carry signs of  plagiarism or regurgitate the old and well-proven thoughts.

The paper will be very interesting to the reader of journal after minor correction. The research topic is current and should be further developed.

Author Response

Q: The paper is written in clear and understandable language. The problem of Using Autoencoders for Anomaly Detection and Transfer Learning in IoT is very actual. The problem of system security is taken more and more seriously as the number and types of attacks are constantly increasing.  The article is interesting, describing in detail the analysis of supervised learning approach to device type identification and using autoencoders for anomaly detection and transfer learning in IoT. Very interesting problem and interesting presentation.

 

A: Thanks for your valuable comments and encouragement. We are trying to make our paper interesting and helpful to the research community.

 

 

Q: Further investigation is needed to evaluate proposed method with different types of devices and network attacks in different IoT scenarios. It's a good idea to add data flow diagrams. New IoT applications will create new traffic patterns, and we live in a world where new applications and devices come to the market rapidly. New valid and legitimate applications may create sudden changes which is a huge challenge for anomaly detection.

 

A: Thanks for your valuable comments and encouragement. We have added a data flow diagram for the proposed method in Sec.3. Due to the current limitation in our experimental sites in a smart factory, the types of devices are limited. Since there could be fast changes in new devices and applications in IoT, in our future work, we plan to expand our experiments to more device types and network protocols for IoT applications.

 

 

Q: A few issues need to be clarified:

The device identification module then utilizes supervised learning methods to train models from network traffics to distinguish among devices. Which Training Harvest Proportions Are Best? The authors point out that one-class SVM, Isolation forest, and autoencoders, that these collections were selected based on the literature. Have the authors verified these parameters? Are they really optimal?

 

A: Thanks very much for your comments. Our proposed method consists of two parts: supervised learning for device identification, and unsupervised learning for anomaly detection. For supervised learning, we adopted a standard convention of dividing data instances into training, validation, and test. In the training phase, we need appropriate number of training instance to learn the model.

In our experiments, since the real world data collected from IoT devices are limited, we empirically determine the ratio for training and test.

For unsupervised learning with the three different methods: one-class SVM, Isolation forest, and autoencoders, the parameters are determined from our observation in preliminary experiments. Although the parameters might not be optimal, we have tested and verified their effects on the performance of anomaly detection.

 

 

Q: The  best  accuracy  of  99.4%  can  be  obtained  for decision  trees,  random  forests,  and  artificial  neural  networks  (ANN). What other types of devices in the future do the authors consider besides the ones mentioned (computer, controller, robotic arm)? What will be the impact on changing the nature of the traffic?

 

A: Thanks for the comments. The accuracy was reported from the reference [4]. It’s based on the DS2OS dataset available on Kaggle, which consists of synthetic data for microservices communicated using MQTT protocol in virtual IoT environment. In our experiments, we used real world data collected from real IoT devices in a smart factory. In future, we plan to include more diverse device types in addition to computers, controllers, and robotic arms. For example, in different IoT scenarios such as smart homes or smart healthcare, we might consider environment sensors for air quality or tele-homecare using cameras or medical monitors for vital signs, to name a few. Since the behaviors of these devices could be very different, the traffic generated are also very different. Also, various network protocols in different formats could carry different traffic. These affect the features for supervised and unsupervised learning for anomaly detection.

 

 

Q: What range can a different type of traffic have? Is encrypted traffic taken into account? Please explain. Would other proportion influence the obtained results? However, to show this and convince the reader, two things must be verified; (1) if there is any anomaly, are there really sudden deviations? (2) if there is no anomaly, are you sure there is no deviations?

 

A: Thanks for your comments. The value range of traffic depends on the type of protocol and devices in IoT. Since data encryption might scramble the original content, we do not take encrypted traffic into account. In fact, any feature extracted from the data fields could influence the models learned by machine learning. From a single packet, sudden deviation could reflect possible anomaly or some rare events that are normal. Also, anomalous packets could possibly masquerade as normal ones by imitating the behaviors. To effectively distinguish between anomaly and normal events, we choose to learn the aggregate behaviors from multiple packets. In practical views, that would be able to satisfy most of the ordinary cases. But it’s still possible for mis-classification to occur. Since our model is based on incremental learning, a single sudden deviation will not greatly affect the detection.

 

 

Q: With 10% for validation, the dataset is divided into 18,000 events as the training set, 2,000 as the validation set, and 1,447 as the test set. To better learn the classifiers, we used 110-dimensional features. The parameters of the neural network include: ELU (Exponential Linear Unit) as the activation function, Adam optimizer, 150 epochs, with the batch size of 32. In experiment with three dense layers of neural networks, the final test accuracy of the neural network is 92.8%. How to choose the optimal parameters?

 

A: For machine learning methods, the parameters could vary among different datasets. In this paper, the parameters are determined from our observation in preliminary experiments where we chose the ones with better performance. Although the parameters might not be optimal, we have tested and verified their effects on the performance of anomaly detection.

 

 

Q: The definition of Sites A and B is unclear. Is it possible to accurately describe and visualize these collections?

 

A: Thanks for your comments. We have elaborated on the characteristics of two sites.

 

 

Q: Whether the collected data and tests took into account attacks from within (e.g. human factor, operator error, sabotage, etc.).What types of attacks were covered? Or only: DoS, reconnaissance, worms, backdoors? What about the rest? Whether they don't matter. What is their percentage share in the total.

 

A: Thanks for your valuable comments. There are many possible types of attacks considered in the current network intrusion detection research. However, some attacks might be only applicable to ordinary computers with specific operating systems installed. Since there’re much more devices in IoT, more potential attacks are possible in such environments. However, most of the currently known attacks in IoT include ransomware, disrupting normal operations, and unauthorized access to critical data. It’s still unclear whether the other types of attacks are feasible in IoT or not, since there’s no public attacking dataset available. In our experiments, we only focus on the simulated attacking traffic in Modbus over TCP including the deliberate errors in function codes, and values in normal commands. Insider attacks or operator errors remain issues to be resolved in future.

 

 

Q: Modbus over TCP / IP protocol and Modbus / TCP are included. How will the results be influenced by adding protocols such as EtherCAT, Profibus, Profinet, Ethernet / IP. Is it possible to fully implement the network in a real system based only on the protocols selected in the article? Will the traffic signatures change?

 

A: Thanks for your valuable comments. Different protocols transmit packets in different formats with different expressiveness, the traffics signatures will certainly change. By adding more protocols to the system, we will be able to learn different features that are more complex and very different from Modbus protocols. However, in a typical IoT environment, it’s not practical to implement all possible protocols since there are only a limited number of working protocols supported in these devices and applications. Other protocols can be ignored since they are usually neglected in the IoT applications such that no responses from devices means no possible attacks from the outside.

 

 

Q: Authors show a good knowledge of the state of the art in the subject area. They describe some of the current and future challenges for research. The paper is correct, quite detailed and presented in a proper way. The paper does not carry signs of plagiarism or regurgitate the old and well-proven thoughts.

The paper will be very interesting to the reader of journal after minor correction. The research topic is current and should be further developed.

 

A: Thanks for your valuable comments and encouragement. We plan to further explore related issues in future.

Reviewer 3 Report

There are two questions and suggestions about the paper:

  1. It is unclear how Modbus packets were mapped to the input of classifier?
  2. It has not been confirmed that simulated attacks are based on real-world attack models (lines 293-299)

Author Response

Q: There are two questions and suggestions about the paper:

  1. It is unclear how Modbus packets were mapped to the input of classifier?

 

A: Thanks for your valuable comments. Given a Modbus packet, we directly extract the sequence of bytes as the feature, and input to the classifier. We have added the detailed description of the mapping to the revised manuscript.

 

 

Q: 2.        It has not been confirmed that simulated attacks are based on real-world attack models (lines 293-299)

A: Thanks for your valuable comments. Most of the currently known attacks in IoT include ransomware, disrupting normal operations, and unauthorized access to critical data. However, it’s still unclear what are the real-world attack models in IoT, since there’s no public attacking dataset available. In our experiments, we only focus on the simulated attacking traffic in Modbus over TCP including the deliberate errors in function codes, and values in normal commands.

 

Round 2

Reviewer 1 Report

The Authors have satisfied all the raised concerns, and in particular:

 

- They have added some information about time analysis (for example, with training times in tables 7,8,9,10)

- Improved the discussion part on IoT devices

- Improved the reference section

 

In my opinion, the paper is ready to be published.

Back to TopTop