Data-Driven Cybersecurity and Privacy Analysis

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 May 2023) | Viewed by 13748

Special Issue Editors


E-Mail Website
Guest Editor
School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, China
Interests: data-driven security; network security; threat intelligence
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Institute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Interests: software security; network security
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Cyber Science and Engineering, Southeast University, Nanjing 211187, China
Interests: cyberthreat intelligence; information extraction

Special Issue Information

Dear Colleagues,

The rapid growth of cyber attacks and information leaks pose a huge challenge for cyber security. Many researchers have aimed to utilize the latest techniques for automatically analyzing big data to enable effective defense against emerging threats. Numerous advanced methods, such as natural language processing, information theory, and artificial intelligence techniques, have been widely used in special areas. However, when facing the rapidly evolving technique and new situations, there are still a lot of unsolved issues, such as network attack detection, threat intelligence extraction, malware classification, and attack attribution.

Therefore, this Special Issue aims to explore new approaches and perspectives on data-driven cybersecurity and privacy analysis topics. This Special Issue will focus on (but is not limited to) the following topics:

  • Network attack intelligence detection;
  • Cyberthreat intelligence extraction;
  • System or mobile malware analysis and classification;
  • Network malicious traffic detection and analysis;
  • System or network attack attribution;
  • Vulnerability mining and analysis;
  • Privacy data leak detection and classification;
  • Social network intelligence analysis.

Dr. Cheng Huang
Dr. Weina Niu
Dr. Wang Yang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data-driven security
  • network security
  • system security
  • privacy security
  • social network intelligence

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 3142 KiB  
Article
Enhanced Intrusion Detection with LSTM-Based Model, Feature Selection, and SMOTE for Imbalanced Data
by Hussein Ridha Sayegh, Wang Dong and Ali Mansour Al-madani
Appl. Sci. 2024, 14(2), 479; https://0-doi-org.brum.beds.ac.uk/10.3390/app14020479 - 05 Jan 2024
Cited by 2 | Viewed by 1192
Abstract
This study introduces a sophisticated intrusion detection system (IDS) that has been specifically developed for internet of things (IoT) networks. By utilizing the capabilities of long short-term memory (LSTM), a deep learning model renowned for its proficiency in modeling sequential data, our intrusion [...] Read more.
This study introduces a sophisticated intrusion detection system (IDS) that has been specifically developed for internet of things (IoT) networks. By utilizing the capabilities of long short-term memory (LSTM), a deep learning model renowned for its proficiency in modeling sequential data, our intrusion detection system (IDS) effectively discerns between regular network traffic and potential malicious attacks. In order to tackle the issue of imbalanced data, which is a prevalent concern in the development of intrusion detection systems (IDSs), we have integrated the synthetic minority over-sampling technique (SMOTE) into our approach. This incorporation allows our model to accurately identify infrequent incursion patterns. The rebalancing of the dataset is accomplished by SMOTE through the generation of synthetic samples belonging to the minority class. Various strategies, such as the utilization of generative adversarial networks (GANs), have been put forth in order to tackle the issue of data imbalance. However, SMOTE (synthetic minority over-sampling technique) presents some distinct advantages when applied to intrusion detection. The SMOTE is characterized by its simplicity and proven efficacy across diverse areas, including in intrusion detection. The implementation of this approach is straightforward and does not necessitate intricate adversarial training techniques such as generative adversarial networks (GANs). The interpretability of SMOTE lies in its ability to generate synthetic samples that are aligned with the properties of the original data, rendering it well suited for security applications that prioritize transparency. The utilization of SMOTE has been widely embraced in the field of intrusion detection research, demonstrating its effectiveness in augmenting the detection capacities of intrusion detection systems (IDSs) in internet of things (IoT) networks and reducing the consequences of class imbalance. This study conducted a thorough assessment of three commonly utilized public datasets, namely, CICIDS2017, NSL-KDD, and UNSW-NB15. The findings indicate that our LSTM-based intrusion detection system (IDS), in conjunction with the implementation of SMOTE to address data imbalance, outperforms existing methodologies in accurately detecting network intrusions. The findings of this study provide significant contributions to the domain of internet of things (IoT) security, presenting a proactive and adaptable approach to safeguarding against advanced cyberattacks. Through the utilization of LSTM-based deep learning techniques and the mitigation of data imbalance using SMOTE, our AI-driven intrusion detection system (IDS) enhances the security of internet of things (IoT) networks, hence facilitating the wider implementation of IoT technologies across many industries. Full article
(This article belongs to the Special Issue Data-Driven Cybersecurity and Privacy Analysis)
Show Figures

Figure 1

15 pages, 2399 KiB  
Article
Machine-Learning-Based Password-Strength-Estimation Approach for Passwords of Lithuanian Context
by Ema Darbutaitė, Pavel Stefanovič and Simona Ramanauskaitė
Appl. Sci. 2023, 13(13), 7811; https://0-doi-org.brum.beds.ac.uk/10.3390/app13137811 - 03 Jul 2023
Cited by 1 | Viewed by 2747
Abstract
In an information-security-assurance system, humans are usually the weakest link. It is partly related to insufficient cybersecurity knowledge and the ignorance of standard security recommendations. Consequently, the required password-strength requirements in information systems are the minimum of what can be done to ensure [...] Read more.
In an information-security-assurance system, humans are usually the weakest link. It is partly related to insufficient cybersecurity knowledge and the ignorance of standard security recommendations. Consequently, the required password-strength requirements in information systems are the minimum of what can be done to ensure system security. Therefore, it is important to use up-to-date and context-sensitive password-strength-estimation systems. However, minor languages are ignored, and password strength is usually estimated using English-only dictionaries. To change the situation, a machine learning approach was proposed in this article to support a more realistic model to estimate the strength of Lithuanian user passwords. A newly compiled dataset of password strength was produced. It integrated both international- and Lithuanian-language-specific passwords, including 6 commonly used password features and 36 similarity metrics for each item (4 similarity metrics for 9 different dictionaries). The proposed solution predicts the password strength of five classes with 77% accuracy. Taking into account the complexity of the accuracy of the Lithuanian language, the achieved result is adequate, as the availability of intelligent Lithuanian-language-specific password-cracking tools is not widely available yet. Full article
(This article belongs to the Special Issue Data-Driven Cybersecurity and Privacy Analysis)
Show Figures

Figure 1

32 pages, 1465 KiB  
Article
Unknown Traffic Recognition Based on Multi-Feature Fusion and Incremental Learning
by Junyi Liu, Jiarong Wang, Tian Yan, Fazhi Qi and Gang Chen
Appl. Sci. 2023, 13(13), 7649; https://0-doi-org.brum.beds.ac.uk/10.3390/app13137649 - 28 Jun 2023
Viewed by 1089
Abstract
Accurate classification and identification of Internet traffic are crucial for maintaining network security. However, unknown network traffic in the real world can affect the accuracy of current machine learning models, reducing the efficiency of traffic classification. Existing unknown traffic classification algorithms are unable [...] Read more.
Accurate classification and identification of Internet traffic are crucial for maintaining network security. However, unknown network traffic in the real world can affect the accuracy of current machine learning models, reducing the efficiency of traffic classification. Existing unknown traffic classification algorithms are unable to optimize traffic features and require the entire system to be retrained each time new traffic data are collected. This results in low recognition efficiency, making the algoritms unsuitable for real-time application detection. To solve the above issues, we suggest a multi-feature fusion-based incremental technique for detecting unknown traffic in this paper. The approach employs a multiple-channel parallel architecture to extract temporal and spatial traffic features. It then uses the mRMR algorithm to rank and fuse the features extracted from each channel to overcome the issue of redundant encrypted traffic features. In addition, we combine the density-ratio-based clustering algorithm to identify the unknown traffic features and update the model via incremental learning. The cassifier enables real-time classification of known and unknown traffic by learning newly acquired class knowledge. Our model can identify encrypted unknown Internet traffic with at least 86% accuracy in various scenarios, using the public ISCX-VPN-Tor datasets. Furthermore, it achieves 90% accuracy on the intrusion detection dataset NSL-KDD. In our self-collected dataset from a real-world environment, the accuracy of our model exceeds 96%. This work offers a novel method for identifying unknown network traffic, contributing to the security preservation of network environments. Full article
(This article belongs to the Special Issue Data-Driven Cybersecurity and Privacy Analysis)
Show Figures

Figure 1

16 pages, 2719 KiB  
Article
Dynamic Malware Analysis Based on API Sequence Semantic Fusion
by Sanfeng Zhang, Jiahao Wu, Mengzhe Zhang and Wang Yang
Appl. Sci. 2023, 13(11), 6526; https://0-doi-org.brum.beds.ac.uk/10.3390/app13116526 - 26 May 2023
Cited by 2 | Viewed by 1958
Abstract
The existing dynamic malware detection methods based on API call sequences ignore the semantic information of functions. Simply mapping API to numerical values does not reflect whether a function has performed a query or modification operation, whether it is related to network communication, [...] Read more.
The existing dynamic malware detection methods based on API call sequences ignore the semantic information of functions. Simply mapping API to numerical values does not reflect whether a function has performed a query or modification operation, whether it is related to network communication, the file system, or other factors. Additionally, the detection performance is limited when the size of the API call sequence is too large. To address this issue, we propose Mal-ASSF, a novel malware detection model that fuses the semantic and sequence features of the API calls. The API2Vec embedding method is used to obtain the dimensionality reduction representation of the API function. To capture the behavioral features of sequential segments, Balts is used to extract the features. To leverage the implicit semantic information of the API functions, the operation and the type of resource operated by the API functions are extracted. These semantic and sequential features are then fused and processed by the attention-related modules. In comparison with the existing methods, Mal-ASSF boasts superior capabilities in terms of semantic representation and recognition of critical sequences within API call sequences. According to the evaluation with a dataset of malware families, the experimental results show that Mal-ASSF outperforms existing solutions by 3% to 5% in detection accuracy. Full article
(This article belongs to the Special Issue Data-Driven Cybersecurity and Privacy Analysis)
Show Figures

Figure 1

13 pages, 3406 KiB  
Article
A Robust Adversarial Example Attack Based on Video Augmentation
by Mingyong Yin, Yixiao Xu, Teng Hu and Xiaolei Liu
Appl. Sci. 2023, 13(3), 1914; https://0-doi-org.brum.beds.ac.uk/10.3390/app13031914 - 01 Feb 2023
Viewed by 1590
Abstract
Despite the success of learning-based systems, recent studies have highlighted video adversarial examples as a ubiquitous threat to state-of-the-art video classification systems. Video adversarial attacks add subtle noise to the original example, resulting in a false classification result. Thorough studies on how to [...] Read more.
Despite the success of learning-based systems, recent studies have highlighted video adversarial examples as a ubiquitous threat to state-of-the-art video classification systems. Video adversarial attacks add subtle noise to the original example, resulting in a false classification result. Thorough studies on how to generate video adversarial examples are essential to prevent potential attacks. Despite much research on this, existing research works on the robustness of video adversarial examples are still limited. To generate highly robust video adversarial examples, we propose a video-augmentation-based adversarial attack (v3a), focusing on the video transformations to reinforce the attack. Further, we investigate different transformations as parts of the loss function to make the video adversarial examples more robust. The experiment results show that our proposed method outperforms other adversarial attacks in terms of robustness. We hope that our study encourages a deeper understanding of adversarial robustness in video classification systems with video augmentation. Full article
(This article belongs to the Special Issue Data-Driven Cybersecurity and Privacy Analysis)
Show Figures

Figure 1

18 pages, 1037 KiB  
Article
A Cybersecurity Knowledge Graph Completion Method Based on Ensemble Learning and Adversarial Training
by Peng Wang, Jingju Liu, Dongdong Hou and Shicheng Zhou
Appl. Sci. 2022, 12(24), 12947; https://0-doi-org.brum.beds.ac.uk/10.3390/app122412947 - 16 Dec 2022
Cited by 5 | Viewed by 2082
Abstract
The application of cybersecurity knowledge graphs is attracting increasing attention. However, many cybersecurity knowledge graphs are incomplete due to the sparsity of cybersecurity knowledge. Existing knowledge graph completion methods do not perform well in domain knowledge, and they are not robust enough relative [...] Read more.
The application of cybersecurity knowledge graphs is attracting increasing attention. However, many cybersecurity knowledge graphs are incomplete due to the sparsity of cybersecurity knowledge. Existing knowledge graph completion methods do not perform well in domain knowledge, and they are not robust enough relative to noise data. To address these challenges, in this paper we develop a new knowledge graph completion method called CSEA based on ensemble learning and adversarial training. Specifically, we integrate a variety of projection and rotation operations to model the relationships between entities, and use angular information to distinguish entities. A cooperative adversarial training method is designed to enhance the generalization and robustness of the model. We combine the method of generating perturbations for the embedding layers with the self-adversarial training method. The UCB (upper confidence bound) multi-armed bandit method is used to select the perturbations of the embedding layer. This achieves a balance between perturbation diversity and maximum loss. To this end, we build a cybersecurity knowledge graph based on the CVE, CWE, and CAPEC cybersecurity databases. Our experimental results demonstrate the superiority of our proposed model for completing cybersecurity knowledge graphs. Full article
(This article belongs to the Special Issue Data-Driven Cybersecurity and Privacy Analysis)
Show Figures

Figure 1

16 pages, 518 KiB  
Article
SATFuzz: A Stateful Network Protocol Fuzzing Framework from a Novel Perspective
by Zulie Pan, Liqun Zhang, Zhihao Hu, Yang Li and Yuanchao Chen
Appl. Sci. 2022, 12(15), 7459; https://0-doi-org.brum.beds.ac.uk/10.3390/app12157459 - 25 Jul 2022
Viewed by 1927
Abstract
Stateful network protocol fuzzing is one of the essential means for ensuring network communication security. However, the existing methods have problems, including frequent auxiliary message interaction, no in-depth state-space exploration, and high shares of invalid interaction time. To this end, we propose SATFuzz, [...] Read more.
Stateful network protocol fuzzing is one of the essential means for ensuring network communication security. However, the existing methods have problems, including frequent auxiliary message interaction, no in-depth state-space exploration, and high shares of invalid interaction time. To this end, we propose SATFuzz, a stateful network protocol fuzzing framework. SATFuzz first prioritizes the states identified by the status codes in response messages, then randomly selects a state to test among the high-priority states, and determines its corresponding optimal test sequence, which is composed of the minimum pre-lead sequence, the test case, and the fittest post-end sequence. Finally, SATFuzz uses a quasi-recurrent neural network (QRNN) to filter the test cases before performing interaction, and only the optimal test sequence, including the valid test case, can be fed to the protocol entity. To verify the proposed framework, we conduct extensive experiments with the state-of-the-art fuzzer on two popular protocols. The results show that the vulnerability discovery efficiency of the proposed approach increases by at least 1.48 times (at most by 3.06 times), making it superior to the rival methods. This not only confirms the effectiveness of SATFuzz in terms of improving the vulnerability discovery efficiency but also shows that SATFuzz has significant advantages. Full article
(This article belongs to the Special Issue Data-Driven Cybersecurity and Privacy Analysis)
Show Figures

Figure 1

Back to TopTop