Artificial Intelligence and Machine Learning in Software Engineering

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 April 2022) | Viewed by 39543

Special Issue Editor


E-Mail Website
Guest Editor
Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, Morro do Lena-Alto do Vieiro, Apartado 4163, 2411-901 Leiria, Portugal
Interests: mobile computing; search-based software engineering; genetic programming; context-aware systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The next decade will be full of technological advances that will present numerous challenges and opportunities for the software engineering (SE) discipline. The application of artificial intelligence and machine learning (AI/ML) techniques to SE has been extensively studied, but many issues and areas of application remain open for investigation. The growing ability of machines to learn and act intelligently will continue to guide the transformation process of the world in which we live in. Forbes identifies AI/ML as the leading strategic technology trend for the next decade, and the IDC indicates that, even with the effects of the pandemic being felt globally, growth in this area is expected to continue accelerating during the next quinquennium. There is an urgent need to explore the opportunities created by AI/ML to leverage research in the area of SE, as software and application vendors should continue incorporating AI/ML as a differentiating aspect to increase adoption and deliver benefits to their customers and users, as well as improve return on investment and achieve cost savings.

Search-based software engineering (SBSE) seeks to reformulate SE challenges as search-based optimization problems, typically through the application of evolutionary algorithms or ML. It has already been applied to a wide variety of areas in SE, including requirements engineering, project planning and cost estimation, automated maintenance, or quality assessment. Most of the global literature in the area of SBSE is, however, dedicated to applications related to software testing – with the generation of test data being the most studied topic. Leveraging AI for automating the processes of creating, executing, and verifying software testing remains critical for improving the quality of the complex software systems that have become the norm of modern society.

The application of ML techniques also possesses enormous potential for applicability in the area of process mining and for improving the quality of processes in an enterprise environment, particularly in the context of DevOps practices. The intricate software development ecosystems of modern enterprises, with emphasis on continuous integration/continuous delivery (CI/CD) pipelines, include several sources that can help understand the quality of processes and products—such as source code repositories and version control systems, static analysis tools, continuous integration servers, testing tools, bug tracking systems, or incident management tools. However, to achieve this goal, it is necessary to systematically collect, store, and analyze information; raw data can then, through ML and data mining automated mechanisms, generate valuable information capable of providing insight, supporting strategies for enabling the identification and even the prediction of defects and failures, and allowing teams to prioritize efforts through alert and recommendation mechanisms.

Prof. José Carlos Bregieiro Ribeiro
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Software engineering
  • artificial intelligence and machine learning
  • search-based software engineering
  • search-based test data generation
  • process mining
  • data mining
  • evolutionary testing

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 604 KiB  
Article
Test Suite Prioritization Based on Optimization Approach Using Reinforcement Learning
by Muhammad Waqar, Imran, Muhammad Atif Zaman, Muhammad Muzammal and Jungsuk Kim
Appl. Sci. 2022, 12(13), 6772; https://0-doi-org.brum.beds.ac.uk/10.3390/app12136772 - 04 Jul 2022
Cited by 8 | Viewed by 2272
Abstract
Regression testing ensures that modified software code changes have not adversely affected existing code modules. The test suite size increases with modification to the software based on the end-user requirements. Regression testing executes the complete test suite after updates in the software. Re-execution [...] Read more.
Regression testing ensures that modified software code changes have not adversely affected existing code modules. The test suite size increases with modification to the software based on the end-user requirements. Regression testing executes the complete test suite after updates in the software. Re-execution of new test cases along with existing test cases is costly. The scientific community has proposed test suite prioritization techniques for selecting and minimizing the test suite to minimize the cost of regression testing. The test suite prioritization goal is to maximize fault detection with minimum test cases. Test suite minimization reduces the test suite size by deleting less critical test cases. In this study, we present a four-fold methodology of test suite prioritization based on reinforcement learning. First, the testers’ and users’ log datasets are prepared using the proposed interaction recording systems for the android application. Second, the proposed reinforcement learning model is used to predict the highest future reward sequence list from the data collected in the first step. Third, the proposed prioritization algorithm signifies the prioritized test suite. Lastly, the fault seeding approach is used to validate the results from software engineering experts. The proposed reinforcement learning-based test suite optimization model is evaluated through five case study applications. The performance evaluation results show that the proposed mechanism performs better than baseline approaches based on random and t-SANT approaches, proving its importance for regression testing. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

20 pages, 4004 KiB  
Article
Software Defect Prediction Using Stacking Generalization of Optimized Tree-Based Ensembles
by Amal Alazba and Hamoud Aljamaan
Appl. Sci. 2022, 12(9), 4577; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094577 - 30 Apr 2022
Cited by 10 | Viewed by 2675
Abstract
Software defect prediction refers to the automatic identification of defective parts of software through machine learning techniques. Ensemble learning has exhibited excellent prediction outcomes in comparison with individual classifiers. However, most of the previous work utilized ensemble models in the context of software [...] Read more.
Software defect prediction refers to the automatic identification of defective parts of software through machine learning techniques. Ensemble learning has exhibited excellent prediction outcomes in comparison with individual classifiers. However, most of the previous work utilized ensemble models in the context of software defect prediction with the default hyperparameter values, which are considered suboptimal. In this paper, we investigate the applicability of a stacking ensemble built with fine-tuned tree-based ensembles for defect prediction. We used grid search to optimize the hyperparameters of seven tree-based ensembles: random forest, extra trees, AdaBoost, gradient boosting, histogram-based gradient boosting, XGBoost and CatBoost. Then, a stacking ensemble was built utilizing the fine-tuned tree-based ensembles. The ensembles were evaluated using 21 publicly available defect datasets. Empirical results showed large impacts of hyperparameter optimization on extra trees and random forest ensembles. Moreover, our results demonstrated the superiority of the stacking ensemble over all fine-tuned tree-based ensembles. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

13 pages, 369 KiB  
Article
Causally Remove Negative Confound Effects of Size Metric for Software Defect Prediction
by Chenlong Li, Yuyu Yuan and Jincui Yang
Appl. Sci. 2022, 12(3), 1387; https://0-doi-org.brum.beds.ac.uk/10.3390/app12031387 - 27 Jan 2022
Viewed by 1745
Abstract
Software defect prediction technology can effectively detect potential defects in the software system. The most common method is to establish machine learning models based on software metrics for prediction. However, most of the prediction models are proposed without considering the confounding effects of [...] Read more.
Software defect prediction technology can effectively detect potential defects in the software system. The most common method is to establish machine learning models based on software metrics for prediction. However, most of the prediction models are proposed without considering the confounding effects of size metric. The size metric has unexpected correlations with other software metrics and introduces biases into prediction results. Suitably removing these confounding effects to improve the prediction model’s performance is an issue that is still largely unexplored. This paper proposes a method that can causally remove the negative confounding effects of size metric. First, we quantify the confounding effects based on a causal graph. Then, we analyze each confounding effect to determine whether they are positive or negative, and only the negative confounding effects are removed. Extensive experimental results on eight data sets demonstrate the effectiveness of our proposed method. The prediction model’s performance can, in general, be improved after removing the negative confounding effects of size metric. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

26 pages, 1480 KiB  
Article
Towards Design and Feasibility Analysis of DePaaS: AI Based Global Unified Software Defect Prediction Framework
by Mahesha Pandit, Deepali Gupta, Divya Anand, Nitin Goyal, Hani Moaiteq Aljahdali, Arturo Ortega Mansilla, Seifedine Kadry and Arun Kumar
Appl. Sci. 2022, 12(1), 493; https://0-doi-org.brum.beds.ac.uk/10.3390/app12010493 - 04 Jan 2022
Cited by 11 | Viewed by 2539
Abstract
Using artificial intelligence (AI) based software defect prediction (SDP) techniques in the software development process helps isolate defective software modules, count the number of software defects, and identify risky code changes. However, software development teams are unaware of SDP and do not have [...] Read more.
Using artificial intelligence (AI) based software defect prediction (SDP) techniques in the software development process helps isolate defective software modules, count the number of software defects, and identify risky code changes. However, software development teams are unaware of SDP and do not have easy access to relevant models and techniques. The major reason for this problem seems to be the fragmentation of SDP research and SDP practice. To unify SDP research and practice this article introduces a cloud-based, global, unified AI framework for SDP called DePaaS—Defects Prediction as a Service. The article describes the usage context, use cases and detailed architecture of DePaaS and presents the first response of the industry practitioners to DePaaS. In a first of its kind survey, the article captures practitioner’s belief into SDP and ability of DePaaS to solve some of the known challenges of the field of software defect prediction. This article also provides a novel process for SDP, detailed description of the structure and behaviour of DePaaS architecture components, six best SDP models offered by DePaaS, a description of algorithms that recommend SDP models, feature sets and tunable parameters, and a rich set of challenges to build, use and sustain DePaaS. With the contributions of this article, SDP research and practice could be unified enabling building and using more pragmatic defect prediction models leading to increase in the efficiency of software testing. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

20 pages, 4877 KiB  
Article
A Bayesian Network-Based Information Fusion Combined with DNNs for Robust Video Fire Detection
by Byoungjun Kim and Joonwhoan Lee
Appl. Sci. 2021, 11(16), 7624; https://0-doi-org.brum.beds.ac.uk/10.3390/app11167624 - 19 Aug 2021
Cited by 6 | Viewed by 2109
Abstract
Fire is an abnormal event that can cause significant damage to lives and property. Deep learning approach has made large progress in vision-based fire detection. However, there is still the problem of false detections due to the objects which have similar fire-like visual [...] Read more.
Fire is an abnormal event that can cause significant damage to lives and property. Deep learning approach has made large progress in vision-based fire detection. However, there is still the problem of false detections due to the objects which have similar fire-like visual properties such as colors or textures. In the previous video-based approach, Faster Region-based Convolutional Neural Network (R-CNN) is used to detect the suspected regions of fire (SRoFs), and long short-term memory (LSTM) accumulates the local features within the bounding boxes to decide a fire in a short-term period. Then, majority voting of the short-term decisions is taken to make the decision reliable in a long-term period. To ensure that the final fire decision is more robust, however, this paper proposes to use a Bayesian network to fuse various types of information. Because there are so many types of Bayesian network according to the situations or domains where the fire detection is needed, we construct a simple Bayesian network as an example which combines environmental information (e.g., humidity) with visual information including the results of location recognition and smoke detection, and long-term video-based majority voting. Our experiments show that the Bayesian network successfully improves the fire detection accuracy when compared against the previous video-based method and the state of art performance has been achieved with a public dataset. The proposed method also reduces the latency for perfect fire decisions, as compared with the previous video-based method. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

24 pages, 957 KiB  
Article
A Paired Learner-Based Approach for Concept Drift Detection and Adaptation in Software Defect Prediction
by Arvind Kumar Gangwar, Sandeep Kumar and Alok Mishra
Appl. Sci. 2021, 11(14), 6663; https://0-doi-org.brum.beds.ac.uk/10.3390/app11146663 - 20 Jul 2021
Cited by 3 | Viewed by 2480
Abstract
The early and accurate prediction of defects helps in testing software and therefore leads to an overall higher-quality product. Due to drift in software defect data, prediction model performances may degrade over time. Very few earlier works have investigated the significance of concept [...] Read more.
The early and accurate prediction of defects helps in testing software and therefore leads to an overall higher-quality product. Due to drift in software defect data, prediction model performances may degrade over time. Very few earlier works have investigated the significance of concept drift (CD) in software-defect prediction (SDP). Their results have shown that CD is present in software defect data and tha it has a significant impact on the performance of defect prediction. Motivated from this observation, this paper presents a paired learner-based drift detection and adaptation approach in SDP that dynamically adapts the varying concepts by updating one of the learners in pair. For a given defect dataset, a subset of data modules is analyzed at a time by both learners based on their learning experience from the past. A difference in accuracies of the two is used to detect drift in the data. We perform an evaluation of the presented study using defect datasets collected from the SEACraft and PROMISE data repositories. The experimentation results show that the presented approach successfully detects the concept drift points and performs better compared to existing methods, as is evident from the comparative analysis performed using various performance parameters such as number of drift points, ROC-AUC score, accuracy, and statistical analysis using Wilcoxon signed rank test. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

14 pages, 1240 KiB  
Article
Automatic Classification of UML Class Diagrams Using Deep Learning Technique: Convolutional Neural Network
by Bethany Gosala, Sripriya Roy Chowdhuri, Jyoti Singh, Manjari Gupta and Alok Mishra
Appl. Sci. 2021, 11(9), 4267; https://0-doi-org.brum.beds.ac.uk/10.3390/app11094267 - 08 May 2021
Cited by 12 | Viewed by 9695
Abstract
Unified Modeling Language (UML) includes various types of diagrams that help to study, analyze, document, design, or develop any software efficiently. Therefore, UML diagrams are of great advantage for researchers, software developers, and academicians. Class diagrams are the most widely used UML diagrams [...] Read more.
Unified Modeling Language (UML) includes various types of diagrams that help to study, analyze, document, design, or develop any software efficiently. Therefore, UML diagrams are of great advantage for researchers, software developers, and academicians. Class diagrams are the most widely used UML diagrams for this purpose. Despite its recognition as a standard modeling language for Object-Oriented software, it is difficult to learn. Although there exist repositories that aids the users with the collection of UML diagrams, there is still much more to explore and develop in this domain. The objective of our research was to develop a tool that can automatically classify the images as UML class diagrams and non-UML class diagrams. Earlier research used Machine Learning techniques for classifying class diagrams. Thus, they are required to identify image features and investigate the impact of these features on the UML class diagrams classification problem. We developed a new approach for automatically classifying class diagrams using the approach of Convolutional Neural Network under the domain of Deep Learning. We have applied the code on Convolutional Neural Networks with and without the Regularization technique. Our tool receives JPEG/PNG/GIF/TIFF images as input and predicts whether it is a UML class diagram image or not. There is no need to tag images of class diagrams as UML class diagrams in our dataset. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

19 pages, 5011 KiB  
Article
POP-ON: Prediction of Process Using One-Way Language Model Based on NLP Approach
by Junhyung Moon, Gyuyoung Park and Jongpil Jeong
Appl. Sci. 2021, 11(2), 864; https://0-doi-org.brum.beds.ac.uk/10.3390/app11020864 - 18 Jan 2021
Cited by 14 | Viewed by 3928
Abstract
In business process management, the monitoring service is an important element that can prevent various problems in advance from before they occur in companies and industries. Execution log is created in an information system that is aware of the enterprise process, which helps [...] Read more.
In business process management, the monitoring service is an important element that can prevent various problems in advance from before they occur in companies and industries. Execution log is created in an information system that is aware of the enterprise process, which helps predict the process. The ultimate goal of the proposed method is to predict the process following the running process instance and predict events based on previously completed event log data. Companies can flexibly respond to unwanted deviations in their workflow. When solving the next event prediction problem, we use a fully attention-based transformer, which has performed well in recent natural language processing approaches. After recognizing the name attribute of the event in the natural language and predicting the next event, several necessary elements were applied. It is trained using the proposed deep learning model according to specific pre-processing steps. Experiments using various business process log datasets demonstrate the superior performance of the proposed method. The name of the process prediction model we propose is “POP-ON”. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

11 pages, 226 KiB  
Article
Research on Personalized Recommendation Methods for Online Video Learning Resources
by Xiaojuan Chen and Huiwen Deng
Appl. Sci. 2021, 11(2), 804; https://0-doi-org.brum.beds.ac.uk/10.3390/app11020804 - 15 Jan 2021
Cited by 13 | Viewed by 2185
Abstract
It is not easy to find learning materials of interest quickly in the vast amount of online learning materials. The purpose of this study is to find students’ interests according to their learning behaviors in the network and to recommend related video learning [...] Read more.
It is not easy to find learning materials of interest quickly in the vast amount of online learning materials. The purpose of this study is to find students’ interests according to their learning behaviors in the network and to recommend related video learning materials. For the students who do not leave an evaluation record in the learning platform, the association rule algorithm in data mining is used to find out the videos that students are interested in and recommend them. For the students who have evaluation records in the platform, we use the collaborative filtering algorithm based on items in machine learning, and use the Pearson correlation coefficient method to find highly similar video materials, and then recommend the learning materials they are interested in. The two methods are used in different situations, and all students in the learning platform can get recommendation. Through the application, our methods can reduce the data search time, improve the stickiness of the platform, solve the problem of information overload, and meet the personalized needs of the learners. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
16 pages, 20515 KiB  
Article
Automatic Colorization of Anime Style Illustrations Using a Two-Stage Generator
by Yeongseop Lee and Seongjin Lee
Appl. Sci. 2020, 10(23), 8699; https://0-doi-org.brum.beds.ac.uk/10.3390/app10238699 - 04 Dec 2020
Cited by 2 | Viewed by 4165
Abstract
Line-arts are used in many ways in the media industry. However, line-art colorization is tedious, labor-intensive, and time consuming. For such reasons, a Generative Adversarial Network (GAN)-based image-to-image colorization method has received much attention because of its promising results. In this paper, we [...] Read more.
Line-arts are used in many ways in the media industry. However, line-art colorization is tedious, labor-intensive, and time consuming. For such reasons, a Generative Adversarial Network (GAN)-based image-to-image colorization method has received much attention because of its promising results. In this paper, we propose to use color a point hinting method with two GAN-based generators used for enhancing the image quality. To improve the coloring performance of drawing with various line styles, generator takes account of the loss of the line-art. We propose a Line Detection Model (LDM) which is used in measuring line loss. LDM is a method of extracting line from a color image. We also propose histogram equalizer in the input line-art to generalize the distribution of line styles. This approach allows the generalization of the distribution of line style without increasing the complexity of inference stage. In addition, we propose seven segment hint pointing constraints to evaluate the colorization performance of the model with Fréchet Inception Distance (FID) score. We present visual and qualitative evaluations of the proposed methods. The result shows that using histogram equalization and LDM enabled line loss exhibits the best result. The Base model with XDoG (eXtended Difference-Of-Gaussians)generated line-art with and without color hints exhibits FID for colorized images score of 35.83 and 44.70, respectively, whereas the proposed model in the same scenario exhibits 32.16 and 39.77, respectively. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

24 pages, 1274 KiB  
Article
LIMCR: Less-Informative Majorities Cleaning Rule Based on Naïve Bayes for Imbalance Learning in Software Defect Prediction
by Yumei Wu, Jingxiu Yao, Shuo Chang and Bin Liu
Appl. Sci. 2020, 10(23), 8324; https://0-doi-org.brum.beds.ac.uk/10.3390/app10238324 - 24 Nov 2020
Cited by 4 | Viewed by 1446
Abstract
Software defect prediction (SDP) is an effective technique to lower software module testing costs. However, the imbalanced distribution almost exists in all SDP datasets and restricts the accuracy of defect prediction. In order to balance the data distribution reasonably, we propose a novel [...] Read more.
Software defect prediction (SDP) is an effective technique to lower software module testing costs. However, the imbalanced distribution almost exists in all SDP datasets and restricts the accuracy of defect prediction. In order to balance the data distribution reasonably, we propose a novel resampling method LIMCR on the basis of Naïve Bayes to optimize and improve the SDP performance. The main idea of LIMCR is to remove less-informative majorities for rebalancing the data distribution after evaluating the degree of being informative for every sample from the majority class. We employ 29 SDP datasets from the PROMISE and NASA dataset and divide them into two parts, the small sample size (the amount of data is smaller than 1100) and the large sample size (larger than 1100). Then we conduct experiments by comparing the matching of classifiers and imbalance learning methods on small datasets and large datasets, respectively. The results show the effectiveness of LIMCR, and LIMCR+GNB performs better than other methods on small datasets while not brilliant on large datasets. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

29 pages, 436 KiB  
Article
Credibility Based Imbalance Boosting Method for Software Defect Proneness Prediction
by Haonan Tong, Shihai Wang and Guangling Li
Appl. Sci. 2020, 10(22), 8059; https://0-doi-org.brum.beds.ac.uk/10.3390/app10228059 - 13 Nov 2020
Cited by 10 | Viewed by 1514
Abstract
Imbalanced data are a major factor for degrading the performance of software defect models. Software defect dataset is imbalanced in nature, i.e., the number of non-defect-prone modules is far more than that of defect-prone ones, which results in the bias of classifiers on [...] Read more.
Imbalanced data are a major factor for degrading the performance of software defect models. Software defect dataset is imbalanced in nature, i.e., the number of non-defect-prone modules is far more than that of defect-prone ones, which results in the bias of classifiers on the majority class samples. In this paper, we propose a novel credibility-based imbalance boosting (CIB) method in order to address the class-imbalance problem in software defect proneness prediction. The method measures the credibility of synthetic samples based on their distribution by introducing a credit factor to every synthetic sample, and proposes a weight updating scheme to make the base classifiers focus on synthetic samples with high credibility and real samples. Experiments are performed on 11 NASA datasets and nine PROMISE datasets by comparing CIB with MAHAKIL, AdaC2, AdaBoost, SMOTE, RUS, No sampling method in terms of four performance measures, i.e., area under the curve (AUC), F1, AGF, and Matthews correlation coefficient (MCC). Wilcoxon sign-ranked test and Cliff’s δ are separately used to perform statistical test and calculate effect size. The experimental results show that CIB is a more promising alternative for addressing the class-imbalance problem in software defect-prone prediction as compared with previous methods. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Software Engineering)
Show Figures

Figure 1

Back to TopTop