Recent Progress in Big Data and Artificial Intelligence: Modern Methods and Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 16865

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Administration and Business, University of Bucharest, 030018 Bucharest, Romania
Interests: artificial intelligence (AI); neural and evolutionary computing; machine learning; linear algebra; parallel computing; computational economics; statistics

Special Issue Information

Dear Colleagues,

Being at the intersection of computer science, information engineering, and mathematics, artificial intelligence techniques have become an essential part of our lives. Nowadays, we find practical applications of artificial intelligence in almost every field of activity, from self-driving cars to speech recognition or fraud detection systems.

Big data is tightly coupled with artificial intelligence. Not only do new big data technologies greatly benefit from artificial intelligence methods but the development of new artificial intelligence techniques also relies on big data technology and use big data sources.

This Special Issue focuses on advances in artificial intelligence methods and their applications using big data sources and technologies. Topics of interest include but are not limited to soft computing methods such as artificial neural networks, fuzzy systems, or evolutionary computation, intelligent agents and multi-agent systems and their applications, and machine learning techniques.

Prof. Dr. Bogdan Oancea
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Big data
  • Artificial intelligence
  • Neural networks
  • Machine learning
  • Intelligent systems

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 3008 KiB  
Article
Applying Neural Networks to Recover Values of Monitoring Parameters for COVID-19 Patients in the ICU
by Sergio Celada-Bernal, Guillermo Pérez-Acosta, Carlos M. Travieso-González, José Blanco-López and Luciano Santana-Cabrera
Mathematics 2023, 11(15), 3332; https://0-doi-org.brum.beds.ac.uk/10.3390/math11153332 - 29 Jul 2023
Viewed by 833
Abstract
From the moment a patient is admitted to the hospital, monitoring begins, and specific information is collected. The continuous flow of parameters, including clinical and analytical data, serves as a significant source of information. However, there are situations in which not all values [...] Read more.
From the moment a patient is admitted to the hospital, monitoring begins, and specific information is collected. The continuous flow of parameters, including clinical and analytical data, serves as a significant source of information. However, there are situations in which not all values from medical tests can be obtained. This paper aims to predict the medical test values of COVID-19 patients in the intensive care unit (ICU). By retrieving the missing medical test values, the model provides healthcare professionals with an additional tool and more information with which to combat COVID-19. The proposed approach utilizes a customizable deep learning model. Three types of neural networks, namely Multilayer Perceptron (MLP), Long/Short-Term Memory (LSTM), and Gated Recurrent Units (GRU), are employed. The parameters of these neural networks are configured to determine the model that delivers the optimal performance. Evaluation of the model’s performance is conducted using metrics such as Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and Mean Absolute Error (MAE). The application of the proposed model achieves predictions of the retrieved medical test values, resulting in RMSE = 7.237, MAPE = 5.572, and MAE = 4.791. Moreover, the article explores various scenarios in which the model exhibits higher accuracy. This model can be adapted and utilized in the diagnosis of future infectious diseases that share characteristics with Coronavirus Disease 2019 (COVID-19). Full article
Show Figures

Figure 1

15 pages, 10643 KiB  
Article
Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames
by Yihang Zhang and Yunsick Sung
Mathematics 2023, 11(13), 2884; https://0-doi-org.brum.beds.ac.uk/10.3390/math11132884 - 27 Jun 2023
Cited by 3 | Viewed by 2486
Abstract
Artificial intelligence plays a significant role in traffic-accident detection. Traffic accidents involve a cascade of inadvertent events, making traditional detection approaches challenging. For instance, Convolutional Neural Network (CNN)-based approaches cannot analyze temporal relationships among objects, and Recurrent Neural Network (RNN)-based approaches suffer from [...] Read more.
Artificial intelligence plays a significant role in traffic-accident detection. Traffic accidents involve a cascade of inadvertent events, making traditional detection approaches challenging. For instance, Convolutional Neural Network (CNN)-based approaches cannot analyze temporal relationships among objects, and Recurrent Neural Network (RNN)-based approaches suffer from low processing speeds and cannot detect traffic accidents simultaneously across multiple frames. Furthermore, these networks dismiss background interference in input video frames. This paper proposes a framework that begins by subtracting the background based on You Only Look Once (YOLOv5), which adaptively reduces background interference when detecting objects. Subsequently, the CNN encoder and Transformer decoder are combined into an end-to-end model to extract the spatial and temporal features between different time points, allowing for a parallel analysis between input video frames. The proposed framework was evaluated on the Car Crash Dataset through a series of comparison and ablation experiments. Our framework was benchmarked against three accident-detection models to evaluate its effectiveness, and the proposed framework demonstrated a superior accuracy of approximately 96%. The results of the ablation experiments indicate that when background subtraction was not incorporated into the proposed framework, the values of all evaluation indicators decreased by approximately 3%. Full article
Show Figures

Figure 1

22 pages, 852 KiB  
Article
Enhancing Precision in Large-Scale Data Analysis: An Innovative Robust Imputation Algorithm for Managing Outliers and Missing Values
by Matthias Templ
Mathematics 2023, 11(12), 2729; https://0-doi-org.brum.beds.ac.uk/10.3390/math11122729 - 16 Jun 2023
Cited by 3 | Viewed by 1423
Abstract
Navigating the intricate world of data analytics, one method has emerged as a key tool in confronting missing data: multiple imputation. Its strength is further fortified by its powerful variant, robust imputation, which enhances the precision and reliability of its results. In the [...] Read more.
Navigating the intricate world of data analytics, one method has emerged as a key tool in confronting missing data: multiple imputation. Its strength is further fortified by its powerful variant, robust imputation, which enhances the precision and reliability of its results. In the challenging landscape of data analysis, non-robust methods can be swayed by a few extreme outliers, leading to skewed imputations and biased estimates. This can apply to both representative outliers—those true yet unusual values of your population—and non-representative outliers, which are mere measurement errors. Detecting these outliers in large or high-dimensional data sets often becomes as complex as unraveling a Gordian knot. The solution? Turn to robust imputation methods. Robust (imputation) methods effectively manage outliers and exhibit remarkable resistance to their influence, providing a more reliable approach to dealing with missing data. Moreover, these robust methods offer flexibility, accommodating even if the imputation model used is not a perfect fit. They are akin to a well-designed buffer system, absorbing slight deviations without compromising overall stability. In the latest advancement of statistical methodology, a new robust imputation algorithm has been introduced. This innovative solution addresses three significant challenges with robustness. It utilizes robust bootstrapping to manage model uncertainty during the imputation of a random sample; it incorporates robust fitting to reinforce accuracy; and it takes into account imputation uncertainty in a resilient manner. Furthermore, any complex regression or classification model for any variable with missing data can be run through the algorithm. With this new algorithm, we move one step closer to optimizing the accuracy and reliability of handling missing data. Using a realistic data set and a simulation study including a sensitivity analysis, the new alogorithm imputeRobust shows excellent performance compared with other common methods. Effectiveness was demonstrated by measures of precision for the prediction error, the coverage rates, and the mean square errors of the estimators, as well as by visual comparisons. Full article
Show Figures

Figure 1

32 pages, 1880 KiB  
Article
Automatic Product Classification Using Supervised Machine Learning Algorithms in Price Statistics
by Bogdan Oancea
Mathematics 2023, 11(7), 1588; https://0-doi-org.brum.beds.ac.uk/10.3390/math11071588 - 24 Mar 2023
Cited by 1 | Viewed by 2688
Abstract
Modern approaches to computing consumer price indices include the use of various data sources, such as web-scraped data or scanner data, which are very large in volume and need special processing techniques. In this paper, we address one of the main problems in [...] Read more.
Modern approaches to computing consumer price indices include the use of various data sources, such as web-scraped data or scanner data, which are very large in volume and need special processing techniques. In this paper, we address one of the main problems in the consumer price index calculation, namely the product classification, which cannot be performed manually when using large data sources. Therefore, we conducted an experiment on automatic product classification according to an international classification scheme. We combined 9 different word-embedding techniques with 13 classification methods with the aim of identifying the best combination in terms of the quality of the resultant classification. Because the dataset used in this experiment was significantly imbalanced, we compared these methods not only using the accuracy, F1-score, and AUC, but also using a weighted F1-score that better reflected the overall classification quality. Our experiment showed that logistic regression, support vector machines, and random forests, combined with the FastText skip-gram embedding technique provided the best classification results, with superior values in performance metrics, as compared to other similar studies. An execution time analysis showed that, among the three mentioned methods, logistic regression was the fastest while the random forest recorded a longer execution time. We also provided per-class performance metrics and formulated an error analysis that enabled us to identify methods that could be excluded from the range of choices because they provided less reliable classifications for our purposes. Full article
Show Figures

Figure 1

28 pages, 6866 KiB  
Article
GRAN3SAT: Creating Flexible Higher-Order Logic Satisfiability in the Discrete Hopfield Neural Network
by Yuan Gao, Yueling Guo, Nurul Atiqah Romli, Mohd Shareduwan Mohd Kasihmuddin, Weixiang Chen, Mohd. Asyraf Mansor and Ju Chen
Mathematics 2022, 10(11), 1899; https://0-doi-org.brum.beds.ac.uk/10.3390/math10111899 - 01 Jun 2022
Cited by 14 | Viewed by 1557
Abstract
One of the main problems in representing information in the form of nonsystematic logic is the lack of flexibility, which leads to potential overfitting. Although nonsystematic logic improves the representation of the conventional k Satisfiability, the formulations of the first, second, and third-order [...] Read more.
One of the main problems in representing information in the form of nonsystematic logic is the lack of flexibility, which leads to potential overfitting. Although nonsystematic logic improves the representation of the conventional k Satisfiability, the formulations of the first, second, and third-order logical structures are very predictable. This paper proposed a novel higher-order logical structure, named G-Type Random k Satisfiability, by capitalizing the new random feature of the first, second, and third-order clauses. The proposed logic was implemented into the Discrete Hopfield Neural Network as a symbolic logical rule. The proposed logic in Discrete Hopfield Neural Networks was evaluated using different parameter settings, such as different orders of clauses, different proportions between positive and negative literals, relaxation, and differing numbers of learning trials. Each evaluation utilized various performance metrics, such as learning error, testing error, weight error, energy analysis, and similarity analysis. In addition, the flexibility of the proposed logic was compared with current state-of-the-art logic rules. Based on the simulation, the proposed logic was reported to be more flexible, and produced higher solution diversity. Full article
Show Figures

Figure 1

14 pages, 3231 KiB  
Article
An Intelligent Metaheuristic Binary Pigeon Optimization-Based Feature Selection and Big Data Classification in a MapReduce Environment
by Felwa Abukhodair, Wafaa Alsaggaf, Amani Tariq Jamal, Sayed Abdel-Khalek and Romany F. Mansour
Mathematics 2021, 9(20), 2627; https://0-doi-org.brum.beds.ac.uk/10.3390/math9202627 - 18 Oct 2021
Cited by 30 | Viewed by 1817
Abstract
Big Data are highly effective for systematically extracting and analyzing massive data. It can be useful to manage data proficiently over the conventional data handling approaches. Recently, several schemes have been developed for handling big datasets with several features. At the same time, [...] Read more.
Big Data are highly effective for systematically extracting and analyzing massive data. It can be useful to manage data proficiently over the conventional data handling approaches. Recently, several schemes have been developed for handling big datasets with several features. At the same time, feature selection (FS) methodologies intend to eliminate repetitive, noisy, and unwanted features that degrade the classifier results. Since conventional methods have failed to attain scalability under massive data, the design of new Big Data classification models is essential. In this aspect, this study focuses on the design of metaheuristic optimization based on big data classification in a MapReduce (MOBDC-MR) environment. The MOBDC-MR technique aims to choose optimal features and effectively classify big data. In addition, the MOBDC-MR technique involves the design of a binary pigeon optimization algorithm (BPOA)-based FS technique to reduce the complexity and increase the accuracy. Beetle antenna search (BAS) with long short-term memory (LSTM) model is employed for big data classification. The presented MOBDC-MR technique has been realized on Hadoop with the MapReduce programming model. The effective performance of the MOBDC-MR technique was validated using a benchmark dataset and the results were investigated under several measures. The MOBDC-MR technique demonstrated promising performance over the other existing techniques under different dimensions. Full article
Show Figures

Figure 1

21 pages, 2448 KiB  
Article
Stock Price Movement Prediction Based on a Deep Factorization Machine and the Attention Mechanism
by Xiaodong Zhang, Suhui Liu and Xin Zheng
Mathematics 2021, 9(8), 800; https://0-doi-org.brum.beds.ac.uk/10.3390/math9080800 - 07 Apr 2021
Cited by 10 | Viewed by 4190
Abstract
The prediction of stock price movement is a popular area of research in academic and industrial fields due to the dynamic, highly sensitive, nonlinear and chaotic nature of stock prices. In this paper, we constructed a convolutional neural network model based on a [...] Read more.
The prediction of stock price movement is a popular area of research in academic and industrial fields due to the dynamic, highly sensitive, nonlinear and chaotic nature of stock prices. In this paper, we constructed a convolutional neural network model based on a deep factorization machine and attention mechanism (FA-CNN) to improve the prediction accuracy of stock price movement via enhanced feature learning. Unlike most previous studies, which focus only on the temporal features of financial time series data, our model also extracts intraday interactions among input features. Further, in data representation, we used the sub-industry index as supplementary information for the current state of the stock, since there exists stock price co-movement between individual stocks and their industry index. The experiments were carried on the individual stocks in three industries. The results showed that the additional inputs of (a) the intraday interactions among input features and (b) the sub-industry index information effectively improved the prediction accuracy. The highest prediction accuracy of the proposed FA-CNN model is 64.81%. It is 7.38% higher than that of traditional LSTM, and 3.71% higher than that of the model without sub-industry index as additional input features. Full article
Show Figures

Figure 1

Back to TopTop