Analytics and Big Data

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (8 February 2023) | Viewed by 8837

Special Issue Editors


E-Mail Website
Guest Editor
Mathematics Department, CEOS.PP, ISCAP / Polytechnic Institute of Porto, 4249-015 Porto, Portugal
Interests: applied mathematics; analytics; technology enhanced learning; e-assessment; mathematics education; e-learning; data mining and quantitative finance

E-Mail Website
Guest Editor
CEOS.PP, ISCAP, Polytechnic Institute of Porto, 4200-465 Porto, Portugal
Interests: business intelligence; analytics; decision support systems; data mining; e-business and digital transformation

E-Mail Website
Guest Editor
Ulster University, Belfast, BT37 0QB, Northern Ireland, UK
Interests: entrepreneurship engineering; mathematical and computational physics; AI in education; mixed augmented and virtual reality; technology-enhanced learning; e-learning; e-business; computer vision and imaging; data visualization

Special Issue Information

Dear Colleagues,

There is no doubt that Big Data are a worthwhile topic of research today. There are arguably two main issues related to Big Data: how to “store” them and how to “analyze” them. Traditional techniques and methods cannot be directly used in the context of Big Data. The aim of this Special Issue is therefore to publish original research articles covering advances and providing insights concerning analytics and big data. Potential authors are encouraged to submit their quality research contributions describing original results, empirical, experimental, literature reviews, or theoretical studies in this area. We also welcome real-world applications.

Topics of interest for submission include but are not limited to big data analytics, mathematics of big data, visualization, algorithms, and data structures for big data.

Prof. Dr. Jose Manuel Azevedo
Prof. Dr. Ana Azevedo
Prof. Dr. James Uhomoibhi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • descriptive, predictive, prescriptive, and diagnostic big data analytics
  • data quality for big data
  • new theoretical methods for big data
  • advanced analytic methods
  • deep learning
  • social networks analysis
  • big data search
  • analytics tools

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

27 pages, 2833 KiB  
Article
On Subsampling Procedures for Support Vector Machines
by Roberto Bárcenas, Maria Gonzalez-Lima, Joaquin Ortega and Adolfo Quiroz
Mathematics 2022, 10(20), 3776; https://0-doi-org.brum.beds.ac.uk/10.3390/math10203776 - 13 Oct 2022
Cited by 1 | Viewed by 1203
Abstract
Herein, theoretical results are presented to provide insights into the effectiveness of subsampling methods in reducing the amount of instances required in the training stage when applying support vector machines (SVMs) for classification in big data scenarios. Our main theorem states that under [...] Read more.
Herein, theoretical results are presented to provide insights into the effectiveness of subsampling methods in reducing the amount of instances required in the training stage when applying support vector machines (SVMs) for classification in big data scenarios. Our main theorem states that under some conditions, there exists, with high probability, a feasible solution to the SVM problem for a randomly chosen training subsample, with the corresponding classifier as close as desired (in terms of classification error) to the classifier obtained from training with the complete dataset. The main theorem also reflects the curse of dimensionalityin that the assumptions made for the results are much more restrictive in large dimensions; thus, subsampling methods will perform better in lower dimensions. Additionally, we propose an importance sampling and bagging subsampling method that expands the nearest-neighbors ideas presented in previous work. Using different benchmark examples, the method proposed herein presents a faster solution to the SVM problem (without significant loss in accuracy) compared with the available state-of-the-art techniques. Full article
(This article belongs to the Special Issue Analytics and Big Data)
Show Figures

Figure 1

16 pages, 2906 KiB  
Article
IT-PMF: A Novel Community E-Commerce Recommendation Method Based on Implicit Trust
by Jun Wu, Xinyu Song, Xiaxia Niu, Li Shi, Lu Gao, Liping Geng, Dan Wang and Dongkui Zhang
Mathematics 2022, 10(14), 2406; https://0-doi-org.brum.beds.ac.uk/10.3390/math10142406 - 09 Jul 2022
Viewed by 1225
Abstract
It is well-known that data sparsity and cold start are two of the open problems in recommendation system research. Numerous studies have been dedicated to dealing with those two problems. Among these, a method of introducing user context information could effectively solve the [...] Read more.
It is well-known that data sparsity and cold start are two of the open problems in recommendation system research. Numerous studies have been dedicated to dealing with those two problems. Among these, a method of introducing user context information could effectively solve the problem of data sparsity and improve the accuracy of recommendation algorithms. This study proposed a novel approach called IT-PMF (Implicit Trust-Probabilistic Matrix Factorization) based on implicit trust, which consists of local implicit trust relationships and in-group membership. The study started from generating the user commodity rating matrix based on the cumulative purchases for items according to their historical purchase records to find the similarity of purchase behaviors and the number of successful interactions between users, which represent the local implicit trust relationship between users. The user group attribute value was calculated through a fuzzy c-means clustering algorithm to obtain the user’s in-group membership. The local implicit trust relationship and the user’s in-group membership were adjusted by the adaptive weight to determine the degree of each part’s influence. Then, the author integrated the user’s score of items and the user’s implicit trust relationship into the probabilistic matrix factorization algorithm to form a trusted recommendation model based on implicit trust relationships and in-group membership. The extensive experiments were conducted using a real dataset collected from a community E-commerce platform, and the IT-PMF method had a better performance in both MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) indices compared with well-known existing algorithms, such as PMF (Probabilistic Matrix Factorization) and SVD (Single Value Decomposition). The results of the experiments indicated that the introduction of implicit trust into PMF could improve the quality of recommendations. Full article
(This article belongs to the Special Issue Analytics and Big Data)
Show Figures

Figure 1

21 pages, 1368 KiB  
Article
IoT Analytics and Agile Optimization for Solving Dynamic Team Orienteering Problems with Mandatory Visits
by Yuda Li, Mohammad Peyman, Javier Panadero, Angel A. Juan and Fatos Xhafa
Mathematics 2022, 10(6), 982; https://0-doi-org.brum.beds.ac.uk/10.3390/math10060982 - 18 Mar 2022
Cited by 3 | Viewed by 2110
Abstract
Transport activities and citizen mobility have a deep impact on enlarged smart cities. By analyzing Big Data streams generated through Internet of Things (IoT) devices, this paper aims to show the efficiency of using IoT analytics, as an agile optimization input for solving [...] Read more.
Transport activities and citizen mobility have a deep impact on enlarged smart cities. By analyzing Big Data streams generated through Internet of Things (IoT) devices, this paper aims to show the efficiency of using IoT analytics, as an agile optimization input for solving real-time problems in smart cities. IoT analytics has become the main core of large-scale Internet applications, however, its utilization in optimization approaches for real-time configuration and dynamic conditions of a smart city has been less discussed. The challenging research topic is how to reach real-time IoT analytics for use in optimization approaches. In this paper, we consider integrating IoT analytics into agile optimization problems. A realistic waste collection problem is modeled as a dynamic team orienteering problem with mandatory visits. Open data repositories from smart cities are used for extracting the IoT analytics to achieve maximum advantage under the city environment condition. Our developed methodology allows us to process real-time information gathered from IoT systems in order to optimize the vehicle routing decision under dynamic changes of the traffic environments. A series of computational experiments is provided in order to illustrate our approach and discuss its effectiveness. In these experiments, a traditional static approach is compared against a dynamic one. In the former, the solution is calculated only once at the beginning, while in the latter, the solution is re-calculated periodically as new data are obtained. The results of the experiments clearly show that our proposed dynamic approach outperforms the static one in terms of rewards. Full article
(This article belongs to the Special Issue Analytics and Big Data)
Show Figures

Figure 1

27 pages, 6336 KiB  
Article
Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications
by Nuno Guimarães, Álvaro Figueira and Luís Torgo
Mathematics 2021, 9(22), 2988; https://0-doi-org.brum.beds.ac.uk/10.3390/math9222988 - 22 Nov 2021
Cited by 5 | Viewed by 3332
Abstract
The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, [...] Read more.
The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a long-term scenario where the topics discussed may change is lacking. In this work, we deviate from current approaches to the problem and instead focus on a longitudinal evaluation using social network publications spanning an 18-month period. We evaluate different combinations of features and supervised models in a long-term scenario where the training and testing data are ordered chronologically, and thus the robustness and stability of the models can be evaluated through time. We experimented with 3 different scenarios where the models are trained with 15-, 30-, and 60-day data periods. The results show that detection models trained with word-embedding features are the ones that perform better and are less likely to be affected by the change of topics (for example, the rise of COVID-19 conspiracy theories). Furthermore, the additional days of training data also increase the performance of the best feature/model combinations, although not very significantly (around 2%). The results presented in this paper build the foundations towards a more pragmatic approach to the evaluation of fake news detection models in social networks. Full article
(This article belongs to the Special Issue Analytics and Big Data)
Show Figures

Figure 1

Back to TopTop