Applied Sciences

Journal Browser

► Journal Browser

Special Issue Editor

Dr. Ayoub Bagheri

E-Mail Website
Guest Editor

Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, 3584 CH Utrecht, Netherlands
Interests: applied data science; social/human data science; computational social science; data mining; text mining; natural language processing; statistical learning; machine learning; deep learning; big data analysis

Special Issue Information

Dear Colleagues,

In today's data-driven world, the field of data and text mining has emerged as a central domain, addressing innovative techniques for analysing and systematically extracting valuable insights, as well as managing large and complex datasets that exceed the capabilities of traditional data processing techniques. The advent of big data has ushered in a transformative era, and its profound impact can be seen across multiple sectors, including social sciences, healthcare, international development, education, and beyond. Furthermore, as we move further into the realm of text mining, we are witnessing remarkable advances in natural language processing (NLP). These advances enable us to unravel the intricate tapestry of human language, opening the door to a wealth of unexplored knowledge and opportunities for discovery. In this Special Issue, we therefore explore this dynamic convergence of data and text mining.

We look forward to your contributions to this Special Issue.

Dr. Ayoub Bagheri
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

data mining
text mining
big data
natural language processing (NLP)
computational social sciences
knowledge discovery
statistical learning
machine learning
data analysis
information retrieval

Published Papers (1 paper)

Review

21 pages, 585 KiB

Open AccessReview

Reproducibility and Data Storage for Active Learning-Aided Systematic Reviews

by Peter Lombaers, Jonathan de Bruin and Rens van de Schoot

Appl. Sci. 2024, 14(9), 3842; https://0-doi-org.brum.beds.ac.uk/10.3390/app14093842 - 30 Apr 2024

Viewed by 223

Abstract

In the screening phase of a systematic review, screening prioritization via active learning effectively reduces the workload. However, the PRISMA guidelines are not sufficient for reporting the screening phase in a reproducible manner. Text screening with active learning is an iterative process, but the labeling decisions and the training of the active learning model can happen independently of each other in time. Therefore, it is not trivial to store the data from both events so that one can still know which iteration of the model was used for each labeling decision. Moreover, many iterations of the active learning model will be trained throughout the screening process, producing an enormous amount of data (think of many gigabytes or even terabytes of data), and machine learning models are continually becoming larger. This article clarifies the steps in an active learning-aided screening process and what data is produced at every step. We consider what reproducibility means in this context and we show that there is tension between the desire to be reproducible and the amount of data that is stored. Finally, we present the RDAL Checklist (Reproducibility and Data storage for Active Learning-Aided Systematic Reviews Checklist), which helps users and creators of active learning software make their screening process reproducible. Full article

(This article belongs to the Special Issue Data and Text Mining: New Approaches, Achievements and Applications)

Journal Menu

Journal Browser

Data and Text Mining: New Approaches, Achievements and Applications

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (1 paper)

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI