Journal Browser

► Journal Browser

Special Issue Editor

Prof. Dr. Hong Bong Hee

E-Mail Website
Guest Editor

Division of Computer Science and Engineering, Pusan National University, Busan 609-735, Korea
Interests: database; big data; test data set generator

Special Issue Information

Dear Colleagues,

Big data analytics (BDA) is a new scientific field that gathers all analytic approaches for the processing of huge amounts of data by extracting hidden insights that would not be attainable using traditional approaches.

Various challenges to problems that are not easy to solve with big data analysis prediction are required. The challenge in various applications to develop powerful learning models for big data analytics prediction is important. As the quality and amount of data increases, we hope to challenge new research methods that enhance analytical predictive power.

Therefore, this Special Issue, “New Challenges in Big Data Analytics and Applications”, will publish original full papers including analytics, theory, practice and applications of big data. Papers that have been presented in conferences would also be welcomed.

Prof. Dr. Hong Bong Hee
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

big data
learning model
test data set
data mining
privacy
big data

Published Papers (1 paper)

Research

20 pages, 1424 KiB

Open AccessArticle

Comparative Analysis of Skew-Join Strategies for Large-Scale Datasets with MapReduce and Spark

by Anh-Cang Phan, Thuong-Cang Phan, Hung-Phi Cao and Thanh-Ngoan Trieu

Appl. Sci. 2022, 12(13), 6554; https://0-doi-org.brum.beds.ac.uk/10.3390/app12136554 - 28 Jun 2022

Cited by 2 | Viewed by 1565

Abstract

In the era of data deluge, Big Data gradually offers numerous opportunities, but also poses significant challenges to conventional data processing and analysis methods. MapReduce has become a prominent parallel and distributed programming model for efficiently handling such massive datasets. One of the most elementary and extensive operations in MapReduce is the join operation. These joins have become ever more complex and expensive in the context of skewed data, in which some common join keys appear with a greater frequency than others. Some of the reduction tasks processing these join keys will finish later than others; thus, the benefits of parallel computation become meaningless. Some studies on the problem of skew joins have been conducted, but an adequate and systematic comparison in the Spark environment has not been presented. They have only provided experimental tests, so there is still a shortage of representations of mathematical models on which skew-join algorithms can be compared. This study is, therefore, designed to provide the theoretical and practical basics for evaluating skew-join strategies for large-scale datasets with MapReduce and Spark—both analytically with cost models and practically with experiments. The objectives of the study are, first, to present the implementation of prominent skew-join algorithms in Spark, second, to evaluate the algorithms by using cost models and experiments, and third, to show the advantages and disadvantages of each one and to recommend strategies for the better use of skew joins in Spark. Full article

(This article belongs to the Special Issue New Challenges in Big Data Analytics and Applications)

► Show Figures

Figure 1

Journal Menu

Journal Browser

New Challenges in Big Data Analytics and Applications

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (1 paper)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI