Advances in Parallel Computing and Their Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (31 March 2023) | Viewed by 8568

Special Issue Editors


E-Mail Website
Guest Editor
Computer Science Department, Sapienza University of Rome, 00185 Roma, Italy
Interests: parallel architectures; parallel arithmetic units; interconnection networks; system level verification; mobile sensor networks; GPU-based simulations

E-Mail Website
Guest Editor
Department of Computer Science, University of Pisa, 56126 Pisa, Italy
Interests: parallel programming; parallel design patterns; algorithmic skeletons; multicores

Special Issue Information

Dear Colleagues, 

Over the past few years, we have witnessed the introduction of parallelism at many levels in computer architectures, with the effect of producing very different parallel computing systems. In addition to the inclusion of an ever-increasing number of processors, parallelism can be found in many different aspects of architecture. Low-level parallelism aspects are, among others, the introduction of the pipeline, both in instruction and arithmetic operation execution, as well as the vectorized execution of arithmetic operations, also through unconventional number representations. On the other hand, among high-level parallelism aspects, we can mention the adoption of a rich and sophisticated memory hierarchy, and various topologies for the interconnection network realising the communication in large-scale multiprocessor systems. 

True parallel computing consists of a set of tasks requiring a non-negligible amount of communication, executed in a collaborative fashion on one application. The main target of parallel computing is scientific applications, and many large-scale scientific applications refer to problems that are modeled as linear systems, often based on sparse matrices, and exploiting linear algebra methods.

This Special Issue of Mathematics is devoted to topics in parallel computing, including theory and applications, as well as applied mathematics. The focus will be on techniques to introduce parallelism at different levels, as well as on applications involving parallel solvers in linear algebra, with emphasis on sparse matrices.

Prof. Dr. Annalisa Massini
Prof. Dr. Marco Danelutto
Guest EditorS

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Parallel computing
  • High-performance computing
  • Parallel solvers
  • Sparse matrices
  • Parallel arithmetic
  • Interconnection networks

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 3793 KiB  
Article
Parallel Implementation of a Sensitivity Operator-Based Source Identification Algorithm for Distributed Memory Computers
by Alexey Penenko and Evgeny Rusin
Mathematics 2022, 10(23), 4522; https://0-doi-org.brum.beds.ac.uk/10.3390/math10234522 - 30 Nov 2022
Viewed by 1008
Abstract
Large-scale inverse problems that require high-performance computing arise in various fields, including regional air quality studies. The paper focuses on parallel solutions of an emission source identification problem for a 2D advection–diffusion–reaction model where the sources are identified by heterogeneous measurement data. In [...] Read more.
Large-scale inverse problems that require high-performance computing arise in various fields, including regional air quality studies. The paper focuses on parallel solutions of an emission source identification problem for a 2D advection–diffusion–reaction model where the sources are identified by heterogeneous measurement data. In the inverse modeling approach we use, a source identification problem is transformed to a quasi-linear operator equation with a sensitivity operator, which allows working in a unified way with heterogeneous measurement data and provides natural parallelization of numeric algorithms by concurrent calculation of the rows of a sensitivity operator matrix. The parallel version of the algorithm implemented with a message passing interface (MPI) has shown a 40× speedup on four Intel Xeon Gold 6248R nodes in an inverse modeling scenario for the Lake Baikal region. Full article
(This article belongs to the Special Issue Advances in Parallel Computing and Their Applications)
Show Figures

Figure 1

25 pages, 1349 KiB  
Article
Parallel Implementations of ARX-Based Block Ciphers on Graphic Processing Units
by SangWoo An, YoungBeom Kim, Hyeokdong Kwon, Hwajeong Seo and Seog Chung Seo
Mathematics 2020, 8(11), 1894; https://0-doi-org.brum.beds.ac.uk/10.3390/math8111894 - 31 Oct 2020
Cited by 5 | Viewed by 1901
Abstract
With the development of information and communication technology, various types of Internet of Things (IoT) devices have widely been used for convenient services. Many users with their IoT devices request various services to servers. Thus, the amount of users’ personal information that servers [...] Read more.
With the development of information and communication technology, various types of Internet of Things (IoT) devices have widely been used for convenient services. Many users with their IoT devices request various services to servers. Thus, the amount of users’ personal information that servers need to protect has dramatically increased. To quickly and safely protect users’ personal information, it is necessary to optimize the speed of the encryption process. Since it is difficult to provide the basic services of the server while encrypting a large amount of data in the existing CPU, several parallel optimization methods using Graphics Processing Units (GPUs) have been considered. In this paper, we propose several optimization techniques using GPU for efficient implementation of lightweight block cipher algorithms on the server-side. As the target algorithm, we select high security and light weight (HIGHT), Lightweight Encryption Algorithm (LEA), and revised CHAM, which are Add-Rotate-Xor (ARX)-based block ciphers, because they are used widely on IoT devices. We utilize the features of the counter (CTR) operation mode to reduce unnecessary memory copying and operations in the GPU environment. Besides, we optimize the memory usage by making full use of GPU’s on-chip memory such as registers and shared memory and implement the core function of each target algorithm with inline PTX assembly codes for maximizing the performance. With the application of our optimization methods and handcrafted PTX codes, we achieve excellent encryption throughput of 468, 2593, and 3063 Gbps for HIGHT, LEA, and revised CHAM on RTX 2070 NVIDIA GPU, respectively. In addition, we present optimized implementations of Counter Mode Based Deterministic Random Bit Generator (CTR_DRBG), which is one of the widely used deterministic random bit generators to provide a large amount of random data to the connected IoT devices. We apply several optimization techniques for maximizing the performance of CTR_DRBG, and we achieve 52.2, 24.8, and 34.2 times of performance improvement compared with CTR_DRBG implementation on CPU-side when HIGHT-64/128, LEA-128/128, and CHAM-128/128 are used as underlying block cipher algorithm of CTR_DRBG, respectively. Full article
(This article belongs to the Special Issue Advances in Parallel Computing and Their Applications)
Show Figures

Figure 1

21 pages, 722 KiB  
Article
Efficient Parallel Implementations of LWE-Based Post-Quantum Cryptosystems on Graphics Processing Units
by SangWoo An and Seog Chung Seo
Mathematics 2020, 8(10), 1781; https://0-doi-org.brum.beds.ac.uk/10.3390/math8101781 - 14 Oct 2020
Cited by 11 | Viewed by 3477
Abstract
With the development of the Internet of Things (IoT) and cloud computing technology, various cryptographic systems have been proposed to protect increasing personal information. Recently, Post-Quantum Cryptography (PQC) algorithms have been proposed to counter quantum algorithms that threaten public key cryptography. To efficiently [...] Read more.
With the development of the Internet of Things (IoT) and cloud computing technology, various cryptographic systems have been proposed to protect increasing personal information. Recently, Post-Quantum Cryptography (PQC) algorithms have been proposed to counter quantum algorithms that threaten public key cryptography. To efficiently use PQC in a server environment dealing with large amounts of data, optimization studies are required. In this paper, we present optimization methods for FrodoKEM and NewHope, which are the NIST PQC standardization round 2 competition algorithms in the Graphics Processing Unit (GPU) platform. For each algorithm, we present a part that can perform parallel processing of major operations with a large computational load using the characteristics of the GPU. In the case of FrodoKEM, we introduce parallel optimization techniques for matrix generation operations and matrix arithmetic operations such as addition and multiplication. In the case of NewHope, we present a parallel processing technique for polynomial-based operations. In the encryption process of FrodoKEM, the performance improvements have been confirmed up to 5.2, 5.75, and 6.47 times faster than the CPU implementation in FrodoKEM-640, FrodoKEM-976, and FrodoKEM-1344, respectively. In the encryption process of NewHope, the performance improvements have been shown up to 3.33 and 4.04 times faster than the CPU implementation in NewHope-512 and NewHope-1024, respectively. The results of this study can be used in the IoT devices server or cloud computing service server. In addition, the results of this study can be utilized in image processing technologies such as facial recognition technology. Full article
(This article belongs to the Special Issue Advances in Parallel Computing and Their Applications)
Show Figures

Figure 1

Back to TopTop