Advanced Embedded HW/SW Development

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (15 February 2021) | Viewed by 30625

Special Issue Editor


E-Mail Website
Guest Editor
DIET, Sapienza University of Rome, 00184 Roma, Italy
Interests: microprocessor architectures; digital VLSI design

Special Issue Information

Dear Colleagues,

The rapidly growing Internet-of-Things and automotive markets are just the dawn of an era in which embedded—i.e., built-in and user-transparent—electronic systems interact with everyday life, at both the individual level­—as in the case of entertainment, vehicular, domotic, medical applications—and societal level—as in the case of smart city applications.

The advent of artificial intelligence algorithms, including deep learning but also other still to be explored techniques, has paved the way to a tighter and tighter interaction between high-performance computing techniques and embedded systems, moving compute intensive loads to edge devices directly interacting with sensor data. Last but not least, the growing ecosystem of open-source HW and SW development is especially giving momentum to new developments in embedded systems.

In this application scenario, the ubiquitous requirements of advanced embedded hardware and software, with rare exceptions, are universally recognized as energy efficiency and security with respect to cyber-attacks and to accidental or intentional faults - thus broadening to safety and resilience.

This Special Issue on Advanced Embedded HW/SW Development aims to collect contributions from various sources composing a representative picture of recent advances within the above defined area. Accounts of practical development of innovative systems or components are particularly welcome.

Topics of interest of this Special Issue include but are not limited to:

  • Advanced embedded hardware and software for edge computing and edge/fog/cloud connectivity;
  • Hardware acceleration and microprocessor core development for advanced embedded systems;
  • Bio-inspired embedded hardware and software development;
  • Approximate computing in advanced embedded hardware and software;
  • Specialized circuits and technologies for advanced embedded systems;
  • Machine learning techniques applied to embedded hardware and software development;
  • Use cases and analysis of advanced embedded hardware and software.

Prof. Dr. Mauro Olivieri
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

9 pages, 5927 KiB  
Article
Embedded GPU Implementation for High-Performance Ultrasound Imaging
by Stefano Rossi and Enrico Boni
Electronics 2021, 10(8), 884; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10080884 - 08 Apr 2021
Cited by 1 | Viewed by 2634
Abstract
Methods of increasing complexity are currently being proposed for ultrasound (US) echographic signal processing. Graphics Processing Unit (GPU) resources allowing massive exploitation of parallel computing are ideal candidates for these tasks. Many high-performance US instruments, including open scanners like ULA-OP 256, have an [...] Read more.
Methods of increasing complexity are currently being proposed for ultrasound (US) echographic signal processing. Graphics Processing Unit (GPU) resources allowing massive exploitation of parallel computing are ideal candidates for these tasks. Many high-performance US instruments, including open scanners like ULA-OP 256, have an architecture based only on Field-Programmable Gate Arrays (FPGAs) and/or Digital Signal Processors (DSPs). This paper proposes the implementation of the embedded NVIDIA Jetson Xavier AGX module on board ULA-OP 256. The system architecture was revised to allow the introduction of a new Peripheral Component Interconnect Express (PCIe) communication channel, while maintaining backward compatibility with all other embedded computing resources already on board. Moreover, the Input/Output (I/O) peripherals of the module make the ultrasound system independent, freeing the user from the need to use an external controlling PC. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

21 pages, 4612 KiB  
Article
Customizable Vector Acceleration in Extreme-Edge Computing: A RISC-V Software/Hardware Architecture Study on VGG-16 Implementation
by Stefano Sordillo, Abdallah Cheikh, Antonio Mastrandrea, Francesco Menichelli and Mauro Olivieri
Electronics 2021, 10(4), 518; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10040518 - 23 Feb 2021
Cited by 6 | Viewed by 3060
Abstract
Computing in the cloud-edge continuum, as opposed to cloud computing, relies on high performance processing on the extreme edge of the Internet of Things (IoT) hierarchy. Hardware acceleration is a mandatory solution to achieve the performance requirements, yet it can be tightly tied [...] Read more.
Computing in the cloud-edge continuum, as opposed to cloud computing, relies on high performance processing on the extreme edge of the Internet of Things (IoT) hierarchy. Hardware acceleration is a mandatory solution to achieve the performance requirements, yet it can be tightly tied to particular computation kernels, even within the same application. Vector-oriented hardware acceleration has gained renewed interest to support artificial intelligence (AI) applications like convolutional networks or classification algorithms. We present a comprehensive investigation of the performance and power efficiency achievable by configurable vector acceleration subsystems, obtaining evidence of both the high potential of the proposed microarchitecture and the advantage of hardware customization in total transparency to the software program. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

12 pages, 598 KiB  
Article
Algorithmic-Level Approximate Tensorial SVM Using High-Level Synthesis on FPGA
by Hamoud Younes, Ali Ibrahim, Mostafa Rizk and Maurizio Valle
Electronics 2021, 10(2), 205; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10020205 - 17 Jan 2021
Cited by 13 | Viewed by 3120
Abstract
Approximate Computing Techniques (ACT) are promising solutions towards the achievement of reduced energy, time latency and hardware size for embedded implementations of machine learning algorithms. In this paper, we present the first FPGA implementation of an approximate tensorial Support Vector Machine (SVM) classifier [...] Read more.
Approximate Computing Techniques (ACT) are promising solutions towards the achievement of reduced energy, time latency and hardware size for embedded implementations of machine learning algorithms. In this paper, we present the first FPGA implementation of an approximate tensorial Support Vector Machine (SVM) classifier with algorithmic level ACTs using High-Level Synthesis (HLS). A touch modality classification framework was adopted to validate the effectiveness of the proposed implementation. When compared to exact implementation presented in the state-of-the-art, the proposed implementation achieves a reduction in power consumption by up to 49% with a speedup of 3.2×. Moreover, the hardware resources are reduced by 40% while consuming 82% less energy in classifying an input touch with an accuracy loss less than 5%. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

29 pages, 474 KiB  
Article
Singular Value Decomposition in Embedded Systems Based on ARM Cortex-M Architecture
by Michele Alessandrini, Giorgio Biagetti, Paolo Crippa, Laura Falaschetti, Lorenzo Manoni and Claudio Turchetti
Electronics 2021, 10(1), 34; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10010034 - 28 Dec 2020
Cited by 8 | Viewed by 4092
Abstract
Singular value decomposition (SVD) is a central mathematical tool for several emerging applications in embedded systems, such as multiple-input multiple-output (MIMO) systems, data analytics, sparse representation of signals. Since SVD algorithms reduce to solve an eigenvalue problem, that is computationally expensive, both specific [...] Read more.
Singular value decomposition (SVD) is a central mathematical tool for several emerging applications in embedded systems, such as multiple-input multiple-output (MIMO) systems, data analytics, sparse representation of signals. Since SVD algorithms reduce to solve an eigenvalue problem, that is computationally expensive, both specific hardware solutions and parallel implementations have been proposed to overcome this bottleneck. However, as those solutions require additional hardware resources that are not in general available in embedded systems, optimized algorithms are demanded in this context. The aim of this paper is to present an efficient implementation of the SVD algorithm on ARM Cortex-M. To this end, we proceed to (i) present a comprehensive treatment of the most common algorithms for SVD, providing a fairly complete and deep overview of these algorithms, with a common notation, (ii) implement them on an ARM Cortex-M4F microcontroller, in order to develop a library suitable for embedded systems without an operating system, (iii) find, through a comparative study of the proposed SVD algorithms, the best implementation suitable for a low-resource bare-metal embedded system, (iv) show a practical application to Kalman filtering of an inertial measurement unit (IMU), as an example of how SVD can improve the accuracy of existing algorithms and of its usefulness on a such low-resources system. All these contributions can be used as guidelines for embedded system designers. Regarding the second point, the chosen algorithms have been implemented on ARM Cortex-M4F microcontrollers with very limited hardware resources with respect to more advanced CPUs. Several experiments have been conducted to select which algorithms guarantee the best performance in terms of speed, accuracy and energy consumption. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

25 pages, 2751 KiB  
Article
Automatic Method for Distinguishing Hardware and Software Faults Based on Software Execution Data and Hardware Performance Counters
by Jihyun Park and Byoungju Choi
Electronics 2020, 9(11), 1815; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics9111815 - 02 Nov 2020
Cited by 1 | Viewed by 2330
Abstract
Debugging in an embedded system where hardware and software are tightly coupled and have restricted resources is far from trivial. When hardware defects appear as if they were software defects, determining the real source becomes challenging. In this study, we propose an automated [...] Read more.
Debugging in an embedded system where hardware and software are tightly coupled and have restricted resources is far from trivial. When hardware defects appear as if they were software defects, determining the real source becomes challenging. In this study, we propose an automated method of distinguishing whether a defect originates from the hardware or software at the stage of integration testing of hardware and software. Our method overcomes the limitations of the embedded environment, minimizes the effects on runtime, and identifies defects by obtaining and analyzing software execution data and hardware performance counters. We analyze the effects of the proposed method through an empirical study. The experimental results reveal that our method can effectively distinguish defects. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

21 pages, 878 KiB  
Article
Sw/Hw Partitioning and Scheduling on Region-Based Dynamic Partial Reconfigurable System-on-Chip
by Qi Tang, Biao Guo and Zhe Wang
Electronics 2020, 9(9), 1362; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics9091362 - 21 Aug 2020
Cited by 2 | Viewed by 2213
Abstract
A heterogeneous system-on-chip (SoC) integrates multiple types of processors on the same chip. It has great advantages in many aspects, such as processing capacity, size, weight, cost, power, and energy consumption, which result in it being widely adopted in many fields. The SoC [...] Read more.
A heterogeneous system-on-chip (SoC) integrates multiple types of processors on the same chip. It has great advantages in many aspects, such as processing capacity, size, weight, cost, power, and energy consumption, which result in it being widely adopted in many fields. The SoC based on region-based dynamic partial reconfigurable (DPR) FPGA plays an important role in the SoC field. However, delivering its powerful capacity to the consumer depends on the efficient Sw/Hw partitioning and scheduling technology that determines the resource volume of the DPR region, the mapping of the application to the DPR region and other processors, and the schedule of the task and its reconfiguration. This paper first proposes an exact approach based on the mixed integer linear programming (MILP) for the Sw/Hw partitioning and scheduling problem. The proposed MILP is able to solve the problem optimally; however, its scalability is poor, despite that we carefully designed its formulation and tried to make it as concise as possible. Therefore, a multi-step hybrid method that combines graph partitioning and MILP is proposed, which is able to reduce the time complexity significantly with the solution quality being degraded marginally. A set of experiments is carried out using a set of real-life applications, and the result demonstrates the effectiveness of the proposed methods. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

17 pages, 2780 KiB  
Article
The L3Pilot Data Management Toolchain for a Level 3 Vehicle Automation Pilot
by Johannes Hiller, Sami Koskinen, Riccardo Berta, Nisrine Osman, Ben Nagy, Francesco Bellotti, Ashfaqur Rahman, Erik Svanberg, Hendrik Weber, Eduardo H. Arnold, Mehrdad Dianati and Alessandro De Gloria
Electronics 2020, 9(5), 809; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics9050809 - 15 May 2020
Cited by 9 | Viewed by 3963
Abstract
As industrial research in automated driving is rapidly advancing, it is of paramount importance to analyze field data from extensive road tests. This paper investigates the design and development of a toolchain to process and manage experimental data to answer a set of [...] Read more.
As industrial research in automated driving is rapidly advancing, it is of paramount importance to analyze field data from extensive road tests. This paper investigates the design and development of a toolchain to process and manage experimental data to answer a set of research questions about the evaluation of automated driving functions at various levels, from technical system functioning to overall impact assessment. We have faced this challenge in L3Pilot, the first comprehensive test of automated driving functions (ADFs) on public roads in Europe. L3Pilot is testing ADFs in vehicles made by 13 companies. The tested functions are mainly of Society of Automotive Engineers (SAE) automation level 3, some of them of level 4. In this context, the presented toolchain supports various confidentiality levels, and allows cross-vehicle owner seamless data management, with the efficient storage of data and their iterative processing with a variety of analysis and evaluation tools. Most of the toolchain modules have been developed to a prototype version in a desktop/cloud environment, exploiting state-of-the-art technology. This has allowed us to efficiently set up what could become a comprehensive edge-to-cloud reference architecture for managing data in automated vehicle tests. The project has been released as open source, the data format into which all vehicular signals, recorded in proprietary formats, were converted, in order to support efficient processing through multiple tools, scalability and data quality checking. We expect that this format should enhance research on automated driving testing, as it provides a shared framework for dealing with data from collection to analysis. We are confident that this format, and the information provided in this article, can represent a reference for the design of future architectures to implement in vehicles. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

21 pages, 6127 KiB  
Article
Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search
by Daniele Jahier Pagliari, Francesco Daghero and Massimo Poncino
Electronics 2020, 9(2), 337; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics9020337 - 15 Feb 2020
Cited by 4 | Viewed by 3340
Abstract
Sequence-to-sequence deep neural networks have become the state of the art for a variety of machine learning applications, ranging from neural machine translation (NMT) to speech recognition. Many mobile and Internet of Things (IoT) applications would benefit from the ability of performing sequence-to-sequence [...] Read more.
Sequence-to-sequence deep neural networks have become the state of the art for a variety of machine learning applications, ranging from neural machine translation (NMT) to speech recognition. Many mobile and Internet of Things (IoT) applications would benefit from the ability of performing sequence-to-sequence inference directly in embedded devices, thereby reducing the amount of raw data transmitted to the cloud, and obtaining benefits in terms of response latency, energy consumption and security. However, due to the high computational complexity of these models, specific optimization techniques are needed to achieve acceptable performance and energy consumption on single-core embedded processors. In this paper, we present a new optimization technique called dynamic beam search, in which the inference complexity is tuned to the difficulty of the processed input sequence at runtime. Results based on measurements on a real embedded device, and on three state-of-the-art deep learning models, show that our method is able to reduce the inference time and energy by up to 25% without loss of accuracy. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

20 pages, 20221 KiB  
Article
Open Vision System for Low-Cost Robotics Education
by Julio Vega and José M. Cañas
Electronics 2019, 8(11), 1295; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics8111295 - 06 Nov 2019
Cited by 13 | Viewed by 4602
Abstract
Vision devices are currently one of the most widely used sensory elements in robots: commercial autonomous cars and vacuum cleaners, for example, have cameras. These vision devices can provide a great amount of information about robot surroundings. However, platforms for robotics education usually [...] Read more.
Vision devices are currently one of the most widely used sensory elements in robots: commercial autonomous cars and vacuum cleaners, for example, have cameras. These vision devices can provide a great amount of information about robot surroundings. However, platforms for robotics education usually lack such devices, mainly because of the computing limitations of low cost processors. New educational platforms using Raspberry Pi are able to overcome this limitation while keeping costs low, but extracting information from the raw images is complex for children. This paper presents an open source vision system that simplifies the use of cameras in robotics education. It includes functions for the visual detection of complex objects and a visual memory that computes obstacle distances beyond the small field of view of regular cameras. The system was experimentally validated using the PiCam camera mounted on a pan unit on a Raspberry Pi-based robot. The performance and accuracy of the proposed vision system was studied and then used to solve two visual educational exercises: safe visual navigation with obstacle avoidance and person-following behavior. Full article
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Show Figures

Figure 1

Back to TopTop