Journal of Low Power Electronics and Applications

14 pages, 4140 KiB

Open AccessArticle

A 1.02 μW Autarkic Threshold-Based Sensing and Energy Harvesting Interface Using a Single Piezoelectric Element

by Zoi Agorastou, Vasileios Kalenteridis and Stylianos Siskos

J. Low Power Electron. Appl. 2021, 11(2), 27; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020027 - 04 Jun 2021

Cited by 3 | Viewed by 3665

Abstract

A self-powered piezoelectric sensor interface employing part of the signal that is not intended for measurement to sustain its autonomous operation was designed using XH018 (180 nm) technology. The aim of the proposed circuit, besides the energy self-sufficiency of the sensor, is to [...] Read more.

A self-powered piezoelectric sensor interface employing part of the signal that is not intended for measurement to sustain its autonomous operation was designed using XH018 (180 nm) technology. The aim of the proposed circuit, besides the energy self-sufficiency of the sensor, is to provide an interface that eliminates the effect of the harvesting process on the piezoelectric output signal which contains context data. This is achieved by isolating part of the signal that is desirable for sensing from the harvesting process so that the former is not affected or distorted by the latter. Moreover, the circuit manages to self-start its operation, so no additional battery or pre-charged capacitor is needed. The circuit achieves a very low power consumption of 1.02 μW. As a proof of concept, the proposed interfacing circuit is implemented in order to be potentially used for weigh-in-motion applications. Full article

► Show Figures

Figure 1

13 pages, 4828 KiB

Open AccessArticle

Design of Low-Voltage FO-[PD] Controller for Motion Systems

by Rafailia Malatesta, Stavroula Kapoulea, Costas Psychalinos and Ahmed S. Elwakil

J. Low Power Electron. Appl. 2021, 11(2), 26; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020026 - 31 May 2021

Cited by 5 | Viewed by 2981

Abstract

Fractional-order controllers have gained significant research interest in various practical applications due to the additional degrees of freedom offered in their tuning process. The main contribution of this work is the analog implementation, for the first time in the literature, of a fractional-order [...] Read more.

Fractional-order controllers have gained significant research interest in various practical applications due to the additional degrees of freedom offered in their tuning process. The main contribution of this work is the analog implementation, for the first time in the literature, of a fractional-order controller with a transfer function that is not directly constructed from terms of the fractional-order Laplacian operator. This is achieved using Padé approximation, and the resulting integer-order transfer function is implemented using operational transconductance amplifiers as active elements. Post-layout simulation results verify the validity of the introduced procedure. Full article

► Show Figures

Figure 1

14 pages, 620 KiB

Open AccessArticle

PageRank Implemented with the MPI Paradigm Running on a Many-Core Neuromorphic Platform

by Evelina Forno, Alessandro Salvato, Enrico Macii and Gianvito Urgese

J. Low Power Electron. Appl. 2021, 11(2), 25; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020025 - 28 May 2021

Cited by 4 | Viewed by 3212

Abstract

SpiNNaker is a neuromorphic hardware platform, especially designed for the simulation of Spiking Neural Networks (SNNs). To this end, the platform features massively parallel computation and an efficient communication infrastructure based on the transmission of small packets. The effectiveness of SpiNNaker in the [...] Read more.

SpiNNaker is a neuromorphic hardware platform, especially designed for the simulation of Spiking Neural Networks (SNNs). To this end, the platform features massively parallel computation and an efficient communication infrastructure based on the transmission of small packets. The effectiveness of SpiNNaker in the parallel execution of the PageRank (PR) algorithm has been tested by the realization of a custom SNN implementation. In this work, we propose a PageRank implementation fully realized with the MPI programming paradigm ported to the SpiNNaker platform. We compare the scalability of the proposed program with the equivalent SNN implementation, and we leverage the characteristics of the PageRank algorithm to benchmark our implementation of MPI on SpiNNaker when faced with massive communication requirements. Experimental results show that the algorithm exhibits favorable scaling for a mid-sized execution context, while highlighting that the performance of MPI-PageRank on SpiNNaker is bounded by memory size and speed limitations on the current version of the hardware. Full article

(This article belongs to the Special Issue Advances in Programming Parallel and Heterogeneous Computing for Cyber-Physical Systems)

► Show Figures

Figure 1

19 pages, 3055 KiB

Open AccessArticle

Efficient ROS-Compliant CPU-iGPU Communication on Embedded Platforms

by Mirco De Marchi, Francesco Lumpp, Enrico Martini, Michele Boldo, Stefano Aldegheri and Nicola Bombieri

J. Low Power Electron. Appl. 2021, 11(2), 24; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020024 - 26 May 2021

Cited by 2 | Viewed by 4892

Abstract

Many modern programmable embedded devices contain CPUs and a GPU that share the same system memory on a single die. Such a unified memory architecture (UMA) allows programmers to implement different communication models between CPU and the integrated GPU (iGPU). Although the simpler [...] Read more.

Many modern programmable embedded devices contain CPUs and a GPU that share the same system memory on a single die. Such a unified memory architecture (UMA) allows programmers to implement different communication models between CPU and the integrated GPU (iGPU). Although the simpler model guarantees implicit synchronization at the cost of performance, the more advanced model allows, through the zero-copy paradigm, the explicit data copying between CPU and iGPU to be eliminated with the benefit of significantly improving performance and energy savings. On the other hand, the robot operating system (ROS) has become a de-facto reference standard for developing robotic applications. It allows for application re-use and the easy integration of software blocks in complex cyber-physical systems. Although ROS compliance is strongly required for SW portability and reuse, it can lead to performance loss and elude the benefits of the zero-copy communication. In this article we present efficient techniques to implement CPU–iGPU communication by guaranteeing compliance to the ROS standard. We show how key features of each communication model are maintained and the corresponding overhead involved by the ROS compliancy. Full article

(This article belongs to the Special Issue Advances in Programming Parallel and Heterogeneous Computing for Cyber-Physical Systems)

► Show Figures

Figure 1

16 pages, 438 KiB

Open AccessReview

A Review of Algorithms and Hardware Implementations for Spiking Neural Networks

by Duy-Anh Nguyen, Xuan-Tu Tran and Francesca Iacopi

J. Low Power Electron. Appl. 2021, 11(2), 23; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020023 - 24 May 2021

Cited by 30 | Viewed by 7167

Abstract

Deep Learning (DL) has contributed to the success of many applications in recent years. The applications range from simple ones such as recognizing tiny images or simple speech patterns to ones with a high level of complexity such as playing the game of [...] Read more.

Deep Learning (DL) has contributed to the success of many applications in recent years. The applications range from simple ones such as recognizing tiny images or simple speech patterns to ones with a high level of complexity such as playing the game of Go. However, this superior performance comes at a high computational cost, which made porting DL applications to conventional hardware platforms a challenging task. Many approaches have been investigated, and Spiking Neural Network (SNN) is one of the promising candidates. SNN is the third generation of Artificial Neural Networks (ANNs), where each neuron in the network uses discrete spikes to communicate in an event-based manner. SNNs have the potential advantage of achieving better energy efficiency than their ANN counterparts. While generally there will be a loss of accuracy on SNN models, new algorithms have helped to close the accuracy gap. For hardware implementations, SNNs have attracted much attention in the neuromorphic hardware research community. In this work, we review the basic background of SNNs, the current state and challenges of the training algorithms for SNNs and the current implementations of SNNs on various hardware platforms. Full article

(This article belongs to the Special Issue Artificial Intelligence of Things (AIoT))

► Show Figures

Figure 1

11 pages, 2483 KiB

Open AccessArticle

An Automatic Offset Calibration Method for Differential Charge-Based Capacitance Measurement

by Umberto Ferlito, Alfio Dario Grasso, Michele Vaiana and Giuseppe Bruno

J. Low Power Electron. Appl. 2021, 11(2), 22; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020022 - 20 May 2021

Cited by 2 | Viewed by 2470

Abstract

Charge-Based Capacitance Measurement (CBCM) technique is a simple but effective technique for measuring capacitance values down to the attofarad level. However, when adopted for fully on-chip implementation, this technique suffers output offset caused by mismatches and process variations. This paper introduces a novel [...] Read more.

Charge-Based Capacitance Measurement (CBCM) technique is a simple but effective technique for measuring capacitance values down to the attofarad level. However, when adopted for fully on-chip implementation, this technique suffers output offset caused by mismatches and process variations. This paper introduces a novel method that compensates the offset of a fully integrated differential CBCM electronic front-end. After a detailed theoretical analysis of the differential CBCM topology, we present and discuss a modified architecture that compensates mismatches and increases robustness against mismatches and process variations. The proposed circuit has been simulated using a standard 130-nm technology and shows a sensitivity of 1.3 mV/aF and a 20× reduction of the standard deviation of the differential output voltage as compared to the traditional solution. Full article

► Show Figures

Figure 1

17 pages, 3771 KiB

Open AccessArticle

A g_m/I_D-Based Design Strategy for IoT and Ultra-Low-Power OTAs with Fast-Settling and Large Capacitive Loads

by Gianluca Giustolisi and Gaetano Palumbo

J. Low Power Electron. Appl. 2021, 11(2), 21; https://doi.org/10.3390/jlpea11020021 - 12 May 2021

Cited by 7 | Viewed by 5014

Abstract

In this paper, a new strategy for the design of ultra-low-power CMOS operational transconductance amplifiers (OTAs), using the

g_{m} / I_{D}

approach, is proposed for the Internet-of-things (IoT) scenario. The strategy optimizes the speed/dissipation of the OTA in terms of settling [...] Read more.

In this paper, a new strategy for the design of ultra-low-power CMOS operational transconductance amplifiers (OTAs), using the

g_{m} / I_{D}

approach, is proposed for the Internet-of-things (IoT) scenario. The strategy optimizes the speed/dissipation of the OTA in terms of settling time, including slew-rate effects. It was designed for large capacitive loads and for transistors biased in the sub-threshold region, but it is also suitable for low-capacitive loads or for transistors biased in the saturation region. To validate the proposed strategy, a well-known three-stage OTA was designed starting from capacitive load and settling time requirements. Simulations confirmed that the OTA satisfies the specifications (even under Monte Carlo analysis), thus proving the correctness of the proposed approach. Full article

► Show Figures

Figure 1

12 pages, 3442 KiB

Open AccessArticle

Accelerating Population Count with a Hardware Co-Processor for MicroBlaze

by Iouliia Skliarova

J. Low Power Electron. Appl. 2021, 11(2), 20; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020020 - 24 Apr 2021

Cited by 6 | Viewed by 3421

Abstract

This paper proposes a Field-Programmable Gate Array (FPGA)-based hardware accelerator for assisting the embedded MicroBlaze soft-core processor in calculating population count. The population count is frequently required to be executed in cyber-physical systems and can be applied to large data sets, such as [...] Read more.

This paper proposes a Field-Programmable Gate Array (FPGA)-based hardware accelerator for assisting the embedded MicroBlaze soft-core processor in calculating population count. The population count is frequently required to be executed in cyber-physical systems and can be applied to large data sets, such as in the case of molecular similarity search in cheminformatics, or assisting with computations performed by binarized neural networks. The MicroBlaze instruction set architecture (ISA) does not support this operation natively, so the count has to be realized as either a sequence of native instructions (in software) or in parallel in a dedicated hardware accelerator. Different hardware accelerator architectures are analyzed and compared to one another and to implementing the population count operation in MicroBlaze. The achieved experimental results with large vector lengths (up to 2¹⁷) demonstrate that the best hardware accelerator with DMA (Direct Memory Access) is ~31 times faster than the best software version running on MicroBlaze. The proposed architectures are scalable and can easily be adjusted to both smaller and bigger input vector lengths. The entire system was implemented and tested on a Nexys-4 prototyping board containing a low-cost/low-power Artix-7 FPGA. Full article

(This article belongs to the Special Issue Advances in Programming Parallel and Heterogeneous Computing for Cyber-Physical Systems)

► Show Figures

Figure 1

15 pages, 563 KiB

Open AccessArticle

A 0.3 V Rail-to-Rail Ultra-Low-Power OTA with Improved Bandwidth and Slew Rate

by Francesco Centurelli, Riccardo Della Sala, Pietro Monsurrò, Giuseppe Scotti and Alessandro Trifiletti

J. Low Power Electron. Appl. 2021, 11(2), 19; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020019 - 21 Apr 2021

Cited by 22 | Viewed by 4375

Abstract

In this paper, we present a novel operational transconductance amplifier (OTA) topology based on a dual-path body-driven input stage that exploits a body-driven current mirror-active load and targets ultra-low-power (ULP) and ultra-low-voltage (ULV) applications, such as IoT or biomedical devices. The proposed OTA [...] Read more.

In this paper, we present a novel operational transconductance amplifier (OTA) topology based on a dual-path body-driven input stage that exploits a body-driven current mirror-active load and targets ultra-low-power (ULP) and ultra-low-voltage (ULV) applications, such as IoT or biomedical devices. The proposed OTA exhibits only one high-impedance node, and can therefore be compensated at the output stage, thus not requiring Miller compensation. The input stage ensures rail-to-rail input common-mode range, whereas the gate-driven output stage ensures both a high open-loop gain and an enhanced slew rate. The proposed amplifier was designed in an STMicroelectronics 130 nm CMOS process with a nominal supply voltage of only 0.3 V, and it achieved very good values for both the small-signal and large-signal Figures of Merit. Extensive PVT (process, supply voltage, and temperature) and mismatch simulations are reported to prove the robustness of the proposed amplifier. Full article

► Show Figures

Figure 1

24 pages, 2009 KiB

Open AccessArticle

Low-Power Audio Keyword Spotting Using Tsetlin Machines

by Jie Lei, Tousif Rahman, Rishad Shafik, Adrian Wheeldon, Alex Yakovlev, Ole-Christoffer Granmo, Fahim Kawsar and Akhil Mathur

J. Low Power Electron. Appl. 2021, 11(2), 18; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020018 - 09 Apr 2021

Cited by 29 | Viewed by 5691

Abstract

The emergence of artificial intelligence (AI) driven keyword spotting (KWS) technologies has revolutionized human to machine interaction. Yet, the challenge of end-to-end energy efficiency, memory footprint and system complexity of current neural network (NN) powered AI-KWS pipelines has remained ever present. This paper [...] Read more.

The emergence of artificial intelligence (AI) driven keyword spotting (KWS) technologies has revolutionized human to machine interaction. Yet, the challenge of end-to-end energy efficiency, memory footprint and system complexity of current neural network (NN) powered AI-KWS pipelines has remained ever present. This paper evaluates KWS utilizing a learning automata powered machine learning algorithm called the Tsetlin Machine (TM). Through significant reduction in parameter requirements and choosing logic over arithmetic-based processing, the TM offers new opportunities for low-power KWS while maintaining high learning efficacy. In this paper, we explore a TM-based keyword spotting (KWS) pipeline to demonstrate low complexity with faster rate of convergence compared to NNs. Further, we investigate the scalability with increasing keywords and explore the potential for enabling low-power on-chip KWS. Full article

(This article belongs to the Special Issue Artificial Intelligence of Things (AIoT))

► Show Figures

Figure 1

18 pages, 3098 KiB

Open AccessArticle

Highly Adaptive Linear Actor-Critic for Lightweight Energy-Harvesting IoT Applications

by Sota Sawaguchi, Jean-Frédéric Christmann and Suzanne Lesecq

J. Low Power Electron. Appl. 2021, 11(2), 17; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020017 - 08 Apr 2021

Cited by 1 | Viewed by 2548

Abstract

Reinforcement learning (RL) has received much attention in recent years due to its adaptability to unpredictable events such as harvested energy and workload, especially in the context of edge computing for Internet-of-Things (IoT) nodes. Due to limited resources in IoT nodes, it is [...] Read more.

Reinforcement learning (RL) has received much attention in recent years due to its adaptability to unpredictable events such as harvested energy and workload, especially in the context of edge computing for Internet-of-Things (IoT) nodes. Due to limited resources in IoT nodes, it is difficult to achieve self-adaptability. This paper studies online reactivity issues of fixed learning rate in the linear actor-critic (LAC) algorithm for transmission duty-cycle control. We propose the LAC-AB algorithm that introduces into the LAC algorithm an adaptive learning rate called Adam for actor update to achieve better adaptability. We introduce a definition of “convergence” when quantitative analysis of convergence is performed. Simulation results using real-life one-year solar irradiance data indicate that, unlike the conventional setups of two decay rate

β_{1}, β_{2}

of Adam, smaller

β_{1}

such as 0.2–0.4 are suitable for power-failure-sensitive applications and 0.5–0.7 for latency-sensitive applications with

β_{2} \in [0.1, 0.3]

. LAC-AB improves the time of reactivity by 68.5–88.1% in our application; it also fine-tunes the initial learning rate for the initial state and improves the time of fine-tuning by 78.2–84.3%, compared to the LAC. Besides, the number of power failures is drastically reduced to zero or a few occurrences over 300 simulations. Full article

(This article belongs to the Special Issue Artificial Intelligence of Things (AIoT))

► Show Figures

Figure 1

20 pages, 1242 KiB

Open AccessReview

Internet of Things: A Review on Theory Based Impedance Matching Techniques for Energy Efficient RF Systems

by Benoit Couraud, Remy Vauche, Spyridon Nektarios Daskalakis, David Flynn, Thibaut Deleruyelle, Edith Kussener and Stylianos Assimonis

J. Low Power Electron. Appl. 2021, 11(2), 16; https://doi.org/10.3390/jlpea11020016 - 31 Mar 2021

Cited by 7 | Viewed by 3548

Abstract

Within an increasingly connected world, the exponential growth in the deployment of Internet of Things (IoT) applications presents a significant challenge in power and data transfer optimisation. Currently, the maximization of Radio Frequency (RF) system power gain depends on the design of efficient, [...] Read more.

Within an increasingly connected world, the exponential growth in the deployment of Internet of Things (IoT) applications presents a significant challenge in power and data transfer optimisation. Currently, the maximization of Radio Frequency (RF) system power gain depends on the design of efficient, commercial chips, and on the integration of these chips by using complex RF simulations to verify bespoke configurations. However, even if a standard 50

Ω

transmitter’s chip has an efficiency of 90%, the overall power efficiency of the RF system can be reduced by 10% if coupled with a standard antenna of 72

Ω

. Hence, it is necessary for scalable IoT networks to have optimal RF system design for every transceiver: for example, impedance mismatching between a transmitter’s antenna and chip leads to a significant reduction of the corresponding RF system’s overall power efficiency. This work presents a versatile design framework, based on well-known theoretical methods (i.e., transducer gain, power wave approach, transmission line theory), for the optimal design in terms of power delivered to a load of a typical RF system, which consists of an antenna, a matching network, a load (e.g., integrated circuit) and transmission lines which connect all these parts. The aim of this design framework is not only to reduce the computational effort needed for the design and prototyping of power efficient RF systems, but also to increase the accuracy of the analysis, based on the explanatory analysis within our design framework. Simulated and measured results verify the accuracy of this proposed design framework over a 0–4 GHz spectrum. Finally, a case study based on the design of an RF system for Bluetooth applications demonstrates the benefits of this RF design framework. Full article

(This article belongs to the Special Issue Artificial Intelligence of Things (AIoT))

► Show Figures

Figure 1

15 pages, 11085 KiB

Open AccessArticle

A 28 nm CMOS 100 MHz 67 dB-Dynamic-Range 968 µW Flipped-Source-Follower Analog Filter

by Marcello De Matteis, Federico Fary, Elia A. Vallicelli and Andrea Baschirotto

J. Low Power Electron. Appl. 2021, 11(2), 15; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020015 - 30 Mar 2021

Cited by 2 | Viewed by 2487

Abstract

This paper presents a fourth-order continuous-time analog filter based on the cascade of two flipped-source-follower (FSF) biquadratic (biquad) cells. The FSF biquad adopts two interacting loops (the first due to the classic source-follower, and the second to the additional gain path) which lower [...] Read more.

This paper presents a fourth-order continuous-time analog filter based on the cascade of two flipped-source-follower (FSF) biquadratic (biquad) cells. The FSF biquad adopts two interacting loops (the first due to the classic source-follower, and the second to the additional gain path) which lower the impedances of all circuit nodes with relevant benefits in terms of noise power reduction and linearity enhancement. The presented device was integrated in 28 nm CMOS and featured 100 MHz −3 dB bandwidth with 67 dB Dynamic-Range. Input IP3 was 12 dBm at 10 and 11 MHz input tone frequencies. Total power consumption was 0.968 mW (0.484 mW per cell). Hence, the filter performed one of the highest figures-of-merit (160.7 dBJ-1) compared with analog state-of-the-art filters. Full article

► Show Figures

Figure 1

21 pages, 552 KiB

Open AccessArticle

Physical Computing: Unifying Real Number Computation to Enable Energy Efficient Computing

by Jennifer Hasler and Eric Black

J. Low Power Electron. Appl. 2021, 11(2), 14; https://0-doi-org.brum.beds.ac.uk/10.3390/jlpea11020014 - 26 Mar 2021

Cited by 7 | Viewed by 3438

Abstract

Physical computing unifies real value computing including analog, neuromorphic, optical, and quantum computing. Many real-valued techniques show improvements in energy efficiency, enable smaller area per computation, and potentially improve algorithm scaling. These physical computing techniques suffer from not having a strong computational theory [...] Read more.

Physical computing unifies real value computing including analog, neuromorphic, optical, and quantum computing. Many real-valued techniques show improvements in energy efficiency, enable smaller area per computation, and potentially improve algorithm scaling. These physical computing techniques suffer from not having a strong computational theory to guide application development in contrast to digital computation’s deep theoretical grounding in application development. We consider the possibility of a real-valued Turing machine model, the potential computational and algorithmic opportunities of these techniques, the implications for implementation applications, and the computational complexity space arising from this model. These techniques have shown promise in increasing energy efficiency, enabling smaller area per computation, and potentially improving algorithm scaling. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

J. Low Power Electron. Appl., Volume 11, Issue 2 (June 2021) – 14 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI