sensors-logo

Journal Browser

Journal Browser

AI-Enabled Advanced Sensing for Human Action and Activity Recognition

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (30 January 2022) | Viewed by 41438

Special Issue Editors

Intelligent Media Laboratory, Digital Contents Research Institute, Sejong University, Seoul, Korea
Interests: Data Mining; Visual Mining; Computer Vision; Intelligent Robots; Mixed Reality; Cultural Property Restoration; Tourism Content; Digital Content Authoring
Intelligent Media Lab, Department of Software, Sejong University, Seoul, Korea
Interests: Action Recognition; Activity Recognition; Anomaly Recognition; Computer Vision; Video Analytics; Deep Learning; Video Summarization
Special Issues, Collections and Topics in MDPI journals
Institute for Infocomm Research (I2R), A*STAR, 1 Fusionopolis Way, Singapore 138632, Singapore
Interests: data analytics; deep learning; domain adaptation; self-supervised learning and related applications
Special Issues, Collections and Topics in MDPI journals
College of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350108, China
Interests: human activity recognition; Internet of Things; machine learning; deep learning; sensors enabled IoT; Smart Homes
Special Issues, Collections and Topics in MDPI journals
Digital Research Center of Sfax (CRNS) and the Head of the DeepVision Research Team
Interests: Deep Learning; Pattern Recognition; Document Processing; Computer Vision and Data Fusion

Special Issue Information

Dear Colleagues,

Recent emerging technologies to recognize human action and activity are functional for public security, assets protection, and human-actions/activities-analytics-based applications for healthcare and entertainment. Surveillance systems are installed at every edge of public zones such as parks, airports, and subways to record ongoing events. The necessity of these surveillance systems greatly increased the concern to protect human assets and reduce the risk of anomalies through automatic human activities analysis. Therefore, automatic real-time human action and activity recognition algorithms/methods are required, using different sensors’ data and their fusion (e.g., vision sensors, depth sensors, and skeleton sensors). Furthermore, the advanced smart sensor technologies in IoT and their improvement in embedded devices has brought a paradigm shift for secure surveillance to recognize human activities and overcome real-world anomalies. The main goal of these technologies is to emphasize computationally intelligent surveillance which integrates and designs several methodologies such as cloud computing, IoT, distributed computing, and embedded vision systems. These technologies can be integrated for the efficient and instant analysis of human activity recognition.

This Special Issue entitled “AI-Enabled Advanced Sensing for Human Action and Activity Recognition” calls for original works revealing the latest research advancements on conventional machine learning and deep learning methods that deeply analyze the structure of human actions and activity patterns from distinct kinds of sensor data or their fusion. These recognition methods need to be based on smart and innovative machine intelligence.

The topics include but are not limited to the following:

• Deep learning for action/activity/anomaly recognition.
• Spatiotemporal features extraction for human sequential patterns analysis.
• Lightweight 2D and 3D convolutional neural networks for human action/activity recognition.
• RGB/depth/skeleton sensors-based action recognition.
• Data prioritization prior to human activity patterns analysis.
• Violence recognition.
• Embedded vision for action/activity recognition.
• Sensors/multi-sensor integrations for activity recognition.
• Activity localization, detection, and context analysis.
• IoT-assisted computationally intelligent methods for activity recognition.
• Cloud/fog computing for action and activity recognition.
• Benchmark datasets for action/activity/anomaly recognition.

Prof. Dr. Sung Wook Baik
Dr. Khan Muhammad
Dr. Zhenghua Chen
Dr. Hao Jiang
Dr. Yousri Kessentini
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 2669 KiB  
Article
Efficient Violence Detection in Surveillance
by Romas Vijeikis, Vidas Raudonis and Gintaras Dervinis
Sensors 2022, 22(6), 2216; https://0-doi-org.brum.beds.ac.uk/10.3390/s22062216 - 13 Mar 2022
Cited by 23 | Viewed by 9907
Abstract
Intelligent video surveillance systems are rapidly being introduced to public places. The adoption of computer vision and machine learning techniques enables various applications for collected video features; one of the major is safety monitoring. The efficacy of violent event detection is measured by [...] Read more.
Intelligent video surveillance systems are rapidly being introduced to public places. The adoption of computer vision and machine learning techniques enables various applications for collected video features; one of the major is safety monitoring. The efficacy of violent event detection is measured by the efficiency and accuracy of violent event detection. In this paper, we present a novel architecture for violence detection from video surveillance cameras. Our proposed model is a spatial feature extracting a U-Net-like network that uses MobileNet V2 as an encoder followed by LSTM for temporal feature extraction and classification. The proposed model is computationally light and still achieves good results—experiments showed that an average accuracy is 0.82 ± 2% and average precision is 0.81 ± 3% using a complex real-world security camera footage dataset based on RWF-2000. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

24 pages, 14754 KiB  
Article
HARNAS: Human Activity Recognition Based on Automatic Neural Architecture Search Using Evolutionary Algorithms
by Xiaojuan Wang, Xinlei Wang, Tianqi Lv, Lei Jin and Mingshu He
Sensors 2021, 21(20), 6927; https://0-doi-org.brum.beds.ac.uk/10.3390/s21206927 - 19 Oct 2021
Cited by 9 | Viewed by 1980
Abstract
Human activity recognition (HAR) based on wearable sensors is a promising research direction. The resources of handheld terminals and wearable devices limit the performance of recognition and require lightweight architectures. With the development of deep learning, the neural architecture search (NAS) has emerged [...] Read more.
Human activity recognition (HAR) based on wearable sensors is a promising research direction. The resources of handheld terminals and wearable devices limit the performance of recognition and require lightweight architectures. With the development of deep learning, the neural architecture search (NAS) has emerged in an attempt to minimize human intervention. We propose an approach for using NAS to search for models suitable for HAR tasks, namely, HARNAS. The multi-objective search algorithm NSGA-II is used as the search strategy of HARNAS. To make a trade-off between the performance and computation speed of a model, the F1 score and the number of floating-point operations (FLOPs) are selected, resulting in a bi-objective problem. However, the computation speed of a model not only depends on the complexity, but is also related to the memory access cost (MAC). Therefore, we expand the bi-objective search to a tri-objective strategy. We use the Opportunity dataset as the basis for most experiments and also evaluate the portability of the model on the UniMiB-SHAR dataset. The experimental results show that HARNAS designed without manual adjustments can achieve better performance than the best model tweaked by humans. HARNAS obtained an F1 score of 92.16% and parameters of 0.32 MB on the Opportunity dataset. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

18 pages, 942 KiB  
Article
Exploring 3D Human Action Recognition Using STACOG on Multi-View Depth Motion Maps Sequences
by Mohammad Farhad Bulbul, Sadiya Tabussum, Hazrat Ali, Wenli Zheng, Mi Young Lee and Amin Ullah
Sensors 2021, 21(11), 3642; https://0-doi-org.brum.beds.ac.uk/10.3390/s21113642 - 24 May 2021
Cited by 6 | Viewed by 2715
Abstract
This paper proposes an action recognition framework for depth map sequences using the 3D Space-Time Auto-Correlation of Gradients (STACOG) algorithm. First, each depth map sequence is split into two sets of sub-sequences of two different frame lengths individually. Second, a number of Depth [...] Read more.
This paper proposes an action recognition framework for depth map sequences using the 3D Space-Time Auto-Correlation of Gradients (STACOG) algorithm. First, each depth map sequence is split into two sets of sub-sequences of two different frame lengths individually. Second, a number of Depth Motion Maps (DMMs) sequences from every set are generated and are fed into STACOG to find an auto-correlation feature vector. For two distinct sets of sub-sequences, two auto-correlation feature vectors are obtained and applied gradually to L2-regularized Collaborative Representation Classifier (L2-CRC) for computing a pair of sets of residual values. Next, the Logarithmic Opinion Pool (LOGP) rule is used to combine the two different outcomes of L2-CRC and to allocate an action label of the depth map sequence. Finally, our proposed framework is evaluated on three benchmark datasets named MSR-action 3D dataset, DHA dataset, and UTD-MHAD dataset. We compare the experimental results of our proposed framework with state-of-the-art approaches to prove the effectiveness of the proposed framework. The computational efficiency of the framework is also analyzed for all the datasets to check whether it is suitable for real-time operation or not. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

17 pages, 23052 KiB  
Article
Device-Free Human Activity Recognition with Low-Resolution Infrared Array Sensor Using Long Short-Term Memory Neural Network
by Cunyi Yin, Jing Chen, Xiren Miao, Hao Jiang and Deying Chen
Sensors 2021, 21(10), 3551; https://0-doi-org.brum.beds.ac.uk/10.3390/s21103551 - 20 May 2021
Cited by 21 | Viewed by 3812
Abstract
Sensor-based human activity recognition (HAR) has attracted enormous interests due to its wide applications in the Internet of Things (IoT), smart homes and healthcare. In this paper, a low-resolution infrared array sensor-based HAR approach is proposed using the deep learning framework. The device-free [...] Read more.
Sensor-based human activity recognition (HAR) has attracted enormous interests due to its wide applications in the Internet of Things (IoT), smart homes and healthcare. In this paper, a low-resolution infrared array sensor-based HAR approach is proposed using the deep learning framework. The device-free sensing system leverages the infrared array sensor of 8×8 pixels to collect the infrared signals, which can ensure users’ privacy and effectively reduce the deployment cost of the network. To reduce the influence of temperature variations, a combination of the J-filter noise reduction method and the Butterworth filter is performed to preprocess the infrared signals. Long short-term memory (LSTM), a representative recurrent neural network, is utilized to automatically extract characteristics from the infrared signal and build the recognition model. In addition, the real-time HAR interface is designed by embedding the LSTM model. Experimental results show that the typical daily activities can be classified with the recognition accuracy of 98.287%. The proposed approach yields a better result compared to the existing machine learning methods, and it provides a low-cost yet promising solution for privacy-preserving scenarios. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

16 pages, 4264 KiB  
Article
Activity Detection from Electricity Consumption and Communication Usage Data for Monitoring Lonely Deaths
by Gyubaek Kim and Sanghyun Park
Sensors 2021, 21(9), 3016; https://0-doi-org.brum.beds.ac.uk/10.3390/s21093016 - 25 Apr 2021
Cited by 6 | Viewed by 2516
Abstract
As the number of single-person households grows worldwide, the need to monitor their safety is gradually increasing. Among several approaches developed previously, analyzing the daily lifelog data generated unwittingly, such as electricity consumption or communication usage, has been discussed. However, data analysis methods [...] Read more.
As the number of single-person households grows worldwide, the need to monitor their safety is gradually increasing. Among several approaches developed previously, analyzing the daily lifelog data generated unwittingly, such as electricity consumption or communication usage, has been discussed. However, data analysis methods in the domain are currently based on anomaly detection. This presents accuracy issues and the challenge of securing service reliability. We propose a new analysis method that finds activities such as operation or movement from electricity consumption and communication usage data. This is evidence of safety. As a result, we demonstrate better performance through comparative verification. Ultimately, this study aims to contribute to a more reliable implementation of a service that enables monitoring of lonely deaths. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

17 pages, 9841 KiB  
Article
An Efficient Anomaly Recognition Framework Using an Attention Residual LSTM in Surveillance Videos
by Waseem Ullah, Amin Ullah, Tanveer Hussain, Zulfiqar Ahmad Khan and Sung Wook Baik
Sensors 2021, 21(8), 2811; https://0-doi-org.brum.beds.ac.uk/10.3390/s21082811 - 16 Apr 2021
Cited by 69 | Viewed by 4626
Abstract
Video anomaly recognition in smart cities is an important computer vision task that plays a vital role in smart surveillance and public safety but is challenging due to its diverse, complex, and infrequent occurrence in real-time surveillance environments. Various deep learning models use [...] Read more.
Video anomaly recognition in smart cities is an important computer vision task that plays a vital role in smart surveillance and public safety but is challenging due to its diverse, complex, and infrequent occurrence in real-time surveillance environments. Various deep learning models use significant amounts of training data without generalization abilities and with huge time complexity. To overcome these problems, in the current work, we present an efficient light-weight convolutional neural network (CNN)-based anomaly recognition framework that is functional in a surveillance environment with reduced time complexity. We extract spatial CNN features from a series of video frames and feed them to the proposed residual attention-based long short-term memory (LSTM) network, which can precisely recognize anomalous activity in surveillance videos. The representative CNN features with the residual blocks concept in LSTM for sequence learning prove to be effective for anomaly detection and recognition, validating our model’s effective usage in smart cities video surveillance. Extensive experiments on the real-world benchmark UCF-Crime dataset validate the effectiveness of the proposed model within complex surveillance environments and demonstrate that our proposed model outperforms state-of-the-art models with a 1.77%, 0.76%, and 8.62% increase in accuracy on the UCF-Crime, UMN and Avenue datasets, respectively. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

19 pages, 10826 KiB  
Article
Low-Cost and Device-Free Human Activity Recognition Based on Hierarchical Learning Model
by Jing Chen, Xinyu Huang, Hao Jiang and Xiren Miao
Sensors 2021, 21(7), 2359; https://0-doi-org.brum.beds.ac.uk/10.3390/s21072359 - 28 Mar 2021
Cited by 10 | Viewed by 3370
Abstract
Human activity recognition (HAR) has been a vital human–computer interaction service in smart homes. It is still a challenging task due to the diversity and similarity of human actions. In this paper, a novel hierarchical deep learning-based methodology equipped with low-cost sensors is [...] Read more.
Human activity recognition (HAR) has been a vital human–computer interaction service in smart homes. It is still a challenging task due to the diversity and similarity of human actions. In this paper, a novel hierarchical deep learning-based methodology equipped with low-cost sensors is proposed for high-accuracy device-free human activity recognition. ESP8266, as the sensing hardware, was utilized to deploy the WiFi sensor network and collect multi-dimensional received signal strength indicator (RSSI) records. The proposed learning model presents a coarse-to-fine hierarchical classification framework with two-level perception modules. In the coarse-level stage, twelve statistical features of time–frequency domains were extracted from the RSSI measurements filtered by a butterworth low-pass filter, and a support vector machine (SVM) model was employed to quickly recognize the basic human activities by classifying the signal statistical features. In the fine-level stage, the gated recurrent unit (GRU), a representative type of recurrent neural network (RNN), was applied to address issues of the confused recognition of similar activities. The GRU model can realize automatic multi-level feature extraction from the RSSI measurements and accurately discriminate the similar activities. The experimental results show that the proposed approach achieved recognition accuracies of 96.45% and 94.59% for six types of activities in two different environments and performed better compared the traditional pattern-based methods. The proposed hierarchical learning method provides a low-cost sensor-based HAR framework to enhance the recognition accuracy and modeling efficiency. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

20 pages, 1140 KiB  
Article
Sensor-Based Human Activity Recognition with Spatio-Temporal Deep Learning
by Ohoud Nafea, Wadood Abdul, Ghulam Muhammad and Mansour Alsulaiman
Sensors 2021, 21(6), 2141; https://0-doi-org.brum.beds.ac.uk/10.3390/s21062141 - 18 Mar 2021
Cited by 82 | Viewed by 7449
Abstract
Human activity recognition (HAR) remains a challenging yet crucial problem to address in computer vision. HAR is primarily intended to be used with other technologies, such as the Internet of Things, to assist in healthcare and eldercare. With the development of deep learning, [...] Read more.
Human activity recognition (HAR) remains a challenging yet crucial problem to address in computer vision. HAR is primarily intended to be used with other technologies, such as the Internet of Things, to assist in healthcare and eldercare. With the development of deep learning, automatic high-level feature extraction has become a possibility and has been used to optimize HAR performance. Furthermore, deep-learning techniques have been applied in various fields for sensor-based HAR. This study introduces a new methodology using convolution neural networks (CNN) with varying kernel dimensions along with bi-directional long short-term memory (BiLSTM) to capture features at various resolutions. The novelty of this research lies in the effective selection of the optimal video representation and in the effective extraction of spatial and temporal features from sensor data using traditional CNN and BiLSTM. Wireless sensor data mining (WISDM) and UCI datasets are used for this proposed methodology in which data are collected through diverse methods, including accelerometers, sensors, and gyroscopes. The results indicate that the proposed scheme is efficient in improving HAR. It was thus found that unlike other available methods, the proposed method improved accuracy, attaining a higher score in the WISDM dataset compared to the UCI dataset (98.53% vs. 97.05%). Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

14 pages, 518 KiB  
Article
Shallow Graph Convolutional Network for Skeleton-Based Action Recognition
by Wenjie Yang, Jianlin Zhang, Jingju Cai and Zhiyong Xu
Sensors 2021, 21(2), 452; https://0-doi-org.brum.beds.ac.uk/10.3390/s21020452 - 11 Jan 2021
Cited by 15 | Viewed by 3006
Abstract
Graph convolutional networks (GCNs) have brought considerable improvement to the skeleton-based action recognition task. Existing GCN-based methods usually use the fixed spatial graph size among all the layers. It severely affects the model’s abilities to exploit the global and semantic discriminative information due [...] Read more.
Graph convolutional networks (GCNs) have brought considerable improvement to the skeleton-based action recognition task. Existing GCN-based methods usually use the fixed spatial graph size among all the layers. It severely affects the model’s abilities to exploit the global and semantic discriminative information due to the limits of receptive fields. Furthermore, the fixed graph size would cause many redundancies in the representation of actions, which is inefficient for the model. The redundancies could also hinder the model from focusing on beneficial features. To address those issues, we proposed a plug-and-play channel adaptive merging module (CAMM) specific for the human skeleton graph, which can merge the vertices from the same part of the skeleton graph adaptively and efficiently. The merge weights are different across the channels, so every channel has its flexibility to integrate the joints. Then, we build a novel shallow graph convolutional network (SGCN) based on the module, which achieves state-of-the-art performance with less computational cost. Experimental results on NTU-RGB+D and Kinetics-Skeleton illustrates the superiority of our methods. Full article
(This article belongs to the Special Issue AI-Enabled Advanced Sensing for Human Action and Activity Recognition)
Show Figures

Figure 1

Back to TopTop