Next Article in Journal
An Anti-Nucleocapsid Antigen Sars-Cov-2 Total Antibody Assay Finds Comparable Results in Edta-Anticoagulated Whole Blood Obtained from Capillary and Venous Blood Sampling
Next Article in Special Issue
A Data Descriptor for Black Tea Fermentation Dataset
Previous Article in Journal
Comparison of 3D Point Clouds Obtained by Terrestrial Laser Scanning and Personal Laser Scanning on Forest Inventory Sample Plots
Previous Article in Special Issue
A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
Due to scheduled maintenance work on our core network, there may be short service disruptions on this website between 16:00 and 16:30 CEST on September 25th.
Article

Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling

SITE, VIT University, Vellore, Tamil Nadu 632014, India
*
Authors to whom correspondence should be addressed.
Received: 16 October 2020 / Revised: 7 November 2020 / Accepted: 8 November 2020 / Published: 11 November 2020
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods. View Full-Text
Keywords: segment-based temporal modeling; two-stream network; action recognition segment-based temporal modeling; two-stream network; action recognition
Show Figures

Figure 1

MDPI and ACS Style

Sarabu, A.; Santra, A.K. Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling. Data 2020, 5, 104. https://0-doi-org.brum.beds.ac.uk/10.3390/data5040104

AMA Style

Sarabu A, Santra AK. Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling. Data. 2020; 5(4):104. https://0-doi-org.brum.beds.ac.uk/10.3390/data5040104

Chicago/Turabian Style

Sarabu, Ashok, and Ajit K. Santra 2020. "Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling" Data 5, no. 4: 104. https://0-doi-org.brum.beds.ac.uk/10.3390/data5040104

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop