Next Article in Journal
Geomorphological Data from Detonation Craters in the Fehmarn Belt, German Baltic Sea
Previous Article in Journal
An Ensemble Model for Predicting Retail Banking Churn in the Youth Segment of Customers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

DriverMVT: In-Cabin Dataset for Driver Monitoring including Video and Vehicle Telemetry Information

1
Information Technology and Programming Faculty, ITMO University, 197101 St. Petersburg, Russia
2
St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 199178 St. Petersburg, Russia
*
Author to whom correspondence should be addressed.
Submission received: 6 April 2022 / Revised: 3 May 2022 / Accepted: 5 May 2022 / Published: 11 May 2022
(This article belongs to the Section Information Systems and Data Management)

Abstract

:
Developing a driver monitoring system that can assess the driver’s state is a prerequisite and a key to improving the road safety. With the success of deep learning, such systems can achieve a high accuracy if corresponding high-quality datasets are available. In this paper, we introduce DriverMVT (Driver Monitoring dataset with Videos and Telemetry). The dataset contains information about the driver head pose, heart rate, and driver behaviour inside the cabin like drowsiness and unfastened belt. This dataset can be used to train and evaluate deep learning models to estimate the driver’s health state, mental state, concentration level, and his/her activity in the cabin. Developing such systems that can alert the driver in case of drowsiness or distraction can reduce the number of accidents and increase the safety on the road. The dataset contains 1506 videos for 9 different drivers (7 males and 2 females) with total number of frames equal 5119k and total time over 36 h. In addition, evaluated the dataset with multi-task temporal shift convolutional attention network (MTTS-CAN) algorithm. The algorithm mean average error on our dataset is 16.375 heartbeats per minute.

1. Introduction

Road accidents cause the death of hundreds of thousands of people every year. According to the World Health Organization, they are considered in the top ten of death causes in the low and middle income countries [1], because they affect not only the drivers and passengers but also the pedestrians. Human error is the main reason for most of these accidents. To eliminate the human factor, huge attention has been drawn to developing automated vehicles that are fully operated by Artificial Intelligence (AI).
With the advance of automated vehicle spreading in the world, driving will become a shared activity between the human and the machine, which generates demand for systems that can evaluate the driver state and his/her ability to take control of the vehicle at any moment.
Developing a driver monitoring system that can estimate the driver’s state has drawn the researchers’ attention lately. These systems aim to increase the safety level on the roads by alerting the driver. They systems include:
  • The detection of the driver’s vital signs like heart rate, blood pressure, oxygen saturation, and respiratory rate.
  • The detection of the driver’s mental state like fatigue.
  • Measurement of the driver’s attention and concentration levels.
  • Detection of the driver’s activity inside the cabin.
Over last decades, researchers investigated the drivers’ behaviours to estimate the crash risk using the naturalistic driving data like speed, acceleration, and braking. The data was collected using Global Positioning System (GPS) and On Board Diagnostics (OBD) [2], accelerometers [3], and smartphones [4] to identify risky and abnormal driving events and evaluate the crash risk. Researchers [5] developed a driver assessment and recommendation system to evaluate individual driving performance and improve the traffic safety. The researchers used features like the trip distance and duration, the average and maximum speed, the number of hard brake and speed up to adopt Gaussian mixture model-universal background model and the maximum likelihood method to capture driver signature. Researcher [6] developed a driving behavior-based relative risk evaluation model using a non-parametric optimization method taking into consideration the frequency and the severity level of the different risky driving behaviors.
Researchers [7,8,9] have studied the driver behaviour factors like: road traffic violation, lapses, fail to maintain a safe gap, errors related to visual perception failure, and others. Different methods were used to evaluate and prioritize the significant driver behavior factors related to road safety. paper [7] designed an analytic hierarchy process with best-worst method (AHP-BWM) model to evaluate driver behavior factors within a designed three-level hierarchical structure. Paper [8] introduced to combine the best-worst method with the triangular fuzzy sets as a supporting tool for ranking and prioritizing the critical driver behavior criteria. While paper [9] performed Pythagorean Fuzzy Analytic Hierarchy Process to assess and prioritize the driver critical behavior criteria designed into a hierarchical model based on data gathered from observed driver groups in Budapest city. This evaluation is valuable to make drivers aware of individual traffic risks and it may assist in the implementation of effective local road safety policies.
Researchers have developed different methods to detect the driver fatigue. Some of these methods depends on detecting biological signal like the heart rate [10,11], others depends on physhical features like the face and eyes [12,13].
In this paper, we present an annotated dataset DriverMVT (Driver Monitoring dataset with Videos and Telemetry). for monitoring the driver inside the vehicle cabin. This dataset can be used to train and evaluate deep learning models to estimate the driver’s state like the fatigue, the distraction, bad health situation, etc. Developing models to detect such critical behaviour and alert the driver can prevent many accidents and increase the safety on the road.
The rest of the paper is organized as follows: A review of the methods and datasets used for driver monitoring is presented in Section 2. Section 3 contains detailed information about our proposed dataset and how to use it. Section 4 shows the experiments for data evaluation. Finally, the conclusion is presented in Section 5.

2. Related Work

In this section, we present a small overview of the methods and datasets used for driver monitoring. The authors of paper [14] introduced a diverse benchmark with 2000 video sequences and over 650,000 frames that contain normal, critical, and accidental situations together in each video sequence. The dataset is for the scenes outside the vehicle. The researchers are answering the following question: Can we predict a driving accident if we know the driver’s attention level?
The authors of paper [15] proposed a dataset called DrivFace that contains images sequences of subjects while driving in real scenarios. The dataset consists of 606 samples with resolution 640 × 480 pixels, acquired by 4 drivers (2 women and 2 men) with different facial features like glasses and beards. This dataset is annotated with head pose angles and the view direction. The authors also proposed a method to estimate the attention level from the head pose angles.
The authors of paper [16] introduced MPIIGaze dataset which contains 213,659 images collected from 15 participants during natural everyday computer use over more than three months with corresponding ground-truth gaze positions. The dataset has a large variability in appearance and illumination but it was not recorded in real driving scenarios. The main purpose for the dataset is to estimate the gaze angle from monocular camera in order to determine the attention level.
The authors of paper [17] introduced DriveAHead dataset, which contains more than 10 h of infrared (IR) and depth images of drivers’ head poses taken in real driving situations. The dataset provides frame-by-frame head pose labels obtained from a motion-capture system, as well as annotations about occlusions of the driver’s face. The dataset was collected from 20 persons (4 females and 16 males) using Kinect v2.
The authors of paper [18] introduced a dataset, collected from 14 young people (11 females, 3 males) who performed three successive experiments (the duration of each experiment was 10 min) in conditions of increasing sleep deprivation induced by acute, prolonged waking. The dataset contains different types of datas (images, signals, and etc.) and aims to help the resesrchers in the field of monitoring drowsiness, but it wasn’t recorded inside the car cabin.
The authors of paper [19] introduced a dataset that consists of videos of drivers performing actions related to different driving scenarios. The dataset was acquired from 35 participants (10 females, 25 males) in different lightning conditions according to the time the session was recorded (morning or afternoon) with different speeds and both in simulations and real scenarios. The dataset was recorded using 3 of Intel RealSense Depth Camera D400-Series in different locations to capture the face, the body, and the hands of the driver.
The authors of paper [20] published a DMD dataset, consisting of videos of the drivers performing distraction actions in an automated driving scenario. The dataset contains over 9.6 million frames of people recorded using 5 near-infrared cameras in different perspectives, and 3 channels from a side camera (RGB, depth, IR).
The authors of paper [21] proposed a dataset called Multimodal Spontaneous Expression Heart Rate (MMSE-HR) dataset which is composed of videos and associated information about the heart rate and the blood pressure. The dataset was collected by 140 participants (58 males and 82 females) of different ages and ethnics. The data were acquired from different face sensors (high-resolution, 3D dynamic imaging, high-resolution 2D video, and thermal sensing), and contact sensors (electrical skin conductivity, respiration, blood pressure, and heart rate).
In contrast to our DriverMVT dataset, most of the datasets found in the literature concentrate on a particular tasks like head pose, gaze angles, action classification, drowsiness. Our dataset provides detailed and diverse information that make it useful for a wider range of tasks related to the drivers. Our dataset provides frame by frame annotation of the driver health indices like the heart rate, the mental state like fatigue, and the head pose estimation, alongside with driver activities. In addition, our dataset was recorded in real environment while subjects were driving home or to work. The dataset is diverse in terms of lightning conditions and speed. Table 1 shows a comparison between the available datasets and our dataset.

3. Dataset

In this section, an overview of the dataset is presented. Section 3.1 addresses the methodology used for collecting the proposed dataset, while Section 3.2 provides the description of the dataset and finally in Section 3.3 an exploratory analysis of the datasets is presented.

3.1. Collection Methodology

In this section, we introduce the collection methodology. In Section 3.1.1, we describe the devices used for data collection, while in Section 3.1.2, we describe the acquiring process.

3.1.1. Collection Devices

The dataset was collected using different camera types: a USB camera produced by ELP (see Figure 1), Samsung Galaxy S10 camera, and Samsung Galaxy S20 camera. The USB camera’s sensor is OV7725, a single-chip VGA camera with an image processor. The lens size is 1/4 inch with view angle 30–150 degree, the sensor incorporates a 640 × 480 image array operating at frame rate 30 fps. The USB camera also has a high speed USB 2.0 interface module. For the smartphones, videos were recorded with resolution 1080 × 1920 and frame rate 60 fps.
For the heart rate recording we used Xiaomi Mi Band 3. This is not a medical device but it provides possibilities to precisely estimate heart rate that can be used for tasks mentioned in the paper.

3.1.2. Data Collection

The dataset was acquired from 9 drivers of different ages and genders (2 females, and 7 males) with total number of frames equal 5119k and total time over 36 h using different conditions of car speed and light. We included drivers with different facial features (with/without beard, with/without mustache, long/short hair, etc.). Table 2 presents the demographic data of the participants.
The drivers are all from St. Petersburg, Russia. We chose the participants to be diverse and balanced regarding to different facial features and different ages.
The videos were recorded and saved with the exact date and time, while the metadata was saved to the database with additional information like the user id, the measurement time and time when the ride started. These additional information is used later for synchronization as shown in Section 3.3.2. Figure 2 shows the scheme of acquiring the information.

3.2. Data Description

The dataset consists of 1506 videos of drivers inside the vehicle cabin and is divided into three sub-categories (see Figure 3):
  • Imprecise synchronization: the category contains videos of mean length of 1 min and meta data for each video, the video is frame by frame annotated but the synchronization between the video and the metadata is not precise with maximum delay of 1 s.
  • Precise synchronization and heart rate information: the category contains videos of mean length of 30 min and meta data for each video, the video is frame by frame annotated with perfect synchronization, The dataset also contains information about the driver’s heart rate.
  • Precise synchronization and no heart rate information: the category contains videos of mean length of 30 min and frame by frame annotation for each video, the synchronization between the video and the information is precise.
For each video, meta data information are given in an CSV file. The file contains the general information about the video (see Table 3), like the geographic coordinates (latitude, longitude, and altitude), the driving trip starting time presented in milliseconds Unix timestamp, the date time (milliseconds Unix timestamp) which describes the time of recording the video, the car speed, the light level, and illuminance, the head pose angles (roll, pitch, and yaw) calculated using the method in paper [22], the data from the gyroscope (accelerometer data, gyroscope data, and magnetometer data), the mouth openness ratio, the seat belt state to detect whether the belt is fastened or not [23], and the heart rate measured using smart watch Xiaomi Mi Band 5.

3.3. Data Distribution

In this section, we present an exploratory analysis of the proposed dataset. Section 3.3.1 shows information about the meta data like the data type and the number of missing values in each columns columns. In addition, a visualization on the distribution of data like heart rate, and speed is presented. In Section 3.3.2 the synchronization method between the videos and the meta data is explained.

3.3.1. Data Exploration

In this section we provide a basic understanding of the dataset by showing the statistics and the distribution of the data.
Table 4 shows information about the metadata of the driver video and HR information. The table shows that there is some missing information in the face_mouth, head_pose, and heart_rate columns. Face_mouth column is calculated based on the Faceboxes framework. Head_pose column is calculated based on the proposed image processing approach discussed in paper [22]. Some frames do not have suitable exposure and in some cases the driver head can not be determined. In this case some values from this columns can be missing. Heart_rate column contains the data from Xiaomi Mi Band 3. Since not all the driver used the device one also can see that some values are missing. For dangerous state the column has values when there is some critical events like fatigue otherwise the state is considered to be normal.
Figure 4 shows examples of different critical events from the dataset.
Figure 5 shows the the distribution of the data according to speed. Around 29% of our datasets was recorded when the car was not moving (like the case when the driver stopped on the traffic light).
Figure 6 shows the distribution of the critical events in the dataset in log scale.
Figure 7 shows the distribution of the heart rate in the dataset. The dataset contains samples in the range [75, 95], which is the normal resting heart rate for adult people.

3.3.2. Data Synchronization

As we mentioned earlier, the names of the video files represent the recording starting time of the video either in the exact Unix time stamp in milliseconds or using the date and time in seconds. The meta data were saved using the Unix time stamp on the database. To synchronize the metadata with the exact video frame, we used Equation (1):
f r a m e = ( d a t e t i m e v i d e o _ r e c o r d i n g _ t i m e ) 1000 f r a m e r a t e
where frame represent the frame number that is described by the metadata, d a t e t i m e represent the Unix time stamp in ms of the metadata, v i d e o _ r e c o r d i n g _ t i m e represent the Unix time stamp in ms of the recording video, and f r a m e r a t e represent the video frame rate. This way, the videos saved by the Unix timestamp will be perfectly synchronized, while the videos that were saved by the date and time will be shifted. The maximum difference is 1 s or 10–60 frames. For efficient usage of the data, we performed the synchronization for the whole dataset. Each video is frame by frame annotated, the metadata is saved in a CSV file that contains the frame number alongside with the additional information.

4. Data Evaluation

To validate our dataset, we carried out expirements with multi-task temporal shift convolutional attention network (MTTS-CAN) [24], one of the state-of-the-art algorithms in heart rate estimation. The architecture is presented in Figure 8.
We tested the algorithm on subset of our dataset that contains heart rate information. This dataset consists of 12 videos. The MTTS-CAN showed a mean average error of 16.375 heartbeats per minutes and Root mean square error equal to 19.495, which considered a high error.
In addition, we carried a separate experiment to evaluate the respiratory rate. We used our algorithm proposed by paper [25] to detect the respiratory rate when the car speed is zero or around the zero. We made experiments of the proposed method on the presented dataset and conclude that we can measure respiratory rate then the the vehicle speed is less than 3 Km/h. The algorithm can be summarized in the following steps:
  • Estimate the position of the chest keypoint using Openpose human pose estimation model.
  • Calculate the keypoint displacement using an optical flow-based neural network (SelFlow).
  • clean the displacement signal using filtering and detrending. Then count the number of peaks/troughs in a time window of one minute.
Figure 9 shows the algorithm scheme.
Figure 10 shows the produced heart rate signal produced by MMTS-CAN and respiratory rate signal.
In our experiment, we divide the videos into three classes depending on the heart rate, then we used our algorithm to calculate the respiratory rate of the driver.
Table 5 shows the mean respiratory rate for each class.
As we can see from the table, there is a direct relationship between the heart rate and the respiratory rate. With increasing the heart rate, the respiratory rate increases as well.

5. Conclusions

In this paper, we introduced a new extensive, diverse dataset called DriverMVT designed to allow researchers to develop a contactless real-time monitoring system. The dataset contains 1506 videos collected using monocular camera from 9 subjects in real driving scenarios with total number of frames equal 5119k and total time over 36 h. For each video, the dataset contains the following time-synchronized information: geographic coordinates, speed, acceleration, light conditions, magnetic orientation, angular velocity, Driver head pose, driver mouth openness ratio, driver heart rate and driver actions. The dataset can be used to train and evaluate models for detecting Drowsiness/Fatigue, Distraction based on the head pose information, and predicting the driver heart rate to detect the driver health state. These models can reduce the accidents and increase the safety on the road. In addition we evaluated the dataset with MTTS-CAN algorithm. The algorithm mean average error on our dataset is 16.375 heartbeats per minute. We hope that other researchers will use our dataset in other innovative ways. Of course the main goal is that the research and models based on the DriverMVT dataset will one day help save lives on the roads.

Author Contributions

W.O. is responsible for dataset preparation, formal analysis, and paper writing. A.K. is responsible for conceptualization, paper writing, and funding acquisition. A.A. is responsible for modules development. N.S. is responsible for paper writing and conceptualisation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Russian Science Foundation (project 18-71-10065). Described in the paper related research (Section 2) has been supported by the Russian State Research # FFZF-2022-0005.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki considered and approved by Reviewers’ Board (Scientific Council) of St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS) (as codified in Protocol of 24 March 2022, No. 3).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Our dataset is available on the following web page: https://1drv.ms/u/s!Ar_DU2ygGWIUhPAz7dO4BUHEwshxKA?e=o9bsm4 (accessed on 5 April 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. World Health Organization. The Top 10 Causes of Death. Available online: https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death (accessed on 6 April 2022).
  2. Jun, J.; Ogle, J.; Guensler, R. Relationships between Crash Involvement and Temporal-Spatial Driving Behavior Activity Patterns: Use of Data for Vehicles with Global Positioning Systems. Transp. Res. Rec. 2007, 2019, 246–255. [Google Scholar] [CrossRef]
  3. Şimşek, B.; Pakdil, F.; Dengiz, B.; Testik, M.C. Driver performance appraisal using GPS terminal measurements: A conceptual framework. Transp. Res. Part Emerg. Technol. 2013, 26, 49–60. [Google Scholar] [CrossRef]
  4. Castignani, G.; Derrmann, T.; Frank, R.; Engel, T. Driver Behavior Profiling Using Smartphones: A Low-Cost Platform for Driver Monitoring. IEEE Intell. Transp. Syst. Mag. 2015, 7, 91–102. [Google Scholar] [CrossRef]
  5. Hong, Z.; Chen, Y.; Wu, Y. A driver behavior assessment and recommendation system for connected vehicles to produce safer driving environments through a “follow the leader” approach. Accid. Anal. Prev. 2020, 139, 105460. [Google Scholar] [CrossRef]
  6. Bao, Q.; Tang, H.; Shen, Y. Driving Behavior Based Relative Risk Evaluation Using a Nonparametric Optimization Method. Int. J. Environ. Res. Public Health 2021, 18, 12452. [Google Scholar] [CrossRef]
  7. Moslem, S.; Farooq, D.; Ghorbanzadeh, O.; Blaschke, T. Application of the AHP-BWM Model for Evaluating Driver Behavior Factors Related to Road Safety: A Case Study for Budapest. Symmetry 2020, 12, 243. [Google Scholar] [CrossRef] [Green Version]
  8. Moslem, S.; Gul, M.; Farooq, D.; Celik, E.; Ghorbanzadeh, O.; Blaschke, T. An Integrated Approach of Best-Worst Method (BWM) and Triangular Fuzzy Sets for Evaluating Driver Behavior Factors Related to Road Safety. Mathematics 2020, 8, 414. [Google Scholar] [CrossRef] [Green Version]
  9. Farooq, D.; Moslem, S. Estimating Driver Behavior Measures Related to Traffic Safety by Investigating 2-Dimensional Uncertain Linguistic Data—A Pythagorean Fuzzy Analytic Hierarchy Process Approach. Sustainability 2022, 14, 1881. [Google Scholar] [CrossRef]
  10. Li, G.; Chung, W.Y. Detection of Driver Drowsiness Using Wavelet Analysis of Heart Rate Variability and a Support Vector Machine Classifier. Sensors 2013, 13, 16494–16511. [Google Scholar] [CrossRef] [Green Version]
  11. Rundo, F.; Spampinato, C.; Conoci, S. Ad-Hoc Shallow Neural Network to Learn Hyper Filtered PhotoPlethysmoGraphic (PPG) Signal for Efficient Car-Driver Drowsiness Monitoring. Electronics 2019, 8, 890. [Google Scholar] [CrossRef] [Green Version]
  12. Wang, Y.; Huang, R.; Guo, L. Eye gaze pattern analysis for fatigue detection based on GP-BCNN with ESM. Pattern Recognit. Lett. 2019, 123, 61–74. [Google Scholar] [CrossRef]
  13. Reddy, B.; Kim, Y.H.; Yun, S.; Seo, C.; Jang, J. Real-Time Driver Drowsiness Detection for Embedded System Using Model Compression of Deep Neural Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 438–445. [Google Scholar] [CrossRef]
  14. Fang, J.; Yan, D.; Qiao, J.; Xue, J. DADA: A Large-scale Benchmark and Model for Driver Attention Prediction in Accidental Scenarios. arXiv 2019, arXiv:1912.12148. [Google Scholar]
  15. Diaz-Chito, K.; Hernández-Sabaté, A.; López, A.M. A reduced feature set for driver head pose estimation. Appl. Soft Comput. 2016, 45, 98–107. [Google Scholar] [CrossRef]
  16. Zhang, X.; Sugano, Y.; Fritz, M.; Bulling, A. MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation. arXiv 2017, arXiv:1711.09017. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Schwarz, A.; Haurilet, M.; Martinez, M.; Stiefelhagen, R. DriveAHead—A Large-Scale Driver Head Pose Dataset. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1165–1174. [Google Scholar] [CrossRef]
  18. Massoz, Q.; Langohr, T.; François, C.; Verly, J.G. The ULg multimodality drowsiness database (called DROZY) and examples of use. In Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, 7–10 March 2016; pp. 1–7. [Google Scholar] [CrossRef] [Green Version]
  19. Ortega, J.D.; Kose, N.; Cañas, P.; Chao, M.A.; Unnervik, A.; Nieto, M.; Otaegui, O.; Salgado, L. DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops (Accepted), Glasgow, UK, 23–28 August 2020. [Google Scholar]
  20. Martin, M.; Roitberg, A.; Haurilet, M.; Horne, M.; Reiß, S.; Voit, M.; Stiefelhagen, R. Drive&Act: A Multi-Modal Dataset for Fine-Grained Driver Behavior Recognition in Autonomous Vehicles. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 2801–2810. [Google Scholar] [CrossRef]
  21. Zhang, Z.; Girard, J.M.; Wu, Y.; Zhang, X.; Liu, P.; Ciftci, U.; Canavan, S.; Reale, M.; Horowitz, A.; Yang, H.; et al. Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3438–3446. [Google Scholar] [CrossRef]
  22. Kashevnik, A.; Ali, A.; Lashkov, I.; Zubok, D. Human Head Angle Detection Based on Image Analysis. In Proceedings of the Future Technologies Conference (FTC) 2020; Springer International Publishing: Cham, Switzerland, 2021; Volume 1, pp. 233–242. [Google Scholar]
  23. Kashevnik, A.; Ali, A.; Lashkov, I.; Shilov, N. Seat Belt Fastness Detection Based on Image Analysis from Vehicle In-abin Camera. In Proceedings of the 2020 26th Conference of Open Innovations Association (FRUCT), Yaroslavl, Russia, 20–24 April 2020; pp. 143–150. [Google Scholar] [CrossRef]
  24. Liu, X.; Fromm, J.; Patel, S.; McDuff, D. Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020; pp. 19400–19411. [Google Scholar]
  25. Othman, W.; Kashevnik, A.; Ryabchikov, I.; Shilov, N. Contactless Camera-Based Approach for Driver Respiratory Rate Estimation in Vehicle Cabin. In Proceedings of the Intelligent Systems Conference (IntelliSys) 2022, Amsterdam, The Netherlands, 1–2 September 2022. [Google Scholar]
Figure 1. USB camera used for collecting the dataset.
Figure 1. USB camera used for collecting the dataset.
Data 07 00062 g001
Figure 2. The scheme of Data collection.
Figure 2. The scheme of Data collection.
Data 07 00062 g002
Figure 3. The hierarchy of the dataset.
Figure 3. The hierarchy of the dataset.
Data 07 00062 g003
Figure 4. Examples of critical events from the dataset.
Figure 4. Examples of critical events from the dataset.
Data 07 00062 g004
Figure 5. Video Statics according to speed.
Figure 5. Video Statics according to speed.
Data 07 00062 g005
Figure 6. Distribution of the critical events.
Figure 6. Distribution of the critical events.
Data 07 00062 g006
Figure 7. Distribution of the heart rate.
Figure 7. Distribution of the heart rate.
Data 07 00062 g007
Figure 8. MTTS-CAN architecture [24].
Figure 8. MTTS-CAN architecture [24].
Data 07 00062 g008
Figure 9. Respiratory rate algorithm scheme [25].
Figure 9. Respiratory rate algorithm scheme [25].
Data 07 00062 g009
Figure 10. The predicted heart rate and respiratory rate signal for a video of a driver inside the car with vehicle speed is zero and heart rate 112.
Figure 10. The predicted heart rate and respiratory rate signal for a video of a driver inside the car with vehicle speed is zero and heart rate 112.
Data 07 00062 g010
Table 1. Comparison between Driver monitoring public datasets.
Table 1. Comparison between Driver monitoring public datasets.
DatasetSizeDriversUsageEnvironment
DrivFace [15]606 images4Head poseCar
MPIIGaze [16]∼214K images15Gaze positionSimulator
DriveAHead [17]10 h20Head poseCar
Dataset [18]500k frames14DrowsinessSimulator
DMD [19]41 h35Head pose/drowsiness/hands/action/classificationCar/simulator
MMSE-HR [21]10TB140Vital signsLaboratory
DriverMVT (ours)1506 videos9Head pose/distraction/fatigue/action classification/heart rateCar
Table 2. Demographic data of the participants.
Table 2. Demographic data of the participants.
Demographic DataNumber of ParticipantsPercentage
Gender
male777.78
female222.22
Ages
<25222.22
25–34333.33
35–45333.33
>45111.11
Special Features
with glasses555.56
with beard444.44
with mustache333.33
Table 3. Metadata information describing the recorded drivers’ videos.
Table 3. Metadata information describing the recorded drivers’ videos.
Column NameDescriptionNotation/UnitPossible Values
filenamethe video name--
framenumberthe video frame number to be descibed--
latitudenorth-south positiondecimal degree[0, 60.120796]
longitudeeast-west positiondecimal degree[0, 37.6103415]
altitudethe distance above sea levelmeter[−565.5, 251.3]
datetimethe time the video recordedUnix Time stamp (ms)-
datetimestartthe driving trip starting timeUnix Time stamp (ms)-
speedthe vehicle speedkm/h[0, 161.208]
lightlevelthe light levellux[0, 7760]
illuminancethe illuminiace-Bright/Dark
head_posethe Euler angles (pitch, yaw, roll)degrees-
accelerometer_datachanges in velocitym/s2-
gyroscope_dataangular velocity°/s-
magnetometer_datathe magnetic field intensitytesla-
face_mouthmouth openness ratio-[0, 1]
heart_ratedriver’s heart rateheartbeats per minute[51, 114]
dangerousstatethe critical event-{cellphone_use, distraction_no_attention, camera_sabotage, belt_not_fasten, drowsiness, eating, distraction_no_face}
Table 4. Number of missing values in the dataset.
Table 4. Number of missing values in the dataset.
ColumnData TypeNumber of Missing Values
filenameobject0
framenumberint640
latitudefloat640
longitudefloat640
altitudefloat640
datetimeint640
datetimestartint640
speedfloat640
accelerationfloat640
lightlevelint640
illuminanceobject0
head_poseobject59,320
accelerometer_dataobject0
gyroscope_dataobject0
magnetometer_dataobject0
face_mouthobject199,089
heart_ratefloat64466,228
dangerousstateobject0
Table 5. Respiratory rate calculation for each category.
Table 5. Respiratory rate calculation for each category.
Heart Rate Class (Heartbeats per Minute)Respiratory Rate Mean Value (Breaths per Minute)
51–7114.5
71–9119
92–11427
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Othman, W.; Kashevnik, A.; Ali, A.; Shilov, N. DriverMVT: In-Cabin Dataset for Driver Monitoring including Video and Vehicle Telemetry Information. Data 2022, 7, 62. https://0-doi-org.brum.beds.ac.uk/10.3390/data7050062

AMA Style

Othman W, Kashevnik A, Ali A, Shilov N. DriverMVT: In-Cabin Dataset for Driver Monitoring including Video and Vehicle Telemetry Information. Data. 2022; 7(5):62. https://0-doi-org.brum.beds.ac.uk/10.3390/data7050062

Chicago/Turabian Style

Othman, Walaa, Alexey Kashevnik, Ammar Ali, and Nikolay Shilov. 2022. "DriverMVT: In-Cabin Dataset for Driver Monitoring including Video and Vehicle Telemetry Information" Data 7, no. 5: 62. https://0-doi-org.brum.beds.ac.uk/10.3390/data7050062

Article Metrics

Back to TopTop