Next Article in Journal
Exploring Inner-City Residents’ and Foreigners’ Commitment to Improving Air Pollution: Evidence from a Field Survey in Hanoi, Vietnam
Previous Article in Journal
FastFix Albatross Data: Snapshots of Raw GPS L-1 Data from Southern Royal Albatross
Data Descriptor

Hand-Washing Video Dataset Annotated According to the World Health Organization’s Hand-Washing Guidelines

1
Medical Education Technology Centre, Riga Stradins University, Dzirciema iela 16, LV-1007 Riga, Latvia
2
Department of Infectious Diseases and Hospital Epidemiology, Pauls Stradins Clinical University Hospital, Pilsonu Street 13, LV-1002 Riga, Latvia
3
Institute of Electronics and Computer Science (EDI), Dzerbenes 14, LV-1006 Riga, Latvia
*
Author to whom correspondence should be addressed.
Academic Editor: Joaquín Torres-Sospedra
Received: 26 February 2021 / Revised: 1 April 2021 / Accepted: 5 April 2021 / Published: 7 April 2021

Abstract

Washing hands is one of the most important ways to prevent infectious diseases, including COVID-19. The World Health Organization (WHO) has published hand-washing guidelines. This paper presents a large real-world dataset with videos recording medical staff washing their hands as part of their normal job duties in the Pauls Stradins Clinical University Hospital. There are 3185 hand-washing episodes in total, each of which is annotated by up to seven different persons. The annotations classify the washing movements according to the WHO guidelines by marking each frame in each video with a certain movement code. The intention of this “in-the-wild” dataset is two-fold: to serve as a basis for training machine-learning classifiers for automated hand-washing movement recognition and quality control, and to allow to investigation of the real-world quality of washing performed by working medical staff. We demonstrate how the data can be used to train a machine-learning classifier that achieves classification accuracy of 0.7511 on a test dataset.
Keywords: hand-washing; hand movements; video dataset hand-washing; hand movements; video dataset

1. Summary

In 2019 European Centre for Disease Prevention and Control (ECDC) and the World Health Organisation (WHO) declared curbing anti-microbial resistance as one of the global health priorities. The death toll from infections caused by multidrug-resistant bacteria reached 34,000 per year in Europe and 700,000 worldwide [1,2,3]. In 2020 the world faced an even greater global health crisis—COVID-19 pandemic caused by SARS-CoV-2. Along with other infection prevention and control (IPC) measures, hand hygiene has been a critical and low-cost safety measure required for preventing the spread and cross-transmission of both multidrug-resistant bacteria and SARS-CoV-2 [4]. It is crucial to perform hand hygiene correctly, following the WHO guidelines on the six key hand-washing moments and the appropriate amount of time [5]. However, the general public and even medical professionals worldwide often neglect these guidelines [6,7,8,9] despite perpetual educational campaigns [10]. The COVID-19 pandemic has accelerated this issue even further. To develop the institution and group-specific recommendations to improve compliance with the WHO guidelines, it is necessary to automate the quality control of hand hygiene.
To develop machine-learning classifiers for accurate hand-washing movement recognition, large-scale real-world datasets are necessary. Few examples of such datasets are openly available. The Kinetics Human Action Video Dataset [11] by Google contains 916 videos of washing hands. The Kaggle data science site provides a Hand Wash Dataset [12], a publicly available sample which has 292 videos labeled according to the WHO guidelines. The STAIR Actions dataset [13] consists of more than 100,000 videos, of which around 1000 are related to washing hands. However, the examples are limited in multiple ways: none of them has more than 1000 hand-washing videos, they do not focus on medical professionals, and only the Kaggle dataset provides labels according to the WHO guidelines.
To address these limitations, this paper presents a large-scale real-world dataset collected in summer 2020 in one of the largest hospitals in Latvia, the Pauls Stradins Clinical University Hospital.

2. Dataset Description

The dataset consists of video files along with their annotations in CSV and JSON formats. Table 1 presents an overview of the dataset.

2.1. Folder Structure

The files in the dataset are structured as follows:
DataSets
\- Dataset1
  \- Videos
    \- 2020-06-27_11-57-25_camera104.mp4
    \- 2020-06-28_18-28-10_camera102.mp4
    \- ...
  \- Annotations
    \- Annotator1
    \- 2020-06-27_11-57-25_camera104.csv
    \- 2020-06-27_11-57-25_camera104.json
    \- ...
    \- Annotator2
    \- 2020-06-27_11-57-25_camera104.csv
    \- 2020-06-27_11-57-25_camera104.json
    \- ...
\- Dataset2
  \- Videos
    \- ...
  \- Annotations
    \- ...
\- Dataset3
  \- Videos
    \- ...
  \- Annotations
    \- ...
...
summary.csv
statistics.csv
Each video file has annotations from one or more annotators. For convenience, two annotation formats are included, although most of the information in the CSV and JSON files is overlapping. A video file name.mp4 has annotations in both name.csv and name.json. Additionally, several files describing aggregate information are present. The file summary.csv contains a summary of the dataset, and  the file statistics.csv contains the main metrics for each hand-washing episode in the dataset.

2.2. Annotations

Each frame in each video is annotated with the following information: (1) whether hand-washing was visible in the frame, and (2) which of the WHO movements, if any, did the hand-washing corresponded to.
Each CSV file contains three columns. The first column is frame_time—the time in seconds of the frame in the video—and the second and third columns, called is_washing and movement_code, contain the movement annotations. Each JSON file contains a dictionary with several keys. The data under the “labels” key contains movement annotations for each frame. The other keys contain supplementary information about the quality of the hand-washing performed in the video. They are: “is_ring_present” “is_armband_present” “is_long_nails_present”, and contain information about whether the person washing hands has a ring, an armband or watch, and long (artificial) nails.
The movement codes in the annotations correspond to specific hand-washing movements as defined by the WHO guidelines [5], described in Table 2.
According to our annotation guidelines presented for the annotators:
  • Code from 1 to 6 is used to denote a correctly performed hand-washing movement that corresponds to one of the WHO movements.
  • Code 0 is used to denote both WHO washing movements that are not performed correctly, and washing movement that have not been defined by the WHO.
  • Code 7 is used to denote the process of correctly terminating the hand-washing episode. Specifically, in order to use the code 7 in the video, we require that the person who washes hands takes a paper towel, dries their hands with it, and then closes the faucet with the towel.
  • Frames that do not have hand-washing depicted are labeled as such (is_washing set to zero). The movement code should be ignored for such frames.

2.3. Quality Issues

To increase the reliability of the annotations, the majority of files in the dataset are labeled by more than one annotator (Figure 1). Overall, there is a reasonably good match between the annotators. For example, frames that are annotated by two annotators have 91.23% agreement on the is_washing field. Those of the frames that have both annotators setting is_washing to one, further have 90.06% agreement in the movement code. We have identified some reasons for a mismatch:
  • Short-term disagreement between the labels typically exists at the points when the washing movement is changed.
  • Movements 1 and 3 look quite similar and can be hard to distinguish when filmed at an angle.
  • The interpretation of what constitutes movement 7 has been different between the different reviewers.
  • Some of the videos have low light levels.

3. Methods

The videos were recorded on either AirLive IP cameras or Axis IP cameras. They were saved by Raspberry Pi 4 devices, which had the cameras attached over Ethernet cables. The cameras were deployed in nine different locations simultaneously, with one location corresponding to one sink. In total there were 12 cameras, and some of the Raspberry Pi devices had more than one camera attached. This enabled us to record hand-washing in a single sink simultaneously from different angles.
The locations where the cameras were deployed included a hospital neurology unit, surgery unit, an intensive care unit, and other hospital units in the Pauls Stradins Clinical University Hospital. In the dataset, the locations are anonymized. In case location information is required for your research, contact us for the list of locations and their correspondence to the directories in the dataset.
The cameras recorded all continuous movements within their field of view. To filter out short-term movements (e.g., a person passing by), recording was only started in the case when motion was detected for 3 s continuously. As a result, the videos in the dataset may have up to first 3 s of each hand-washing episode missing. Videos shorter than 20 s were not saved by the recording system to minimize the number of false-positive motion detections.
The recorded data was manually collected to a central server at the Riga Stradins University by bringing in the SD cards from the Pi’s and uploading their data. Subsequently, annotators were given access to the video files on the server, and asked to label the files using a Python OpenCV application that we had developed for the task. The annotators pre-filtered the files to remove videos that did not include an actual hand-washing episode. Each annotator did this independently, based on our guidelines. As a result, some files have been filtered out by one annotator, but not by other ones. In the final dataset, there are instances when an annotation for a video is present in e.g., Annotator1, but not in Annotator2. The annotators are anonymized in the final dataset, and the folder Annotator1 in one part of the dataset is not necessarily annotated by the same person as the folder Annotator1 in a different part of the dataset.

4. Application Example

The intention of this dataset is two-fold: first, to serve as a basis for training machine-learning classifiers for automated hand-washing movement recognition and quality control; second, to allow investigation of the real-world quality of washing performed by working medical staff.
To demonstrate the first application, we trained on video data and annotations MobileNetV2 [14], a neural network classifier available in in Keras [15], a high-level deep learning API for the deep learning tensor library TensorFlow [16]. We aimed to recognize the Movements 1 to 6 as defined by the WHO, and to distinguish them from the movement 0. As to the movement 7, while it is important for clinical outcomes, is not relevant to our machine-learning goals here, so was treated as the movement 0 for the classification purposes.
We started by partitioning the dataset in two portions, test and train and validation (10% and 90%, respectively) to ensure that frames from test videos were kept separately from the frames from the training and validation datasets. We subsequently extracted frames from the videos and saved them as JPEG files. As the class 0 is over-represented in the data, we only saved 20% of the JPEG files corresponding to this class. Further data processing consisted of resizing the images files to 224 × 224 pixels, which is the standard input size for MobileNetV2 implementation in Keras, and applying random flipping and rotations by 20 degrees to them to augment the data.
To obtain the first results quickly, we used MobileNetV2 model pre-trained on the Imagenet dataset [17] and trained the model for 10 epochs using Adam optimizer with learning rate of 0.008 and categorical loss function (Figure 2); no more refined hyperparameter tuning was done. As a result, we achieved classification accuracy of 0.7511 on the test dataset.
Python scripts for preprocessing the dataset and training the MobileNetV2 model are available at https://github.com/edi-riga/handwash, accessed on 22 March 2021. More extensive evaluation of machine-learning classifiers evaluated on the data is available in [18].

Author Contributions

Conceptualization, A.E., R.K., A.S. (Andreta Slavinska), M.L., A.R.; methodology, A.E., M.I., R.K., A.S. (Andreta Slavinska), M.L., A.R., A.V., A.G.; software, M.I., A.S. (Ansis Skadins), A.E.; validation, M.I., A.S. (Ansis Skadins), A.E.; data curation, M.L., A.E., A.R.; writing—original draft preparation, M.L., A.R., A.E., M.I.; visualization, A.E., M.I.; supervision, A.E., R.K.; project administration, A.S. (Andreta Slavinska); funding acquisition, A.S. (Andreta Slavinska), A.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Education and Science, Republic of Latvia, project “Integration of reliable technologies for protection against COVID-19 in healthcare and high-risk areas”, project No. VPP-COVID-2020/1-0004.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of Riga Stradins University (protocol code Nr. 6-1/08/10, date of approval 23 July 2021).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data is available at https://0-doi-org.brum.beds.ac.uk/10.5281/zenodo.4537209, accessed on 22 March 2021.

Acknowledgments

We thank the Pauls Stradins Hospital for allowing us to collect data in their premises. We also extensively thank all who participated in the labor-intensive task of annotating the data.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. European Centre for Disease Prevention and Control. Single Programming Document 2019–2021. 2019. Available online: https://www.ecdc.europa.eu/en/publications-data/single-programming-document-2019-2021 (accessed on 22 March 2021).
  2. World Health Organization. Challenges to Tackling Antimicrobial Resistance Economic and Policy Responses: Economic and Policy Responses; OECD Publishing: Paris, France, 2020. [Google Scholar]
  3. World Health Organization. No Time to Wait: Securing the Future from Drug-Resistant Infections. 2019. Available online: https://www.who.int/antimicrobial-resistance/interagency-coordination-group/final-report/en/ (accessed on 22 March 2021).
  4. CDC. Interim Infection Prevention and Control Recommendations for Healthcare Personnel During the Coronavirus Disease 2019 (COVID-19) Pandemic. Available online: https://www.cdc.gov/coronavirus/2019-ncov/hcp/infection-control-recommendations.html (accessed on 22 March 2021).
  5. World Health Organization. WHO Guidelines on Hand Hygiene in Health Care: First Global Patient Safety Challenge Clean Care Is Safer Care; World Health Organization: Geneva, Switzerland, 2009. [Google Scholar]
  6. Widmer, A.F.; Dangel, M. Alcohol-based handrub: Evaluation of technique and microbiological efficacy with international infection control professionals. Infect. Control Hosp. Epidemiol. 2004, 25, 207–209. [Google Scholar] [CrossRef] [PubMed]
  7. Widmer, A.F.; Conzelmann, M.; Tomic, M.; Frei, R.; Stranden, A.M. Introducing alcohol-based hand rub for hand hygiene: The critical need for training. Infect. Control Hosp. Epidemiol. 2007, 28, 50–54. [Google Scholar] [CrossRef] [PubMed]
  8. Sutter, S.T.; Frei, R.; Dangel, M.; Widmer, A. Effect of teaching recommended World Health Organization technique on the use of alcohol-based hand rub by medical students. Infect. Control Hosp. Epidemiol. 2010, 31, 1194–1195. [Google Scholar] [CrossRef] [PubMed]
  9. Szilágyi, L.; Haidegger, T.; Lehotsky, Á.; Nagy, M.; Csonka, E.A.; Sun, X.; Ooi, K.L.; Fisher, D. A large-scale assessment of hand hygiene quality and the effectiveness of the “WHO 6-steps”. BMC Infect. Dis. 2013, 13, 1–10. [Google Scholar] [CrossRef]
  10. Luangasanatip, N.; Hongsuwan, M.; Limmathurotsakul, D.; Lubell, Y.; Lee, A.S.; Harbarth, S.; Day, N.P.; Graves, N.; Cooper, B.S. Comparative efficacy of interventions to promote hand hygiene in hospital: Systematic review and network meta-analysis. BMJ 2015, 351, h3728. [Google Scholar] [CrossRef] [PubMed]
  11. Kay, W.; Carreira, J.; Simonyan, K.; Zhang, B.; Hillier, C.; Vijayanarasimhan, S.; Viola, F.; Green, T.; Back, T.; Natsev, P.; et al. The kinetics human action video dataset. arXiv 2017, arXiv:1705.06950. [Google Scholar]
  12. Kaggle. Sample: Hand Wash Dataset. 2020. Available online: https://www.kaggle.com/realtimear/hand-wash-dataset (accessed on 22 March 2021).
  13. Yoshikawa, Y.; Lin, J.; Takeuchi, A. Stair actions: A video dataset of everyday home actions. arXiv 2018, arXiv:1804.04326. [Google Scholar]
  14. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
  15. Chollet, F. Keras: The Python Deep Learning Library; Astrophysics Source Code Library: Houghton, MI, USA, 2018; p. ascl-1806. [Google Scholar]
  16. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
  17. Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
  18. Ivanovs, M.; Kadikis, R.; Lulla, M.; Rutkovskis, A.; Elsts, A. Automated Quality Assessment of Hand Washing Using Deep Learning. arXiv 2020, arXiv:2011.11383. [Google Scholar]
Figure 1. Number of annotators per video.
Figure 1. Number of annotators per video.
Data 06 00038 g001
Figure 2. Training and validation accuracy of MobileNetV2 model.
Figure 2. Training and validation accuracy of MobileNetV2 model.
Data 06 00038 g002
Table 1. Dataset overview.
Table 1. Dataset overview.
PropertyValue
Frame rate30 FPS
Resolution320 × 240 and 640 × 480
Number of videos3185
Number of annotations6690
Total washing duration83,804 s
Movement 1–7 duration27,517 s
Table 2. Movement codes.
Table 2. Movement codes.
CodeMovement
1Hand-washing movement—Palm to palm
2Hand-washing movement—Palm over dorsum, fingers interlaced
3Hand-washing movement—Palm to palm, fingers interlaced
4Hand-washing movement—Backs of fingers to opposing palm, fingers interl.
5Hand-washing movement—Rotational rubbing of the thumb
6Hand-washing movement—Fingertips to palm
7Turning off the faucet with a paper towel
0Other hand-washing movement
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Back to TopTop