Mobile Camera-Based Image and Video Processing

A special issue of Journal of Imaging (ISSN 2313-433X). This special issue belongs to the section "Image and Video Processing".

Deadline for manuscript submissions: closed (15 December 2021) | Viewed by 15116

Special Issue Editors

Laboratory of Informatics, Image and Interaction (L3i), La Rochelle University, La Rochelle, France
Interests: image processing and analysis; video processing and analysis; pattern recognition; document forensics and security; mobile camera-based processing and analysis
EPITA Research and Development Laboratory (LRDE), Le Kremlin-Bicêtre, France
Interests: computer vision; machine learning; image processing; document analysis and recognition; evaluation protocols; mobile document imaging
L3i Laboratory, La Rochelle University, 17000 La Rochelle, France
Interests: document image analysis; pattern recognition; computer vision

Special Issue Information

Dear Colleagues,

Today, mobile devices such as smartphones have become very popular for acquiring objects due to their availability and simple use. They allow us to easily capture several image samples or a video of an object in photo or video mode. However, mobile camera-captured images and videos represent a particular challenge for the analysis, recognition and understanding of their content, as they may contain several degradations such as lightning variations, shadows, geometric distortions, blur, noise and low resolution. Another challenge is the processing of these contents on the mobile device, as the methods have to account for the computational resources, to avoid draining the battery. The versatility of mobile devices enables new applications in various domains such as augmented reality, mobile document scanning and integrity checks, user authentication and presentation attack detection, automated photo enhancement, 3D acquisition, etc.

We request contributions taking an academic or industrial perspective in the field of mobile camera-based image and video processing. We hope that the submitted articles will raise and respond to new issues of mobile camera-based processing and its applications, or represent ongoing research to improve image and video analysis, recognition and understanding methods. The topics of interest include, but are not limited to the keywords below.

Dr. Petra Gomez-Krämer
Dr. Joseph Chazalon
Dr. Muhammad Muzzamil Luqman
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Imaging is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Mobile camera-based acquisition: acquisition protocols, arrays/groups of cameras, the fusion of multiple sensor data
  • Camera calibration and pose estimation
  • The preprocessing of mobile camera-captured images: deblurring, dewarping, denoising and super-resolution
  • Quality assessment: blur assessment, noise assessment, compression quality and readability
  • The analysis, recognition and understanding of mobile camera-captured images and videos
  • Embedded processing: real-time processing and lightweight approaches
  • Interactive approaches: live feedback
  • Applications: augmented reality, virtual reality, mobile camera-based forensics and security, mobile document acquisition, and new usages of mobile camera-based images and video
  • Datasets and evaluation: challenging datasets, performance evaluation and prototyping tools

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

12 pages, 7206 KiB  
Article
Document Liveness Challenge Dataset (DLC-2021)
by Dmitry V. Polevoy, Irina V. Sigareva, Daria M. Ershova, Vladimir V. Arlazarov, Dmitry P. Nikolaev, Zuheng Ming, Muhammad Muzzamil Luqman and Jean-Christophe Burie
J. Imaging 2022, 8(7), 181; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8070181 - 28 Jun 2022
Cited by 3 | Viewed by 3717
Abstract
Various government and commercial services, including, but not limited to, e-government, fintech, banking, and sharing economy services, widely use smartphones to simplify service access and user authorization. Many organizations involved in these areas use identity document analysis systems in order to improve user [...] Read more.
Various government and commercial services, including, but not limited to, e-government, fintech, banking, and sharing economy services, widely use smartphones to simplify service access and user authorization. Many organizations involved in these areas use identity document analysis systems in order to improve user personal-data-input processes. The tasks of such systems are not only ID document data recognition and extraction but also fraud prevention by detecting document forgery or by checking whether the document is genuine. Modern systems of this kind are often expected to operate in unconstrained environments. A significant amount of research has been published on the topic of mobile ID document analysis, but the main difficulty for such research is the lack of public datasets due to the fact that the subject is protected by security requirements. In this paper, we present the DLC-2021 dataset, which consists of 1424 video clips captured in a wide range of real-world conditions, focused on tasks relating to ID document forensics. The novelty of the dataset is that it contains shots from video with color laminated mock ID documents, color unlaminated copies, grayscale unlaminated copies, and screen recaptures of the documents. The proposed dataset complies with the GDPR because it contains images of synthetic IDs with generated owner photos and artificial personal information. For the presented dataset, benchmark baselines are provided for tasks such as screen recapture detection and glare detection. The data presented are openly available in Zenodo. Full article
(This article belongs to the Special Issue Mobile Camera-Based Image and Video Processing)
Show Figures

Figure 1

20 pages, 2664 KiB  
Article
Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
by Sanjana Gunna, Rohit Saluja and Cheerakkuzhi Veluthemana Jawahar
J. Imaging 2022, 8(4), 86; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8040086 - 23 Mar 2022
Cited by 1 | Viewed by 2574
Abstract
Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse [...] Read more.
Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse fonts to ensure robust reading solutions. We present utilizing additional non-Unicode fonts with generally employed Unicode fonts to cover font diversity in such synthesizers for Indian languages. We also perform experiments on transfer learning among six different Indian languages. Our transfer learning experiments on synthetic images with common backgrounds provide an exciting insight that Indian scripts can benefit from each other than from the extensive English datasets. Our evaluations for the real settings help us achieve significant improvements over previous methods on four Indian languages from standard datasets like IIIT-ILST, MLT-17, and the new dataset (we release) containing 440 scene images with 500 Gujarati and 2535 Tamil words. Further enriching the synthetic dataset with non-Unicode fonts and multiple augmentations helps us achieve a remarkable Word Recognition Rate gain of over 33% on the IIIT-ILST Hindi dataset. We also present the results of lexicon-based transcription approaches for all six languages. Full article
(This article belongs to the Special Issue Mobile Camera-Based Image and Video Processing)
Show Figures

Figure 1

20 pages, 21653 KiB  
Article
Digitization of Handwritten Chess Scoresheets with a BiLSTM Network
by Nishatul Majid and Owen Eicher
J. Imaging 2022, 8(2), 31; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8020031 - 30 Jan 2022
Cited by 2 | Viewed by 3880
Abstract
During an Over-the-Board (OTB) chess event, all players are required to record their moves strictly by hand, and later the event organizers are required to digitize these sheets for official records. This is a very time-consuming process, and in this paper we present [...] Read more.
During an Over-the-Board (OTB) chess event, all players are required to record their moves strictly by hand, and later the event organizers are required to digitize these sheets for official records. This is a very time-consuming process, and in this paper we present an alternate workflow of digitizing scoresheets using a BiLSTM network. Starting with a pretrained network for standard Latin handwriting recognition, we imposed chess-specific restrictions and trained with our Handwritten Chess Scoresheet (HCS) dataset. We developed two post-processing strategies utilizing the facts that we have two copies of each scoresheet (both players are required to write the entire game), and we can easily check if a move is valid. The autonomous post-processing requires no human interaction and achieves a Move Recognition Accuracy (MRA) around 95%. The semi-autonomous approach, which requires requesting user input on unsettling cases, increases the MRA to around 99% while interrupting only on 4% moves. This is a major extension of the very first handwritten chess move recognition work reported by us in September 2021, and we believe this has the potential to revolutionize the scoresheet digitization process for the thousands of chess events that happen every day. Full article
(This article belongs to the Special Issue Mobile Camera-Based Image and Video Processing)
Show Figures

Figure 1

13 pages, 2420 KiB  
Article
A Temporal Boosted YOLO-Based Model for Birds Detection around Wind Farms
by Hiba Alqaysi, Igor Fedorov, Faisal Z. Qureshi and Mattias O’Nils
J. Imaging 2021, 7(11), 227; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging7110227 - 27 Oct 2021
Cited by 10 | Viewed by 4049
Abstract
Object detection for sky surveillance is a challenging problem due to having small objects in a large volume and a constantly changing background which requires high resolution frames. For example, detecting flying birds in wind farms to prevent their collision with the wind [...] Read more.
Object detection for sky surveillance is a challenging problem due to having small objects in a large volume and a constantly changing background which requires high resolution frames. For example, detecting flying birds in wind farms to prevent their collision with the wind turbines. This paper proposes a YOLOv4-based ensemble model for bird detection in grayscale videos captured around wind turbines in wind farms. In order to tackle this problem, we introduce two datasets—(1) Klim and (2) Skagen—collected at two locations in Denmark. We use Klim training set to train three increasingly capable YOLOv4 based models. Model 1 uses YOLOv4 trained on the Klim dataset, Model 2 introduces tiling to improve small bird detection, and the last model uses tiling and temporal stacking and achieves the best mAP values on both Klim and Skagen datasets. We used this model to set up an ensemble detector, which further improves mAP values on both datasets. The three models achieve testing mAP values of 82%, 88%, and 90% on the Klim dataset. mAP values for Model 1 and Model 3 on the Skagen dataset are 60% and 92%. Improving object detection accuracy could mitigate birds’ mortality rate by choosing the locations for such establishment and the turbines location. It can also be used to improve the collision avoidance systems used in wind energy facilities. Full article
(This article belongs to the Special Issue Mobile Camera-Based Image and Video Processing)
Show Figures

Figure 1

Back to TopTop