Signal, Image and Video Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (31 August 2021) | Viewed by 10639

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea
Interests: image processing; video compression; multimedia signal processing; image enhancement; deep learning

Special Issue Information

Dear Colleagues,

An objective of this Special Issue is to present innovative developments of various aspects of signal, image, and video processing. It seeks to offer broad coverage of recent developments of signal, image, and video areas in both theory and applications. Tremendous advancements have emerged using convolutional neural networks for image and video processing based on deep learning, which were inspired by convolution operation in signal processing areas. Numerous image and video applications using artificial intelligence and deep learning are emerging these days.

This Special Issue covers theory, algorithms, methods, and architectures for the processing, formation, communication, and analysis of signals, images, and video in a variety of applications.

Topics include but are not limited to:

  • Signal/image/video representation;
  • Image/video understanding;
  • Image/video compression;
  • Intelligent signal processing;
  • Image enhancement and denoising based on deep learning;
  • Depth estimation and 3D reconstruction;
  • Image and video quality metrics;
  • Computer vision using convolutional neural networks.

Prof. Changhoon Yim
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • signal processing
  • image processing
  • video processing
  • computer vision
  • convolutional neural network (CNN)
  • deep learning
  • image enhancement
  • quality metrics

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 11828 KiB  
Article
Interactive Part Segmentation Using Edge Images
by Ju-Young Oh and Jung-Min Park
Appl. Sci. 2021, 11(21), 10106; https://0-doi-org.brum.beds.ac.uk/10.3390/app112110106 - 28 Oct 2021
Viewed by 1298
Abstract
As more and more fields utilize deep learning, there is an increasing demand to make suitable training data for each field. The existing interactive object segmentation models can easily make the mask label data because these can accurately segment the area of the [...] Read more.
As more and more fields utilize deep learning, there is an increasing demand to make suitable training data for each field. The existing interactive object segmentation models can easily make the mask label data because these can accurately segment the area of the target object through user interaction. However, it is difficult to accurately segment the target part in the object using the existing models. We propose a method to increase the accuracy of part segmentation by using the proposed interactive object segmentation model trained only with edge images instead of color images. The results evaluated with the PASCAL VOC Part dataset show that the proposed method can accurately segment the target part compared to the existing interactive object segmentation model and the semantic part-segmentation model. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing)
Show Figures

Figure 1

20 pages, 10260 KiB  
Article
Keyword Detection Based on RetinaNet and Transfer Learning for Personal Information Protection in Document Images
by Guo-Shiang Lin, Jia-Cheng Tu and Jen-Yung Lin
Appl. Sci. 2021, 11(20), 9528; https://0-doi-org.brum.beds.ac.uk/10.3390/app11209528 - 13 Oct 2021
Cited by 7 | Viewed by 1909
Abstract
In this paper, a keyword detection scheme is proposed based on deep convolutional neural networks for personal information protection in document images. The proposed scheme is composed of key character detection and lexicon analysis. The first part is the key character detection developed [...] Read more.
In this paper, a keyword detection scheme is proposed based on deep convolutional neural networks for personal information protection in document images. The proposed scheme is composed of key character detection and lexicon analysis. The first part is the key character detection developed based on RetinaNet and transfer learning. To find the key characters, RetinaNet, which is composed of convolutional layers featuring a pyramid network and two subnets, is exploited to detect key characters within the region of interest in a document image. After the key character detection, the second part is a lexicon analysis, which analyzes and combines several key characters to find the keywords. To train the model of RetinaNet, synthetic image generation and data augmentation are exploited to yield a large image dataset. To evaluate the proposed scheme, many document images are selected for testing, and two performance measurements, IoU (Intersection Over Union) and mAP (Mean Average Precision), are used in this paper. Experimental results show that the mAP rates of the proposed scheme are 85.1% and 85.84% for key character detection and keyword detection, respectively. Furthermore, the proposed scheme is superior to Tesseract OCR (Optical Character Recognition) software for detecting the key characters in document images. The experimental results demonstrate that the proposed method can effectively localize and recognize these keywords within noisy document images with Mandarin Chinese words. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing)
Show Figures

Figure 1

15 pages, 32642 KiB  
Article
Novel Multiple-Image Encryption Scheme Based on Coherent Beam Combining and Equal Modulus Decomposition
by Wei Li, Aimin Yan and Hongbo Zhang
Appl. Sci. 2021, 11(19), 9310; https://0-doi-org.brum.beds.ac.uk/10.3390/app11199310 - 07 Oct 2021
Cited by 1 | Viewed by 1457
Abstract
In our research, we propose a novel asymmetric multiple-image encryption method using a conjugate Dammann grating (CDG), which is based on the coherent beam combining (CBC) principle. The phase generated by the Dammann grating (DG) beam splitting system is processed and added to [...] Read more.
In our research, we propose a novel asymmetric multiple-image encryption method using a conjugate Dammann grating (CDG), which is based on the coherent beam combining (CBC) principle. The phase generated by the Dammann grating (DG) beam splitting system is processed and added to the image to be encrypted, and then, the ciphertexts and keys are generated by equal modulus decomposition (EMD). Decryption is to combine the beams through the CDG and collect the combined images in the far field. The proposed encryption scheme is flexible and thus extendable. CDG structure parameters, such as one period length of CDG, can be used as encryption key for the increase of the complexity. The Fresnel diffraction distance can also be used as an encryption key. The power of the combined beam is stronger than that of the single beam system, which is convenient for long-distance transmission and also easy to detect. Simulation results show that the proposed method is effective and efficient for asymmetric multiple-image encryption. Sensitivity analysis of CDG alignment has also been performed showing the robustness of the system. The influence of occlusion attack and noise attack on decryption are also discussed, which proves the stability of the system. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing)
Show Figures

Figure 1

13 pages, 10602 KiB  
Article
DSCope: Development of Automatic Program for Detecting Fractures and Measuring Dip Angles
by Dongseob Lee, Sangyoon Sung, Junghae Choi and You-Hong Kihm
Appl. Sci. 2021, 11(14), 6423; https://0-doi-org.brum.beds.ac.uk/10.3390/app11146423 - 12 Jul 2021
Viewed by 1482
Abstract
Changes in underground environments have been predicted by investigating underground bedrock conditions and analyzing the shapes of discontinuities in the rocks. The most commonly used method is to drill a borehole, insert a camera inside and capture the wall of the borehole in [...] Read more.
Changes in underground environments have been predicted by investigating underground bedrock conditions and analyzing the shapes of discontinuities in the rocks. The most commonly used method is to drill a borehole, insert a camera inside and capture the wall of the borehole in a photograph to investigate the discontinuities. However, if the images of the borehole cannot be captured, the characteristics of the discontinuities in the bedrock are analyzed by capturing the drilling cores in photographs. In this case, considerable time is required to analyze the drilling cores with the naked eye and measure the attitudes of the discontinuities developed in the cores in detail. Moreover, the results may vary depending on the researcher’s perspective. To overcome these limitations, this study develops a program for analyzing photographs of drilling cores. The program can automatically identify discontinuities in drilling cores and measure the attitudes through linear fitting using only drilling core photographs. In addition, we apply the program to practical field data to verify its applicability. We found that the program could provide more accurate and objective information on drilling cores than the currently used method and could more effectively organize the characteristics of fractures in the study area. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing)
Show Figures

Figure 1

11 pages, 2736 KiB  
Article
A Deep Multi-Frame Super-Resolution Network for Dynamic Scenes
by Ze Pan, Zheng Tan and Qunbo Lv
Appl. Sci. 2021, 11(7), 3285; https://0-doi-org.brum.beds.ac.uk/10.3390/app11073285 - 06 Apr 2021
Cited by 2 | Viewed by 2372
Abstract
The multi-frame super-resolution techniques have been prosperous over the past two decades. However, little attention has been paid to the combination of deep learning and multi-frame super-resolution. One reason is that most deep learning-based super-resolution methods cannot handle variant numbers of input frames. [...] Read more.
The multi-frame super-resolution techniques have been prosperous over the past two decades. However, little attention has been paid to the combination of deep learning and multi-frame super-resolution. One reason is that most deep learning-based super-resolution methods cannot handle variant numbers of input frames. Another reason is that it is hard to capture accurate temporal and spatial information because of the misalignment of input images. To solve these problems, we propose an optical-flow-based multi-frame super-resolution framework, which is capable of dealing with various numbers of input frames. This framework enables to make full use of the input frames, allowing it to obtain better performance. In addition, we use a spatial subpixel alignment module for more accurate subpixel-wise spatial alignment and introduce a dual weighting module to generate weights for temporal fusion. Both two modules lead to more effective and accurate temporal fusion. We compare our method with other state-of-the-art methods and conduct ablation studies on our method. The results of qualitative and quantitative analyses show that our method achieves state-of-the-art performances, demonstrating the advantage of the designed framework and the necessity of proposed modules. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing)
Show Figures

Figure 1

Back to TopTop