Scale Space and Variational Methods in Computer Vision

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 February 2023) | Viewed by 7106

Special Issue Editor

College of Information Technology, School of AI Convergence, Soongsil University, Dongjak-Gu 06978, Seoul, Korea
Interests: image processing; computer vision; machine learning; deep learning; efficient convolutional neural network; facial emotion analysis; etc.

Special Issue Information

Dear Colleagues,

Computer vision is a field of artificial intelligence that enables computers to extract meaningful information from visual inputs such as images or videos and take appropriate actions. Scale space and variational methods in computer vision is one of the techniques to achieve this goal. It mainly focuses on multiscale analysis of image content, partial differential equations, geometric methods, variational methods, and optimization. In the imaging world, distant objects are projected at a smaller size and near objects at a larger size. To extract meaningful information, images or videos should be processed at all levels of scales simultaneously. Scale–space theory in computer vision describes a formal way of representation and computation of features from image data at all scales. Variational methods are a specific class of optimization methods of cost functions in higher dimensions. Instead of defining a heuristic sequence of processing, the variational method enables the derivation of algorithms automatically. Many problems in computer vision can be formulated as variational methods. Examples are denoising, segmentation, super-resolution, tracking, object detection, optical flow estimation, depth estimation, 3D reconstruction, etc.

Dr. Deepak Ghimire
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • denoising and filtering
  • image segmentation
  • super-resolution
  • 3D vision
  • medical imaging
  • feature extraction and analysis
  • scale–space methods in computer vision
  • motion estimation
  • stereo vision
  • 3D scene reconstruction
  • optical flow estimation
  • object detection and tracking
  • variational methods in computer vision
  • optimization

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1256 KiB  
Article
Magnitude and Similarity Based Variable Rate Filter Pruning for Efficient Convolution Neural Networks
by Deepak Ghimire and Seong-Heum Kim
Appl. Sci. 2023, 13(1), 316; https://0-doi-org.brum.beds.ac.uk/10.3390/app13010316 - 27 Dec 2022
Cited by 3 | Viewed by 1465
Abstract
The superior performance of the recent deep learning models comes at the cost of a significant increase in computational complexity, memory use, and power consumption. Filter pruning is one of the effective neural network compression techniques suitable for model deployment in modern low-power [...] Read more.
The superior performance of the recent deep learning models comes at the cost of a significant increase in computational complexity, memory use, and power consumption. Filter pruning is one of the effective neural network compression techniques suitable for model deployment in modern low-power edge devices. In this paper, we propose a loss-aware filter Magnitude and Similarity based Variable rate Filter Pruning (MSVFP) technique. We studied several filter selection criteria based on filter magnitude and similarity among filters within a convolution layer, and based on the assumption that the sensitivity of each layer throughout the network is different, unlike conventional fixed rate pruning methods, our algorithm using loss-aware filter selection criteria automatically finds the suitable pruning rate for each layer throughout the network. In addition, the proposed algorithm adapts two different filter selection criteria to remove weak filters as well as redundant filters based on filter magnitude and filter similarity score respectively. Finally, the iterative filter pruning and retraining approach are used to maintain the accuracy of the network during pruning to its target float point operations (FLOPs) reduction rate. In the proposed algorithm, a small number of retraining steps are sufficient during iterative pruning to prevent an abrupt drop in the accuracy of the network. Experiments with commonly used VGGNet and ResNet models on CIFAR-10 and ImageNet benchmark show the superiority of the proposed method over the existing methods in the literature. Notably, VGG-16, ResNet-56, and ResNet-110 models on the CIFAR-10 dataset even improved the original accuracy with more than 50% reduction in network FLOPs. Additionally, the ResNet-50 model on the ImageNet dataset reduces model FLOPs by more than 42% with a negligible drop in the original accuracy. Full article
(This article belongs to the Special Issue Scale Space and Variational Methods in Computer Vision)
Show Figures

Figure 1

16 pages, 2267 KiB  
Article
DRFENet: An Improved Deep Learning Neural Network via Dilated Skip Convolution for Image Denoising Application
by Ruizhe Zhong and Qingchuan Zhang
Appl. Sci. 2023, 13(1), 28; https://0-doi-org.brum.beds.ac.uk/10.3390/app13010028 - 20 Dec 2022
Viewed by 1762
Abstract
Deep learning technology dominates current research in image denoising. However, denoising performance is limited by target noise feature loss from information propagation in association with the depth of the network. This paper proposes a Dense Residual Feature Extraction Network (DRFENet) combined with a [...] Read more.
Deep learning technology dominates current research in image denoising. However, denoising performance is limited by target noise feature loss from information propagation in association with the depth of the network. This paper proposes a Dense Residual Feature Extraction Network (DRFENet) combined with a Dense Enhancement Block (DEB), a Residual Dilated Block (RDB), a Feature Enhancement Block (FEB), and a Simultaneous Iterative Reconstruction Block (SIRB). The DEB uses our proposed interval transmission strategy to enhance the extraction of noise features in the initial stage of the network. The RDB module uses a combination strategy of concatenated dilated convolution and a skip connection, and the local features are amplified through different perceptual dimensions. The FEB enhances local feature information. The SIRB uses an attention block to learn the noise distribution while using residual learning (RL) technology to reconstruct a denoised image. The combination strategy in DRFENet makes the neural network deeper to obtain higher fine-grained image information. We respectively examined the performance of DRFENet in gray image denoising on datasets BSD68 and SET12 and color image denoising on datasets McMaster, Kodak24, and CBSD68. The experimental results showed that the denoising accuracy of DRFENet is better than most existing image-denoising methods under PSNR and SSIM evaluation indicators. Full article
(This article belongs to the Special Issue Scale Space and Variational Methods in Computer Vision)
Show Figures

Figure 1

22 pages, 19394 KiB  
Article
OTSU Multi-Threshold Image Segmentation Based on Improved Particle Swarm Algorithm
by Jianfeng Zheng, Yinchong Gao, Han Zhang, Yu Lei and Ji Zhang
Appl. Sci. 2022, 12(22), 11514; https://0-doi-org.brum.beds.ac.uk/10.3390/app122211514 - 13 Nov 2022
Cited by 17 | Viewed by 1882
Abstract
In view of the slow convergence speed of traditional particle swarm optimization algorithms, which makes it easy to fall into local optimum, this paper proposes an OTSU multi-threshold image segmentation based on an improved particle swarm optimization algorithm. After the particle swarm completes [...] Read more.
In view of the slow convergence speed of traditional particle swarm optimization algorithms, which makes it easy to fall into local optimum, this paper proposes an OTSU multi-threshold image segmentation based on an improved particle swarm optimization algorithm. After the particle swarm completes the iterative update speed and position, the method of calculating particle contribution degree is used to obtain the approximate position and direction, which reduces the scope of particle search. At the same time, the asynchronous monotone increasing social learning factor and the asynchronous monotone decreasing individual learning factor are used to balance global and local search. Finally, chaos optimization is introduced to increase the diversity of the population to achieve OTSU multi-threshold image segmentation based on improved particle swarm optimization (IPSO). Twelve benchmark functions are selected to test the performance of the algorithm and are compared with the traditional meta-heuristic algorithm. The results show the robustness and superiority of the algorithm. The standard dataset images are used for multi-threshold image segmentation experiments, and some traditional meta-heuristic algorithms are selected to compare the calculation efficiency, peak signal to noise ratio (PSNR), structural similarity (SSIM), feature similarity (FSIM), and fitness value (FITNESS). The results show that the running time of this paper is 30% faster than other algorithms in general, and the accuracy is also better than other algorithms. Experiments show that the proposed algorithm can achieve higher segmentation accuracy and efficiency. Full article
(This article belongs to the Special Issue Scale Space and Variational Methods in Computer Vision)
Show Figures

Figure 1

11 pages, 2875 KiB  
Article
Tracking the Rhythm: Pansori Rhythm Segmentation and Classification Methods and Datasets
by Yagya Raj Pandeya, Bhuwan Bhattarai and Joonwhoan Lee
Appl. Sci. 2022, 12(19), 9571; https://0-doi-org.brum.beds.ac.uk/10.3390/app12199571 - 23 Sep 2022
Cited by 3 | Viewed by 1349
Abstract
This paper presents two methods to understand the rhythmic patterns of the voice in Korean traditional music called Pansori. We used semantic segmentation and classification-based structural analysis methods to segment the seven rhythmic categories of Pansori. We propose two datasets; one is for [...] Read more.
This paper presents two methods to understand the rhythmic patterns of the voice in Korean traditional music called Pansori. We used semantic segmentation and classification-based structural analysis methods to segment the seven rhythmic categories of Pansori. We propose two datasets; one is for rhythm classification and one is for segmentation. Two classification and two segmentation neural networks are trained and tested in an end-to-end manner. The standard HR network and DeepLabV3+ network are used for rhythm segmentation. A modified HR network and a novel GlocalMuseNet are used for the classification of music rhythm. The GlocalMuseNet outperforms the HR network for Pansori rhythm classification. A novel segmentation model (a modified HR network) is proposed for Pansori rhythm segmentation. The results show that the DeepLabV3+ network is superior to the HR network. The classifier networks are used for time-varying rhythm classification that behaves as the segmentation using overlapping window frames in a spectral representation of audio. Semantic segmentation using the DeepLabV3+ and the HR network shows better results than the classification-based structural analysis methods used in this work; however, the annotation process is relatively time-consuming and costly. Full article
(This article belongs to the Special Issue Scale Space and Variational Methods in Computer Vision)
Show Figures

Figure 1

Back to TopTop