Machine Learning in Image Analysis and Pattern Recognition

A special issue of Data (ISSN 2306-5729). This special issue belongs to the section "Information Systems and Data Management".

Deadline for manuscript submissions: closed (30 June 2021) | Viewed by 35647

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computational Sciences, Maharaja Ranjit Singh Punjab Technical University, Punjab 151001, India
Interests: feature extraction; image classification; artificial intelligence
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Thapar Institute of Engineering and Technology, Patiala 147001, India

E-Mail Website
Guest Editor
School of Engineering and Computer Science, Oakland University, Rochester, MI 48309, USA

Special Issue Information

Dear Colleagues,

Images are ubiqutous. Tens of millions of images are captured everyday for a variety of applications in almost all domains of human endeavors. We have come to rely on machine learning, including deep learning, to analyze and extract actionable information from captured images via recgnizing the presence of patterns of interests. In recent years, these methods, particularly those using deep learning, have exhibited performance levels surpassing human performance in certain applications. 

The purpose of this Special Issue is to chart the progress in applying machine learning, including deep learning, to a broad range of image analysis and pattern recognition problems and applications. To this end, we invite original research articles making unique contributions to the theory, methodology, and applications of machine learning in image analysis and pattern recognition. We also welcome comprehensive survey articles dealing with any particular aspect of machine learning vis-à-vis image analysis and pattern recognition.

Dr. Munish Kumar
Dr. R. K. Sharma
Prof. Dr. Ishwar Sethi

Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Data is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Computer vision
  • Document image analysis
  • Face, gesture, and body pose recognition
  • Image classification and clustring
  • Image segmentation and shape analysis
  • Image matching and retrieval
  • Image synthesis
  • Medical, biological, and cell microscopy
  • Multimedia processing
  • Object detection, recognition, and tracking
  • Representation learning
  • Semantic segmentation
  • Texture analysis

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

12 pages, 7448 KiB  
Article
Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling
by Ashok Sarabu and Ajit Kumar Santra
Data 2020, 5(4), 104; https://0-doi-org.brum.beds.ac.uk/10.3390/data5040104 - 11 Nov 2020
Cited by 8 | Viewed by 3477
Abstract
The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. [...] Read more.
The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

26 pages, 5969 KiB  
Article
An Optimum Tea Fermentation Detection Model Based on Deep Convolutional Neural Networks
by Gibson Kimutai, Alexander Ngenzi, Rutabayiro Ngoga Said, Ambrose Kiprop and Anna Förster
Data 2020, 5(2), 44; https://0-doi-org.brum.beds.ac.uk/10.3390/data5020044 - 30 Apr 2020
Cited by 14 | Viewed by 5762
Abstract
Tea is one of the most popular beverages in the world, and its processing involves a number of steps which includes fermentation. Tea fermentation is the most important step in determining the quality of tea. Currently, optimum fermentation of tea is detected by [...] Read more.
Tea is one of the most popular beverages in the world, and its processing involves a number of steps which includes fermentation. Tea fermentation is the most important step in determining the quality of tea. Currently, optimum fermentation of tea is detected by tasters using any of the following methods: monitoring change in color of tea as fermentation progresses and tasting and smelling the tea as fermentation progresses. These manual methods are not accurate. Consequently, they lead to a compromise in the quality of tea. This study proposes a deep learning model dubbed TeaNet based on Convolution Neural Networks (CNN). The input data to TeaNet are images from the tea Fermentation and Labelme datasets. We compared the performance of TeaNet with other standard machine learning techniques: Random Forest (RF), K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), and Naive Bayes (NB). TeaNet was more superior in the classification tasks compared to the other machine learning techniques. However, we will confirm the stability of TeaNet in the classification tasks in our future studies when we deploy it in a tea factory in Kenya. The research also released a tea fermentation dataset that is available for use by the community. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

Review

Jump to: Research, Other

19 pages, 1955 KiB  
Review
An Interdisciplinary Review of Camera Image Collection and Analysis Techniques, with Considerations for Environmental Conservation Social Science
by Coleman L. Little, Elizabeth E. Perry, Jessica P. Fefer, Matthew T. J. Brownlee and Ryan L. Sharp
Data 2020, 5(2), 51; https://0-doi-org.brum.beds.ac.uk/10.3390/data5020051 - 06 Jun 2020
Cited by 8 | Viewed by 3119
Abstract
Camera-based data collection and image analysis are integral methods in many research disciplines. However, few studies are specifically dedicated to trends in these methods or opportunities for interdisciplinary learning. In this systematic literature review, we analyze published sources (n = 391) to [...] Read more.
Camera-based data collection and image analysis are integral methods in many research disciplines. However, few studies are specifically dedicated to trends in these methods or opportunities for interdisciplinary learning. In this systematic literature review, we analyze published sources (n = 391) to synthesize camera use patterns and image collection and analysis techniques across research disciplines. We frame this inquiry with interdisciplinary learning theory to identify cross-disciplinary approaches and guiding principles. Within this, we explicitly focus on trends within and applicability to environmental conservation social science (ECSS). We suggest six guiding principles for standardized, collaborative approaches to camera usage and image analysis in research. Our analysis suggests that ECSS may offer inspiration for novel combinations of data collection, standardization tactics, and detailed presentations of findings and limitations. ECSS can correspondingly incorporate more image analysis tactics from other disciplines, especially in regard to automated image coding of pertinent attributes. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

Other

Jump to: Research, Review

15 pages, 23389 KiB  
Data Descriptor
King Abdulaziz University Breast Cancer Mammogram Dataset (KAU-BCMD)
by Asmaa S. Alsolami, Wafaa Shalash, Wafaa Alsaggaf, Sawsan Ashoor, Haneen Refaat and Mohammed Elmogy
Data 2021, 6(11), 111; https://0-doi-org.brum.beds.ac.uk/10.3390/data6110111 - 25 Oct 2021
Cited by 10 | Viewed by 10810
Abstract
The current era is characterized by the rapidly increasing use of computer-aided diagnosis (CAD) systems in the medical field. These systems need a variety of datasets to help develop, evaluate, and compare their performances fairly. Physicians indicated that breast anatomy, especially dense ones, [...] Read more.
The current era is characterized by the rapidly increasing use of computer-aided diagnosis (CAD) systems in the medical field. These systems need a variety of datasets to help develop, evaluate, and compare their performances fairly. Physicians indicated that breast anatomy, especially dense ones, and the probability of breast cancer and tumor development, vary highly depending on race. Researchers reported that breast cancer risk factors are related to culture and society. Thus, there is a massive need for a local dataset representing breast cancer in our region to help develop and evaluate automatic breast cancer CAD systems. This paper presents a public mammogram dataset called King Abdulaziz University Breast Cancer Mammogram Dataset (KAU-BCMD) version 1. To our knowledge, KAU-BCMD is the first dataset in Saudi Arabia that deals with a large number of mammogram scans. The dataset was collected from the Sheikh Mohammed Hussein Al-Amoudi Center of Excellence in Breast Cancer at King Abdulaziz University. It contains 1416 cases. Each case has two views for both the right and left breasts, resulting in 5662 images based on the breast imaging reporting and data system. It also contains 205 ultrasound cases corresponding to a part of the mammogram cases, with 405 images as a total. The dataset was annotated and reviewed by three different radiologists. Our dataset is a promising dataset that contains different imaging modalities for breast cancer with different cancer grades for Saudi women. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

7 pages, 4938 KiB  
Data Descriptor
LeLePhid: An Image Dataset for Aphid Detection and Infestation Severity on Lemon Leaves
by Jorge Parraga-Alava, Roberth Alcivar-Cevallos, Jéssica Morales Carrillo, Magdalena Castro, Shabely Avellán, Aaron Loor and Fernando Mendoza
Data 2021, 6(5), 51; https://0-doi-org.brum.beds.ac.uk/10.3390/data6050051 - 17 May 2021
Cited by 12 | Viewed by 3685
Abstract
Aphids are small insects that feed on plant sap, and they belong to a superfamily called Aphoidea. They are among the major pests causing damage to citrus crops in most parts of the world. Precise and automatic identification of aphids is needed [...] Read more.
Aphids are small insects that feed on plant sap, and they belong to a superfamily called Aphoidea. They are among the major pests causing damage to citrus crops in most parts of the world. Precise and automatic identification of aphids is needed to understand citrus pest dynamics and management. This article presents a dataset that contains 665 healthy and unhealthy lemon leaf images. The latter are leaves with the presence of aphids, and visible white spots characterize them. Moreover, each image includes a set of annotations that identify the leaf, its health state, and the infestation severity according to the percentage of the affected area on it. Images were collected manually in real-world conditions in a lemon plant field in Junín, Manabí, Ecuador, during the winter, by using a smartphone camera. The dataset is called LeLePhid: lemon (Le) leaf (Le) image dataset for aphid (Phid) detection and infestation severity. The data can facilitate evaluating models for image segmentation, detection, and classification problems related to plant disease recognition. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

8 pages, 2359 KiB  
Data Descriptor
A Data Descriptor for Black Tea Fermentation Dataset
by Gibson Kimutai, Alexander Ngenzi, Rutabayiro Ngoga Said, Rose C. Ramkat and Anna Förster
Data 2021, 6(3), 34; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030034 - 19 Mar 2021
Cited by 2 | Viewed by 3839
Abstract
Tea is currently the most popular beverage after water. Tea contributes to the livelihood of more than 10 million people globally. There are several categories of tea, but black tea is the most popular, accounting for about 78% of total tea consumption. Processing [...] Read more.
Tea is currently the most popular beverage after water. Tea contributes to the livelihood of more than 10 million people globally. There are several categories of tea, but black tea is the most popular, accounting for about 78% of total tea consumption. Processing of black tea involves the following steps: plucking, withering, crushing, tearing and curling, fermentation, drying, sorting, and packaging. Fermentation is the most important step in determining the final quality of the processed tea. Fermentation is a time-bound process and it must take place under certain temperature and humidity conditions. During fermentation, tea color changes from green to coppery brown to signify the attainment of optimum fermentation levels. These parameters are currently manually monitored. At present, there is only one existing dataset on tea fermentation images. This study makes a tea fermentation dataset available, composed of tea fermentation conditions and tea fermentation images. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

25 pages, 15956 KiB  
Data Descriptor
A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
by Kajsa Møllersen, Jon Yngve Hardeberg and Fred Godtliebsen
Data 2020, 5(2), 56; https://0-doi-org.brum.beds.ac.uk/10.3390/data5020056 - 26 Jun 2020
Cited by 2 | Viewed by 3258
Abstract
Multi-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class [...] Read more.
Multi-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, but the instances are unlabeled. The instances are most commonly regarded as a set of points in a multi-dimensional space. Alternatively, instances are viewed as realizations of random vectors with corresponding probability distribution, where the bag is the distribution, not the realizations. By introducing the probability distribution space to bag-level classification problems, dissimilarities between probability distributions (divergences) can be applied. The bag-to-bag Kullback–Leibler information is asymptotically the best classifier, but the typical sparseness of MI training sets is an obstacle. We introduce bag-to-class divergence to MI learning, emphasizing the hierarchical nature of the random vectors that makes bags from the same class different. We propose two properties for bag-to-class divergences, and an additional property for sparse training sets, and propose a dissimilarity measure that fulfils them. Its performance is demonstrated on synthetic and real data. The probability distribution space is valid for MI learning, both for the theoretical analysis and applications. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

Back to TopTop