Deep Learning-Based Plant Classification Using Nonaligned Thermal and Visible Light Images

Batchuluun, Ganbayar; Nam, Se Hyun; Park, Kang Ryoung

doi:10.3390/math10214053

Open AccessArticle

Deep Learning-Based Plant Classification Using Nonaligned Thermal and Visible Light Images

by

Ganbayar Batchuluun

,

Se Hyun Nam

and

Kang Ryoung Park

^*

Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro, 1-gil, Jung-gu, Seoul 04620, Korea

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(21), 4053; https://0-doi-org.brum.beds.ac.uk/10.3390/math10214053

Submission received: 29 September 2022 / Revised: 25 October 2022 / Accepted: 27 October 2022 / Published: 1 November 2022

(This article belongs to the Special Issue Computer Vision and Pattern Recognition with Applications)

Download

Browse Figures

Versions Notes

Abstract

:

There have been various studies conducted on plant images. Machine learning algorithms are usually used in visible light image-based studies, whereas, in thermal image-based studies, acquired thermal images tend to be analyzed with a naked eye visual examination. However, visible light cameras are sensitive to light, and cannot be used in environments with low illumination. Although thermal cameras are not susceptible to these drawbacks, they are sensitive to atmospheric temperature and humidity. Moreover, in previous thermal camera-based studies, time-consuming manual analyses were performed. Therefore, in this study, we conducted a novel study by simultaneously using thermal images and corresponding visible light images of plants to solve these problems. The proposed network extracted features from each thermal image and corresponding visible light image of plants through residual block-based branch networks, and combined the features to increase the accuracy of the multiclass classification. Additionally, a new database was built in this study by acquiring thermal images and corresponding visible light images of various plants.

Keywords:

plant image; image classification; thermal image; visible light image; deep learning

MSC:

68T07; 68U10

1. Introduction

Many studies have investigated plant image classification methods and databases [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]. However, all these studies utilized visible light cameras, which are sensitive to light and have the drawback of obtaining low-quality images owing to shadows, illumination changes, and ambient light and its reflections. In addition, visible light cameras cannot be used in the absence of light.

To avoid such problems, a thermal camera was used in previous research [22]. However, there are still very few studies on plant image classification using thermal cameras, and the acquired thermal images of plants have mostly been analyzed with a naked eye examination [23,24]. In addition, thermal cameras are sensitive to atmospheric temperature and humidity; therefore, they have the disadvantage of capturing low-quality images owing to rainwater and ambient heat from surrounding objects. To solve these problems, Raza et al. [25] used thermal and visible light cameras simultaneously. However, they conducted experiments based on manual feature extraction. The extraction of the most suitable features from a database is difficult while using the manual feature extraction-based method. Furthermore, binary classification, which only classifies plant images into two classes, diseased and healthy, was performed in this previous study. Therefore, we conducted novel multiclass classification research based on automatic feature extraction using a deep learning method in this study. In the proposed method, we developed a plant classification residual (PlantCR) network by using VGG-Net [26] and ResNet [27]. Furthermore, we used this structure to extract features from thermal and visible light images of plants and to combine them. Then, the plants were classified into 28 classes. A detailed description of our method is described in Section 3. Moreover, we obtained and constructed a novel nonaligned thermal and visible light plant image database (TherVisDb) [28]. In addition, we used this constructed database to perform various experiments in the ablation study. The novelties of our study can be explained in the following four ways:

-: Previous studies only dealt with the binary classification of plants based on thermal and corresponding visible light images. Therefore, this is the first research for the multiclass classification of plants using thermal and corresponding visible light images. In addition, previous studies used manual feature extraction, whereas automatic feature extraction was used in this study to obtain more suitable features.
-: We proposed a novel PlantCR network in this study. To design the structure of PlantCR, we used the same grouped submodules for visible and thermal images, and the fusion of two feature sets extracted from these two images. By using the same grouped submodule, we could simplify the system architecture. Each grouped submodule included a residual block, and the accuracy of the multiclass classification increased through the fusion of two feature sets.
-: We built a new TherVisDb database containing 9440 images of various flowers and flower leaves for the experiment. This self-constructed database contained both thermal images and the corresponding visible light images of plants. However, there was a significant difference between the field of view (FOV) and view of angle (VOA) of each camera, and, thus, the images were not aligned with each other. Therefore, it was challenging to conduct the plant image classification. In addition, this database had a larger number of classes and images than existing thermal plant image databases.
-: Our model and database constructed in this study were disclosed via the site GitHub [28] so that they can be used by other researchers.

Existing related works are explained in Section 2, in which previous plant image-based studies were divided into visible light image-based, thermal image-based, and thermal and visible light image-based methods. Section 2.1, Section 2.2 and Section 2.3 provide the detailed descriptions of these methods. A detailed explanation of the proposed method is provided in Section 3, which is followed by the experimental results and analysis in Section 4. Lastly, the discussion and conclusion are presented in Section 5 and Section 6, respectively.

2. Related Works

2.1. Visible Light Image-Based Methods

In the crop disease classification [1] method, experiments were previously performed using the PlantDoc database and proposed AAR network. In the fruit recognition method [2], classification experiments were conducted using AlexNet and CNN. This previous study compared the performance of the methods based on the Fruit-360 database. In the fruit variety classification [3] method, a CNN and the Fruit-360 database were used. In addition, regions of interest were created from apple images by using the YOLO network [29]. In the crop disease classification [4] method, experiments were conducted using the PlantDoc database and proposed DenseNet-121 model. In the fruit image classification [5] method, classification experiments were conducted using Inception V3 [30] and VGG16 [26]. This previous study compared the performance of methods based on the Fruit-360 database. In the grape variety recognition [6] method, the ExtResnet model was proposed, but the experiment was performed using the wine grape instance segmentation database [31]. In the fruit classification [7] method, DCNN and the Fruit-360 database were used. In the fruits recognition method [8], various experiments were conducted using the Fruits-360 image database. In addition, various feature extractions and machine-learning-based methods were used to perform experiments, after which the results were compared. The fruit classification [9] method used a CNN and the Fruit-360 database. In another fruit classification method, [10] proposed a multiclass CNN model. In this previous study, the experiment was performed on the FIDS30 database [32] and on the Fruits-360 database. In the fruit and vegetable classification [11] method, FruitVegCNN and MPSoC were proposed, but the Fruit-360 database was used for the experiment. The fruit classification [12] method proposed multiple deep learning models. In this previous study, the experiment was performed on the following databases: a supermarket produce and self-collected databases. Siddiqi [13] proposed FruitNet as a fruit image classification method and compared fourteen deep learning methods and compared the performance on the Fruit-360 database. In the fruit recognition method [14], experiments were conducted using EfficientNet-B0 and Fruit-360. In the crop disease classification [15] method, experiments were performed using the PlantDoc database and proposed OMNCNN. In the fruit classification [16] method, classification experiments were conducted using a histogram of oriented gradients. In the fruit image classification [17] method, the Fruit-360 database and ShuffleNet were used. Another fruit image-based classification method [18] proposed a deep learning method and used a model trained on ImageNet. In the crop and crop disease classification [19] method, experiments were performed using the PlantDoc database and five deep learning methods (MobileNetV1, MobileNetV2, NASNetMobile, DenseNet121, and Xception). Another crop and crop disease classification [20] method proposed a trilinear convolutional neural network model. Various experiments were conducted using the PlantVillage [33] and PlantDoc [34] databases and a pretrained model with ImageNet. In the apple classification [21] method, a CNN and the Fruit-360 database were used.

However, all these studies used visible light cameras. As explained in Section 1, visible light cameras are sensitive to light. Hence, they have the drawback of capturing low-quality images owing to shadows, illumination changes, and ambient light and its reflections. Moreover, visible light cameras cannot be used in the absence of light. To solve these problems, studies were conducted using thermal plant image-based methods, which were explained in Section 2.2.

2.2. Thermal Image-Based Method

Zhu et al. [23] studied plant image diagnosis and conducted experiments using a self-collected database. They proposed a method for diagnosing diseased plants. Lydia et al. [24] studied plant image identification and conducted experiments using a self-collected database. They proposed a data-gathering approach for the identification of diseased plants. In the plant image-based detection [22] method, experiments were conducted using thermal images for the detection of diseased areas in plant images based on the if–then rule. However, the previous thermal image-based studies typically analyzed thermal images with a naked eye examination, which is a time-consuming task. Moreover, very few studies on plant thermal images have been executed, and methods based on computer devices and algorithms have not been sufficiently developed. In addition, thermal cameras are sensitive to atmospheric temperature and humidity; therefore, they have the drawback of capturing low-quality images owing to rainwater and heat from surrounding objects. However, the studies described in Section 2.3 considered the drawbacks of thermal and visible light cameras and used them simultaneously.

2.3. Thermal and Visible Light Image-Based Method

A previous plant image classification [25] study considered the problems of thermal and visible light cameras and proposed a method based on computer devices and algorithms. Their method used thermal images and corresponding visible light images to perform the binary classification. Furthermore, in the study, thermal images and the corresponding multiple visible light images were obtained through three camera sensors, a thermal camera and two visible light cameras (left and right). By combining these three types of images, an improvement in the accuracy of the classification for dividing images into healthy and diseased plant images was possible. Moreover, the binary classification (healthy or diseased) was performed by using the analysis of variance [35] and support vector machine [36], and the features extracted through manual feature extraction methods.

However, this method increased the computation time and complexity of the system because of the simultaneous use of the three cameras. In addition, this method used manual feature extraction methods, which cannot extract suitable features. Moreover, since this method only performed a binary classification, it could not recognize various plant images.

Therefore, to solve these problems, a multiclass classification was performed in this study using plant images acquired using two camera sensors, thermal and visible light cameras. In addition, a deep learning-based automatic feature extraction method was used to extract more suitable features from the images.

In addition, Table 1 summarizes the existing methods explained in the Section 2 for ease of understanding. In Table 1, previous methods and the proposed method are also compared based on images, feature extractions, and methods. Moreover, the methods are compared by advantages and disadvantages of using such images and methods. As shown in Table 1, the proposed method was the first study on plant multiclass classification using thermal and visible light images.

3. Proposed Method

3.1. Overall Explanation of the Proposed Method

This section thoroughly describes PlantCR. A simplified flowchart of PlantCR constructed in this study is presented in Figure 1. The proposed PlantCR network was constructed using the concepts of VGG-Net and ResNet. As presented in Figure 1, the proposed method used thermal images and corresponding visible light images of plants as its input and combined the extracted features to classify the images into 28 classes. The tables and figures in Section 3.2 describe the details of the structure. In addition, Section 3.2 extensively explains the input images, size of the outputs, and parameters used in the structure.

3.2. Detailed Structure of the PlantCR Network

As presented in Figure 1, a visible light plant image (200 × 200 × 3 pixels) and a thermal plant image (200 × 200 × 1 pixels) were inputted into the PlantCR network proposed in this study. The details of the PlantCR structure are listed in Table 2, Table 3 and Table 4 and presented in Figure 2. The main structure of the PlantCR network is presented in Table 2, and it consisted of input layers (input layers one and two), group layers (group_1–3), the concatenate layer (concat), fully connected layer (FC), and global average pooling layer (GAP); the output (class#) of the FC layer was 28. The “Times” columns in Table 2 and Table 3 represent the number of repetitions of each layer. Each layer in the column of parameters shows a sum of the parameters of only that layer. Table 3 summarizes a group of layers and residual blocks, consisting of an input layer, convolution layers (conv2d), a max pooling layer (max_pool), and residual blocks (res_block). Table 4 summarizes the residual block, consisting of an input layer, convolution layers, a parametric rectified linear unit (prelu), and an additional operation layer (add). In Table 2, Table 3 and Table 4, the filter size and stride were (3 × 3) and (1 × 1), respectively; the padding of the conv2d layers of the residual block was (1 × 1), and that of the remaining conv2d layers was (0 × 0). “#” indicates “number of” in all contents.

3.3. Details of Database and Experimental Setup

Experiments were conducted using TherVisDb [28], consisting of various rose and rose leaf images. The Tau^® 2 FLIR thermal camera (Tau^® 2) and Logitech C270 HD camera [37] were used to capture thermal images and corresponding visible light images. During the capturing of the images, the atmospheric humidity, temperature, wind speed, fine dust, ultrafine dust, and UV index were 91%, 30 °C, 3 m/s, 24 μg/m³, 22 μg/m³, and 8, respectively. Here, one UV index unit is equal to 25 milliWatts/m². The database was constructed in July 2022. The size and depth of the images captured using the thermal and visible light cameras were 640 × 512 × 1 pixels and 14 bits, respectively. Moreover, the depth and size of the images acquired using the visible light camera were 24 and 640 × 512 × 3, respectively. In this study, the size of the images in the database, prepared using the cropping operation, was 8 bits and 300 × 300 × 1 pixels for the thermal images and 24 bits and 300 × 300 × 3 pixels for the visible light images. The size of the depth of the images was 24 bits. The images used in this study had 24 bits. In detail, a single pixel in an image represented up to 256 numbers (or intensity), which could be represented by 8 bits in the binary number. Because we had RGB (three-channel) color images, the size of the depth of the images could be 24 bits (8 bits × 3 channels). Moreover, all the images had ‘.png’ as the image file extension. In addition, all images of 300 × 300 pixels were downsized to 200 × 200 pixels when being inputted into the proposed model in this study. The total number of thermal and visible light plant images was 4720 and 4720, respectively. In addition, the training, test, and validation sets had 3314, 954, and 452 images, respectively. The training set images were augmented from 3314 images to 26,512 images through the use of augmentation methods (rotating three times by 90 degrees and flipping horizontally). The total number of classes in this database was 28, and Figure 3 shows example images. Table 5 lists the number of all images and names of plants for each class in the TherVisDb database. As presented in Figure 3, the thermal images and corresponding visible light images were not aligned, owing to the FOV and VOA of the cameras.

The image classification algorithm was processed using a computer device with an NVIDIA GeForce GTX TITAN X GPU [38], a Core i7-6700 [email protected] GHz CPU, and a 32 GB RAM. The model and source code were built by using libraries of OpenCV (version 4.3.0) [39], Python (version 3.5.4) [40], and the Keras API (version 2.1.6-tf) [41].

4. Experimental Results

4.1. Training Setup and Hyperparameters

The training setup of the proposed PlantCR was as follows: An adaptive moment estimation (Adam [42]) and a categorical cross-entropy loss [43] were used as the optimizer and the loss, respectively. The training loss curves and validation accuracy curves of the PlantCR are presented in Figure 4. Figure 4a,b present the loss curves and accuracy curves of the PlantCR according to the epoch number, respectively. As shown in Figure 4, the network constructed in this study was sufficiently trained without being overfitted by the training data. In detail, in machine learning, overfitting can be checked by validating the trained model using a validation dataset at each epoch in the training phase. For example, overfitting does not occur if validation accuracy increases with stabilization when the training loss decreases with stabilization. Therefore, as shown in Figure 4, the validation accuracy increased with stabilization (Figure 4b) when the training loss decreased with stabilization (Figure 4a). Therefore, we could confirm that the model was sufficiently trained without being overfitted. In addition, Table 6 lists the hyperparameters, and presents the selected values from the search spaces for the network.

4.2. Ablation Study

This section explains and compares the testing accuracies. The performance of the plant classification methods was calculated by using metrics given by Equations (1)–(4), positive predictive values (PPVs, also named precisions), the presenting true positive rate (TPR, also named the sensitivity and recall), F1-score (the harmonic mean of the precision and recall) [44], and accuracy (ACC) [45], respectively. In the equations, the numbers of true positive (#TP), false positive (#FP), false negative (#FN), and true negative (#TN) were given to calculate the accuracies.

PPV = (#TP)/(#TP + #FP)

(1)

TPR = (#TP)/(#TP + #FN)

(2)

F1-score = 2(PPV·TPR)/(PPV + TPR)

(3)

ACC = (#TP + #TN)/(#TP + #TN + #FP + #FN)

(4)

As shown in Table 7 and Figure 5, the experiment was conducted using seven methods. Methods (Figure 5) using only the thermal images, only the visible light images, and a combination of the two types of images in various ways were compared.

As shown in Figure 5a–g, variants of the classification network were presented. Method 1 (Figure 5a) used a single thermal image as the input, whereas Method 2 (Figure 5b) used a single visible light image. Method 3 (Figure 5c) combined the thermal and visible light images by using a concatenate layer (Layer# 4 in Table 2) and obtained a single image with four channels, which was used as the input image. Methods 4–7 (Figure 5d–g) extracted features from each image using the branch networks and combined the features to classify the plant images. As shown in Figure 5d–g, features were extracted by using different numbers of groups (as in Layer# 2 and 3 in Table 2 and Table 3) and combined at different points. The accuracies obtained using these methods were compared in Table 7. As shown in Table 7, Method 5 exhibited the highest accuracy. Therefore, we proposed Method 5 in this study.

Figure 6, Figure 7 and Figure 8 and Table 8 show the accuracies obtained for each class using Methods 1, 2, and 5 (proposed PlantCR). In Figure 6, Figure 7 and Figure 8, the darker colors represent the higher values.

In this study, training and validation accuracies in Figure 4b were different metrics from the ACC in Table 7. For the training and validation in the training phase, we used “accuracy” as the metric. Here, “accuracy” calculated how often predictions equaled labels. This metric created two local variables, such as the total number of training or validation images N and the count of correct predictions TP. Shortly, it was an operation that simply divided the TP by N (accuracy = TP/N). In another words, the training and validation accuracies were a ratio of the number of correct predictions and the number of the total data of the training or validation set. However, in the case of the ACC in Table 7, we used Equation (4), in which a sum of TP and TN was divided by the total number of testing images (#TP + #TN + #FP + #FN). The ACC results mostly had a higher accuracy compared to the other metrics, such as the TPR and PPV. This was because the number of TNs was much greater than others (TP, FP, and FN), and the number of TNs increased more if the class number increased. In the case of Method 5 in Table 7, the total number of TNs was 25,662 when the total number of the FP, FN, and TP was 96, 96, and 858, respectively. Because we had 28 classes, we had a much greater number of TNs than others (TP, FP, and FN).

4.3. Comparisons with Existing Methods

This section describes the comparative experiments conducted using the latest existing methods. Table 9, Table 10 and Table 11 compare the previously proposed methods [6,7,21] and proposed PlantCR method. Table 9, Table 10 and Table 11 list the experimental results using a thermal, visible light, and thermal and visible light image databases, respectively. As could be observed, the proposed method obtained the highest accuracies in all cases.

4.4. Processing Time

The processing time of Methods 1 and 2 and PlantCR on variants of the database in the testing phase was compared in Table 12, and it was measured based on the computer specifications described in Section 3.3 (the last paragraph). As presented in Table 12, the frame rate of the method using the thermal images was 19.28 frames per second (fps) (1000/(51.85)), and that of the method using visible light images was 19.24 fps (1000/(51.97)). Moreover, the frame rate of the method using both thermal and visible light images (PlantCR) was 18.09 fps (1000/(55.25)).

5. Discussion

In this study, we investigated various classifications using thermal images and corresponding visible light images of plants. By combining the thermal image and corresponding visible light image, the proposed method (PlantCR) performed better than the methods using either thermal or visible light images on their own, as summarized in Table 7 and Table 8. In addition, the proposed method was verified to perform better than the methods proposed in previous studies, as per the data listed in Table 9, Table 10 and Table 11.

In addition, there have been very few existing open databases of plant thermal images. Moreover, open databases of plant thermal and visible light images have never been available to the public. Therefore, in this study, we constructed a thermal and visible light plant image open database. This database had more images and a greater number of classes than existing thermal plant image open databases. The constructed database was called TherVisDb, and is publicly available for other researchers to use [28].

Figure 9 shows the error cases of PlantCR. Classification errors occurred due to the plant labels in the images as presented in Figure 9.

As shown in Table 5, each class had a different number of images. Moreover, some classes had low-quality images as shown in Figure 10. Therefore, the number of images of each class and the low-quality images comprised the differences of the model performances in the different classes. In detail, as shown in Figure 10, the image of duftrausch became dark due to rainwater on the plant and on the ground. Moreover, the image of the rose gaujard became blurry due to high humidity. Moreover, the image of rosenau became bright due to the hot temperature of the environment.

In addition, in the field of agriculture, machine-learning-based solutions are very helpful to human work. For example, such techniques increase the speed of human work in the analysis, disease detection, and crop classification in automated conveyor systems.

Similarly, machine-learning-based plant classification systems are very helpful for classifying plants (medicinal plants, medicinal herbs, and foods) in gardens and factories. For example, there are many medicinal plant factories, where processing is performed by humans by hand [46], whereas some factories use automated systems [47] based on machine learning algorithms such as the proposed method. Moreover, there are many wild plants that can be used as a food source [48]. The proposed method could also be used as a smartphone application; therefore, a person can check if a wild plant in front of them can be used as a food source or if it is a medicinal plant [49].

In our research, the thermal images were not aligned with the corresponding visible light images. For the alignment, an additional calibration process of the thermal and visible light cameras was required. Even with the calibration, the thermal and visible light images could not be aligned due to the different distances between the cameras and the objects. Therefore, we proposed the plant classification method, which could be operated without the alignment method.

6. Conclusions

We developed a plant classification method in this study, based on thermal and visible light images and conducted various experiments using the TherVisDb database, containing various rose and rose leaf images. The experimental results obtained using the TherVisDb database showed a higher accuracy of the proposed method (F1-score = 90.05%, ACC = 99.28%) than the latest methods.

Through this study, we confirmed that the classification accuracy could be increased by simultaneously using thermal images and corresponding visible light images of plants. Moreover, although the thermal images and corresponding visible light plant images used in this study exhibited large differences in the FOV and VOA, the increase in the accuracy was verified when two types of images were simultaneously used, compared to using only one image type. In addition, as shown in Figure 5 and Table 7, we confirmed that extracting features using different numbers of groups (Table 2 and Table 3) resulted in different accuracies, and using two groups for each image showed a higher accuracy than using more or less groups.

There have been many machine-learning-based studies conducted in the field of agriculture. However, as explained in related works, mostly crop disease datasets and fruit datasets were used in various studies, namely, detection, classification, segmentation, etc. Furthermore, there have been very few plant image dataset-based studies. Therefore, we referenced previous agriculture and machine-learning-based studies in the Section 2.

To reduce the occurrence of classification errors in future work, as mentioned in Figure 9, we plan to conduct a classification study to increase the accuracy of PlantCR by considering various deep learning methods. Additionally, we plan to perform a plant segmentation by using the self-constructed thermal and visible light images database and increasing the corresponding classification performance. Moreover, we plan to count plants by using drones and machine learning algorithms [50] in our future works.

Author Contributions

Methodology, G.B.; validation, S.H.N.; supervision, K.R.P.; writing—original draft, G.B.; writing—review and editing, K.R.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (MSIT) through the Basic Science Research Program (NRF-2022R1F1A1064291), in part by the NRF funded by the MSIT through the Basic Science Research Program (NRF-2021R1F1A1045587), and in part by the NRF funded by the MSIT through the Basic Science Research Program (NRF-2020R1A2C1006179).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Abawatew, G.Y.; Belay, S.; Gedamu, K.; Assefa, M.; Ayalew, M.; Oluwasanmi, A.; Qin, Z. Attention augmented residual network for tomato disease detection and classification. Turk. J. Electr. Eng. Comput. Sci. 2021, 29, 2869–2885. [Google Scholar]
Hamid, N.N.A.A.; Razali, R.A.; Ibrahim, Z. Comparing bags of features, conventional convolutional neural network and AlexNet for fruit recognition. Indones. J. Electr. Eng. Comput. Sci. 2019, 14, 333–339. [Google Scholar] [CrossRef]
Katarzyna, R.; Paweł, M.A. Vision-based method utilizing deep convolutional neural networks for fruit variety classification in uncertainty conditions of retail sales. Appl. Sci. 2019, 9, 3971. [Google Scholar] [CrossRef] [Green Version]
Chakraborty, A.; Kumer, D.; Deeba, K. Plant leaf disease recognition using fastai image classification. In Proceedings of the 5th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 8–10 April 2021; pp. 1624–1630. [Google Scholar] [CrossRef]
Siddiqi, R. Effectiveness of transfer learning and fine tuning in automated fruit image classification. In Proceedings of the 3rd International Conference on Deep Learning Technologies, Xiamen, China, 5–7 July 2019; pp. 91–100. [Google Scholar] [CrossRef]
Franczyk, B.; Hernes, M.; Kozierkiewicz, A.; Kozina, A.; Pietranik, M.; Roemer, I.; Schieck, M. Deep learning for grape variety recognition. Procedia Comput. Sci. 2020, 176, 1211–1220. [Google Scholar] [CrossRef]
Hussain, I.; Tan, S.; Hussain, W.; Ali, A. CNN Transfer learning for automatic fruit recognition for future class of fruit. IJC. 2020, 39, 88–96. [Google Scholar]
Kader, A.; Sharif, S.; Bhowmick, P.; Mim, F.H.; Srizon, A.Y. Effective workflow for high-performance recognition of fruits using machine learning approaches. Int. Res. J. Eng. Technol. 2020, 7, 1–5. [Google Scholar]
Dandekar, M.; Punn, N.S.; Sonbhadra, S.K.; Agarwal, S.; Kiran, R.U. Fruit classification using deep feature maps in the presence of deceptive similar classes. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–6. [Google Scholar]
Biswas, B.; Ghosh, S.K.; Ghosh, A. A robust multi-label fruit classification based on deep convolution neural network. In Computational Intelligence in Pattern Recognition. Advances in Intelligent Systems and Computing; Das, A., Nayak, J., Naik, B., Pati, S., Pelusi, D., Eds.; Springer: Singapore, 2020; Volume 999. [Google Scholar] [CrossRef]
Dey, S.; Saha, S.; Singh, A.; McDonald-Maier, K. FruitVegCNN: Power- and memory-efficient classification of fruits & vegetables using cnn in mobile MPSoC. In Proceedings of the IEEE 17th India Council International Conference (INDICON), New Delhi, India, 11–13 December 2020; pp. 1–7. [Google Scholar] [CrossRef]
Hossain, M.S.; Al-Hammadi, M.; Muhammad, G. Automatic fruit classification using deep learning for industrial applications. IEEE Trans. Ind. Inform. 2019, 15, 1027–1034. [Google Scholar] [CrossRef]
Siddiqi, R. Comparative performance of various deep learning-based models in fruit image classification. In Proceedings of the 11th International Conference on Advances in Information Technology, Bangkok, Thailand, 1–3 July 2020; Volume 14, pp. 1–9. [Google Scholar] [CrossRef]
Srivastava, S.; Singh, T.; Sharma, S.; Verma, A. A fruit recognition system based on modern deep learning technique. Int. J. Eng. Res. Technol. 2020, 9, 896–898. [Google Scholar] [CrossRef]
Ashwinkumar, S.; Rajagopal, S.; Manimaran, V.; Jegajothi, B. Automated plant leaf disease detection and classification using optimal MobileNet based convolutional neural networks. Mater. Today Proc. 2022, 51, 480–487. [Google Scholar] [CrossRef]
Muhathir, M.; Santoso, M.H.; Muliono, R. Analysis naïve bayes in classifying fruit by utilizing HOG feature extraction. J. Inform. Telecommun. Eng. 2020, 4, 1–10. [Google Scholar] [CrossRef]
Ghosh, S.; Mondal, M.J.; Sen, S.; Chatterjee, S.; Kar Roy, N.; Patnaik, S. A novel approach to detect and classify fruits using ShuffleNet V2. In Proceedings of the IEEE Applied Signal Processing Conference, Kolkata, India, 7–9 October 2020; pp. 163–167. [Google Scholar] [CrossRef]
Shahi, T.B.; Sitaula, C.; Neupane, A.; Guo, W. Fruit classification using attention-based MobileNetV2 for industrial applications. PLoS ONE 2022, 17, 1–21. [Google Scholar] [CrossRef] [PubMed]
Chompookham, T.; Surinta, O. Ensemble methods with deep convolutional neural networks for plant leaf recognition. ICIC Express Lett. 2021, 15, 553–565. [Google Scholar]
Wang, D.; Wang, J.; Li, W.; Guan, P. T-CNN: Trilinear convolutional neural networks model for visual detection of plant diseases. Comput. Electron. Agric. 2021, 190, 106468. [Google Scholar] [CrossRef]
Rhamadiyanti, D.T.; Suyanto, S. Robustness of convolutional neural network in classifying apple images. In Proceedings of the International Seminar on Intelligent Technology and Its Applications (ISITIA), Virtual Conference, 21–22 July 2021; pp. 226–231. [Google Scholar] [CrossRef]
Anasta, N.; Setyawan, F.X.A.; Fitriawan, H. Disease detection in banana trees using an image processing-based thermal camera. IOP Conf. Ser. Earth Environ. Sci. 2021, 739, 012088. [Google Scholar] [CrossRef]
Zhu, W.; Chen, H.; Ciechanowska, I.; Spaner, D. Application of infrared thermal imaging for the rapid diagnosis of crop disease. IFAC-PapersOnLine 2018, 51, 424–430. [Google Scholar] [CrossRef]
Lydia, M.S.; Aulia, I.; Jaya, I.; Hanafiah, D.S.; Lubis, R.H. Preliminary study for identifying rice plant disease based on thermal images. Journal of Physics: Conference Series, Volume 1566. In Proceedings of the 4th International Conference on Computing and Applied Informatics 2019 (ICCAI 2019), Medan, Indonesia, 26–27 November 2019. [Google Scholar]
Raza, S.-E.-A.; Prince, G.; Clarkson, J.P.; Rajpoot, N.M. Automatic Detection of Diseased Tomato Plants Using Thermal and Stereo Visible Light Images. PLoS ONE 2015, 10, e0123262. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2015, arXiv:1409.1556v6. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
PlantCR & TherVisDb. Available online: https://github.com/ganav/PlantCR-TherVisDb/tree/main (accessed on 28 September 2022).
Redmon, J.; Farhadi, A. Yolo V3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. arXiv 2015, arXiv:1512.00567. [Google Scholar]
Santos, T.; Leonardo, D.S.; Andreza, D.S.; Sandra, A. Embrapa wine grape instance segmentation dataset—Embrapa WGISD (1.0.0) [Data set]. Zenodo 2019. Available online: https://zenodo.org/record/3361736#.Ywgs0nZByUk (accessed on 16 September 2022). [CrossRef]
FIDS30 Database. Available online: https://www.kaggle.com/datasets/arnavmehta710a/fids30 (accessed on 16 September 2022).
PlantVillage Dataset. Available online: https://www.kaggle.com/datasets/emmarex/plantdisease (accessed on 16 September 2022).
Singh, D.; Jain, N.; Jain, P.; Kayal, P.; Kumawat, S.; Batra, N. PlantDoc: A dataset for visual plant disease detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, Hyderabad, India, 5–7 January 2020; pp. 249–253. [Google Scholar] [CrossRef] [Green Version]
Analysis of Variance. Available online: https://en.wikipedia.org/wiki/Analysis_of_variance (accessed on 16 September 2022).
Tong, S.; Koller, D. Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2002, 2, 45–66. [Google Scholar] [CrossRef]
Logitech C270 HD Web-Camera. Available online: https://www.logitech.com/en-us/products/webcams/c270-hd-webcam.960-000694.html (accessed on 6 September 2022).
NVIDIA GeForce GTX TITAN X. Available online: https://www.nvidia.com/en-us/geforce/products/10series/titan-x-pascal/ (accessed on 16 September 2022).
OpenCV, Intel, California, U.S. Available online: http://opencv.org/ (accessed on 16 September 2022).
Python, Python Software Foundation, Delaware, U.S. Available online: https://www.python.org/ (accessed on 16 September 2022).
Keras, Chollet, F., California, U.S. Available online: https://keras.io/ (accessed on 16 September 2022).
Kingma, D.P.; Ba, J.B. ADAM: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
Categorical Cross-Entropy Loss. Available online: https://peltarion.com/knowledge-center/documentation/modeling-view/build-an-ai-model/loss-functions/categorical-crossentropy (accessed on 16 September 2022).
Derczynski, L. Complementarity, F-score, and NLP evaluation. In Proceedings of the Tenth International Conference on Language Resources and Evaluation 2016, Portorož, Slovenia, 23–28 May 2016; pp. 261–266. Available online: https://aclanthology.org/L16-1040 (accessed on 28 September 2022).
Powers, D.M.W. Evaluation: From precision, recall and f-measure to roc, informedness, markedness & correlation. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar]
BOLIVIA Plant for Processing Medicinal and Aromatic Herbs, Chizchipani, Caranavi. Available online: https://www.alamy.com/stock-photo-bolivia-plant-for-processing-medicinal-and-aromatic-herbs-chizchipani-31914323.html (accessed on 25 October 2022).
Processing Lines for Medicinal Herbs and Plants. Available online: https://www.godioliebellanti.com/medical-herbs-and-plants/?lang=en (accessed on 25 October 2022).
Wild Plants for a Free Meal. Available online: https://morningchores.com/edible-wild-plants/ (accessed on 25 October 2022).
Medicinal Plants. Available online: https://www.plantscience4u.com/2018/08/10-medicinal-plants-and-their-uses-with.html (accessed on 25 October 2022).
Castellano, G.; Castiello, C.; Cianciotta, M.; Mencar, C.; Vessio, G. Multi-View Convolutional Network for Crowd Counting in Drone-Captured Images. In Computer Vision—ECCV 2020 Workshops. ECCV 2020; Lecture Notes in Computer Science; Bartoli, A., Fusiello, A., Eds.; Springer: Cham, Switzerland, 2020; Volume 12538. [Google Scholar] [CrossRef]

Figure 1. Overview of the PlantCR network for plant image classification.

Figure 2. The detailed structure of PlantCR.

Figure 3. Example images of TherVisDb. From left to right: images of Alexandra, Belvedere, Elvis, and Fellowship. (a) Visible light images and (b) thermal images.

Figure 4. Loss curves and accuracy curves of PlantCR. (a) Training loss curves and validation loss curves; (b) training accuracy curves and validation accuracy curves.

Figure 5. Variants of the classification network conducted in the ablation study. (a–g) Methods 1–7. The meaning of each color bar is explained in Figure 2.

Figure 6. Confusion matrix obtained using Method 1 (Ther).

Figure 7. Confusion matrix obtained using Method 2 (Vis).

Figure 8. Confusion matrix obtained using proposed PlantCR (Th&V).

Figure 9. Example of error cases on TherVisDb. From left to right, the images of alexandra, charm of paris, cleopatra, and grand classes. (a) Thermal images; (b) corresponding visible light images.

Figure 10. Example of error cases on TherVisDb. Images of (a) duftrausch, (b) rose gaujard, and (c) rosenau.

Table 1. Summary and comparison of our and previous methods.

Categories		Methods	Advantages	Disadvantages
Visible light image-based	Based on automatic feature extraction	Binary and multiclass classification [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]	- Provides high-resolution and high-quality images in day or high illumination environments - Provides color information - Extracts features automatically based on big database	- Provides dark images in night or low illumination environments - Provides low-quality images in day or high illumination environments owing to shadows, illumination changes, and ambient light and its reflections
Thermal image-based	Based on manual feature extraction	Diagnosis [23], identification [24]	- Provides thermal information	- Sensitive to atmospheric temperature and humidity, and captures low-quality images owing to rainwater and heat from surrounding objects - Very difficult to extract suitable features
Thermal image-based	Based on automatic feature extraction	Detection method [22]	- Provides thermal information - Automatically extracts features based on big database	- Sensitive to atmospheric temperature and humidity, and captures low-quality images owing to rainwater and heat from surrounding objects
Thermal and visible light image-based	Based on manual feature extraction	Binary classification [25]	- Provides high-resolution and high-quality images in day or high illumination environments - Provides color and thermal information	- Very difficult to extract suitable features - Does not consider multiclass problems - Computationally expensive owing to three camera sensors
Thermal and visible light image-based	Based on automatic feature extraction	Multiclass classification (proposed method)	- Provides high-resolution and high-quality image in day or high illumination environments - Provides color and thermal information - Extracts features automatically based on big database - Considers multiclass problems	- Computationally expensive owing to two camera sensors

Table 2. Structure of the PlantCR network.

Layer#	Times	Layer Type	Filter#	Parameter#	Layer Connection
1	1	input layer_1 and 2	0	0	input
2	2	group_1	64/128	1,735,872	input layer_1
3	2	group_2	64/128	1,735,872	input layer_2
4	1	concat	0	0	group_1 and group_2
5	2	group_3	128	3,100,160	concat
6	1	GAP	28	0	group_3
7	1	FC (softmax)	class#	3612	GAP
	Total number of parameters: 6,575,516

Table 3. Structure of a group layer.

Times	Layer Type	Layer Connection
1	input layer	input
2	conv2d	input layer
1	max_pool	conv2d
4	res_block	max_pool

Table 4. Structure of a residual block.

Layer Type	Layer Connection
input layer	input
conv2d_1	input layer
prelu	conv2d_1
conv2d_2	prelu
add	conv2d_2 and input layer

Table 5. Description of TherVisDb.

Class Index	Class Names	Thermal Image #	Visible Light Image #	Train Set	Test Set	Validation Set
1	Alexandra	120	120	168	48	24
2	Belvedere	48	48	68	20	8
3	Blue river	136	136	192	56	24
4	Charm of Paris	136	136	192	56	24
5	Cleopatra	152	152	214	62	28
6	Cocktail	112	112	158	46	20
7	Duftrausch	176	176	248	72	32
8	Echinacea sunset	64	64	90	26	12
9	Eleanor	144	144	202	58	28
10	Elvis	224	224	314	90	44
11	Fellowship	208	208	292	84	40
12	Goldeise	144	144	202	58	28
13	Goldfassade	184	184	258	74	36
14	Grand classe	264	264	370	106	52
15	Just Joey	72	72	102	30	12
16	Kerria japonica	104	104	146	42	20
17	Margaret	112	112	158	46	20
18	Oklahoma	312	312	438	126	60
19	Pink perfume	120	120	168	48	24
20	Queen Elizabeth	120	120	168	48	24
21	Rose gaujard	312	312	438	126	60
22	Rosenau	304	304	426	122	60
23	Roseraie du chatelet	352	352	494	142	68
24	Spiraea salicifolia l	64	64	90	26	12
25	Stella de oro	48	48	68	20	8
26	Twist	288	288	404	116	56
27	Ulrich brunner fils	120	120	168	48	24
28	White symphonie	280	280	392	112	56
Total		4720	4720	6628	1908	904

Table 6. Selected values from search spaces of hyperparameters.

Parameters	Epochs	Learning Rate Decay (for SGD)	Batch Size	Learning Rate	Momentum (for SGD)	Optimizer
Search Space	(1~300)	(0.000001, 0.00001, 0.0001)	(1,4,8,16)	(0.0001, 0.001, 0.01, 0.1)	(0.9, 0.8, 0.7)	(“SGD,” “Adam”)
Selected Value	260	0.00001	8	0.00001	0.9	“Adam”

Table 7. Accuracies obtained by using the proposed PlantCR network and variants of PlantCR.

Methods	TPR	PPV	F1-Score	ACC
Method 1 (thermal image)	80.4	81.69	80.28	98.58
Method 2 (visible light image)	81.65	78.79	79.32	98.52
Method 3 (combined)	84.5	83.47	83.17	98.69
Method 4 (combined)	86.97	86.11	86.32	99.04
Method 5 (combined) (proposed PlantCR)	90.26	90.42	90.05	99.28
Method 6 (combined)	87.58	86.63	86.68	99.08
Method 7 (combined)	86.67	83.99	84.7	98.87

Table 8. Detailed accuracy of each class using Methods 1, 2, and PlantCR (using thermal image (Ther), visible light image (Vis), and both images (Th&V), respectively).

#	Class Names	TPR			PPV			F1-Score			ACC
#	Class Names	Ther	Vis	Th&V	Ther	Vis	Th&V	Ther	Vis	Th&V	Ther	Vis	Th&V
1	Alexandra	86.36	93.33	100	79.17	58.33	91.67	82.61	71.79	95.65	99.16	98.85	99.79
2	Belvedere	57.14	100	83.33	80	60	100	66.67	75	90.91	99.16	99.58	99.79
3	Blue river	100	85.19	83.87	64.29	82.14	92.86	78.26	83.64	88.14	98.95	99.06	99.27
4	Charm of Paris	69.57	80	85.19	57.14	57.14	82.14	62.75	66.67	83.64	98.01	98.32	99.06
5	Cleopatra	72.41	70.83	94.74	67.74	54.84	58.06	70	61.82	72	98.11	97.8	98.53
6	Cocktail	85	100	91.3	73.91	56.52	91.30	79.07	72.22	91.3	99.06	98.95	99.58
7	Duftrausch	82.05	66.67	97.14	88.89	83.33	94.44	85.33	74.07	95.77	98.85	97.8	99.69
8	Echinacea sunset	100	76.47	92.31	92.31	100	92.31	96	86.67	92.31	99.9	99.58	99.79
9	Eleanor	100	84.85	100	89.66	96.55	100	94.55	90.32	100	99.69	99.37	100
10	Elvis	87.23	76.6	85.71	91.11	80	93.33	89.13	78.26	89.36	98.95	97.9	98.95
11	Fellowship	74.47	71.74	86.36	83.33	78.57	90.48	78.65	75	88.37	98.01	97.69	98.95
12	Goldeise	64.71	69.44	81.82	75.86	86.21	93.10	69.84	76.92	87.1	98.01	98.43	99.16
13	Goldfassade	87.18	84.21	87.18	91.89	86.49	91.89	89.47	85.33	89.47	99.16	98.85	99.16
14	Grand classe	73.58	61.67	86	73.58	69.81	81.13	73.58	65.49	83.5	97.06	95.91	98.22
15	Just Joey	70.59	78.57	76.47	80	73.33	86.67	75	75.86	81.25	99.16	99.27	99.37
16	Kerria japonica	74.07	95	100	95.24	90.48	100	83.33	92.68	100	99.16	99.69	100
17	Margaret	86.36	82.61	86.36	82.61	82.61	82.61	84.44	82.61	84.44	99.27	99.16	99.27
18	Oklahoma	85.45	81.97	86.57	74.60	79.37	92.06	79.66	80.65	89.23	97.48	97.48	98.53
19	Pink perfume	57.89	78.26	90.91	91.67	75	83.33	70.97	76.6	86.96	98.11	98.85	99.37
20	Queen Elizabeth	67.65	86.36	95.83	95.83	79.17	95.83	79.31	82.61	95.83	98.74	99.16	99.79
21	Rose gaujard	83.93	76.92	85.29	74.60	95.24	92.06	78.99	85.11	88.55	97.38	97.8	98.43
22	Rosenau	86.76	84.62	91.04	96.72	90.16	100	91.47	87.3	95.31	98.85	98.32	99.37
23	Roseraie du chatelet	91.84	72.13	95.16	63.38	61.97	83.1	75.00	66.67	88.72	96.86	95.39	98.43
24	Spiraea salicifolia L.	85.71	80	100	92.31	92.31	100	88.89	85.71	100	99.69	99.58	100
25	Stella de oro	90	90	90.91	90	90	100	90	90	95.24	99.79	99.79	99.9
26	Twist	88	84.21	94.64	75.86	82.76	91.38	81.48	83.48	92.98	97.9	98.01	99.16
27	Ulrich brunner fils	68.97	85.00	86.36	83.33	70.83	79.17	75.47	77.27	82.61	98.64	98.95	99.16
28	White symphonie	74.19	89.66	92.86	82.14	92.86	92.86	77.97	91.23	92.86	97.27	98.95	99.16
	Average	80.40	81.65	90.26	81.69	78.79	90.42	80.28	79.32	90.05	98.58	98.52	99.28

Table 9. Comparison between previous methods and Method 1 (Ther).

Methods	TPR	PPV	F1	ACC
Rhamadiyanti et al., 2021 [21]	79.65	80.22	79.93	96.73
Hussain et al., 2020 [7]	80.16	81.57	80.26	97.08
Franczyk et al., 2020 [6]	79.12	80.44	79.77	97.16
Method 1	80.40	81.69	80.28	98.58

Table 10. Comparison between previous methods and Method 2 (Vis).

Methods	TPR	PPV	F1	ACC
Rhamadiyanti et al., 2021 [21]	81.50	78.42	79.23	98.09
Hussain et al., 2020 [7]	81.46	78.37	79.28	97.97
Franczyk et al., 2020 [6]	80.80	78.30	79.24	97.66
Method 2	81.65	78.79	79.32	98.52

Table 11. Comparison between previous methods and proposed PlantCR (Th&V).

Methods	TPR	PPV	F1	ACC
Rhamadiyanti et al., 2021 [21]	88.67	90.30	89.47	98.26
Hussain et al., 2020 [7]	89.52	89.67	89.59	98.42
Franczyk et al., 2020 [6]	88.33	90.28	89.29	98.29
PlantCR	90.26	90.42	90.05	99.28

Table 12. Processing time per image.

Methods	Processing Time
Method 1 (Ther)	51.85 ms
Method 2 (Vis)	51.97 ms
PlantCR (Th&V)	55.25 ms

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Batchuluun, G.; Nam, S.H.; Park, K.R. Deep Learning-Based Plant Classification Using Nonaligned Thermal and Visible Light Images. Mathematics 2022, 10, 4053. https://0-doi-org.brum.beds.ac.uk/10.3390/math10214053

AMA Style

Batchuluun G, Nam SH, Park KR. Deep Learning-Based Plant Classification Using Nonaligned Thermal and Visible Light Images. Mathematics. 2022; 10(21):4053. https://0-doi-org.brum.beds.ac.uk/10.3390/math10214053

Chicago/Turabian Style

Batchuluun, Ganbayar, Se Hyun Nam, and Kang Ryoung Park. 2022. "Deep Learning-Based Plant Classification Using Nonaligned Thermal and Visible Light Images" Mathematics 10, no. 21: 4053. https://0-doi-org.brum.beds.ac.uk/10.3390/math10214053

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Plant Classification Using Nonaligned Thermal and Visible Light Images

Abstract

1. Introduction

2. Related Works

2.1. Visible Light Image-Based Methods

2.2. Thermal Image-Based Method

2.3. Thermal and Visible Light Image-Based Method

3. Proposed Method

3.1. Overall Explanation of the Proposed Method

3.2. Detailed Structure of the PlantCR Network

3.3. Details of Database and Experimental Setup

4. Experimental Results

4.1. Training Setup and Hyperparameters

4.2. Ablation Study

4.3. Comparisons with Existing Methods

4.4. Processing Time

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI