Uzair Aslam Bhatti,Sibghat Ullah Bazai,Shumaila Hussain,Shariqa Fakhar,Chin Soon Ku,Shah Marjan,Por Lip Yee and Liu Jing
1College of Information and Communication Engineering,Hainan University,Haikou,570228,China
2Department of Computer Engineering,Balochistan University of Information Technology,Engineering,and Management Sciences(BUITEMS),Quetta,Pakistan
3Department of Computer Science,Sardar Bahadur Khan Women’s University,Quetta,Pakistan
4Department of Computer Science,Universiti Tunku Abdul Rahman,Kampar,31900,Malaysia
5Department of Software Engineering,Balochistan University of Information Technology,Engineering,and Management Sciences(BUITEMS),Quetta,Pakistan
6Faculty of Computer Science and Information Technology,University of Malaya,Kuala Lumpur,50603,Malaysia
ABSTRACT Crop diseases have a significant impact on plant growth and can lead to reduced yields.Traditional methods of disease detection rely on the expertise of plant protection experts,which can be subjective and dependent on individual experience and knowledge.To address this,the use of digital image recognition technology and deep learning algorithms has emerged as a promising approach for automating plant disease identification.In this paper,we propose a novel approach that utilizes a convolutional neural network (CNN) model in conjunction with Inception v3 to identify plant leaf diseases.The research focuses on developing a mobile application that leverages this mechanism to identify diseases in plants and provide recommendations for overcoming specific diseases.The models were trained using a dataset consisting of 80,848 images representing 21 different plant leaves categorized into 60 distinct classes.Through rigorous training and evaluation,the proposed system achieved an impressive accuracy rate of 99%.This mobile application serves as a convenient and valuable advisory tool,providing early detection and guidance in real agricultural environments.The significance of this research lies in its potential to revolutionize plant disease detection and management practices.By automating the identification process through deep learning algorithms,the proposed system eliminates the subjective nature of expert-based diagnosis and reduces dependence on individual expertise.The integration of mobile technology further enhances accessibility and enables farmers and agricultural practitioners to swiftly and accurately identify diseases in their crops.
KEYWORDS Plant disease;Inception v3;CNN;crop diseases
Agriculture is the backbone of Pakistan’s economy [1].In terms of potential,this sector can produce both for the internal market and export.However,the contribution of agriculture to GDP has gradually declined to 19.3 percent in the last decades due to frequently occurring plant diseases and a lack of awareness about preventive and protective measures against diseases.This can have a detrimental influence on the economies of nations like Pakistan,where agriculture is the primary source of income.To prevent crop damage and increase harvesting quality,detecting,identifying,and acknowledging the infection from the initial stage is essential.In 2020,Pakistan’s agricultural sector contributed 22.69 percent to the GDP.The agricultural sector’s contribution to the GDP in 2020 decreased from 22.04 percent to 19.3 percent due to conventional farming practices and a lack of awareness regarding preventing and protecting plants from the disease[2].
Many crops are cultivated in Pakistan during different seasons.According to the UN,almost 2,000 tons of cherry are produced annually in Pakistan.On a commercial basis,export-quality cherries are grown on about 897 hectares in Balochistan (mainly in Quetta,Ziarat,and Kalat),resulting in an annual production of 1,507 tonnes [3].Similarly,potatoes (Solanum tuberosumL.) are one of the world’s most extensively grown and consumed tuberous crops,and around 1300 kha of potato is planted in Pakistan[4].
Approximately 75.4 million tons of apples were produced globally in 2013.Pakistan is one of the largest apple producers,primarily concentrated in Khyber Pakhtunkhwa,Punjab,and Baluchistan.Balochistan has the largest apple crop,covering 45,875 hectares of land annually,producing 589,281 tons of apples[5].Similarly,in 2013–2014,strawberry was grown on 236 ha in Pakistan[6].Berry is an emerging exotic fruit crop in subtropical regions of Pakistan.It remained unnoticed until it began to be produced commercially in Khyber Pakhtunkhwa.Its unique,desirable traits and profit potential have attracted attention [7].The capsicum is cultivated over 61,600 ha in Pakistan,yielding 110,500 tons per year[8].
The world manufacturing of tomatoes experienced a consistent and non-stop increase in the 20th Century.Pakistan is one of the thirty-five biggest producers of potatoes [9].Grapes (Vitis vinifera)of the family Vitaceae part of the most well-liked fruit in the world.In Pakistan,the province of Balochistan contributes 98 percent of the country’s grape production.Grapes of several sorts are produced in the province’s upland areas.The vast majority of well-known and famous commercial types are grown in the districts of Quetta,Pishin,Killa Abdulla,Masting,Kalat,Loralai,and Zhob[10].Pakistan’s central peach-growing region is Swat.It has a total area of 14700 acres and produces 55800 tons yearly[11].
A key issue in Pakistan is farmers’limited knowledge about crop diseases.Farmers are still using the traditional and outdated method of discovering the crops’conditions by personally and physically inspecting the produce.Farmers utilize their experience to monitor and analyze their harvests with their naked eyes.This traditional system has severe flaws and obstacles.If the farmer is unaware of disease types of crop infections,the crops will either go undiagnosed or be treated with the incorrect disease control method that can affect the crop’s yield and ruin the entire crop.Disease control is an important guarantee to ensure the safety of plant production,and it can also effectively improve the yield and quality of crops.The premise of prevention and control is to be able to detect diseases in a timely and accurate manner and to identify their types and severity[12].
In plant disease identification,research objects are generally taken from parts of plants,such as stems,leaves,fruits,branches,and other parts with apparent characteristics.Plant leaves are easier to obtain than other parts and have prominent disease characteristics.From the perspective of botanical research,the shape,texture,and color of diseased leaves can be used as the basis for classification.After a pathogen infects a plant or becomes diseased,the diseased leaves’external characteristics and internal structure undergo subtle changes.The appearance is mainly reflected in fading,rolling,rot,discoloration,etc.The opposite internal factors are reflected in water and pigment.However,the symptoms of different diseases present ambiguity,complexity,and similarity.Farmers’low scientific and cultural quality in Pakistan make it impossible to accurately and timely diagnose plant disease’s period and development process [13].We only spray large doses of chemicals when the human eye finds that the disease severely affects the plant.This negligence causes a significant reduction in crop yields and causes pollution.Therefore;accessible;accurate;prompt plant disease identification and assessing the degree of damage to provide practical information for disease control has become an essential issue in crop production.
With the continuous development of computer technology and mobile phone applications,smartphones have become essential for people to connect.Taking pictures and videos has become a must-have tool for mobile phones.The concept of deep learning is widely implemented to develop,enhance and expand the utilization of mobile applications in different areas.With the development of network technology,people share data through the network,which not only enriches the material of the data but also obtains data images at a low cost,which provides a large amount of data for the training of convolutional neural networks.With the rapid development of storage technology and the continuous updating of the Internet,mobile phone CPUs’computing power has also been continuously strengthened,laying a foundation for computing power to build a lightweight image recognition model on the mobile phone.In recent years,elements combined with artificial intelligence have begun to appear on mobile terminals.For example,with intelligent voice and recognition development,the mobile phone device is like an intelligent robot.This has laid the equipment foundation for this kind of work on the mobile phone.At present,many cases of image recognition are gradually being applied to mobile phones.
Using technology,the crop’s disease detection procedure can be automated.Artificial intelligence techniques and computer vision systems are most widely used for automating disease detection in plants[14–21].The use of machine learning has revolutionized computer vision,especially in imagebased detection and classification[22].The convolutional neural networks CNNs is a deep learning approach that is most promising in agriculture for plant species identification,yield management,weed identification,water control,soil maintenance,counting harvest yield,disease identification,pest detection,and field management[23–32].The research proposes a deep learning-based technique to automatically identify plant leaf disease.The proposed mechanism uses the convolutional neural network CNN and Inception v3 to identify plant leaf disease and provide recommendations to overcome the specified condition.To make it convenient for the farmer to implement the automated machines in a real-time agricultural environment,The research focused on developing a mobile application.The mobile application is capable of capturing the image of the plant leaf;identifying the disease and providing recommendations to overcome the identified condition.
Deep learning techniques are proven to be very successful in all areas [33–35].Plant diseases in agriculture can have devastating consequences and cause economic loss.Researchers are focusing on techniques to improve automatic plant disease detection and have developed different techniques.Convolutional Neural Networks(CNN)showed significant outcomes in image classification,object recognition,and semantic segmentation.The tremendous feature learning and classification capabilities of CNNs have attracted widespread attention.Using PlantVillage datasets with 20,639 pictures,Slava et al.[36]exhibited hyperparameters enhancing the existing ResNet50 for disease classification and achieved good accuracy.Brady et al.[37]proposed a hybrid technique based on the convolutional autoencoder(CAE)and convolutional neural networks for disease detection in leaves of peach.The proposed model uses few parameters and provides 98.38%test accuracy on the PlantVillage dataset.Agarwal et al.[38]suggested a Conv2D model to determine disease severity in cucumber plants and achieved improved results.Similarly,Shen et al.[39]conducted a comparison of six models to identify powdery mildew on strawberry leaves.He concluded that ResNet-50 has the highest classification accuracy of 98.11%,AlexNet is the fastest processing,and SqueezeNet-MOD2 has the smallest memory footprint.
VGG16 was used by Jiang et al.[40] to detect diseases in rice and wheat plants.Halil Durmu?s[41]developed a plant disease detection system using AlexNet,SqueezeNet,and CNN models.Their dataset contains 18,000 tomato images collected by Plant Village in 10 categories.The overall accuracy of their neural network was 94.3%.
Another researcher[42]implemented a set of tests using the dataset of 552 apple leaves affected by black rot disease.The photos of disease at four stages were considered,110 photographs of healthy plant leaves,137 images of early disease,180 images of mid-stage disease,and 125 pictures of latestage.They used the VGG-16 model to analyze the data.Transfer learning helped them in improving the model and showed 90.4 percent accuracy.
The ResNet-50 model was trained on 3750 tomato leaf images using PlantVillage dataset by Bart et al.[43].They correctly classified leaf diseases on tomato plants and achieved 99.7%accuracy.Another researcher[44]has chosen maize leaves for disease identification on a collection of 400 maize leaf images using the CNN model and obtained 92.85%accuracy.Using input images of aspects 200×200,the VGG-A model(Visual Geometry Group-Architecture)along with CNN(8 convolutional layers with 2 fully linked layers) is used to identify healthy radishes affected with fusarium shrink disease.Another research is conducted to classify potato disease using the VGG model containing 8 trainable layers,three fully linked layers,and five convolutional layers.The quantity of the training dataset affects the VGG model’s classification and achieves 83%accuracy[45].
Amara et al.[46]used 3700 photos of banana leaves from the PlantVillage collection to conduct their studies.They highlighted the effects of lighting,size,background,attitude,and orientation of images on the performance of their model.Yadav et al.[47] used a deep learning technique for automated segmentation and detected the selected diseases in the leaves of peach.They separated the test in the controlled laboratory environment and on actual cultivation and achieved 98.75 percent overall categorization accuracy.Similarly,Sladojevic et al.[48] created a database by downloading 4483 photos from the Internet.These photos are divided into 15 categories,13 classes for damaged plants,one class for healthy leaves,and one class for the background.The overall accuracy of the experimental outcome using AlexNet was 96.3 percent [49,50].Table 1 given below,indicates the benefits and drawbacks of machine learning techniques so far used for plant leaf disease detection.
Table 1:Benefits and drawbacks of machine learning/deep learning algorithms
The proposed plant leaf disease detection and recommendation consists of dataset preparation,classification,disease identification,and suggestions to cope with the disease.
We have implemented the model using Python programming language and TensorFlow and OpenCV libraries.The data preprocessing,prediction,and recommendations performed by the model are implemented using Google Colab with high-speed 16 GB RAM,and eight Tesla P100 GPUs.
The dataset used in this experiment selects several plant leaf diseases using PlantVillage and PlantDoc datasets,like scab disease,black rot,rust and grape leaf black rot,black pox,leaf blight in apple leaves,etc.
To train our disease detection and recommendation system,we have used PlantVillage and PlantDoc datasets,including vegetables,fruits,and fruits vegetables.The dataset contains 80,848 images of leaves from 21 crops,which include apples,cherries,corn,grapes,peaches,bell peppers,potatoes,strawberries,tomatoes,oranges,and squash.The dataset contains 60 classes.Almost all leaf diseases that can harm crops are included in the dataset.
The preprocessing of the dataset before implementing the deep learning technique can improve performance and accuracy.The data was therefore preprocessed so that it can be analyzed appropriately.
The process of data augmentation is to increase the amount of data using existing data to improve accuracy.An improperly trained neural network may be unable to predict explicit output;however,with enough data,it can be perfectly fitted.The disease identification model in this research is built via image augmentation.Image augmentation produces huge diversified images from pictures used for classification,object detection,and segmentation.This study has investigated a few factors to augment the data,such as random horizontal flipping,rescaling 1/255,and zoom.
Models must be evaluated to confirm the accuracy of any neural network.After applying data augmentation,we partitioned the selected dataset for testing and training.In training,we let the model learn while in testing the ensuring accuracy.
Fig.1 given above depicts the flow diagram of the proposed plant disease detection and recommendation system.
The proposed system used the CNN model with 5 convolutional layers and 5 max pooling layers.The input width(nw)and height(nh)of the first convolutional layers are 128 and 128,respectively.In the first step the CNN is used to train the selected dataset in the second step image segmentation of leaves is performed.The Inception v3 along with CNN is used to segment the image features extraction.In the third step,the proposed model performs classification or identification of the specie of disease.In the fourth step,the system provides recommendations to overcome the disease.Fig.2 given below describes the structural composition and detailed overview of the proposed plant leaf disease detection and recommendation system.
Once the data is augmented,the CNN model is used to train selected datasets.The CNN is a multilayer structural model in which each layer generates a reaction and extracts key elements from the dataset.
A total of 60,448 images were used to train the model,while around 20,461 crop images were used to validate them.The convolutional neural network CNN model,along with Inception v3 is used in the proposed model to detect plant disease and provide recommendations.
The deep learning model convolutional neural network CNN is very promising for classifying text and images.Fig.3 given below,shows the detail of the layers used in the CNN model.
Figure 1:Flow diagram of proposed plant disease detection and recommendation system
The convolution is an operation applied to two functions with real numbers as arguments.The convolution operation is defined by the following mathematical expression:
The(x)is the input while the w is the kernel the output of this function is known as a feature map.The discrete convolution is matrices multiplication.Fig.4 is given below the convolutional operation of the CNN model used by the proposed system.
Figure 2:Diagrammatic representation of the proposed model
The pooling layer is a significant component of the CNN model.To lessen the amount of work that has to be done by the network in terms of computing and parameter management,this component must gradually shrink the physical dimensions of the representation.
Max Pooling is an action that takes the maximum value of all of the parameters and reduces the attribute or value by a factor of 4.
Figure 3:Layers in convolutional neural network model
Figure 4:Convolution operation in CNN
The procedure known as “Average Pooling”chooses the arithmetic mean of the area to utilize,which decreases the amount of data by a factor of 4.Fig.5 below depicts the pooling function of the CNN model using maximum clustering of left and middle right pooling.
The Convolutional Neural Network model is used with 5 convolutional layers and 5 max pooling layers.The input width (nw) and height (nh) of the first convolutional layers are 128 and 128,respectively.The softmax activation function is used in the output layer of the convolutional model to ensure that all logits add up to one,and satisfies the probability density restrictions.The CNN is responsible for extracting the features.Because we have sixty output categories,the dense unit in our model is sixty.
There are 42 layers in the Inception v3 machine learning model with fewer parameters.Convolutions are factorized to lower the parameters.For example,a 5×5 filter convolution can be achieved by combining two 3×3 filter convolutions.This technique reduces the parameters from 5×5=25 to 3×3+3×3=18.As a result,there is a 28%drop in the number of parameters.Overfitting is less evident in a model with fewer parameters,resulting in higher accuracy.
Figure 5:Pooling function with maximum clustering of left and middle right pooling
The research used performance evaluation metrics for accuracy,precision,recall,and F1-score.Note that the basic confusion matrix can be misleading;therefore,we used the performance above evaluation criteria.
Accuracy (A) represents the proportion of currently classified predictions and is calculated as follows:
Note that TP,TN,FP,and FN represent true positive,true negative,false positive,and false negative,respectively.
The term “precision” abbreviated as “P” refers to the percentage of positive outcomes that correspond to answers that are accurate and is computed as follows:
Recall(R)is a measurement that determines the percentage of real positives that were accurately detected,and its calculation goes as follows:
The F1-score is computed as the harmonic mean of the accuracy and recall scores,and its definition and calculation are as follows:
A mobile application is developed using an Android studio to make user interaction easier.The user captures the plant leaf image and uploads it on the software as an input image.After processing the image,the proposed system determines if the plant is infected.The system analyzes the plant leaf disease and displays the results.Moreover,the system is capable of suggesting recommendations to overcome the disease.Fig.6 depicts the working of the proposed plant leaf disease detection and recommendation system deployed on an Android environment.
Figure 6:Working of the proposed model
This section reports results using the proposed model for disease detection in plant leaves and recommendations to overcome the disease.The classification task is carried out using fully connected layers with the ReLu activation function,and softmax is employed at the final layer.Moreover,the Inception v3 is used with the convolutional neural network model to perform the feature extraction and classification.Analysis of results concludes that using Inception v3 architecture along with CNN outperformed with the highest reported accuracy.
A comparison of the proposed system in terms of precision,recall,and F1-score using the same dataset is given below in Table 2,indicating the proposed model’s improved performance.
Table 2:Performance evaluation results
Table 3 indicates the comparative analysis of the proposed model with famous machine-learning techniques used for plant leaf disease detection.
Table 3:Comparative analysis of the proposed model using the PlantVillage and Plant Doc dataset
Above Fig.7 depicts the loss for training and validation of the proposed model.
Figure 7:Loss curve of training and validation
Fig.8 Indicates the accuracy of validation and training for the proposed model.
The above Table 4 indicates the performance of the proposed model using different plant leaf datasets.The above results suggest that the proposed model outperformed existing techniques.The Inception v3 uses various kernel sizes to identify features of varying sizes efficiently.Increasing the number of layers model can spread them out more thinly across the screen and pairing Inception v3 with CNN uplifted the performance of the proposed model.
Figure 8:Accuracy comparison with training and validation
Table 4:Comparative analysis of the proposed model on multiple datasets
Although this paper designs a convenient mobile plant disease detection system,there are still some points that can be improved in the future:
? The proposed model considers the PlantVillage and PlantDoc Dataset.These images are taken indoors in a controlled environment.The system is designed to work in a real-time environment.It can therefore be said that the system will be affected by external environments such as sunlight.In follow-up research,we will add the collection of plant disease leaves taken under natural conditions to examine the system’s performance.
With the continuous development of intelligent technology,more intelligent devices began to penetrate various fields to replace human labor and reduce costs.Technological advancements in agriculture have induced efficiency,lowered prices,reduced time,and improved production.The technological advances in plant leaf disease identification are at an early stage;it is done by visiting the field to capture images using cameras.These images are then inspected using technology to identify the disease,which is still time-consuming.This paper proposes a novel approach using a convolutional neural network model and inception v3 to identify plant leaf diseases.The proposed model is capable of working in a real-time environment.This research focused on developing a mobile application using the proposed model to identify plant disease and provide recommendations to overcome the identified disease.The model achieved 99%accuracy.The proposed model is a convenient and beneficial advisory or early warning tool to operate in a real agricultural environment.
Acknowledgement:Thanks for reviewers and editors for providing suggestions during review process.
Funding Statement:This study is supported by the Hainan Provincial Natural Science Foundation of China(No.123QN182)and Hainan University Research Fund(Project Nos.KYQD(ZR)-22064,KYQD(ZR)-22063,and KYQD(ZR)-22065).
Author Contributions:Study conception and design:Uzair Aslam Bhatti,Sibghat Ullah Bazai,Shumaila Hussain,Shariqa Fakhar,Chin Soon Ku,Shah Marjan,Por Lip Yee,Liu Jing;data collection:Sibghat Ullah Bazai,Shumaila Hussain,Shariqa Fakhar,Chin Soon Ku,Shah Marjan,Por Lip Yee,Liu Jing;analysis and interpretation of results: Uzair Aslam Bhatti,Sibghat Ullah Bazai,Shumaila Hussain,Shariqa Fakhar,Chin Soon Ku;draft manuscript preparation:Uzair Aslam Bhatti,Sibghat Ullah Bazai,Shumaila Hussain,Shariqa Fakhar,Chin Soon Ku.All authors reviewed the results and approved the final version of the manuscript.
Availability of Data and Materials:The data will be available on suitable request from corresponding author.
Conflicts of Interest:The authors declare they have no conflicts of interest to report regarding the present study.
Computers Materials&Continua2023年10期