黃英來(lái) 李大明 呂鑫 楊柳松
摘要:為探索對(duì)袋料栽培香菇的機(jī)械式采摘,提出一種基于改進(jìn)YOLOv4的識(shí)別算法。主要改進(jìn)方法為:在PANet(Path Aggregation Network)結(jié)構(gòu)中,增加一條具有殘差注意力機(jī)制的特征圖路徑,提高對(duì)小目標(biāo)的識(shí)別精度,并用深度可分離卷積結(jié)構(gòu)替換PANet網(wǎng)絡(luò)中卷積層,降低了參數(shù)量。使用Focal loss損失函數(shù)改進(jìn)原置信度損失函數(shù)。在數(shù)據(jù)預(yù)處理方面,采用gamma變換方法對(duì)數(shù)據(jù)進(jìn)行增強(qiáng)擴(kuò)充。在訓(xùn)練過(guò)程中利用遷移學(xué)習(xí)的思想,對(duì)主干網(wǎng)絡(luò)載入VOC數(shù)據(jù)集的預(yù)訓(xùn)練權(quán)重。相比原YOLOv4算法,mAP值增加了4.82個(gè)百分點(diǎn),達(dá)到94.39%,算法參數(shù)量降為原來(lái)的58.13%,算法更加高效和輕量化,為機(jī)械采摘提供視覺(jué)算法支持。
關(guān)鍵詞:YOLOv4;目標(biāo)檢測(cè);gamma變換;遷移學(xué)習(xí);香菇采摘
DOI:10.15938/j.jhust.2022.04.004
中圖分類號(hào): TP391.4
文獻(xiàn)標(biāo)志碼: A
文章編號(hào): 1007-2683(2022)04-0023-09
A Detection Method of Lentinus Edodes Based
on Improved YOLOv4 Algorithm
HUANG Ying-lai,LI Da-ming,L? Xin,YANG Liu-song
(Collegeof Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China)
Abstract:In order to explore the picking of Lentinus edodes which are cultivated in bags, a recognition algorithm based on improved YOLOv4 is proposed.?The main improvement measures are: in the structure of PANet (Path Aggregation Network), we add a feature map path with residual attention mechanism to improve the recognition accuracy of small targets, and replace the convolution layer in PANet network with deep separable convolution structure to reduce the amount of parameters.?Focal loss is selected to improve the original confidence loss function.?In the aspect of data preprocessing, gamma transform method is used to enhance and expand the data.?In the training process, the idea of transfer learning is used to load the pre training weight of VOC data set on the backbone network.?Compared with the original YOLOv4 algorithm, the mAP value is increased by 4.82 percentage points to 94.39%, and the amount of algorithm parameters is reduced by 58.13%.?The algorithm is more efficient and lightweight, providing visual algorithm support for mechanical picking.
Keywords:YOLOv4; target detection; gamma transform; transfer learning; lentinus edodes picking
0引言
香菇是一種普遍的食用菌,袋料培育技術(shù)[1]是目前木屑栽培技術(shù)的主要方法之一。通過(guò)實(shí)地調(diào)查,香菇的采摘主要依靠人力,采摘過(guò)程繁瑣且勞動(dòng)強(qiáng)度大,且香菇在生長(zhǎng)條件適宜時(shí),出菇量猛增,若不及時(shí)采摘會(huì)使香菇過(guò)大,而香菇生長(zhǎng)過(guò)大會(huì)降低香菇自身品質(zhì)。過(guò)大的香菇的收購(gòu)價(jià)格也遠(yuǎn)低于標(biāo)準(zhǔn)級(jí)香菇,對(duì)種植戶造成一定經(jīng)濟(jì)損失,機(jī)械式采摘[2]不但可以節(jié)省人力,降低種植戶的生產(chǎn)成本,提高香菇出菇峰值期的采摘效率,而且可以擴(kuò)大香菇種植業(yè),并使其向智能農(nóng)業(yè)、現(xiàn)代農(nóng)業(yè)方向發(fā)展,具有很好的實(shí)際應(yīng)用價(jià)值。
為探索對(duì)食用菌的開(kāi)發(fā)利用,近幾年來(lái)利用計(jì)算機(jī)視覺(jué)對(duì)于食用菌的辨識(shí)研究從未中斷。2014年Chen等[3]提出基于紋理特征和聚類方法的香菇品質(zhì)分選方法,香菇類型分選模型的分選正確率可達(dá)到 93.57%。2015年Xu等[4]提出基于顯著性分割算法實(shí)現(xiàn)對(duì)食用菌雜質(zhì)的自動(dòng)提取,算法在光照不均勻條件下的識(shí)別率仍達(dá)到99.6%。2016年Liu等[5]提出基于紅外光譜技術(shù)和支持向量機(jī)的野生蘑菇近紅外識(shí)別模型,正確識(shí)別率為95.3%。2020年Lin等[6]利用圖像處理技術(shù),提取融合野生菌菇的顏色和形態(tài)特征,實(shí)現(xiàn)食用菌種類的識(shí)別,識(shí)別率達(dá)到90.87%。上述研究方法利用傳統(tǒng)的機(jī)器視覺(jué)處理方法,人工設(shè)計(jì)提取特征,部分方法雖取得較高檢測(cè)精度,但方法并不具有通用性和復(fù)雜環(huán)境下的魯棒性。
基于卷積神經(jīng)網(wǎng)絡(luò)的機(jī)器視覺(jué)辨識(shí)研究,通過(guò)神經(jīng)網(wǎng)絡(luò)進(jìn)行監(jiān)督訓(xùn)練,自動(dòng)學(xué)習(xí)分類特征,取得了較高的準(zhǔn)確率[7-8]。而基于卷積神經(jīng)網(wǎng)絡(luò)的目標(biāo)檢測(cè)算法(如SSD[9],Efficientdet[10] ,F(xiàn)aster RCNN[11],Centernet,YOLO系列等),也為目標(biāo)檢測(cè)提供較高的精度。Liu等[12]提出改進(jìn)SSD的田間行人檢測(cè)模型,使用MobileNetV2[13]作為基礎(chǔ)網(wǎng)絡(luò),以反向殘差結(jié)構(gòu)結(jié)合空洞卷積作為基礎(chǔ)結(jié)構(gòu)進(jìn)行位置預(yù)測(cè),準(zhǔn)確率達(dá)到了97.?46%。Xiang等[14]提出改進(jìn)Faster RCNN的鋁材表面缺陷檢測(cè)方法,在主干網(wǎng)絡(luò)加入特征金字塔網(wǎng)絡(luò)(FPN)結(jié)構(gòu)以加強(qiáng)對(duì)小缺陷的特征提取能力,使用區(qū)域校準(zhǔn)(ROI Pooling)算法來(lái)代替區(qū)域池化算法,獲得更準(zhǔn)確的缺陷定位信息,實(shí)驗(yàn)表明,改進(jìn)后的網(wǎng)絡(luò)對(duì)鋁材表面缺陷檢測(cè)的平均精度均值為91.20%。Zhu等[15]提出基于改進(jìn)Efficientdet的端子線芯檢測(cè)算法,利用K-means多維度聚類算法對(duì)線芯邊界框聚類,生成錨框,利用梯度均衡機(jī)制重構(gòu)損失函數(shù),精度均值達(dá)96.2%。Deng等[16]提出改進(jìn)YOLOv3[17]的交通標(biāo)志檢測(cè)方法,引入改進(jìn)的空間金字塔池化結(jié)構(gòu),優(yōu)化多尺度預(yù)測(cè)網(wǎng)絡(luò),選取CIoU作為損失函數(shù),平均精度可達(dá)94.8%。上述目標(biāo)檢測(cè)算法的改進(jìn)也為食用菌的機(jī)器視覺(jué)辨識(shí)提供了新的方法。YOLOv4算法[18]在相近的參數(shù)量下,表現(xiàn)較高的準(zhǔn)確率。本文提出基于YOLOv4的香菇機(jī)器視覺(jué)識(shí)別算法,為香菇的機(jī)械采摘提供視覺(jué)算法支持。
1YOLOv4算法介紹
1.1網(wǎng)絡(luò)結(jié)構(gòu)方面
首先在主干網(wǎng)絡(luò)嵌入表征能力更強(qiáng)CSP(Cross Stage Paritial Network)結(jié)構(gòu)進(jìn)行特征圖提取,如圖1所示。其中,每個(gè)CSP結(jié)構(gòu)是以CBM結(jié)構(gòu)為基本單元組合構(gòu)成的殘差結(jié)構(gòu),而每個(gè)CBM結(jié)構(gòu)由1個(gè)卷積層,1個(gè)歸一化層和1個(gè)Mish函數(shù)激活層堆疊而成:
其次,使用SPP結(jié)構(gòu)(spatial pyramid pooling layer)將特征圖通過(guò)并列的3個(gè)不同池化層進(jìn)行池化,最后通過(guò)殘差邊進(jìn)行拼接。接著,將所得到的三組不同尺寸的特征圖送入PANet結(jié)構(gòu),輸出擁有全局語(yǔ)義信息的特征圖。
最后,在輸出的3組不同尺度的特征圖上進(jìn)行預(yù)測(cè),特征圖組上的每個(gè)網(wǎng)格(見(jiàn)圖2)需要預(yù)測(cè)3個(gè)不同的目標(biāo)框,每個(gè)目標(biāo)框要預(yù)測(cè)4個(gè)位置參數(shù)x、y、w、h,1個(gè)置信度p和各類別概率class。
1.2損失函數(shù)方面
1.3訓(xùn)練技巧方面
2YOLOv4算法改進(jìn)
2.1深度可分離卷積網(wǎng)絡(luò)輕量化
2.2殘差注意力機(jī)制與PANet特征融合
2.3單目標(biāo)損失函數(shù)
3進(jìn)行實(shí)驗(yàn)與結(jié)果分析
3.1實(shí)驗(yàn)環(huán)境
實(shí)驗(yàn)所用硬件環(huán)境為Intel(R) Core(TM) i5-10300H CPU,NVIDIA Geforce GTX 1660Ti GPU,內(nèi)存大小為16g。軟件環(huán)境為Windows 10操作系統(tǒng),使用pytorch 1.2框架,cuda 10.1,cudnn 7.6.4,python版本3.7.1。
3.2研究方法
3.3結(jié)果對(duì)比與分析
4結(jié)論
本文提出一種改進(jìn)YOLOv4算法的香菇機(jī)器視覺(jué)檢測(cè)方法,通過(guò)增加一條具有殘差注意力機(jī)制的特征圖路徑,使用Focal loss損失函數(shù)改進(jìn)損失函數(shù),以及使用遷移學(xué)習(xí)的訓(xùn)練方法,提高了網(wǎng)絡(luò)的檢測(cè)精度。其次,在部分非主干網(wǎng)絡(luò)中,利用深度可分離卷積層替換原卷積層,減少了算法參數(shù)量。上述方法在對(duì)比實(shí)驗(yàn)中被證明有效,且與當(dāng)前主流的幾種目標(biāo)檢測(cè)方法的相比,檢測(cè)精度較高,但檢測(cè)速度還是有進(jìn)一步提升的空間,這依賴于方法的改良與網(wǎng)絡(luò)結(jié)構(gòu)的改進(jìn),后續(xù)的研究將注重在算法的檢測(cè)精度值不下降的基礎(chǔ)上,進(jìn)一步減小參數(shù)規(guī)模和提高檢測(cè)速度,以方便嵌入式設(shè)備的部署。
參 考 文 獻(xiàn):
[1]肖德清, 史向陽(yáng). 大棚層架式袋料香菇栽培技術(shù)[J].現(xiàn)代農(nóng)業(yè)科技,2013(7):109.XIAO Deqing, SHI Xiangyang. Cultivation Techniques of Lentinus Edodes in Greenhouse with Bagged Materials[J]. Modern Agricultural Science and Technology, 2013(7):109.
[2]高文碩, 宋衛(wèi)東, 王教領(lǐng), 等. 果蔬菌采摘機(jī)械研究綜述[J].中國(guó)農(nóng)機(jī)化學(xué)報(bào),2020,41(10):9.GAO Wenshuo, SONG Weidong, WANG Jiaoling, et al. Review of Fruit and Vegetable Fungus Picking Machinery[J]. China Journal of Agricultural Machinery Chemistry, 2020,41 (10): 9.
[3]陳紅, 夏青, 左婷, 等. 基于紋理分析的香菇品質(zhì)分選方法[J].農(nóng)業(yè)工程學(xué)報(bào),2014,30(3):285.CHEN Hong, XIA Qing, ZUO Ting, et al. Quality Sorting Method of Lentinus Edodes Based on Texture Analysis[J]. Journal of Agricultural Engineering, 2014,30(3): 285.
[4]徐振馳, 紀(jì)磊, 劉曉榮, 等. 基于顯著性特征的食用菌中雜質(zhì)檢測(cè)[J].計(jì)算機(jī)科學(xué),2015,42(S2):203.XU Zhenchi, JI Lei, LIU Xiaorong, et al. Detection of Impurities in Edible Fungi Based on Significance Characteristics[J]. Computer Science, 2015,42(S2):203.
[5]劉洋, 王濤, 左月明. 基于支持向量機(jī)的野生蘑菇近紅外識(shí)別模型[J].食品與機(jī)械,2016,32(4): 92.LIU Yang, WANG Tao, ZUO Yueming. Near Infraredrecognition Model of Wild Mushrooms Based on Support Vector Machine[J]. Food and Machinery, 2016,32(4):92.
[6]林楠, 王娜, 李卓識(shí), 等. 基于機(jī)器視覺(jué)的野生食用菌特征提取識(shí)別研究[J].中國(guó)農(nóng)機(jī)化學(xué)報(bào),2020,41(5):111.LIN Nan, WANG Na, LI ZhuoZhi, et al.Research on Feature Extraction and Recognition of Wild Edible Fungi Based on Machine Vision[J]. China Journal of Agricultural Machinery Chemistry, 2020,41(5):111.
[7]王衛(wèi)兵, 王卓, 徐倩, 等. 基于三維卷積神經(jīng)網(wǎng)絡(luò)的肺結(jié)節(jié)分類[J].哈爾濱理工大學(xué)學(xué)報(bào),2021,26(4): 87.WANG Weibing, WANG Zhuo, XU Qing, et al. Classification of Pulmonary Nodules Based on Three Dimensional Convolutional Neural Network[J]. Journal of Harbin University of Science and Technology, 2021,26(4):87.
[8]畢蓉蓉, 王進(jìn)科. CT圖像下結(jié)合RCNN與U-Net的肺實(shí)質(zhì)自動(dòng)分割方法[J].哈爾濱理工大學(xué)學(xué)報(bào),2021,26(3):74.BI Rongrong, WANG Jinke. Lung Consolidation Combined with RCNN and U-net under CT Images Automatic Mass Segmentation Method[J]. Journal of Harbin University of Science and Technology, 2021,26(3):74.
[9]LIU W, ANGUELOV D, ERHAN D, et al. SSD:Single Shot MultiBox Detector[C]//Computer Vision-ECCV 2016. Cham, 2016: 21.
[10]TAN M, PANG R, LE Q V. EfficientDet: Scalableand Efficient Object Detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020:10781.
[11]REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:Towards Real-time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern ?Analysis and Machine Intelligence,2017,39(6):1137.
[12]劉慧, 張禮帥, 沈躍, 等. 基于改進(jìn)SSD的果園行人實(shí)時(shí)檢測(cè)方法[J].農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(4):29.LIU Hui, ZHANG Lishuai, SHEN Yue, et al. Real Time Detection Method of Orchard Pedestrian Based on Improved SSD[J]. Journal of Agricultural Machinery,2019,50(4):29.
[13]SANDLER M, HOWARD A, ZHU M , et al. MobileNetV2:Inverted Residuals and Linear Bottlenecks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018.
[14]向?qū)挘?李松松, 欒明慧, 等. 基于改進(jìn)Faster RCNN的鋁材表面缺陷檢測(cè)方法[J]. 儀器儀表學(xué)報(bào),2021,42(1):191.XIANG Kuan, LI Songsong, LUAN Minghui, et al. Based on Improved Faster RCNN′s Aluminum Surface Defect Detection Method[J]. Journal of Instrumentation,2021,42(1):191.
[15]朱世松, 孫秀帥. 基于改進(jìn)EfficientDet的線束端子線芯檢測(cè)算法[J/OL].激光與光電子學(xué)進(jìn)展:116[2022-01-18].http://kns.cnki.net/kcms/detail/31.1690.TN.20210806.1548.017.html.ZHU Shisong, SUN Xiushuai. Based on Improved Efficient Det′s Wire Harness Terminal Core Detection Algorithm[J/OL].Laser and Optoelectronics Scientific Progress:1[2022-01-18].http://kns.cnki.net/kcms/detail/31.1690.TN.20210806.1548.017.html.
[16]鄧天民, 周臻浩, 方芳, 等. 改進(jìn)YOLOv3的交通標(biāo)志檢測(cè)方法研究[J]. 計(jì)算機(jī)工程與應(yīng)用,2020,56(20):28.DENG Tianmin, ZHOU Zhenhao, FANG Fang, et al. Improve the Traffic Sign Inspection of YOLOv3 Study on Measurement Methods[J]. Computer Engineering and Application, 2020,56(20):28.
[17]REDMON J, FARHADI A. YOLO v3:an Incrementalimprovement[R]. arXiv:1804. 02767v1,2018.
[18]BOCHKOVSKIY A, WANG C Y. YOLOv4: Optimal Speed and Accuracy of Object Detection[J/OL]. arXiv:109342020.http://arxiv.org/abs/2004.10934.
[19]HE K, ZHANG X, REN S, et al. Deep Residual Learning for Lmage Recognition[C]//CVPR 2016, Las Vegas, 2016: 770.
[20]CHOLLET F. Xception: Deep Learning with Depthwise Separable Convolutions[J]. arXiv preprint arXiv:1610.02357, 2016.
[21]WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module[C]// Proceedings of the 2018 European Conference on Computer Vision .2018:3.
[22]LIN T Y, GOYAL P, GIRSHICK R, et al. Focal Loss for Dense Object Detection[C]// IEEE Transactions on Pattern Analysis & Machine Intelligence. IEEE, 2017:2999.
(編輯:溫澤宇)