陳 燕,王佳盛,曾澤欽,鄒湘軍,陳明猷
大視場(chǎng)下荔枝采摘機(jī)器人的視覺預(yù)定位方法
陳 燕,王佳盛,曾澤欽,鄒湘軍※,陳明猷
(1. 華南農(nóng)業(yè)大學(xué)工程學(xué)院,廣州 510642; 2. 華南農(nóng)業(yè)大學(xué)南方農(nóng)業(yè)機(jī)械與裝備關(guān)鍵技術(shù)教育部重點(diǎn)實(shí)驗(yàn)室,廣州 510642)
機(jī)器人采摘荔枝時(shí)需要獲取多個(gè)目標(biāo)荔枝串的空間位置信息,以指導(dǎo)機(jī)器人獲得最佳運(yùn)動(dòng)軌跡,提高效率。該文研究了大視場(chǎng)下荔枝采摘機(jī)器人的視覺預(yù)定位方法。首先使用雙目相機(jī)采集荔枝圖像;然后改進(jìn)原始的YOLOv3網(wǎng)絡(luò),設(shè)計(jì)YOLOv3-DenseNet34荔枝串檢測(cè)網(wǎng)絡(luò);提出同行順序一致性約束的荔枝串配對(duì)方法;最后基于雙目立體視覺的三角測(cè)量原理計(jì)算荔枝串空間坐標(biāo)。試驗(yàn)結(jié)果表明,YOLOv3-DenseNet34網(wǎng)絡(luò)提高了荔枝串的檢測(cè)精度與檢測(cè)速度;平均精度均值(mean average precision,mAP)達(dá)到0.943,平均檢測(cè)速度達(dá)到22.11幀/s。基于雙目立體視覺的荔枝串預(yù)定位方法在3 m的檢測(cè)距離下預(yù)定位的最大絕對(duì)誤差為36.602 mm,平均絕對(duì)誤差為23.007 mm,平均相對(duì)誤差為0.836%,滿足大視場(chǎng)下采摘機(jī)器人的視覺預(yù)定位要求,可為其他果蔬在大視場(chǎng)下采摘的視覺預(yù)定位提供參考。
機(jī)器人;圖像處理;目標(biāo)檢測(cè);荔枝采摘;大視場(chǎng);卷積神經(jīng)網(wǎng)絡(luò);立體視覺
研發(fā)荔枝采摘機(jī)器人,實(shí)現(xiàn)荔枝采摘的自動(dòng)化與智能化,是解決國內(nèi)的荔枝采摘作業(yè)自動(dòng)化程度低的重要途徑。視覺系統(tǒng)是荔枝采摘機(jī)器人的重要組成部分[1],以機(jī)器視覺為主的定位技術(shù)近年來被廣泛應(yīng)用到農(nóng)業(yè)領(lǐng)域[2-4]。視覺定位算法是視覺系統(tǒng)的關(guān)鍵,其性能直接影響荔枝采摘機(jī)器人的采摘效率和質(zhì)量。因此荔枝采摘視覺定位技術(shù)具有重要研究意義。
華南農(nóng)業(yè)大學(xué)鄒湘軍教授團(tuán)隊(duì)對(duì)荔枝視覺采摘機(jī)器人開展了大量的研究[5-8]。該團(tuán)隊(duì)提出了自然環(huán)境下的荔枝分割方法[9-12]。此外,國內(nèi)還有許多研究者在各類果實(shí)采摘的識(shí)別定位進(jìn)行了研究[13-16]。但上述研究都是基于小視場(chǎng)、僅有一兩串荔枝的場(chǎng)景。而借鑒國外果蔬采摘的經(jīng)驗(yàn)[17-18],在機(jī)器人到達(dá)作業(yè)范圍之前對(duì)荔枝樹整體的果實(shí)分布做預(yù)定位,可指導(dǎo)機(jī)器人運(yùn)動(dòng)到采摘位置,再進(jìn)行精確的采摘點(diǎn)定位,從而提高機(jī)器人采摘效率。大視場(chǎng)是指相機(jī)的視野覆蓋范圍較廣,但是在這個(gè)條件下,相機(jī)視野范圍內(nèi)會(huì)出現(xiàn)多串荔枝,這提高了荔枝串的定位難度。因此,有必要對(duì)大視場(chǎng)下荔枝采摘機(jī)器人的視覺預(yù)定位進(jìn)行研究。
近年來,隨著深度學(xué)習(xí),特別是卷積神經(jīng)網(wǎng)絡(luò)的發(fā)展,有許多學(xué)者利用卷積神經(jīng)網(wǎng)絡(luò)進(jìn)行分類、分割、識(shí)別與檢測(cè)[19-32]。如文獻(xiàn)[20]在VGGNet的基礎(chǔ)上優(yōu)化網(wǎng)絡(luò)結(jié)構(gòu),提高番茄主要器官的特征提取能力,并通過Selective Search生產(chǎn)檢測(cè)區(qū)域,實(shí)現(xiàn)不同種類、不同成熟度的番茄主要器官的檢測(cè)。文獻(xiàn)[28-29]分別使用YOLO算法對(duì)采摘目標(biāo)進(jìn)行了識(shí)別、定位并取得不錯(cuò)的結(jié)果。因此,使用深度學(xué)習(xí)方法有助于荔枝果串的預(yù)定位。
試驗(yàn)設(shè)備由硬件設(shè)備與軟件組成,硬件設(shè)備主要包括:2臺(tái)GigE工業(yè)相機(jī)構(gòu)成的雙目立體視覺系統(tǒng),型號(hào)為維視 MV-EM200C,分辨率1600×1200像素,幀率60幀/s,鏡頭焦距為16 mm;博世激光測(cè)距儀,型號(hào)為GLM50,有效測(cè)量范圍0.05~50 m,測(cè)量精度±1.5 mm;維視高精度圓點(diǎn)標(biāo)定板,圓點(diǎn)數(shù)量為9×11個(gè),圓心距離(30±0.01)mm;筆記本電腦,主要配置:i7-7700HQ處理器;16 G,2 400 MHz內(nèi)存;GTX1060 6G顯卡。
軟件系統(tǒng)主要以O(shè)penCV視覺庫與DarkNet深度學(xué)習(xí)框架為基礎(chǔ)編寫而成。
在拍照采樣之前,需要對(duì)雙目立體視覺系統(tǒng)進(jìn)行標(biāo)定。根據(jù)三角測(cè)量原理,基線距離越大,測(cè)量精度越高,但是基線距離越大,2個(gè)相機(jī)的公共視場(chǎng)越小。為了保證在較高的精度下有較大的公共視場(chǎng),經(jīng)過多次調(diào)試后選擇基線距離為110 mm。為確保圖像的準(zhǔn)確度,相機(jī)標(biāo)定在大視場(chǎng)范圍內(nèi)進(jìn)行,即相機(jī)與目標(biāo)果實(shí)的距離為2.5~3 m。在采集圖像前,使用圓點(diǎn)標(biāo)定板完成相機(jī)雙目立體視覺系統(tǒng)的標(biāo)定。
試驗(yàn)圖像的拍攝時(shí)間為2018年6-7月,拍攝地點(diǎn)為廣州市增城區(qū)和廣州市從化區(qū)。在野外環(huán)境下采集大視場(chǎng)范圍下的荔枝圖像,并用激光測(cè)距儀測(cè)量荔枝串的距離,用于與本文算法所得結(jié)果進(jìn)行比對(duì)。共采集雙目圖像250對(duì)。由于樣本數(shù)據(jù)較小,容易出現(xiàn)過擬合,因此需要對(duì)原圖與極線校正后的圖像使用了小范圍的隨機(jī)裁剪、縮放對(duì)樣本進(jìn)行擴(kuò)充,最終的圖片數(shù)據(jù)集為4 000張。最后借助開源工具LabelImg制作目標(biāo)檢測(cè)網(wǎng)絡(luò)的數(shù)據(jù)集。
在大視場(chǎng)條件下,圖像背景復(fù)雜,如果直接對(duì)全圖進(jìn)行稠密立體匹配,匹配效率低且效果差。另外,如圖1中藍(lán)色框與紅色框所示,部分荔枝串無法完全同時(shí)出現(xiàn)在公共視場(chǎng)中,這會(huì)影響荔枝串圖像的模板匹配,從而難以準(zhǔn)確定位荔枝串。因此,本文首先對(duì)左、右目圖像做目標(biāo)檢測(cè),在目標(biāo)檢測(cè)的基礎(chǔ)上提出基于同行順序一致性約束的荔枝串配對(duì)算法,根據(jù)三角測(cè)量原理,以各串荔枝中心的視差計(jì)算出荔枝串的三維空間坐標(biāo)。
注:黃色框表示公共視場(chǎng)中圖像完整;藍(lán)色框表示公共視場(chǎng)中圖像有部分缺失;紅色框表示公共視場(chǎng)中圖像完全缺失。
1.3.1 荔枝串目標(biāo)檢測(cè)
借鑒YOLOv3[30]目標(biāo)檢測(cè)網(wǎng)絡(luò)以及DenseNet[31]分類網(wǎng)絡(luò),并結(jié)合荔枝串檢測(cè)任務(wù)的場(chǎng)景單一(僅為果園環(huán)境)、目標(biāo)單一的特點(diǎn)優(yōu)化網(wǎng)絡(luò)結(jié)構(gòu),設(shè)計(jì)了深度為34層的密集卷積層(下文稱為Dense Module),基于Dense Module設(shè)計(jì)荔枝串檢測(cè)網(wǎng)絡(luò)YOLOv3-DenseNet34。
由卷積層(convolution,Conv),批歸一化層(batch normalization,BN)以及激活層(leaky ReLU)組成一個(gè)基本組件層(DarkNet convolution, batch normalization, leaky ReLU, 下文稱為DBL)(如圖2左下角),其中DBL(1×1)指卷積層的卷積核大小為1×1。多個(gè)DBL層組成一個(gè)DBL模塊(如圖2右下角);多個(gè)DBL模塊組成Dense Module,模塊之間的連接模式如圖2所示。
圖2 Dense Module結(jié)構(gòu)示意圖
YOLOv3-DenseNet34的先驗(yàn)框尺寸通過對(duì)樣本集所有圖像中荔枝的寬高進(jìn)行K-means聚類獲得。根據(jù)樣本的尺度分布,聚類時(shí)選取聚類數(shù)為6。最終得到的先驗(yàn)框聚類結(jié)果為(20, 20),(33, 27),(26, 39),(48, 49),(32, 56),(57, 95)。
根據(jù)上述聚類結(jié)果可知,最大的先驗(yàn)框邊長為95,使用3×3卷積的感受野,可知YOLOv3-DenseNet34的下采樣次數(shù)為5。
為了不損失原始數(shù)據(jù),YOLOv3-DenseNet34使用步長為2的卷積來代替最大池化(max pooling)進(jìn)行下采樣。下采樣次數(shù)與卷積感受野、先驗(yàn)框邊長存在以下關(guān)系:
式中為卷積的感受野尺寸;為下采樣次數(shù);為最大的先驗(yàn)框邊長。
本文設(shè)計(jì)的YOLOv3-DenseNet34目標(biāo)檢測(cè)網(wǎng)絡(luò)結(jié)構(gòu)如圖3所示。其中DBL(步長=2)即為代替下采樣的卷積層。該網(wǎng)絡(luò)使用包含4個(gè)Dense Module的34層卷積backbone提取多尺度特征,使用3個(gè)不同尺度的特征圖做預(yù)測(cè)輸出,即圖3中的1,2,3,其中1、2、3分別下采樣5、4、3次。每個(gè)尺度預(yù)測(cè)2個(gè)輸出,每個(gè)輸出包含目標(biāo)的位置坐標(biāo)和尺度在不同方向上的偏移量、置信度和目標(biāo)類別的one-hot共6個(gè)數(shù)據(jù),因此預(yù)測(cè)輸出的深度均為12。
1.3.2 基于雙目立體視覺的荔枝串預(yù)定位
完成相機(jī)的單目與雙目標(biāo)定后,需要對(duì)左右圖像對(duì)應(yīng)點(diǎn)做立體匹配,然后計(jì)算匹配點(diǎn)視差,最后根據(jù)三角測(cè)量原理計(jì)算匹配點(diǎn)的三維坐標(biāo)。如果對(duì)整幅大視場(chǎng)的荔枝圖像進(jìn)行稠密立體匹配,計(jì)算量會(huì)很大,并且容易出現(xiàn)誤匹配,即使完成了全局的立體匹配,仍然不能得到各串荔枝的位置信息。
檢測(cè)到圖像中的荔枝串后,可使用直接模板匹配方法,直接以左目圖像的荔枝串檢測(cè)結(jié)果為模板,在右目圖像上做模板匹配,將匹配得分最高的點(diǎn)作為匹配點(diǎn),從而實(shí)現(xiàn)稀疏的立體匹配。但是直接模板匹配需要對(duì)左目圖像中的每個(gè)荔枝串都在整幅右目圖像上做搜索,計(jì)算量大,并且容易出現(xiàn)誤匹配,如圖4所示。圖4中左右圖像中相同的數(shù)字代表直接模板匹配算法認(rèn)為是同一荔枝串的區(qū)域。可以明顯看出第5、6、8串荔枝出現(xiàn)了誤匹配。
為了解決上述問題,在完成荔枝串檢測(cè)的基礎(chǔ)上,提出基于同行順序一致性約束的一種稀疏立體匹配算法。同行順序一致性約束的荔枝串配對(duì)方法是在外極線矯正后進(jìn)行。以左目圖像的目標(biāo)檢測(cè)結(jié)果為模板,根據(jù)行約束在同行內(nèi)搜索匹配圖像,以減小搜索范圍。另外對(duì)于光軸平行式雙目立體視覺模型,空間點(diǎn)在右目圖像的軸坐標(biāo)一定比左目圖像的小。因此,如果模板圖像在左目圖像的右下角的橫坐標(biāo)為x,則它在右目圖像的搜索范圍x可限制在0~x之間,這樣可以進(jìn)一步減小搜索范圍?;谕许樞蛞恢滦约s束的匹配方法可以減少搜索范圍,提高匹配速度,減少誤匹配。同行順序一致性約束的匹配范圍如圖5所示。
圖3 YOLOv3-DenseNet34網(wǎng)絡(luò)結(jié)構(gòu)示意圖
注:黃色框表示荔枝串在左目圖像中的檢測(cè)狀況;紫色框表示荔枝串在右目圖像中的檢測(cè)狀況。
注:xl為目標(biāo)在左目圖上的橫坐標(biāo);xr為目標(biāo)在右目圖像上的橫坐標(biāo)。
為了剔除同行順序一致性匹配方法的誤匹配,計(jì)算每個(gè)候選匹配區(qū)域與右目圖像目標(biāo)檢測(cè)結(jié)果的重合度,每個(gè)候選匹配區(qū)域保留重合度最大的目標(biāo)檢測(cè)結(jié)果作為其配對(duì)結(jié)果。然后剔除不重合或者重合度極低(IoU<0.2)的配對(duì)。
最后,對(duì)上述匹配結(jié)果修正。在右目圖像上取配對(duì)結(jié)果的重合區(qū)域(圖6b中白色框)作為模板,在左目圖像上用基于同行順序一致性約束的匹配方法進(jìn)行模板匹配。但此時(shí)約束范圍稍有變化:假設(shè)重合區(qū)域在右目圖像的左上角橫坐標(biāo)為x,則滑窗檢索的范圍在與重合區(qū)域同行的(x,)內(nèi),其中為圖像寬度。修正效果如圖6中白色框所示。
注:黃色框表示目標(biāo)檢測(cè)結(jié)果;紫色框表示左目圖像目標(biāo)檢測(cè)結(jié)果在右目圖像上的匹配狀況;白色框表示左、右目圖像的匹配結(jié)果重合區(qū)域。
1.3.3 亞像素視差計(jì)算
荔枝串配對(duì)后,需確定匹配點(diǎn)用于計(jì)算視差。配對(duì)框大小相同時(shí),左、右目圖像的中心點(diǎn)視差與配對(duì)框左上角的視差一致。為克服視差的計(jì)算結(jié)果為像素級(jí),設(shè)計(jì)了一種計(jì)算亞像素級(jí)視差的方法,流程如下:先計(jì)算配對(duì)框的相似度和視差;然后計(jì)算相像素級(jí)精度下鄰視差的匹配相似度,此時(shí)包含原匹配點(diǎn)和相似度總共可以確定視差-相似度平面內(nèi)的3個(gè)點(diǎn)(如圖7點(diǎn)1、2、3),這3個(gè)點(diǎn)可以唯一確定一條二次曲線(如圖7曲線)。最后求解該二次曲線頂點(diǎn)(如圖7點(diǎn)),頂點(diǎn)的橫坐標(biāo)即為亞像素精度下的視差。得到視差后即可計(jì)算匹配點(diǎn)的三維空間坐標(biāo)。
注:p1、p2、p3為原匹配點(diǎn)和相似度所確定視差-相似度平面內(nèi)的3個(gè)點(diǎn);t為二次曲線的頂點(diǎn)。
1.3.4 預(yù)定位誤差計(jì)算
匹配點(diǎn)在左相機(jī)坐標(biāo)系下的坐標(biāo)是無法直接測(cè)量的,故無法直接計(jì)算3個(gè)坐標(biāo)值之間的誤差。因此采用空間點(diǎn)的距離誤差來衡量定位誤差。具體計(jì)算方法如下:
式中為測(cè)量誤差,mm;為視覺測(cè)量距離,mm;d激光測(cè)量距離,mm;,,為視覺測(cè)量激光點(diǎn)的坐標(biāo)值,mm。
試驗(yàn)數(shù)據(jù)的采集時(shí)間、地點(diǎn)以及采集設(shè)備同1.1、1.2節(jié)。試驗(yàn)過程中,首先完成雙目立體視覺的標(biāo)定。然后調(diào)整三腳架云臺(tái)的位置,使激光點(diǎn)落在某串荔枝果實(shí)上,并鎖死三腳架云臺(tái),待激光測(cè)距儀數(shù)值穩(wěn)定后記錄激光測(cè)量距離t,同時(shí)讓2臺(tái)相機(jī)同時(shí)采樣荔枝圖像。不斷重復(fù)上述過程,共記錄30組數(shù)據(jù)和30對(duì)荔枝圖像。30對(duì)荔枝圖像均以圖像激光點(diǎn)為中心,選取一個(gè)固定大小的區(qū)域作為目標(biāo)檢測(cè)結(jié)果,然后使用基于同行順序一致性約束的匹配方法進(jìn)行匹配和視差計(jì)算,計(jì)算匹配點(diǎn)三維坐標(biāo)并得出荔枝果實(shí)到相機(jī)的距離。最后計(jì)算與t之間的誤差。
DarkNet[29]是YOLOv3的骨干網(wǎng)絡(luò),用于進(jìn)行網(wǎng)絡(luò)構(gòu)建與訓(xùn)練,DarkNet53與DenseNet34的訓(xùn)練參數(shù)設(shè)置如表1所示。
表1 網(wǎng)絡(luò)訓(xùn)練參數(shù)設(shè)置
根據(jù)前人研究[29-31],采用Loss值表示損失狀況,可用于衡量網(wǎng)絡(luò)的正確性與收斂狀況。本文網(wǎng)絡(luò)訓(xùn)練過程中前1 000次迭代的Loss數(shù)值很大而且沒有意義,曲線從第1 000次迭代開始記錄,如圖8所示。
由圖8可知,2種網(wǎng)絡(luò)在前2 000次迭代中迅速擬合,之后偏向穩(wěn)定,YOLOv3-DenseNet34的Loss值比原始網(wǎng)絡(luò)下降慢,但最后均能收斂。表明本文所設(shè)計(jì)的網(wǎng)絡(luò)結(jié)構(gòu)可靠。
圖8 荔枝串檢測(cè)網(wǎng)絡(luò)訓(xùn)練過程Loss曲線
使用平均精度均值[33-34](mean average precision,mAP)指標(biāo)來衡量荔枝串檢測(cè)精度,它能很好地反映目標(biāo)檢測(cè)網(wǎng)絡(luò)的識(shí)別能力,是目前目標(biāo)檢測(cè)領(lǐng)域最常用的指標(biāo)。用幀率(frame per second,F(xiàn)PS)來表示模型的檢測(cè)速度。其中mAP計(jì)算公式如下:
式中為準(zhǔn)確率;tp為正例正確地分類為正例的數(shù)量;fp為負(fù)例錯(cuò)誤地分類為正例的數(shù)量;A為平均精度;為識(shí)別圖像總數(shù);mAP為平均精度均值;C為識(shí)別類別總數(shù)。
統(tǒng)計(jì)試驗(yàn)所得mAP、平均檢測(cè)速度與模型大小,結(jié)果如表2所示。
表2 荔枝串檢測(cè)網(wǎng)絡(luò)的性能對(duì)比
由表2可知,YOLOv3-DenseNet34檢測(cè)速度比原始的YOLOv3提高約0.6倍,達(dá)到22.11幀/s,同時(shí)mAP提高5.6%,達(dá)到0.943,模型大小只有9.3 MB,僅為原始網(wǎng)絡(luò)的1/26。由此可見,本文改進(jìn)的荔枝串檢測(cè)網(wǎng)絡(luò)YOLOv3-DenseNet34與原始YOLOv3模型在檢測(cè)速度與檢測(cè)精度以及模型參數(shù)大小上都有改進(jìn)和提高。
荔枝串預(yù)定位的激光測(cè)量值、視覺測(cè)量值、測(cè)量誤差等數(shù)據(jù)如表3所示。計(jì)算可得雙目立體視覺荔枝串預(yù)定位的最大絕對(duì)誤差為33.602 mm,平均絕對(duì)誤差為23.007 mm,標(biāo)準(zhǔn)差為7.434 mm,平均相對(duì)誤差為0.836%,表明本文方法檢測(cè)精度高,滿足預(yù)定位要求。
表3 荔枝預(yù)定位視覺測(cè)量值及其誤差
本文研究了大視場(chǎng)下荔枝采摘機(jī)器人視覺預(yù)定位方法。通過改進(jìn)的原始的YOLOv3,設(shè)計(jì)了荔枝串檢測(cè)網(wǎng)絡(luò)YOLO-DenseNet34;提出了同行順序一致性約束的荔枝串配對(duì)方法;最后基于雙目立體視覺的三角測(cè)量原理計(jì)算荔枝串空間坐標(biāo)。試驗(yàn)結(jié)果表明,YOLOv3-DenseNet34網(wǎng)絡(luò)提高了荔枝串的檢測(cè)精度與檢測(cè)速度;mAP值達(dá)到0.943,平均檢測(cè)速度達(dá)到22.11幀/s?;陔p目立體視覺的荔枝串預(yù)定位方法在3 m的檢測(cè)距離下預(yù)定位的最大絕對(duì)誤差為36.602 mm,平均絕對(duì)誤差為23.007 mm,平均相對(duì)誤差為0.836%。本文所研究的大視場(chǎng)下荔枝采摘機(jī)器人視覺預(yù)定位方法在精度與速度上都能滿足大視場(chǎng)下采摘視覺預(yù)定位要求,可為其他果蔬大視場(chǎng)下采摘的視覺預(yù)定位提供參考。
[1]程祥云,宋欣. 果蔬采摘機(jī)器人視覺系統(tǒng)研究綜述[J]. 浙江農(nóng)業(yè)科學(xué),2019,60(3):490-493.
Cheng Xiangyun, Song Xin. A review of research on vision system of fruit and vegetable picking robot[J]. Journal of Zhejiang Agricultural Sciences. 2019, 60(3): 490-493. (in Chinese with English abstract)
[2]羅陸鋒,鄒湘軍,程堂燦,等. 采摘機(jī)器人視覺定位及行為控制的硬件在環(huán)虛擬試驗(yàn)系統(tǒng)設(shè)計(jì)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(4):39-46.
Luo Lufeng, Zou Xiangjun, Cheng Tangcan, et al. Design of virtual test system based on hardware-in-loop for picking robot vision localization and behavior control[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(4): 39-46. (in Chinese with English abstract)
[3]熊俊濤,何志良,湯林越,等. 非結(jié)構(gòu)環(huán)境中擾動(dòng)葡萄采摘點(diǎn)的視覺定位技術(shù)[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2017,48(4):29-33,81.
Xiong Juntao, He Zhiliang, Tang Linyue, et al. Visual localization of disturbed grape picking point in non-structural environment[J]. Transactions of the Chinese Society for Agricultural Machinery, 2017, 48(4): 29-33, 81. (in Chinese with English abstract)
[4]朱镕杰,朱穎匯,王玲,等. 基于尺度不變特征轉(zhuǎn)換算法的棉花雙目視覺定位技術(shù)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2016,32(6):182-188.
Zhu Rongjie, Zhu Yinghui, Wang Ling, et al. Cotton positioning technique based on binocular vision with implementation of scale invariant feature transform algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(6): 182-188. (in Chinese with English abstract)
[5]葉敏,鄒湘軍,羅陸鋒,等. 荔枝采摘機(jī)器人雙目視覺的動(dòng)態(tài)定位誤差分析[J]. 農(nóng)業(yè)工程學(xué)報(bào),2016,32(5):50-56.
Ye Min, Zou Xiangjun, Luo Lufeng, et al. Error analysis of dynamic localization tests based on binocular stereo vision on litchi harvesting manipulator[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(5): 50-56. (in Chinese with English abstract)
[6]Zou X, Ye M, Luo C, et al. Fault-tolerant design of a limited universal fruit-picking end-effector based on vision-positioning error[J]. Applied Engineering in Agriculture, 2016, 32(1): 5-18.
[7]Zou X, Zou H, Lu J. Virtual manipulator-based binocular stereo vision positioning system and errors modelling[J]. Machine Vision and Applications. 2012, 23(1): 43-63.
[8]陳燕,鄒湘軍,徐東風(fēng),等. 荔枝采摘機(jī)械手機(jī)構(gòu)設(shè)計(jì)及運(yùn)動(dòng)學(xué)仿真[J]. 機(jī)械設(shè)計(jì),2010,27(5):31-34.
Chen Yan, Zou Xiangjun, Xu Dongfeng, et al. Mechanism design and kinematics simulation of litchi picking manipulator[J]. Journal of Machine Design, 2010, 27(5): 31-34. (in Chinese with English abstract)
[9]熊俊濤,鄒湘軍,陳麗娟,等. 基于機(jī)器視覺的自然環(huán)境中成熟荔枝識(shí)別[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2011,42(9):162-166.
Xiong Juntao, Zou Xiangjun, Chen Lijuan, et al. Recognition of mature litchi in natural environment based on machine vision[J]. Transactions of the Chinese Society for Agricultural Machinery, 2011, 42(9): 162-166. (in Chinese with English abstract)
[10]郭艾俠,鄒湘軍,朱夢(mèng)思,等. 基于探索性分析的的荔枝果及結(jié)果母枝顏色特征分析與識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2013,29(4):191-198.
Guo Aixia, Zou Xiangjun, Zhu Mengsi, et al. Color feature analysis and recognition for litchi fruits and their main fruit bearing branch based on exploratory analysis[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(4): 191-198. (in Chinese with English abstract)
[11]熊俊濤,鄒湘軍,王紅軍,等. 基于Retinex圖像增強(qiáng)的不同光照條件下的成熟荔枝識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2013,29(12):170-178.
Xiong Juntao, Zou Xiangjun, Wang Hongjun, et al. Recognition of ripe litchi in different illumination conditions based on Retinex image enhancement[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(12): 170-178. (in Chinese with English abstract)
[12]彭紅星,鄒湘軍,陳麗娟,等. 基于雙次Otsu算法的野外荔枝多類色彩目標(biāo)快速識(shí)別[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2014,45(4):61-68.
Peng Hongxing, Zou Xiangjun, Chen Lijuan, et al. Fast recognition of multiple color targets of litchi image in field environment based on double otsu algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery, 2014, 45(4): 61-68. (in Chinese with English abstract)
[13]Fu Longsheng, Tola Elkamil, Al-Mallahi Ahmad, et al. A novel image processing algorithm to separate linearly clustered kiwifruits[J]. Biosystems Engineering, 2019, 183: 184-195.
[14]傅隆生,孫世鵬,Vázquez-Arellano Manuel,等. 基于果萼圖像的獼猴桃果實(shí)夜間識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(2):199-204.
Fu Longsheng, Sun Shipeng, Vázquez-Arellano Manuel, et al. Kiwifruit recognition method at night based on fruit calyx image[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(2): 199-204. (in Chinese with English abstract)
[15]梁喜鳳,金超杞,倪梅娣,等. 番茄果實(shí)串采摘點(diǎn)位置信息獲取與試驗(yàn)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018,34(16):163-169.
Liang Xifeng, Jin Chaoqi, Ni Meidi, et al. Acquisition and experiment on location information of picking point of tomato fruit clusters[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(16): 163-169. (in Chinese with English abstract)
[16]李寒,張漫,高宇,等. 溫室綠熟番茄機(jī)器視覺檢測(cè)方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(增刊1):328-334,388.
Li Han, Zhang Man, Gao Yu, et al. Green ripe tomato detection method based on machine vision in greenhouse[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(Supp.1): 328-334, 388. (in Chinese with English abstract)
[17]Van Henten E J, Van Tuijl B A J, Hemming J, et al. Field Test of an Autonomous Cucumber Picking Robot[J]. Biosystems Engineering. 2003, 86(3): 305-313.
[18]Mehta S S, Burks T F. Vision-based control of robotic manipulator for citrus harvesting[J]. Computers and Electronics in Agriculture. 2014, 102: 146-158.
[19]薛金林,閆嘉,范博文. 多類農(nóng)田障礙物卷積神經(jīng)網(wǎng)絡(luò)分類識(shí)別方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2018,49(S1):35-41.
Xue Jinlin, Yan Jia, Fan Bowen. Classification and identification method of multiple kinds of farm obstacles based on convolutional neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(S1): 35-41. (in Chinese with English abstract)
[20]周云成,許童羽,鄭偉,等. 基于深度卷積神經(jīng)網(wǎng)絡(luò)的番茄主要器官分類識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(15):219-226.
Zhou Yuncheng, Xu Tongyu, Zheng Wei, et al. Classification and recognition approaches of tomato main organs based on DCNN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(15): 219-226. (in Chinese with English abstract)
[21]傅隆生,馮亞利,Elkamil Tola,等. 基于卷積神經(jīng)網(wǎng)絡(luò)的田間多簇獼猴桃圖像識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018,34(2):205-211.
Fu Longsheng, Feng Yali, Elkamil Tola, et al. Image recognition method of multi-cluster kiwifruit in field based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(2): 205-211. (in Chinese with English abstract)
[22]陳鋒軍,王成翰,顧夢(mèng)夢(mèng),等. 基于全卷積神經(jīng)網(wǎng)絡(luò)的云杉圖像分割算法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2018,49(12):188-194.
Chen Fengjun, Wang Chenghan, Gu Mengmeng, et al. Spruce image segmentation algorithm based on fully convolutional networks[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(12): 188-194. (in Chinese with English abstract)
[23]韓巧玲,趙玥,趙燕東,等. 基于全卷積網(wǎng)絡(luò)的土壤斷層掃描圖像中孔隙分割[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(2):128-133.
Han Qiaoling, Zhao Yue, Zhao Yandong, et al. Soil pore segmentation of computed tomography images based on fully convolutional network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(2): 128-133. (in Chinese with English abstract)
[24]高云,郭繼亮,黎煊,等. 基于深度學(xué)習(xí)的群豬圖像實(shí)例分割方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(4):179-187.
Gao Yun, Guo Jiliang, Li Xuan, et al. Instance-level segmentation method for group pig images based on deep learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(4): 179-187. (in Chinese with English abstract)
[25]王丹丹,何東健. 基于R-FCN深度卷積神經(jīng)網(wǎng)絡(luò)的機(jī)器人疏果前蘋果目標(biāo)的識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(3):156-163.
Wang Dandan, He Dongjian. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 156-163. (in Chinese with English abstract)
[26]楊國國,鮑一丹,劉子毅. 基于圖像顯著性分析與卷積神經(jīng)網(wǎng)絡(luò)的茶園害蟲定位與識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(6):156-162.
Yang Guoguo, Bao Yidan, Liu Ziyi. Localization and recognition of pests in tea plantation based on image saliency analysis and convolutional neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(6): 156-162. (in Chinese with English abstract)
[27]畢松,高峰,陳俊文,等. 基于深度卷積神經(jīng)網(wǎng)絡(luò)的柑橘目標(biāo)識(shí)別方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(5):181-186.
Bi Song, Gao Feng, Chen Junwen, et al. Detection method of citrus based on deep convolution neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(5): 181-186. (in Chinese with English abstract)
[28]趙德安,吳任迪,劉曉洋,等. 基于YOLO深度卷積神經(jīng)網(wǎng)絡(luò)的復(fù)雜背景下機(jī)器人采摘蘋果定位[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(3):164-173.
Zhao Dean, Wu Rendi, Liu Xiaoyang, et al. Apple positioning based on YOLO deep convolutional neural network for picking robot in complex background[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 164-173. (in Chinese with English abstract)
[29]薛月菊,黃寧,涂淑琴,等. 未成熟芒果的改進(jìn)YOLOv2識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018,34(7):173-179.
Xue Yueju, Huang Ning, Tu Shuqin, et al.Immature mango detection based on improved YOLOv2[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(7): 173-179.(in Chinese with English abstract)
[30]Redmon J, Farhadi A. Yolov3: An incremental improvement[R]. arXiv, 2018.
[31]Huang G, Liu Z, Maaten L V D, et al. Densely Connected Convolutional Networks[C]//CVPR. IEEE Computer Society, 2017.
[32]Lin G, Tang Y, Zou X. et al. Fruit detection combined with color, depth, and shape information[J/OL]. Precision Agriculture. https://doi.org/10.1007/s11119-019-09654-w, 2019-06-29.
[33]劉挺,秦兵,張宇. 信息檢索系統(tǒng)導(dǎo)論[M]. 北京:機(jī)械工業(yè)出版社,2008.
[34]Wu Shengli, McClean Sally. Lecture Notes in Computer Science[M]. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006.
Vision pre-positioning method for litchi picking robot under large field of view
Chen Yan, Wang Jiasheng, Zeng Zeqin, Zou Xiangjun※, Chen Mingyou
(1510642,; 2.,,510642,)
Litchi picking robot is an important tool for improving the automation of litchi picking operation. The spatial position information of litchi cluster needs to be acquired when the robot picks litchi normally. In order to guide the robot moving to the picking position and improve the picking efficiency, the vision pre-positioning method of litchi picking robot under large field of view is proposed in this paper studied. Firstly, using the binocular stereo vision system composed of two industrial cameras that have been calibrated, 250 pairs of litchi cluster images under large field of view was taken in the litchi orchard in Guangzhou, the spatial positions of key litchi clusters were recorded by using a laser range finder, and the results were compared with those tested in the paper. In order to expand the sample size, the original image and the polar line correction image were randomly cropped and scaled in a small range, and the final image data set was 4 000 sheets. After that, by using LabelImg, the data set of the target detection network was created. Secondly, by using the YOLOv3 network and the DenseNet classification network, combined with the characteristics of single target and single scene of litchi cluster detection task (only for orchard environment), the network structure was optimized, a Dense Module with a depth of 34 layers and a litchi cluster detection network YOLOv3-DenseNet34 based on the Dense Module was designed. Thirdly, Because of the the complexity of the background image under large field of view, the dense stereo matching degree of the whole image is low and the effect is poor, at the same time, some litchi clusters can not appear in the public view of the image at the same time, therefore, a method for calculating sub-pixel parallax was designed, peer-to-peer sequential consistency constraint matching method was proposed. By solving the quadratic curve composed of parallax and similarity, the parallax under sub-pixel was used to calculate the spatial positions of the litchi cluster. Through the comparison with the original network of YOLOv3, the test network performance of the paper was tested, and found that the YOLOv3-DenseNet34 network improved the detection accuracy and detection speed of the litchi cluster, the mAP (mean average precision) value was 0.943, the average detection speed was 22.11 frame/s and the model size was 9.3 MB, which was 1/26 of the original network of YOLOv3. Then, the detection results of the method were compared with the results of the laser range finder. The max absolute error of the pre-positioning at the detection distance of 3 m was 36.602 mm, the mean absolute error was 23.007 mm, and the average relative error was 0.836%. Test results showed that the vision pre-positioning method studied in this paper can basically meet the requirements of vision pre-positioning under large field of view in precision and speed. And this method can provide reference for other vision pre-positioning methods under large field of view of fruits and vegetables picking.
robs; image processing; object detection; litchi picking; large field of view; convolutional neural network; stereo vision
陳 燕,王佳盛,曾澤欽,鄒湘軍,陳明猷. 大視場(chǎng)下荔枝采摘機(jī)器人的視覺預(yù)定位方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(23):48-54.doi:10.11975/j.issn.1002-6819.2019.23.006 http://www.tcsae.org
Chen Yan, Wang Jiasheng, Zeng Zeqin, Zou Xiangjun, Chen Mingyou. Vision pre-positioning method for litchi picking robot under large field of view[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(23): 48-54. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2019.23.006 http://www.tcsae.org
2019-06-30
2019-11-11
國家自然科學(xué)基金資助項(xiàng)目(31571568);廣東省自然科學(xué)基金項(xiàng)目(2018A030307067)
陳 燕,副教授,主要從事農(nóng)業(yè)機(jī)器人、農(nóng)業(yè)智能裝備和智能設(shè)計(jì)與制造的研究,Email:cy123@scau.edu.cn
鄒湘軍,教授,博士生導(dǎo)師,主要從事農(nóng)業(yè)機(jī)器人、機(jī)器視覺的研究,Email:xjzou1@163.com
10.11975/j.issn.1002-6819.2019.23.006
TP391.41
A
1002-6819(2019)-23-0048-07
農(nóng)業(yè)工程學(xué)報(bào)2019年23期