李艷君,黃康為,2,項(xiàng) 基
基于立體視覺的動態(tài)魚體尺寸測量
李艷君1,黃康為1,2,項(xiàng) 基3※
(1. 浙大城市學(xué)院,杭州 310015;2. 浙江大學(xué)控制科學(xué)與工程學(xué)院,杭州 310027;3. 浙江大學(xué)電氣工程學(xué)院,杭州 310027)
獲取漁業(yè)養(yǎng)殖魚類生長態(tài)勢的人工測量方法費(fèi)時(shí)費(fèi)力,且影響魚的正常生長。為了實(shí)現(xiàn)水下魚體信息動態(tài)感知和快速無損檢測,該研究提出立體視覺下動態(tài)魚體尺寸測量方法。通過雙目立體視覺技術(shù)獲取三維信息,再通過Mask-RCNN(Mask Region Convolution Neural Network)網(wǎng)絡(luò)進(jìn)行魚體檢測與精細(xì)分割,最后生成魚表面的三維點(diǎn)云數(shù)據(jù),計(jì)算得到自由活動下多條魚的外形尺寸。試驗(yàn)結(jié)果表明,長度和寬度的平均相對誤差分別在4.7%和9.2%左右。該研究滿足了水產(chǎn)養(yǎng)殖環(huán)境下進(jìn)行可視化管理、無接觸測量魚體尺寸的需要,可以為養(yǎng)殖過程中分級飼養(yǎng)和合理投餌提供參考依據(jù)。
魚;機(jī)器視覺;三維重建;圖像分割;深度學(xué)習(xí);Mask-RCNN;三維點(diǎn)云處理
魚體外形尺寸信息尤其是魚體長度信息不僅反映了魚的生長狀態(tài),還體現(xiàn)出相應(yīng)的群體特征,對水產(chǎn)養(yǎng)殖有著重要意義。為獲得魚體外形尺寸信息,傳統(tǒng)方法通常將魚麻醉后撈出進(jìn)行手工測量,不僅費(fèi)時(shí)費(fèi)力,還會影響魚的正常生長[1]。目前機(jī)器視覺方法已廣泛應(yīng)用于水產(chǎn)養(yǎng)殖的質(zhì)量分級[2],品種識別[3],計(jì)數(shù)[4-5],行為識別[6],新鮮度檢測[7]等領(lǐng)域。然而,因魚在水下始終處于游動狀態(tài),用視覺方法來無接觸獲取魚體尺寸信息具有較大挑戰(zhàn)。
國內(nèi)外學(xué)者在基于機(jī)器視覺的魚體尺寸測量上已有一些研究[8-10]。余心杰等[11]搭建了大黃魚形態(tài)參數(shù)測量平臺,通過相機(jī)獲取置于玻璃平面上大黃魚的圖像,再人工選取特征點(diǎn)來計(jì)算魚體長度和寬度,實(shí)現(xiàn)了0.28%的測量誤差。Monkman等[12]采用深度學(xué)習(xí)模型實(shí)現(xiàn)魚體定位,檢測魚體表面放置的標(biāo)記塊信息得到像素長度和真實(shí)長度間的關(guān)系并計(jì)算魚長,測量誤差在2.2%左右。上述方法基于單目視覺,在固定深度下獲取三維信息,一般用于捕獲后魚體尺寸測量,不適合水下動態(tài)魚體信息處理?;陔p目視覺的方法,能重構(gòu)圖像的三維信息,可進(jìn)行自由游動的魚體尺寸測量。Torisawa等[13]通過直接線性變換從雙目圖像中獲取三維信息,魚體長度測量誤差的變異系數(shù)<5%。Mu?oz-Benavent等[14]設(shè)計(jì)了金槍魚的幾何模型,提取魚頭魚尾的圖像特征用于定位模型初始位置,再通過模型匹配實(shí)現(xiàn)魚定位和輪廓提取,90%計(jì)算結(jié)果的相對誤差在3%左右。Pérez等[15]在實(shí)驗(yàn)室環(huán)境下分別對機(jī)械魚和自由游動的魚開展了研究,圖像經(jīng)邊緣提取和濾波處理后,進(jìn)行魚體檢測,并通過雙目立體視覺計(jì)算不同角度的魚體尺寸,實(shí)現(xiàn)了4%左右的測量誤差。受限于水下環(huán)境的復(fù)雜性和魚的空間分布等因素,基于雙目視覺的水下自由游動魚體測量技術(shù)需對采集的圖像進(jìn)行三維重建、魚體定位、輪廓提取和姿態(tài)估計(jì)等處理,提升了算法的泛化能力和適用范圍,為實(shí)現(xiàn)魚類養(yǎng)殖中的非接觸式魚體大小測量提供了一種有效的手段。
本研究主要基于深度學(xué)習(xí)和立體視覺,實(shí)現(xiàn)不同場景不同種類魚體尺寸的快速無損測量。在自主搭建的魚體尺寸測量平臺上,開發(fā)了水產(chǎn)養(yǎng)殖監(jiān)控系統(tǒng)和魚體尺寸計(jì)算程序。通過相機(jī)標(biāo)定、立體校正和匹配實(shí)現(xiàn)對雙目采集圖像的三維重建;并制作數(shù)據(jù)集來訓(xùn)練掩膜區(qū)域卷積神經(jīng)網(wǎng)絡(luò)(Mask Region Convolution Neural Network,Mask-RCNN)模型,再結(jié)合形態(tài)學(xué)和GrabCut算法實(shí)現(xiàn)魚體檢測與分割;根據(jù)魚體分割的三維信息提取魚表面數(shù)據(jù),經(jīng)坐標(biāo)變換統(tǒng)一魚體的方向和位置,計(jì)算魚體長度和寬度信息。該方法為自由活動狀態(tài)下魚體尺寸信息的快速自動獲取提供了思路。
本研究的試驗(yàn)樣本選用花鱸15條,體長分布在112.3~141.8 mm;珍珠石斑魚5條,體長分布在233.0~247.0 mm;鱸魚5條,體長分布在245.0~290.0 mm。設(shè)計(jì)了“一桶多魚”和“一桶一魚”2 種場景,采用直徑1 m、高1 m的圓桶和880 mm×630 mm×650 mm的方形養(yǎng)殖箱作為養(yǎng)殖容器。多條魚放置于同一養(yǎng)殖桶中的“一桶多魚”場景與實(shí)際水產(chǎn)養(yǎng)殖環(huán)境相似,用于獲取圖像制作數(shù)據(jù)集,同時(shí)驗(yàn)證深度學(xué)習(xí)模型的檢測分割效果。“一桶一魚”場景是將一條已人工測量尺寸的魚單獨(dú)放置于養(yǎng)殖箱中,用于驗(yàn)證魚體尺寸測量算法準(zhǔn)確性。
自行研制并搭建的魚體尺寸測量系統(tǒng)如圖1所示。雙目相機(jī)放置于防水外殼中,通過USB數(shù)據(jù)線傳輸魚的水下視頻數(shù)據(jù)。雙目相機(jī)分辨率像素為2 560×720,視頻采集幀率為30 Hz,基線長度為6 cm。樹莓派負(fù)責(zé)視頻流推送。云端服務(wù)器配備四路Nvidia GTX 1080 Ti顯卡,為深度學(xué)習(xí)的計(jì)算能力提供保障。軟件部分主要實(shí)現(xiàn)數(shù)據(jù)采集、傳輸、計(jì)算、結(jié)果輸出等功能。測量算法由Python語言編寫,通過OpenCV計(jì)算機(jī)視覺庫實(shí)現(xiàn)圖像相關(guān)操作,基于Tensorflow的Keras框架實(shí)現(xiàn)深度學(xué)習(xí)模型搭建與訓(xùn)練。
圖1 魚體尺寸測量系統(tǒng)
與現(xiàn)有的基于機(jī)器視覺的魚體尺寸測量方法不同,本研究基于立體視覺[16]和實(shí)例分割[17-19]等技術(shù)實(shí)現(xiàn)魚體尺寸測量,充分利用了圖像中的三維信息,適用于實(shí)際水產(chǎn)養(yǎng)殖環(huán)境下對不同品種的多條魚進(jìn)行魚體尺寸估算。同時(shí),本研究通過平面擬合與橢圓擬合計(jì)算點(diǎn)云位姿,實(shí)現(xiàn)了對不同位姿的魚體尺寸計(jì)算。算法主要包括三維重建、魚體檢測與分割、三維點(diǎn)云處理3個(gè)部分。
1.2.1 三維重建
為從雙目圖像中獲取三維信息,需對場景進(jìn)行三維重建。先對雙目相機(jī)進(jìn)行標(biāo)定[20],再進(jìn)行立體校正使圖像標(biāo)準(zhǔn)行對齊,然后在校正后的圖像上進(jìn)行立體匹配得到像素對應(yīng)物點(diǎn)的三維坐標(biāo),完成三維重建。
首先,采用張正友標(biāo)定法[21-22]獲取2個(gè)相機(jī)的內(nèi)參矩陣和雙目間的相對位置。反映了物點(diǎn)的相機(jī)坐標(biāo)系坐標(biāo)和像點(diǎn)像素位置間的關(guān)系,如式(1)所示:
式中(0,0)為相機(jī)主點(diǎn)在像素平面坐標(biāo);和分別是和軸的比例系數(shù);反映了像素平面軸和軸的傾斜程度。
接著采用立體校正算法[23],使雙目成像平面共面且行對齊(標(biāo)準(zhǔn)行對齊)。校正后的雙目相機(jī)模型如圖2所示。
注:pleft和pright分別為雙目的左右成像平面;P為物點(diǎn);z為物點(diǎn)的深度,mm;(ul, vl)和(ur, vr)為物點(diǎn)在2個(gè)成像平面上的像點(diǎn)坐標(biāo);Oleft和Oright分別為左右相機(jī)的主點(diǎn);t為雙目的基線長度,mm;f為立體校正后的焦距長度。
由圖2模型可知,物點(diǎn)點(diǎn)的深度通過式(2)計(jì)算得出:
式中視差值=u?u為物點(diǎn)在左右成像平面上的像點(diǎn)坐標(biāo)(u,v)和(u,v)橫坐標(biāo)差值;為雙目的基線長度,mm;為立體校正后的焦距長度。
采用半全局塊匹配算法(Semi-Global Block Matching,SGBM)[24-25]獲取視差值,經(jīng)立體匹配后得到視差圖[26]。在立體匹配過程中,由于遮擋、噪聲、單一背景以及視差值過大等原因無法在搜索范圍內(nèi)找到匹配點(diǎn),無法得到視差值。物點(diǎn)的三維坐標(biāo)由對應(yīng)像點(diǎn)在圖像中的坐標(biāo)和視差值計(jì)算得到,如式(3)所示:
式中(,,)為物點(diǎn)坐標(biāo);為比例系數(shù);()為物點(diǎn)對應(yīng)像點(diǎn)在左圖的坐標(biāo);轉(zhuǎn)換矩陣如式(4)所示:
1.2.2 檢測與分割
為測量多條魚的尺寸,需通過目標(biāo)檢測得到魚在圖像中的位置,并獲取魚的分割結(jié)果,從而在三維重建中提取魚的三維點(diǎn)云數(shù)據(jù)。
與傳統(tǒng)圖像分割方法相比,基于深度學(xué)習(xí)的目標(biāo)檢測與分割方法可克服應(yīng)用環(huán)境等因素對檢測和分割結(jié)果的影響。本研究選取Mask-RCNN[27]網(wǎng)絡(luò)完成魚體檢測和分割任務(wù),計(jì)算出目標(biāo)邊界框和分割的結(jié)果。共獲取3 712張人工標(biāo)注圖像制作數(shù)據(jù)集,其中訓(xùn)練集圖像2 662張,驗(yàn)證集圖像750 張。數(shù)據(jù)集中只標(biāo)注魚尾根部和鼻尖都在圖像中且完整無遮擋的魚。
部分Mask-RCNN分割結(jié)果在魚體邊緣附近存在偏差,本研究采用GrabCut[28-29]交互式分割算法精煉Mask-RCNN分割結(jié)果(圖3)。首先,對初始分割進(jìn)行腐蝕處理,將腐蝕后剩余像素標(biāo)記為前景;再對初始分割進(jìn)行膨脹處理,將放大1.1倍的邊界框中剩余像素標(biāo)記為背景;最后,根據(jù)標(biāo)記基于GrabCut算法分割剩余魚體邊緣附近未標(biāo)記像素。由圖3可知,GrabCut能有效地優(yōu)化分割結(jié)果。
圖3 GrabCut交互式分割算法優(yōu)化分割結(jié)果
為評價(jià)深度學(xué)習(xí)模型的性能,采用精確率(Precision)和召回率(Recall)來評價(jià)目標(biāo)檢測的效果,如式(5)~式(6)所示:
式中TP為真正類,即能和標(biāo)注框匹配的檢測結(jié)果數(shù)量。FP為假正類,TP+FP即為模型檢測出的目標(biāo)數(shù);FN為假負(fù)類,TP+FN為標(biāo)注的目標(biāo)數(shù)。
采用平均像素交并比[30](mean Intersection Over Union,mIOU)評價(jià)分割效果,交并比(Intersection Over Union,IOU)為模型分割結(jié)果與標(biāo)注結(jié)果間的交集像素?cái)?shù)與并集像素?cái)?shù)比值,mIOU為所有預(yù)測結(jié)果IOU的均值。
1.2.3 三維點(diǎn)云處理
根據(jù)檢測與分割結(jié)果,在三維重建中提取魚的三維點(diǎn)云數(shù)據(jù)。由于檢測出魚的位置和角度具有隨機(jī)性,需通過一系列坐標(biāo)變換歸一化點(diǎn)云數(shù)據(jù)。本研究通過2次三維坐標(biāo)變換,將魚體中心變換到坐標(biāo)原點(diǎn)位置,并使魚體長寬厚方向與3個(gè)坐標(biāo)軸方向一致,再根據(jù)新坐標(biāo)系下點(diǎn)云數(shù)據(jù)在橫、縱坐標(biāo)軸上的延伸范圍,計(jì)算魚體長度和寬度。三維變換計(jì)算過程如圖4所示。
根據(jù)分割的結(jié)果采取腐蝕操作提取魚的輪廓,再對魚體輪廓三維點(diǎn)云數(shù)據(jù)進(jìn)行平面擬合(圖4a)。設(shè)平面方程為1+2+3+1=0,其中1、2、3為擬合系數(shù)。采用最小二乘法求解3個(gè)擬合系數(shù),可得魚輪廓所在平面單位法向量,如式(7)所示
注:矩形框?yàn)轸~體輪廓點(diǎn)云擬合的平面;軸、軸、軸為坐標(biāo)變換前的三個(gè)坐標(biāo)軸,mm;′、′軸、軸、軸為第一次變換后坐標(biāo)系的原點(diǎn)和坐標(biāo)軸,mm;橢圓由輪廓點(diǎn)云在擬合平面上的投影點(diǎn)擬合得到;″、″軸、″軸、″軸為第二次變換后坐標(biāo)系的原點(diǎn)和坐標(biāo)軸,mm。
Note: The rectangle is plane fitted by points clouds of contour;axis,axis andaxis are the axes before first transformation, mm;′,′ axis,axis andaxis are the origin and axes after first transformation, mm; the ellipse is fitted by the projection points of the contour points cloud on the fitting plane;″,″ axis,″ axis and″axis are the origin and axes after second transformation, mm.
圖4 坐標(biāo)變換計(jì)算過程
Fig.4 Procedure of coordinate transformations
對原坐標(biāo)系下所有三維點(diǎn)(,,)T進(jìn)行坐標(biāo)變換使得擬合平面為變換后坐標(biāo)系的′′面(圖4a)。變換后的坐標(biāo)(,,)T與原坐標(biāo)(,,)T間的關(guān)系如式(8)所示
其中′為擬合平面′′上任一點(diǎn),為該平面上一單位向量,向量和叉乘得到單位向量。
采用一個(gè)旋轉(zhuǎn)橢圓擬合某條魚輪廓點(diǎn)在平面上的投影(圖4b)。旋轉(zhuǎn)橢圓方程采用一般二次曲線方程′2+′′+′2+′+′+0,基于橢圓擬合方法[31-32]可得橢圓擬合系數(shù)~由式(9)~式(11)得到旋轉(zhuǎn)橢圓的中心坐標(biāo)(0,0)以及長軸傾角(,rad)。
即得旋轉(zhuǎn)橢圓中心點(diǎn)″(0,0,0)T,長軸方向向量=(sin,?cos,0)T,短軸方向向量=(cos,?sin,0)T。
將該條魚對應(yīng)點(diǎn)云上所有的三維點(diǎn)(′,′,′)T變換到以橢圓中心″為原點(diǎn)、橢圓長軸方向?yàn)椤遢S、短軸方向?yàn)椤遢S、平面法向量的方向?yàn)椤遢S的新坐標(biāo)系下,變換關(guān)系如式(12)所示,式中向量=(0,0,1)T。
待測魚體的長度和寬度可分別通過點(diǎn)云數(shù)據(jù)在″軸和″軸方向上的延伸范圍計(jì)算得到。
綜上所述,立體視覺下動態(tài)魚體尺寸測量方法的整體框架如圖5所示。
注:Mask-RCNN為一種用于實(shí)例分割的卷積神經(jīng)網(wǎng)絡(luò)。
根據(jù)相機(jī)參數(shù)和完成訓(xùn)練的Mask-RCNN模型,運(yùn)用本研究提出方法提取魚的三維點(diǎn)云數(shù)據(jù),通過坐標(biāo)變換計(jì)算魚體長度和寬度信息,與人工實(shí)測結(jié)果進(jìn)行比較,驗(yàn)證本方法的測量精度。
如圖6所示的試驗(yàn)結(jié)果表明,本方法在魚體長度和寬的測量上的相對誤差分別為4.7%和9.2%。此外,計(jì)算結(jié)果同人工測量結(jié)果相比偏低。經(jīng)分析,寬度方向誤差主要由分割結(jié)果在邊緣附近部分缺失導(dǎo)致;長度方向誤差主要由魚體彎曲或分割結(jié)果部分缺失造成。
注:箱線圖橫坐標(biāo)表示不同魚的編號;每條箱線對應(yīng)同一條魚的若干計(jì)算結(jié)果;箱線的中心矩形代表計(jì)算結(jié)果的四分位間距;中心矩形內(nèi)部的橫線代表計(jì)算結(jié)果的中位數(shù);頂部和底部橫線代表計(jì)算結(jié)果的最大最小值。
為驗(yàn)證深度學(xué)習(xí)模型性能,將置信度閾值設(shè)為0.9,由式(5)得到模型精確率為88%,召回率為84%。經(jīng)過GrabCut精細(xì)化分割處理后,mIOU由78%提升為81%。圖像的處理速度為2.3 Hz。結(jié)果表明所訓(xùn)練的Mask-RCNN網(wǎng)絡(luò)能實(shí)現(xiàn)較好的檢測效果,且本研究采用的基于形態(tài)學(xué)操作和GrabCut算法的分割精細(xì)化能夠有效提高分割精度。
對不同種類魚體尺寸測量平均相對誤差如表1所示,結(jié)果表明所提深度學(xué)習(xí)模型具有良好的泛化能力,本方法適用于不同種類的魚體尺寸計(jì)算。
表1 不同種類魚體尺寸計(jì)算結(jié)果
擬合平面和相機(jī)坐標(biāo)系平面夾角可表示魚游動方向和成像平面間的夾角。統(tǒng)計(jì)夾角在各個(gè)區(qū)間的三維點(diǎn)云平均相對測量誤差和夾角的頻數(shù)分布,結(jié)果如圖7所示。
圖7 不同角度下的長度平均相對誤差、寬度平均相對誤差和角度頻數(shù)分布直方圖
由圖7可知,根據(jù)研究的數(shù)據(jù)集標(biāo)注方法,85%的點(diǎn)云數(shù)據(jù)中輪廓擬合平面和成像平面夾角都在0~40°范圍內(nèi),且該范圍內(nèi)的計(jì)算結(jié)果準(zhǔn)確度較高,>40°時(shí)測量結(jié)果的平均相對誤差會較大幅度增大。結(jié)果表明,立體視覺下動態(tài)魚體尺寸測量算法適用于角度在40°內(nèi)的魚體尺寸測量,可設(shè)置夾角條件進(jìn)一步篩選點(diǎn)云數(shù)據(jù),實(shí)現(xiàn)算法檢測精度優(yōu)化。
本研究自行研制并搭建了基于水下雙目的魚類養(yǎng)殖監(jiān)控系統(tǒng),提出了基于立體視覺和深度學(xué)習(xí)的動態(tài)魚體尺寸測量方法,在不影響魚自由活動的情況下,進(jìn)行不同種類、不同大小、不同位姿的多條魚體尺寸測量,得出以下結(jié)論:
1)采集了3 712張水下魚圖像,通過多邊形標(biāo)注工具制作水產(chǎn)養(yǎng)殖環(huán)境下魚類分割數(shù)據(jù)集,訓(xùn)練掩膜區(qū)域卷積神經(jīng)網(wǎng)絡(luò)(Mask Region Convolution Neural Network,Mask-RCNN)模型,實(shí)現(xiàn)魚體檢測與分割。模型在驗(yàn)證集上精確率為88%,召回率為84%。采用GrabCut交互式分割算法在邊緣附近精細(xì)化處理,使分割結(jié)果的平均像素交并比(mean Intersection Over Union,mIOU)由78%提升至81%,提高了分割精確度。
2)采用平面擬合魚體輪廓的三維點(diǎn)云,將輪廓點(diǎn)云投影到擬合平面上進(jìn)行旋轉(zhuǎn)橢圓擬合,得到魚的位置和姿態(tài),將點(diǎn)云坐標(biāo)變換到以長、寬、厚方向?yàn)樽鴺?biāo)軸的坐標(biāo)系下,實(shí)現(xiàn)對不同角度魚體尺寸測量。實(shí)測結(jié)果表明,成像平面間夾角<40°的魚體尺寸測量精度較高。
3)計(jì)算結(jié)果與人工測量結(jié)果進(jìn)行比較,長度測量的平均相對誤差為4.7%,寬度測量的平均相對誤差為9.2%,計(jì)算速度為2 Hz。表明本研究提出的水下游動魚體尺寸測量方法,具有計(jì)算的準(zhǔn)確性和快速性以及良好的泛化能力,且體型較大魚的平均相對測量誤差會較低,為水產(chǎn)養(yǎng)殖中游動魚體尺寸無接觸測量提供了可行方法。
[1] Maule A G, Tripp R A, Kaattari S L, et al. Stress alters immune function and disease resistance in chinook salmon ()[J]. Journal of Endocrinology, 1989, 120(1): 135-142.
[2] 張志強(qiáng),牛智有,趙思明,等. 基于機(jī)器視覺技術(shù)的淡水魚質(zhì)量分級[J]. 農(nóng)業(yè)工程學(xué)報(bào),2011,27(2):350-354.
Zhang Zhiqiang, Niu Zhiyou, Zhao Siming, et al. Weight grading of freshwater fish based on computer vision[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2011, 27(2): 350-354. (in Chinese with English abstract)
[3] Rathi D, Jain S, Indu S. Underwater fish species classification using convolutional neural network and deep learning[C]//Ninth International Conference on Advances in Pattern Recognition (ICAPR), Bangalore, India, 2017.
[4] Aliyu I, Gana K J, Musa A A, et al. A proposed fish counting algorithm using digital image processing technique[J]. Abubakar Tafawa Balewa University Journal of Science, Technology and Education, 2017, 5(1): 1-11.
[5] Zhang Song, Yang Xinting, Wang Yizhong, et al. Automatic fish population counting by machine vision and a hybrid deep neural network model[J]. Animals, 2020, 10(2): 364-381.
[6] 張佳林,徐立鴻,劉世晶. 基于水下機(jī)器視覺的大西洋鮭攝食行為分類[J]. 農(nóng)業(yè)工程學(xué)報(bào),2020,36(13):158-164.
Zhang Jialin, Xu Lihong, Liu Shijing. Classification of Atlantic salmon feeding behavior based on underwater machine vision[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(13): 158-164. (in Chinese with English abstract)
[7] Issac A, Dutta M K, Sarkar B. Computer vision based method for quality and freshness check for fish from segmented gills[J]. Computers and Electronics in Agriculture, 2017, 139: 10-21.
[8] 段延娥,李道亮,李振波,等. 基于計(jì)算機(jī)視覺的水產(chǎn)動物視覺特征測量研究綜述[J]. 農(nóng)業(yè)工程學(xué)報(bào),2015,31(15):1-11.
Duan Yan’e, Li Daoliang, Li Zhenbo, et al. Review on visual characteristic measurement research of aquatic animals based on computer vision[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(15): 1-11. (in Chinese with English abstract).
[9] Hao Mingming, Yu Helong, Li Daoliang. The measurement of fish size by machine vision-a review[C]//International Conference on Computer and Computing Technologies in Agriculture, Beijing, China, 2015.
[10] Saberioon M, Gholizadeh A, Cisar P, et al. Application of machine vision systems in aquaculture with emphasis on fish: State-of-the-art and key issues[J]. Reviews in Aquaculture, 2017, 9(4): 369-387.
[11] 余心杰,吳雄飛,王建平,等. 基于機(jī)器視覺的大黃魚形態(tài)參數(shù)快速檢測方法[J]. 集成技術(shù),2014(5):45-51.
Yu Xinjie, Wu Xiongfei, Wang Jianping, et al. Rapid detecting method for pseudosciaena crocea morphological parameters based on the machine vision[J]. Journal of Integration Technology, 2014(5): 45-51. (in Chinese with English abstract)
[12] Monkman G G, Hyder K, Kaiser M J, et al. Using machine vision to estimate fish length from images using regional convolutional neural networks[J]. Methods in Ecology and Evolution, 2019, 10(12): 2045-2056.
[13] Torisawa S, Kadota M, Komeyama K, et al. A digital stereo-video camera system for three-dimensional monitoring of free-swimming Pacific bluefin tuna,, cultured in a net cage[J]. Aquatic Living Resources, 2011, 24(2): 107-112.
[14] Mu?oz-Benavent P, Andreu-García G, Valiente-González J M, et al. Enhanced fish bending model for automatic tuna sizing using computer vision[J]. Computers and Electronics in Agriculture, 2018, 150: 52-61.
[15] Pérez D, Ferrero F J, Alvarez I, et al. Automatic measurement of fish size using stereo vision[C]//2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Houston, USA, 2018.
[16] Tippetts B, Lee D J, Lillywhite K, et al. Review of stereo vision algorithms and their suitability for resource-limited systems[J]. Journal of Real-Time Image Processing, 2016, 11(1): 5-25.
[17] Lin Tsungyi, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]//European Conference on Computer Vision, NewYork, USA, 2014.
[18] Garcia R, Prados R, Quintana J, et al. Automatic segmentation of fish using deep learning with application to fish size measurement[J]. International Council for the Expoloration of the Sea Journal of Marine Science, 2020, 77(4): 1354-1366.
[19] 鄧穎,吳華瑞,朱華吉. 基于實(shí)例分割的柑橘花朵識別及花量統(tǒng)計(jì)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2020,36(7):200-207.
Deng Ying, Wu Huarui, Zhu Huaji. Recognition and counting of citrus flowers basssed on instance segmentation[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(7): 200-207. (in Chinese with English abstract)
[20] Yang Shoubo, Gao Yang, Liu Zhen, et al. A calibration method for binocular stereo vision sensor with short-baseline based on 3D flexible control field[J/OL]. Optics and Lasers in Engineering, 2019, 124, [2019-08-20], https://doi.org/10.1016/ j.optlaseng.2019.105817.
[21] Zhang Zhengyou. Flexible camera calibration by viewing a plane from unknown orientations[C]// Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 1999.
[22] 遲德霞,王洋,寧立群,等. 張正友法的攝像機(jī)標(biāo)定試驗(yàn)[J]. 中國農(nóng)機(jī)化學(xué)報(bào),2015,36(2):287-289.
Chi Dexia, Wang Yang, Ning Liqun, et al. Experimental reserch of camera calibration based on Zhang’s method[J]. Journal of Chinese Agricultural Mechanization, 2015, 36(2): 287-289. (in Chinese with English abstract).
[23] Feti? A, Juri? D, Osmankovi? D. The procedure of a camera calibration using camera calibration toolbox for MATLAB[C]//2012 Proceedings of the 35thInternational Convention MIPRO, Opatija, Croatia, 2012.
[24] Heiko H. Stereo processing by semi-global matching and mutual information[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2007, 30(2): 328-341.
[25] Lee Y, Park M G, Hwang Y, et al. Memory-efficient parametric semiglobal matching[J]. IEEE Signal Processing Letters, 2017, 25(2): 194-198.
[26] Ttofis C, Kyrkou C, Theocharides T. A hardware-efficient architecture for accurate real-time disparity map estimation[J]. Association for Computing Machinery Transactions on Embedded Computing Systems (TECS), 2015, 14(2): 1-26.
[27] He Kaiming, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.
[28] Rother C, Kolmogorov V, Blake A. GrabCut: Interactive foreground extraction using iterated graph cuts[J]. Association for Computing Machinery Transactions on Graphics (TOG), 2004, 23(3): 309-314.
[29] Li Yubing, Zhang Jinbo, Gao Pengs, et al. Grab cut image segmentation based on image region[C]//2018 IEEE 3rdInternational Conference on Image, Vision and Computing (ICIVC), Chongqing, China, 2018.
[30] Wei Yunchao, Feng Jiashi, Liang Xiaodan, et al. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017.
[31] Fitzgibbon A, Pilu M, Fisher R B. Direct least square fitting of ellipses[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(5): 476-480.
[32] Cho M. Performance comparison of two ellipse fitting-based cell separation algorithms[J]. Journal of Information and Communication Convergence Engineering, 2015, 13(3): 215-219.
Measurement of dynamic fish dimension based on stereoscopic vision
Li Yanjun1, Huang Kangwei1,2, Xiang Ji3※
(1.,310015,; 2.,310027,; 3.,,310027,)
Fish dimension information, especially length, is very important for aquaculture, which can be used for grading and developing bait strategy. In order to acquire accurate information on fish size, the traditional method of measurement has to take the fish out of the water, which is not only time-consuming and laborious but also may influence the growth rates of fishes. In this study, a dynamic measurement method for fish body dimension based on stereo vision was proposed, which could calculate dimension information of multiple fishes simultaneously without restricting their movements. It was implemented and verified by an intelligent monitor system designed and built by ourselves considering the hardware compatibility with satisfied integral performance. Through this system, the videos of underwater fish were captured and uploaded to the remote cloud server for further processing. Then three main procedures were developed including 3D reconstruction, fish detection and segmentation, 3D points cloud processing, which was designed for size acquirement of fishes swimming freely in a real aquaculture environment. In the 3D reconstruction part, in order to acquire the data for modeling, 3D information was restored from binocular images by camera calibration, stereo rectification, stereo matching in sequence. Firstly, the binocular was calibrated with a chessboard to get camera parameters including intrinsic matrix as well as relative translation and rotation of the left and right camera. Then, the captured binocular images were rectified to row-aligned according to parameters of the calibrated binocular camera. Finally, stereo matching based on the semi-global block matching method (SGBM) was applied to extract accurate 3D information from rectified binocular image pairs and achieved 3D reconstruction. In the fish detection and segmentation part, a Mask Region Convolution Neural Network (Mask-RCNN) was trained as a model to locate fishes in the image with a bounding box and extract pixels of fish in each bounding box to get raw segmentation. The raw segmentation was refined with an interactive segmentation method called GrabCut combining with some morphological processing algorithms to correct bias around the edge. In the 3D points cloud processing part, two coordinate transformations were carried out to unify the cloud points of fishes with various locations and orientations. The transformation parameters were calculated based on three-dimension plane fitting of the contour points cloud and rotated ellipse fitting of the transformed points cloud respectively. After transformation, the length and width of the fish points cloud were parallel to axes. Therefore, the length and width of fish were the range of points cloud along the abscissa and ordinate axes. Experiments were conducted using the self-designed system and results including various species and sizes of fish were compared with those of manual measurements. It turned out that the average relative estimation error of length was about 4.7% and the average relative estimation error of width was about 9.2%. In terms of running time, the developed measurement system could process 2.5 frames per second for fish dimensions calculation. The experiment results also showed that the trained Mask-RCNN model achieved the precision of 0.88 and the recall of 84% with satisfied generalization performance. After segmentation refinement, the mean intersection over union increased from 78% to 81%, which exhibited the effectiveness of the refinement method. It also showed that the longer the fish length, the smaller the average relative error of the measurement. These results demonstrated that the proposed method was able to measure multiple underwater fish dimensions based on a stereoscopic vision method by using deep learning-based image segmentation algorithms and coordinates transformation method. This study could provide a novel idea for flexible measurement of fish body size and improve the level of dynamic information perception technology for rapid and non-destructive detection of underwater fish in aquaculture.
fish; machine vision; three-dimensional reconstruction; image segmentation; deep learning; Mask-RCNN; 3D cloud points processing
李艷君,黃康為,項(xiàng)基. 基于立體視覺的動態(tài)魚體尺寸測量[J]. 農(nóng)業(yè)工程學(xué)報(bào),2020,36(21):220-226. doi:10.11975/j.issn.1002-6819.2020.21.026 http://www.tcsae.org
Li Yanjun, Huang Kangwei, Xiang Ji. Measurement of dynamic fish dimension based on stereoscopic vision[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(21): 220-226. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2020.21.026 http://www.tcsae.org
2020-04-09
2020-06-16
浙江省重點(diǎn)研發(fā)計(jì)劃項(xiàng)目(2019C01150)
李艷君,博士,教授,主要從事智能控制與優(yōu)化、復(fù)雜系統(tǒng)建模等研究。Email:liyanjun@zucc.edu.cn
項(xiàng)基,博士,教授,主要從事水下自主航行器、網(wǎng)絡(luò)化控制等研究。Email:jxiang@zju.edu.cn
10.11975/j.issn.1002-6819.2020.21.026
TP391.41
A
1002-6819(2020)-21-0220-07