張北舉,陳松樹,李魁印,李魯華,徐如宏,安暢,熊富敏,張燕,董俐利,任明見
基于近紅外光譜的高粱籽粒直鏈淀粉、支鏈淀粉含量檢測(cè)模型的構(gòu)建與應(yīng)用
張北舉,陳松樹,李魁印,李魯華,徐如宏,安暢,熊富敏,張燕,董俐利,任明見
貴州大學(xué)農(nóng)學(xué)院/國(guó)家小麥改良中心貴州分中心,貴陽 550025
【】高粱是釀酒和飼料的主要原料之一,其籽粒直鏈淀粉含量與支鏈淀粉含量的比值大小與白酒品質(zhì)及飼料質(zhì)量密切相關(guān)。傳統(tǒng)的高粱成分化學(xué)檢測(cè)方法已不適合高通量測(cè)試,采用改進(jìn)最小二乘法(modified PLS)對(duì)高粱樣品的近紅外光譜圖進(jìn)行光譜預(yù)處理、得分處理和結(jié)果監(jiān)控建立高粱籽粒直鏈淀粉、支鏈淀粉含量的預(yù)測(cè)模型,旨在得到一種快速高效低成本的檢測(cè)方法,為高粱的遺傳改良及品質(zhì)分析提供依據(jù)。從450份高粱資源中篩選出112份代表品種作為校正集和驗(yàn)證集,通過雙波長(zhǎng)法測(cè)定112份高粱品種籽粒中直鏈淀粉、支鏈淀粉含量的化學(xué)值,并收集波長(zhǎng)為850—1 048 nm的近紅外光譜,對(duì)光譜進(jìn)行掃描數(shù)據(jù)矩陣和化學(xué)數(shù)據(jù)計(jì)算得分(PL1)處理解釋光譜間差異,剔除馬氏距離(GH)大于3的超常品種以減小建模誤差。采用Modified PLS回歸技術(shù)建模,通過不同散射處理和導(dǎo)數(shù)處理等方法建立不同的定標(biāo)模型。根據(jù)交叉驗(yàn)證標(biāo)準(zhǔn)偏差(SECV)、交叉驗(yàn)證相關(guān)系數(shù)(1-VR)確定最佳模型,并進(jìn)行結(jié)果監(jiān)控和非參數(shù)檢驗(yàn)評(píng)估模型的預(yù)測(cè)性能。直鏈淀粉的近紅外預(yù)測(cè)模型SECV是2.7732,1-VR是0.9503,相關(guān)系數(shù)(RSQ)是0.9688。Bias=0.229<2.7732(SECV)×0.6,即偏差(Bias)小于定標(biāo)模型SECV的0.6倍;預(yù)測(cè)標(biāo)準(zhǔn)偏差(SEP)=1.266<2.7732(SECV)×1.3=3.60516,即SEP小于定標(biāo)模型SECV的1.3倍,11.01(SD)—10.81(SD)=0.2<11.02(SD)×0.2=2.204即化學(xué)數(shù)據(jù)和近紅外預(yù)測(cè)數(shù)據(jù)標(biāo)準(zhǔn)偏差(SD)差值小于化學(xué)數(shù)據(jù)SD的20%。支鏈淀粉的近紅外預(yù)測(cè)模型SECV是1.7516,1-VR是0.8818,RSQ是0.9127。Bias=-0.014<1.7516(SECV)×0.6即Bias小于定標(biāo)模型SECV的0.6倍,SEP=1.316<1.7516(SECV)×1.3=2.2708即SEP小于定標(biāo)模型SECV的1.3倍,5.30-5.29=0.01<5.30×0.2=1.06即化學(xué)數(shù)據(jù)和近紅外預(yù)測(cè)數(shù)據(jù)SD差值小于化學(xué)數(shù)據(jù)SD的20%。利用30份模型外高粱籽粒對(duì)模型的有效性進(jìn)行兩配對(duì)樣本非參數(shù)檢驗(yàn),結(jié)果表明,直鏈淀粉含量和支鏈淀粉含量的測(cè)定值與預(yù)測(cè)值之間差異不顯著(=0.262>0.05;=0.992>0.05)。所建立的近紅外模型精準(zhǔn)度高,穩(wěn)定性好,能準(zhǔn)確快速地檢測(cè)高粱籽粒中直鏈淀粉、支鏈淀粉的含量,可用于高粱的遺傳改良及高粱品質(zhì)的檢測(cè)。
近紅外光譜;高粱;直鏈淀粉;支鏈淀粉;改進(jìn)最小二乘法
【研究意義】高粱是中國(guó)公認(rèn)的白酒傳統(tǒng)固態(tài)發(fā)酵和飼料的主要原料之一,白酒的出酒率和品質(zhì)及飼料的質(zhì)量均與高粱中籽粒直鏈淀粉含量與支鏈淀粉含量的比值大小有重要關(guān)系[1]。根據(jù)市場(chǎng)和育種工作者的需求,探究一種高效快速、低成本、簡(jiǎn)單易操作、無污染檢測(cè)高粱直鏈淀粉和支鏈淀粉含量的方法,對(duì)研究高品質(zhì)酒和優(yōu)質(zhì)飼料具有極其重要的意義。近紅外光譜(near-infrared spectroscopy,NIS)分析技術(shù)具有無損、快速等優(yōu)點(diǎn),已被廣泛應(yīng)用于不同的科學(xué)研究領(lǐng)域[2-5];基于近紅外光譜技術(shù),采用改進(jìn)最小二乘法(modified PLS)構(gòu)建高粱籽粒直鏈淀粉、支鏈淀粉含量的預(yù)測(cè)模型,可得到一種高效準(zhǔn)確的高粱籽粒直鏈淀粉、支鏈淀粉含量的檢測(cè)方法?!厩叭搜芯窟M(jìn)展】加拿大谷物實(shí)驗(yàn)室早期通過近紅外光譜技術(shù)檢測(cè)油菜籽粒中油分、硫苷及蛋白質(zhì)含量[6-7]。王翠秀等[8]通過偏最小二乘法(partial least squares,PLS)、反向傳播(back propagation,BP)神經(jīng)網(wǎng)絡(luò)法建立大豆籽粒脂肪和蛋白質(zhì)含量最優(yōu)近紅外預(yù)測(cè)模型,實(shí)現(xiàn)大豆品質(zhì)分析的快速檢測(cè),極大地推動(dòng)育種改良進(jìn)程。ZHANG等[9]通過可見光和近紅外光譜技術(shù)分別建立玉米種子水分含量的預(yù)測(cè)模型,確定近紅外光譜技術(shù)模型與玉米種子水分含量具有更多的相關(guān)性。李佳佳等[10]通過393份大豆莖稈的近紅外光譜圖,構(gòu)建了大豆莖稈化學(xué)組分含量檢測(cè)模型,具有高效、低成本、無污染的特點(diǎn),用于大豆種植資源抗倒伏的選育。目前,基于近紅外光譜分析技術(shù),分析高粱的重要指標(biāo)也有相關(guān)報(bào)道。黃朝暉等[11]應(yīng)用PLS技術(shù)建立高粱原花青素近紅外預(yù)測(cè)模型,能準(zhǔn)確檢測(cè)高粱中高含量原花青素品種,為高粱選育提供了一種不破壞籽粒的方法。劉敏軒等[12]應(yīng)用傅里葉變換近紅外光譜分析技術(shù)建立高粱籽粒中多酚類物質(zhì)含量的分析模型,能準(zhǔn)確快速檢測(cè)高粱籽粒中多酚類物質(zhì)的含量,在高粱的育種和品質(zhì)分析中具有非凡的意義。SIMEONE等[13]利用近紅外光譜采用PLS技術(shù)測(cè)定甜高粱汁中蔗糖、葡萄糖、和果糖的含量,用于分析甜高粱基因型與不同環(huán)境中甜高粱產(chǎn)生乙醇量的相關(guān)性。相對(duì)于傳統(tǒng)的化學(xué)檢測(cè)技術(shù),近紅外光譜分析技術(shù)能夠無損害、樣品處理量小、綠色、高效檢測(cè)各種作物的化學(xué)成分,且精度、準(zhǔn)確度高[14-15],對(duì)高粱的遺傳改良及高粱品質(zhì)檢測(cè)有重要意義。【本研究切入點(diǎn)】目前,國(guó)內(nèi)外科研人員已采用PLS、BP神經(jīng)網(wǎng)絡(luò)法等構(gòu)建高粱蛋白質(zhì)、脂肪、單寧、水分、淀粉等含量的檢測(cè)模型,但采用Modified PLS構(gòu)建高粱籽粒直鏈淀粉、支鏈淀粉含量的預(yù)測(cè)模型鮮見報(bào)道?!緮M解決的關(guān)鍵問題】為推進(jìn)高粱的遺傳改良及高粱品質(zhì)檢測(cè),本研究采用Modified PLS對(duì)82份校正集高粱品種和30份驗(yàn)證集高粱品種的近紅外光譜圖進(jìn)行得分處理、光譜預(yù)處理和結(jié)果監(jiān)控建立高粱籽粒直鏈淀粉、支鏈淀粉含量的預(yù)測(cè)模型,用于快速、高效、無污染、低成本等檢測(cè)高粱籽粒中直鏈淀粉、支鏈淀粉含量,為推進(jìn)酒高粱的遺傳改良及高粱品質(zhì)檢測(cè)提供依據(jù)。
高粱品種資源分別于2018年和2019年播種在國(guó)家小麥改良中心貴州分中心基地。分別用Grain Analyzer(InfratecTM1241型,丹麥FOSS)收集2年高粱籽粒的近紅外光譜,根據(jù)馬氏距離(global H,GH),剔除數(shù)值大于3的超常品種和小于0.8的過剩品種,挑選出112份變異性(代表性)高粱品種作為校正集和驗(yàn)證集,用于定標(biāo)建模和結(jié)果監(jiān)控。
先用種子風(fēng)選凈度儀(CFY-II型,浙江托普云農(nóng)科技股份有限公司)將烘干的112份高粱品種進(jìn)行風(fēng)選,減少雜質(zhì)的干擾,然后進(jìn)行近紅外光譜收集,收集光譜波長(zhǎng)為850—1 048 nm,每份高粱品種掃描10次,收集平均光譜,每份高粱品種重復(fù)裝樣掃描收集光譜3次,作為高粱品種的原始光譜。
參考GB7648-87和GB/T15683-2008[16](部分步驟稍有改進(jìn))測(cè)量112份高粱品種籽粒的支鏈淀粉、直鏈淀粉的含量。依據(jù)高粱支鏈淀粉標(biāo)品(購買于Solarbio公司貨號(hào)106A1030)和直鏈淀粉標(biāo)品(購買于Solarbio公司貨號(hào)1012G104),通過酶標(biāo)儀(MULTISKAN Sky型,成都百樂科技有限公司)分別測(cè)定高粱籽粒支鏈淀粉、直鏈淀粉的測(cè)定波長(zhǎng)和參比波長(zhǎng)。分別將112份高粱籽粒用高速多功能粉碎機(jī)(SUS 304型,永康市鉑歐五金制品有限公司)粉碎,過120目篩子,裝入自封袋備用。將粉碎的高粱品種用脂肪測(cè)定儀(SZF-06C型,浙江托普云農(nóng)科技股份有限公司)進(jìn)行脫脂脫糖處理并烘干,使用電子天平(BSA224S型,賽多利斯科學(xué)儀器有限公司)稱取0.1000 g脫脂脫糖的高粱品種,放入50 ml的燒杯中,加入450 μl的無水乙醇進(jìn)行濕潤(rùn),再加10 ml 0.5 mol·L-1的KOH溶液,80℃加熱10 min加速溶解,用雙蒸水定容至50 ml。最后吸取品種液2.5 ml,加入25—35 ml的雙蒸水,用0.1 mol·L-1的HCL調(diào)節(jié)pH為3左右,加入0.5 ml的碘試劑進(jìn)行顯色,通過酶標(biāo)儀得出高粱品種的吸光度計(jì)算出高粱品種籽粒支鏈淀粉、直鏈淀粉的含量。每份高粱品種重復(fù)測(cè)量3次,然后分別挑選出直鏈淀粉、支鏈淀粉含量相對(duì)穩(wěn)定的30份品種作為驗(yàn)證集,并通過WinISI軟件[17]選出3次掃描的近紅外光譜圖,其余的82份品種作為校正集,并通過WinISI軟件選出3次掃描的近紅外光譜圖。驗(yàn)證集需要化學(xué)值穩(wěn)定,以便能準(zhǔn)確檢測(cè)模型的預(yù)測(cè)性能,減小誤差。
1.4.1 定標(biāo)光譜的化學(xué)數(shù)據(jù)及得分作圖處理 通過WinISI軟件分別打開3次選出的82份定標(biāo)光譜文件,一一對(duì)應(yīng)輸入直鏈淀粉、支鏈淀粉含量的化學(xué)數(shù)據(jù),為降低操作誤差對(duì)3份定標(biāo)光譜進(jìn)行平均處理生成平均定標(biāo)光譜文件。對(duì)平均定標(biāo)光譜文件進(jìn)行掃描數(shù)據(jù)矩陣和化學(xué)數(shù)據(jù)計(jì)算得分(programming language 1,PL1)處理解釋光譜間差異,馬氏距離(GH含義為得分的三維圖中,每個(gè)品種距離中心品種點(diǎn)的距離。)設(shè)置為3,通過數(shù)學(xué)處理(math treatment)、散射處理(scatter)、導(dǎo)數(shù)處理(derivative)剔除超常品種,然后將直鏈淀粉、支鏈淀粉轉(zhuǎn)化為主成分得分?jǐn)?shù)據(jù)進(jìn)行預(yù)測(cè)模型的建立。
1.4.2 定標(biāo)模型的構(gòu)建及選擇 利用寬范圍定標(biāo)技術(shù)(global equation)建立可擴(kuò)展的定標(biāo)模型,方便以后定標(biāo)模型升級(jí)。為得到最佳直鏈淀粉、支鏈淀粉的近紅外光譜模型,采用Modified PLS回歸技術(shù)對(duì)主成分得分?jǐn)?shù)據(jù)建模,分別采用標(biāo)準(zhǔn)正常化處理(standard normal variant,SNV)、去散射處理(detrend only)、無散射處理(none)、標(biāo)準(zhǔn)正?;?散射處理(SNV+detrend)、多元離散校正(multi scatter correction,MSC)、反向多元離散校正(inverse multi scatter correction)、加權(quán)散射校正(weighted multi scatter correction)進(jìn)行散射處理;導(dǎo)數(shù)處理分別采用一階導(dǎo)數(shù)處理、二階導(dǎo)數(shù)處理;做一次平滑處理等方法建立不同的定標(biāo)模型[18]。通過觀察不同定標(biāo)模型的近紅外預(yù)測(cè)數(shù)據(jù)和實(shí)驗(yàn)室標(biāo)準(zhǔn)數(shù)據(jù)的相關(guān)系數(shù)(R-squared,RSQ),預(yù)測(cè)沒有參與定標(biāo)品種近紅外值與化學(xué)分析值之間交叉驗(yàn)證標(biāo)準(zhǔn)偏差(standard error of cross validation,SECV)的均值,預(yù)測(cè)沒有參與定標(biāo)品種近紅外值與化學(xué)分析值之間交叉驗(yàn)證相關(guān)系數(shù)(1 minus the variance ratio,1-VR)的均值選擇最佳定標(biāo)模型。
1.4.3 定標(biāo)模型的結(jié)果監(jiān)控 采用30份驗(yàn)證集品種對(duì)最優(yōu)模型的預(yù)測(cè)性能進(jìn)行驗(yàn)證,根據(jù)Bias、SECV、預(yù)測(cè)標(biāo)準(zhǔn)偏差(standard error of prediction,SEP)評(píng)價(jià)定標(biāo)模型的預(yù)測(cè)性能。Bias小于定標(biāo)模型SECV的0.6倍,預(yù)測(cè)SEP小于定標(biāo)模型SECV的1.3倍,化學(xué)數(shù)據(jù)和近紅外預(yù)測(cè)數(shù)據(jù)SD差值小于化學(xué)數(shù)據(jù)SD的20%,檢測(cè)表明該成分適合近紅外光譜分析,預(yù)測(cè)性能可靠。
在測(cè)定高粱直鏈淀粉、支鏈淀粉含量的過程中,為確定化學(xué)值的準(zhǔn)確度,同時(shí)也保證近紅外分析的準(zhǔn)確性。由同一個(gè)操作員完成112份高粱品種籽粒直鏈淀粉、支鏈淀粉含量的測(cè)定,并設(shè)3個(gè)重復(fù)取平均值,確保操作的統(tǒng)一性。表1為高粱品種直鏈淀粉、支鏈淀粉含量化學(xué)值統(tǒng)計(jì)參數(shù)。校正集直鏈淀粉含量的平均值是18.23%,含量范圍是1.08%—40.8%;支鏈淀粉含量的平均值是45.05%,含量范圍是26.74%— 67.95%。校正集直鏈淀粉、支鏈淀粉含量范圍廣,具有一定的代表性。驗(yàn)證集是挑選出化學(xué)值穩(wěn)定的品種并且范圍較廣,具有一定監(jiān)控的性能。
表1 高粱品種直鏈淀粉、支鏈淀粉含量化學(xué)值統(tǒng)計(jì)參數(shù)
由圖1可以看出高粱品種的原始光譜走向基本相同并且光譜掃描完整沒有殘缺,可以對(duì)光譜進(jìn)行數(shù)學(xué)處理和去散射處理。處理后的光譜特征基本一致,波峰和波谷的變化明顯,說明近紅外光譜對(duì)高粱直鏈淀粉、支鏈淀粉有較高的區(qū)分度。對(duì)處理后的光譜進(jìn)行掃描數(shù)據(jù)矩陣和化學(xué)數(shù)據(jù)計(jì)算得分,利用每個(gè)高粱品種的得分與校正高粱品種得分平均值比較計(jì)算GH,GH大于3為超常品種,剔除。由圖2可知直鏈淀粉三維得分圖比較集中,沒有明顯的分組現(xiàn)象,不需分組定標(biāo)建模;支鏈淀粉三維得分極大部分集中,但有少部分分布在邊緣,可能是化學(xué)值測(cè)量存在一定誤差(在可接受范圍內(nèi),GH小于3),為降低預(yù)測(cè)分析誤差,剔除邊緣品種定標(biāo)建模。
A:直鏈淀粉;B:支鏈淀粉 A: Amylose; B: Amylopectin
采用Modified PLS回歸技術(shù)對(duì)得分文件建模,采用不同的數(shù)學(xué)方法和散射方法進(jìn)行處理。SECV是進(jìn)行交叉驗(yàn)證時(shí)所獲得的近紅外預(yù)測(cè)值與化學(xué)分析值標(biāo)準(zhǔn)偏差,通過SEVC可以大致評(píng)估定標(biāo)模型的預(yù)測(cè)準(zhǔn)確度,1-VR是進(jìn)行交叉驗(yàn)證時(shí)模型對(duì)品種集濃度變化所能描述出的百分率。對(duì)于直鏈淀粉、支鏈淀粉,當(dāng)SECV越低,1-VR越高,說明定標(biāo)模型越好。由表2可知直鏈淀粉最佳模型是通過標(biāo)準(zhǔn)正?;幚?二階導(dǎo)數(shù)(SNV+second derivative)處理獲得的模型(SECV=2.7732、1-VR=0.9503),支鏈淀粉最佳模型是通過標(biāo)準(zhǔn)正?;幚?一階導(dǎo)數(shù)(SNV+ first derivative)處理獲得的模型(SECV=1.7516、1-VR=0.8818)。
運(yùn)用WinISI軟件驗(yàn)證30份高粱品種(表3)檢驗(yàn)定標(biāo)模型的預(yù)測(cè)性能,直鏈淀粉定標(biāo)模型檢測(cè)結(jié)果(圖3)表明,Bias=0.229<2.7732(SECV)×0.6,SEP=1.266<2.7732(SECV)×1.3=3.60516,11.01(SD)-10.81(SD)=0.2<11.02(SD)×0.2=2.204,RSQ(外部)是0.987。支鏈淀粉定標(biāo)模型檢測(cè)結(jié)果(圖3)表明,Bias=-0.014<1.7516(SECV)×0.6,SEP=1.316<1.7516(SECV)×1.3=2.2708,5.30-5.29=0.01<5.30×0.2=1.06,RQS(外部)是0.937。同時(shí)將直鏈淀粉、支鏈淀粉含量測(cè)定值(表3)進(jìn)行單樣本K-S檢測(cè),結(jié)果(表4)表明,直鏈淀粉漸進(jìn)顯著性=0.003<0.05,支鏈淀粉漸進(jìn)顯著性=0.012<0.05,所以直鏈淀粉、支鏈淀粉含量數(shù)據(jù)不符合正態(tài)分布,不能進(jìn)行檢測(cè)。為檢驗(yàn)測(cè)定值和預(yù)測(cè)值的相關(guān)性,進(jìn)行了兩配對(duì)樣本非參數(shù)檢測(cè)(威爾科克森符號(hào)秩檢驗(yàn),表5),結(jié)果表明,直鏈淀粉測(cè)定值-直鏈淀粉預(yù)測(cè)值漸進(jìn)顯著性=0.262>0.05,支鏈淀粉測(cè)定值-支鏈淀粉預(yù)測(cè)值漸進(jìn)顯著性=0.992>0.05,所以直鏈淀粉、支鏈淀粉測(cè)定值和對(duì)應(yīng)的預(yù)測(cè)值無顯著差異,因此,直鏈淀粉、支鏈淀粉的定標(biāo)模型有很高的預(yù)測(cè)性能,能滿足高粱直鏈淀粉、支鏈淀粉含量中高對(duì)準(zhǔn)度的要求。
A:直鏈淀粉;B:支鏈淀粉 A: Amylose; B: Amylopectin
表2 不同處理方法高粱直鏈淀粉、支鏈淀粉含量的主要評(píng)價(jià)參數(shù)
SEC:定標(biāo)標(biāo)準(zhǔn)偏差;RSQ:相關(guān)系數(shù);SECV:交叉驗(yàn)證標(biāo)準(zhǔn)偏差;1-VR:交叉驗(yàn)證相關(guān)系數(shù)
SEC: standard error of the calibration; RSQ: R-squared; SECV: standard error of cross validation; 1-VR: 1 minus the variance ratio
表3 化學(xué)測(cè)定值和近紅外模型預(yù)測(cè)值結(jié)果比較
近紅外光具有很好的透射性,在檢測(cè)樣品時(shí)近紅外光通過漫反射、透射、反射等方式使含氫基團(tuán)X-H(X=C、S、N、O)對(duì)近紅外光進(jìn)行選擇性吸收,獲得包含樣品信息合頻和倍頻近紅外光譜,運(yùn)用化學(xué)計(jì)量學(xué)方法將光譜和化學(xué)數(shù)據(jù)關(guān)聯(lián)并建立預(yù)測(cè)模型[19]。直鏈淀粉是一種基本線性α-(1,4)-葡聚糖鏈,支鏈淀粉由許多α吡喃葡萄糖通過α-1,4-糖苷鍵連接而成的短鏈所構(gòu)成[20-22]。因此,直鏈淀粉、支鏈淀粉都含有大量的羥基和碳?xì)滏I,可以對(duì)近紅外光進(jìn)行吸收。相較于傳統(tǒng)檢測(cè)方法操作復(fù)雜、樣品破壞性大、存在一定的安全風(fēng)險(xiǎn)和環(huán)境污染等的缺陷,近紅外光譜分析技術(shù)已經(jīng)用在產(chǎn)業(yè)產(chǎn)品品質(zhì)和質(zhì)量的評(píng)定標(biāo)準(zhǔn)上,在生產(chǎn)效率和產(chǎn)品質(zhì)量上取得了良好的效果,并在農(nóng)業(yè)、分子生物和制藥等領(lǐng)域都廣泛的普及應(yīng)用[23-25]。
表4 單樣本科爾翼戈洛夫-斯米諾夫檢驗(yàn)
a:檢驗(yàn)分布為正態(tài)分布;b:根據(jù)數(shù)據(jù)計(jì)算;c:里利氏顯著性修正
a: Test distribution is normal;b: Calculate based on data;c: Richie's significance correction
表5 威爾科克森符號(hào)秩檢驗(yàn)
a:基于正秩a: Based on positive rank
本研究基于Grain Analyzer(InfratecTM1241型,丹麥FOSS)建立一套快速、高效、綠色、低成本檢測(cè)高粱籽粒直鏈淀粉、支鏈淀粉含量的定標(biāo)模型,并利用該模型對(duì)高粱品質(zhì)進(jìn)行評(píng)估。InfratecTM1241 Grain Analyzer獲得CE認(rèn)證、GIPSA認(rèn)證、NTEP認(rèn)證、PTB認(rèn)證等,開機(jī)自檢、程序穩(wěn)定、操作簡(jiǎn)單,通過單色光穿透樣品到達(dá)檢測(cè)器獲取光譜信號(hào),該信號(hào)通過內(nèi)置計(jì)算機(jī)處理獲得檢測(cè)值。通過收集2年450份高粱品種的近紅外光譜圖,根據(jù)計(jì)算得分處理挑出0.8<GH<3的112份代表品種定標(biāo)建模和結(jié)果監(jiān)控。相較于劉紅梅等[26]、王勇生等[27]、巫小建等[28]構(gòu)建的近紅外模型,本研究采用的樣品是通過GH挑選出來的代表性品種,光譜特征基本一致波峰和波谷的變化明顯,在932和972 nm處不同高粱品種的吸光度有較大區(qū)別,說明近紅外光譜對(duì)高粱直鏈淀粉、支鏈淀粉有較高的區(qū)分度,既節(jié)省時(shí)間又確保模型的穩(wěn)定和結(jié)果的準(zhǔn)確(圖1)。然后對(duì)樣品光譜采用14種不同的散射處理、計(jì)算得分,并用WinISI軟件根據(jù)得分文件采用Modified PLS構(gòu)建不同模型,挑選出最佳模型并進(jìn)行結(jié)果監(jiān)控。與李佳佳等[10]、陳雪萍等[5]構(gòu)建的近紅外模型相比,本研究采用Modified PLS構(gòu)建模型,獲得模型SECV小1-VR高(表2)。用WinISI軟件進(jìn)行結(jié)果監(jiān)控并繪圖,相比SPSS相關(guān)性分析更準(zhǔn)確更靠譜,同時(shí)還可以對(duì)定標(biāo)模型進(jìn)行斜率和截距的調(diào)整,使檢測(cè)結(jié)果更準(zhǔn)確更穩(wěn)定。
劉紅梅等[26]通過PLS回歸技術(shù),采用不同預(yù)處理和不同波長(zhǎng)建立稻米直鏈淀粉含量的近紅外光譜預(yù)測(cè)模型。結(jié)果表明,對(duì)全譜圖采用多元校正(MSC)預(yù)處理的效果較好,經(jīng)過優(yōu)化模型的相關(guān)系數(shù)()是0.9819,SEP是0.1009,定標(biāo)標(biāo)準(zhǔn)偏差(standard error of the calibration,SEC)是0.831;將化學(xué)值與稻米直鏈淀粉含量的近紅外光譜預(yù)測(cè)值進(jìn)行配對(duì)檢測(cè),=0.356>0.05(置信區(qū)間為95%),說明近紅外光譜預(yù)測(cè)值與化學(xué)分析值無顯著差異,即利用近紅外光譜快速檢測(cè)稻米直鏈淀粉含量是可行的。王勇生等[27]采用PLS回歸技術(shù)和全交互驗(yàn)證手段高粱中粗蛋白質(zhì)、水分含量的近紅外光譜預(yù)測(cè)模型。結(jié)果顯示粗蛋白質(zhì)含量原始光譜通過一階導(dǎo)數(shù)+多元散射校正預(yù)處理得到的近紅外光譜預(yù)測(cè)模型相對(duì)分析誤差是8.41,交互驗(yàn)證相對(duì)分析誤差是4.97,外部驗(yàn)證相對(duì)分析誤差3.32;水分含量原始光譜通過一階導(dǎo)數(shù)+減去一條直線預(yù)處理得到的近紅外光譜預(yù)測(cè)模型相對(duì)分析誤差是12.20,交互驗(yàn)證相對(duì)分析誤差是7.97,外部驗(yàn)證相對(duì)分析誤差5.36。預(yù)測(cè)模型的相對(duì)分析誤差均大于評(píng)估值,因此具有精確評(píng)估高粱中粗蛋白質(zhì)和水分含量的應(yīng)用效果。巫小建等[28]建立大麥籽??偟矸鄣慕t外檢測(cè)模型代替?zhèn)鹘y(tǒng)檢測(cè),韓浩楠等[29]建立玉米粉淀粉含量預(yù)測(cè)模型應(yīng)用于育種材料篩選,KIM等[30]建立闊葉凋落物的水分含量近紅外預(yù)測(cè)模型用于預(yù)測(cè)林地凋落物的水分,說明近幾年近紅外光譜分析技術(shù)在育種、品質(zhì)分析、工業(yè)上運(yùn)用得越來越廣泛。
CHEN等[31]、劉紅梅等[26]、王勇生等[27]用校正集樣品近紅外光譜通過不同的預(yù)處理和主成分分析建立不同的模型,通過驗(yàn)證集樣品進(jìn)行模型評(píng)估,比較各模型的RSQ、SEP、SECV、1-VR等選擇最優(yōu)模型。該研究步驟與前人基本一致,但回歸建模技術(shù)是Modified PLS,檢測(cè)模型的預(yù)測(cè)性能是通過WinISI軟件進(jìn)行監(jiān)測(cè)和非參數(shù)檢驗(yàn)。該試驗(yàn)建立的高粱中直鏈淀粉含量最佳預(yù)測(cè)模型采用SNV+二階導(dǎo)數(shù)處理,SECV是2.7732,1-VR是0.9503,RSQ是0.9688。Bias=0.229<2.7732(SECV)×0.6,即Bias小于定標(biāo)模型SECV的0.6倍;SEP=1.266<2.7732(SECV)×1.3=3.60516,即SEP小于定標(biāo)模型SECV的1.3倍,11.01(SD)-10.81(SD)=0.2<11.02(SD)×0.2=2.204,即化學(xué)數(shù)據(jù)和近紅外預(yù)測(cè)數(shù)據(jù)SD差值小于化學(xué)數(shù)據(jù)SD的20%。支鏈淀粉含量最佳預(yù)測(cè)模型是采用SNV+一階導(dǎo)數(shù)處理,SECV是1.7516,1-VR是0.8818,RSQ是0.9127。Bias=﹣0.014<1.7516(SECV)×0.6,即Bias小于定標(biāo)模型SECV的0.6倍,SEP=1.316<1.7516(SECV)×1.3=2.2708,即SEP小于定標(biāo)模型SECV的1.3倍,5.30-5.29=0.01<5.30×0.2=1.06,即化學(xué)數(shù)據(jù)和近紅外預(yù)測(cè)數(shù)據(jù)SD差值小于化學(xué)數(shù)據(jù)SD的20%,并且進(jìn)行兩配對(duì)樣本非參數(shù)檢驗(yàn)結(jié)果表明二者之間差異不顯著(=0.262>0.05;=0.992>0.05)。
構(gòu)建了一套低成本、綠色、高效的高粱籽粒直鏈淀粉、支鏈淀粉含量檢測(cè)模型。該模型的結(jié)果監(jiān)控顯示精準(zhǔn)度高,穩(wěn)定性好,可靠性強(qiáng),并且可以代替化學(xué)方法測(cè)定高粱籽粒直鏈淀粉、支鏈淀粉的含量。
[1] 焦少杰, 王黎明, 姜艷喜, 嚴(yán)洪冬, 蘇德峰, 孫廣全. 高粱與固態(tài)白酒關(guān)系的研究綜述. 釀酒, 2015, 42(1): 13-16.
JIAO S J, Wang L M, JIANG Y X, YAN H D, SU D F, SUN G Q. A review of the research on the relationship between sorghum and solid liquorWine Making, 2015, 42(1): 13-16. (in Chinese)
[2] OMAR J, SLOWIKOWSKI B, BOIX A. Chemometric approach for discriminating tobacco trademarks by near infrared spectroscopy.Forensic Science International, 2019, 294: 15-20.
[3] CHEN H, TAN C, LIN Z, LI H J. Quantifying several adulterants of Noto ginseng powder by near-infrared spectroscopy and multivariate calibration. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2018, 211: 280-286.
[4] SIRSOMBOON P, POSOM J. On-line measurement of activation energy of ground bamboo using near infrared spectroscopy. Renewable Energy, 2019, 133: 480-488.
[5] 陳雪萍, 劉世堯, 尹能文, 荊凌云, 魏麗娟, 林吶, 肖陽, 徐新福, 李加納, 劉列釗. 甘藍(lán)型油菜莖稈纖維組分含量和木質(zhì)素單體G/S近紅外模型構(gòu)建. 中國(guó)農(nóng)業(yè)科學(xué), 2018, 51(4): 688-698.
CHEN X P, LIU S Y, YIN N W, JING L Y, WEI L J, LIN N, XIAO Y, XU X F, LI J N, LIU L Z. Construction of NIR model of fiber component content and lignin monomer G/S instalk. Scientia Agricultura Sinica,2018, 51(4): 688-698. (in Chinese)
[6] TKACHUK R. Oil and protein analysis of whole rapeseed kernels by near infrared reflectance spectroscopy.Journal of the American Oil Chemists’ Society, 1981, 58(8): 819-822.
[7] PANFORD J A, WILLIAMS P C, DEMAN J M. Analysis of oilseeds for protein, oil, fiber and moisture by near-infrared reflectance spectroscopy.Journal of the American Oil Chemists’ Society, 1988, 65(10): 1627-1634.
[8] 王翠秀, 曹建飛, 顧振飛, 徐明雪, 吳泉源. 基于近紅外光譜大豆蛋白質(zhì)、脂肪快速無損檢測(cè)模型的優(yōu)化構(gòu)建. 大豆科學(xué), 2019, 38(6): 968-976.
WANG C X, CAO J F, GU Z F, XU M X, WU Q Y. Optimized construction of a rapid non-destructive detection model for soybean protein and fat based on near-infrared spectroscopy. Soybean Science, 2019, 38(6): 968-976. (in Chinese)
[9] ZHANG Y M, GUO W C. Moisture content detection of maize seed based on visible/near-infrared and near-infrared hyperspectral imaging technology. International Journal of Food Science and Technology. 2020,55: 631-640.
[10] 李佳佳, 洪慧龍, 萬明月, 儲(chǔ)麗, 趙敬會(huì), 汪明華, 徐志鵬, 張陰, 黃志平, 張文明, 王曉波, 邱麗娟. 基于近紅外光譜的大豆莖稈化學(xué)組分含量檢測(cè)模型構(gòu)建與應(yīng)用. 中國(guó)農(nóng)業(yè)科學(xué), 2021, 54, 54(5): 887-900.
LI J J, HONG H L, WAN M Y, CHU L, ZHAO J H, WANG M H, XU Z P, ZHANG Y, HUANG Z P, ZHANG W M, WANG X B, QIU L J. Construction and application of soybean stem chemical composition detection model based on near infrared spectroscopy. Scientia Agricultura Sinica, 2021, 54, 54(5): 887-900. (in Chinese)
[11] 黃朝暉, 陸平, 楊楠, 孟憲軍, 任貴興. 近紅外光譜法測(cè)定高粱原花青素含量. 食品科技, 2008(10): 207-210.
HUANG Z H, LU P, YANG N, MENG X J, REN G X. Determination of proanthocyanidins content of sorghum by near infrared spectroscopy. Food Science and Technology, 2008(10): 207-210. (in Chinese)
[12] 劉敏軒, 黃赟文, 韓建國(guó). 高粱籽粒中多酚類物質(zhì)的傅里葉變換近紅外光譜分析. 分析化學(xué), 2009, 37(9): 1275-1280.
LIU M X, HUANG Y W, HAN J G. Fourier transform near infrared spectroscopy analysis of polyphenols in sorghum grains.Chinese Journal of Analytical Chemistry, 2009, 37(9): 1275-1280. (in Chinese)
[13] SIMEONE M L F, PARRELLA R A C, SCHAFFERT R E S, DAMASCENO C M B, LEAL M C B, PASQUINI C. Near infrared spectroscopy determination of sucrose, glucose and fructose in sweet sorghum juice. Microchemical Journal, 2017, 134: 125-130.
[14] 紀(jì)楠. 大豆秸稈木質(zhì)素和纖維素含量與近紅外光譜相關(guān)性模型研究[D]. 哈爾濱: 東北農(nóng)業(yè)大學(xué), 2016.
JI N. Research on the correlation model between lignin and cellulose content of soybean straw and near infrared spectroscopy[D]. HarbinNortheast Agricultural University, 2016. (in Chinese)
[15] CELIO P. Near infrared spectroscopy: fundamentals, practical aspects and analytical applications. Journal of the Brazilian Society, 2003, 4(1): 198-219.
[16] GB/T15683-2008, 大米直鏈淀粉含量的測(cè)定. 北京: 中國(guó)標(biāo)準(zhǔn)出版社, 2008.
GB/T15683-2008, Determination of amylose content in rice. Beijing: China Standard Press, 2008. (in Chinese)
[17] GARRIDOVARO A, GARCIAOLMO J, FEARN T. A note on Mahalanobis and related distance measures in WinISI and The Unscrambler. Journal of Near Infrared Spectroscopy, 2019, 27(4): 253-258.
[18] BAI T C, WANG T, CHEN Y Q, MERCATORIS B. Comparison of near-infrared spectrum pretreatment methods for Jujube leaf moisture content detection in the sand and dust area of southern Xinjiang. Spectroscopy and Spectral Analysis,2019, 39: 1323-1328.
[19] 鄭震璇. 近紅外光譜分析技術(shù)在飼料加工行業(yè)的應(yīng)用. 福建農(nóng)機(jī), 2019, 40(1): 24-27.
ZHENG Z X. Application of near infrared spectroscopy analysis technology in feed processing industry. Fujian Agricultural Machinery, 2019, 40(1): 24-27. (in Chinese)
[20] VAMADEVAN V, BERTOFT E. Structure-function relationships of starch components. Starch-St?rke, 2015, 67(1/2): 55-68.
[21] LEE J H, YOU S, KWEON D K, CHUNG H J, LIM S T. Dissolution behaviors of waxy maize amylopectin in aqueous-DMSO solutions containing NaCl and CaCl2. Food Hydrocolloids, 2014, 35: 115-121.
[22] BERTOFT E. Understanding starch structure: recent progress. Agronomy 2017, 7: 1-29.
[23] 郝勇. 近紅外光譜微量分析方法研究[D]. 天津: 南開大學(xué), 2009.
HAO Y. Research on near infrared spectroscopy microanalysis method[D]. Tianjin: Nankai University. 2009. (in Chinese)
[24] 梁曉燕, 吉海彥. 近紅外光譜技術(shù)在農(nóng)作物品質(zhì)分析方面的應(yīng)用. 中國(guó)農(nóng)學(xué)通報(bào), 2006, 22(1): 366-371.
LIANG X Y, JI H Y. Application of near infrared spectroscopy technology in crop quality analysis. Chinese Agricultural Science Bulletin, 2006, 22(1): 366-371. (in Chinese)
[25] 王家多, 周向陽, 金同銘, 胡祥娜, 鐘嬌娥, 吳啟堂. 近紅外光譜檢測(cè)技術(shù)在農(nóng)業(yè)和食品分析上的應(yīng)用. 光譜學(xué)與光譜分析, 2004, 24(4): 447-450.
WANG J D, ZHOU X Y, JIN T M, HU X N, ZHONG J E, WU Q T. The application of near-infrared spectroscopy detection technology in agriculture and food analysis. Spectroscopy and Spectral Analysis, 2004, 24(4): 447-450. (in Chinese)
[26] 劉紅梅, 肖正午, 申濤, 蔣鵬, 單雙呂, 鄒應(yīng)兵. 稻米直鏈淀粉含量近紅外檢測(cè)模型的建立. 湖南農(nóng)業(yè)大學(xué)學(xué)報(bào), 2019, 45(2): 189-193.
LIU H M, XIAO Z W, SHEN T, JIANG P, SHAN S L, ZOU Y B. Establishment of near infrared detection Model of rice amylose content. Journal of Hunan Agricultural University, 2019, 45(2): 189-193. (in Chinese)
[27] 王勇生, 李潔, 王博, 張宇婷, 耿俊林. 基于近紅外光譜技術(shù)評(píng)估高粱中粗蛋白質(zhì)、水分含量的研究. 動(dòng)物營(yíng)養(yǎng)學(xué)報(bào), 2020, 32(3): 1353-1361.
WANG Y S, LI J, WANG B, ZHANG Y T, GENG J L. Research on the evaluation of crude protein and moisture content in sorghum based on near-infrared spectroscopy technology. Chinese Journal of Animal Nutrition, 2020, 32(3): 1353-1361. (in Chinese)
[28] 巫小建, 曾凡榮, 岳文浩, 汪軍妹. 大麥籽??偟矸酆拷t外快速無損檢測(cè)模型的構(gòu)建. 浙江農(nóng)業(yè)科學(xué), 2021, 62(1): 40-41.
WU X J, ZENG F R, YUE W H, WANG J M. Construction of a near-infrared rapid non-destructive detection model for the total starch content of barley grains.Zhejiang Agricultural Sciences, 2021, 62(1): 40-41. (in Chinese)
[29] 韓浩楠, 王美娟, 趙訓(xùn)超, 魯鑫, 周志強(qiáng), 李明順, 張德貴, 郝傳芳, 翁劍峰, 雍紅軍, 李新海. 玉米粉淀粉含量近紅外模型建立與優(yōu)化. 玉米科學(xué), 2020, 28(6): 81-87.
HAN H N, WANG M J, ZHAO X C, LU X, ZHOU Z Q, LI M S, ZHANG D G, HAO C F, WENG J F, YONG H J, LI X H. Establishment and optimization of near-infrared model of corn flour starch content.Maize Science, 2020, 28(6): 81-87. (in Chinese)
[30] KIM G, HONG S J, LEE A Y, LEE Y E, LM S. Moisture content measurement of broadleaf litters using near-infrared spectroscopy technique. Remote Sensing, 2017, 9(12): 1212.
[31] CHEN J M, LI M L, PAN T, PANG L W, YAO L J, ZHANG J. Rapid and non-destructive for the identification of multi-grain rice seeds with near-infrared spectroscopy. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2019, 219(5): 179-185.
Construction and Application of Detection Model for Amylose and Amylopectin Content in Sorghum Grains Based on Near Infrared Spectroscopy
ZHANG BeiJu, CHEN SongShu, LI KuiYin, LI LuHua, XU RuHong, AN Chang, XIONG FuMin, ZHANG Yan, DONG LiLi, REN MingJian
College of Agriculture, Guizhou University/Guizhou Branch of National Wheat Improvement Center, Guiyang 550025
【】Sorghum is one of the main raw materials for wine making and feed. The ratio of amylose content to amylopectin content in its grains is closely related to liquor quality and feed quality. Traditional chemical detection methods of sorghum components are no longer suitable for high-throughput testing. Modified PLS is used to perform spectral preprocessing, score processing and result monitoring on the near-infrared spectra of sorghum samples to establish sorghum grain amylose and amylopectin. The prediction model of amylose content aims to obtain a fast, efficient and low-cost detection method, laying the foundation for genetic improvement and quality analysis of sorghum. 【】From 450 sorghum resources, 112 representative varieties were selected as calibration set and verification set. The chemical values of amylose and amylopectin content in 112 sorghum varieties were measured, and near-infrared spectra with wavelengths of 850-1 048 nm were collected, and the spectrum was scanned data matrix and chemical data calculated score (PL1) processing and interpreting the differences between the spectra, and eliminating abnormal species with Global H (GH) greater than 3 to reduce modeling errors. Modified PLS regression technology is used for modeling, and different calibration models are established through different scattering processing and derivative processing methods. Determine the best model according to the cross-validation standard deviation (SECV) and cross-validation correlation coefficient (1-VR), and perform result monitoring and non-parametric testing to evaluate the predictive performance of the model.【】The near-infrared prediction model SECV of amylose is 2.7732, 1-VR is 0.9503, and the correlation coefficient (RSQ) is 0.9688. Bias=0.229<2.7732(SECV)×0.6, that is, the deviation (Bias) is less than 0.6 times of the calibration model SECV; the predicted standard deviation (SEP)=1.266<2.7732(SECV)×1.3=3.60516, that is, the SEP is less than the calibration. The model SECV is 1.3 times, 11.01(SD)-10.81(SD)=0.2<11.02(SD)×0.2=2.204, that is, the difference between the standard deviation (SD) of the chemical data and the near-infrared prediction data is less than 20% of the chemical data SD. The near-infrared prediction model SECV of amylopectin is 1.7516, 1-VR is 0.8818, and RSQ is 0.9127. Bias=-0.014<1.7516(SECV)×0.6 means that Bias is less than 0.6 times of SECV of calibration model, SEP=1.316<1.7516(SECV)×1.3=2.2708 means SEP is less than 1.3 times of SECV of calibration model, 5.30-5.29=0.01<5.30×0.2=1.06, that is, the difference between the chemical data and the near-infrared prediction data SD is less than 20% of the chemical data SD. Using 30 sorghum grains outside the model to conduct a two-pair sample non-parametric test on the validity of the model, the results showed that the difference between the measured and predicted values of amylose content and amylopectin content was not significant (=0.262>0.05;=0.992>0.05).【】The established near-infrared model has high accuracy and good stability, can accurately and quickly detect the content of amylose and amylopectin in sorghum, and can be used for the genetic improvement of sorghum and the detection of sorghum quality.
near infrared spectroscopy; sorghum; amylose; amylopectin; improved least squares method
10.3864/j.issn.0578-1752.2022.01.003
2021-06-01;
2021-07-30
貴州省特色雜糧現(xiàn)代農(nóng)業(yè)產(chǎn)業(yè)技術(shù)體系建設(shè)專項(xiàng)(黔財(cái)農(nóng)[2019]15號(hào))、酒用高粱良種繁殖及配套栽培技術(shù)試驗(yàn)研究(700484192124)、貴州酒用高粱品種選育研究(GNW2020GD001)
張北舉,E-mail:743665191@qq.com。通信作者任明見,E-mail:rmj72@163.com
(責(zé)任編輯 李莉)