段青玲,張 磊,魏芳芳,肖曉琰,王 亮
(中國農(nóng)業(yè)大學(xué)信息與電氣工程學(xué)院,北京 100083)
基于時(shí)間序列GA-SVR的水產(chǎn)品價(jià)格預(yù)測模型及驗(yàn)證
段青玲,張 磊,魏芳芳,肖曉琰,王 亮
(中國農(nóng)業(yè)大學(xué)信息與電氣工程學(xué)院,北京 100083)
水產(chǎn)品價(jià)格的準(zhǔn)確預(yù)測有助于合理規(guī)劃水產(chǎn)養(yǎng)殖,正確引導(dǎo)水產(chǎn)行業(yè)的發(fā)展。根據(jù)水產(chǎn)品價(jià)格序列的非線性、非平穩(wěn)和周期性特點(diǎn),提出了一種基于時(shí)間序列遺傳優(yōu)化(genetic algorithm,GA)支持向量回歸(support vector regression,SVR)的水產(chǎn)品價(jià)格預(yù)測模型。該模型首先通過時(shí)間序列分析方法對(duì)價(jià)格序列進(jìn)行平穩(wěn)性檢驗(yàn)和確定相關(guān)階數(shù),得到訓(xùn)練數(shù)據(jù)集;再利用遺傳算法對(duì)支持向量回歸模型的參數(shù)組合進(jìn)行尋優(yōu),使用優(yōu)化后的參數(shù)建立支持向量回歸模型,然后使用模型進(jìn)行預(yù)測。分別選取桂魚、基圍蝦、梭子蟹的價(jià)格數(shù)據(jù)對(duì)模型進(jìn)行驗(yàn)證,選取 2011-2014年的數(shù)據(jù)作為訓(xùn)練集,對(duì)2015年價(jià)格進(jìn)行預(yù)測,結(jié)果表明:桂魚、基圍蝦、梭子蟹的平均絕對(duì)誤差分別為6.70%、7.82%、14.76%,均方根誤差分別為5.853 1、23.701 1、13.858 0,且優(yōu)于基于時(shí)間序列的SVR模型及BPANN模型的預(yù)測結(jié)果,可以為水產(chǎn)品價(jià)格的預(yù)測提供依據(jù)。
養(yǎng)殖;模型;支持向量機(jī);價(jià)格預(yù)測;水產(chǎn)品;遺傳算法;時(shí)間序列
中國是水產(chǎn)養(yǎng)殖大國,水產(chǎn)品價(jià)格的波動(dòng)對(duì)水產(chǎn)行業(yè)的發(fā)展有著重要影響[1]。但是由于水產(chǎn)養(yǎng)殖的無序性,以及保鮮和運(yùn)輸?shù)挠绊?,?dǎo)致水產(chǎn)品價(jià)格波動(dòng)過大,有時(shí)會(huì)出現(xiàn)優(yōu)質(zhì)優(yōu)量但不優(yōu)價(jià)的情況。對(duì)水產(chǎn)品價(jià)格進(jìn)行預(yù)測,能夠使水產(chǎn)養(yǎng)殖者及時(shí)了解市場的變化趨勢(shì),合理規(guī)劃養(yǎng)殖結(jié)構(gòu),做到有的放矢,使養(yǎng)殖利益最大化。同時(shí),價(jià)格預(yù)測為政府制定相關(guān)行業(yè)政策提供科學(xué)依據(jù),力求資源得到充分利用,促進(jìn)水產(chǎn)行業(yè)健康可持續(xù)發(fā)展。
價(jià)格預(yù)測是依據(jù)市場經(jīng)濟(jì)規(guī)律,運(yùn)用科學(xué)的方法,對(duì)未來價(jià)格的變動(dòng)趨勢(shì)所進(jìn)行的分析和判斷[2]。價(jià)格預(yù)測的主要模型有時(shí)間序列模型[3-5],回歸分析模型[6-7],以及組合模型[8-10]。時(shí)間序列模型主要分析價(jià)格序列和時(shí)間之間的關(guān)系,根據(jù)歷史數(shù)據(jù)的規(guī)律和特點(diǎn)對(duì)未來價(jià)格進(jìn)行預(yù)測,這類模型對(duì)于線性預(yù)測效果較好,但對(duì)于非線性處理并不理想;回歸分析模型是選擇和預(yù)測對(duì)象關(guān)聯(lián)較高的影響因子建立預(yù)測模型,價(jià)格的波動(dòng)是由于影響因子的變化導(dǎo)致的,所以這種方法易于理解,但是對(duì)于影響因子的選擇和數(shù)據(jù)的搜集較為困難;隨著智能計(jì)算的發(fā)展,組合模型逐漸成為研究的熱點(diǎn),它的好處是可以結(jié)合各自的特點(diǎn)建立預(yù)測模型。Zhang等[11]構(gòu)建了水產(chǎn)品價(jià)格預(yù)測支持系統(tǒng),集成了神經(jīng)網(wǎng)絡(luò)、案例推理、移動(dòng)平均、線性回歸等模型,首先使用時(shí)間序列的方法對(duì)數(shù)據(jù)序列進(jìn)行分析,然后根據(jù)數(shù)據(jù)序列的特點(diǎn)選用適當(dāng)?shù)哪P瓦M(jìn)行預(yù)測。Li等[12]采用小波神經(jīng)網(wǎng)絡(luò)對(duì)水產(chǎn)品價(jià)格進(jìn)行預(yù)測,并使用鱸魚價(jià)格驗(yàn)證了模型的可行性。任海軍等[13]給出了一種基于時(shí)間序列的AR_SVR模型的農(nóng)產(chǎn)品價(jià)格預(yù)測方法,將時(shí)間序列和支持向量回歸相結(jié)合,并在黃瓜價(jià)格上得到了驗(yàn)證。李哲敏等[14]將混沌理論和神經(jīng)網(wǎng)絡(luò)時(shí)間序列應(yīng)用到農(nóng)產(chǎn)品價(jià)格預(yù)測中,設(shè)計(jì)了動(dòng)態(tài)混沌神經(jīng)網(wǎng)絡(luò)時(shí)間序列預(yù)測模型,以馬鈴薯時(shí)間序列價(jià)格為例進(jìn)行了試驗(yàn),優(yōu)于傳統(tǒng)的時(shí)間序列預(yù)測模型效果。還有一些學(xué)者應(yīng)用不同的分析模型對(duì)預(yù)測的問題做了很多研究[15-21]。這些模型分別針對(duì)數(shù)據(jù)項(xiàng)的確定,核函數(shù)的選擇,模型參數(shù)的優(yōu)化給出分析方法,側(cè)重于某一方面問題的解決。相較于一般的農(nóng)產(chǎn)品,水產(chǎn)品價(jià)格具有價(jià)格彈性大的特點(diǎn),由于養(yǎng)殖的特殊性,受產(chǎn)量、天氣及交通運(yùn)輸?shù)纫蛩氐挠绊懀沟盟a(chǎn)品價(jià)格序列具有非線性、非平穩(wěn)、周期性的特點(diǎn)。但針對(duì)水產(chǎn)品價(jià)格預(yù)測的深入研究并不多,預(yù)測精度仍有待提高。
水產(chǎn)品價(jià)格是水產(chǎn)養(yǎng)殖業(yè)的關(guān)鍵指標(biāo),是消費(fèi)者和養(yǎng)殖戶都關(guān)心的問題。隨著人們生活水平的不斷提高,對(duì)水產(chǎn)品的營養(yǎng)價(jià)值也有了更科學(xué)的認(rèn)識(shí),水產(chǎn)品的消費(fèi)日益增多。根據(jù)水產(chǎn)品價(jià)格的非線性、非平穩(wěn)、周期性特點(diǎn),本文給出基于時(shí)間序列遺傳優(yōu)化支持向量回歸(geneticalgorithm-supportvector regression,GA-SVR)的水產(chǎn)品價(jià)格預(yù)測模型,對(duì)序列的平穩(wěn)性、相關(guān)項(xiàng)階數(shù)的確定、核函數(shù)的選擇及參數(shù)的優(yōu)化給出解決方法,以提高預(yù)測的精度。本文的主要工作在兩方面:1)將時(shí)間序列的分析方法,遺傳優(yōu)化算法和支持向量回歸模型結(jié)合起來,給出水產(chǎn)品價(jià)格的組合預(yù)測模型;2)使用組合模型對(duì)水產(chǎn)品價(jià)格數(shù)據(jù)做預(yù)測,通過和其他預(yù)測模型進(jìn)行比較,驗(yàn)證模型的準(zhǔn)確性。
根據(jù)水產(chǎn)品價(jià)格的非線性、非平穩(wěn)、周期性特點(diǎn),本文給出了基于時(shí)間序列GA-SVR的水產(chǎn)品價(jià)格預(yù)測模型。首先對(duì)原始數(shù)據(jù)序列進(jìn)行預(yù)處理,使得數(shù)據(jù)的波動(dòng)映射到一個(gè)較小的區(qū)間上,然后使用時(shí)間序列的分析方法使得序列平穩(wěn)化并確定相關(guān)項(xiàng)的階數(shù),生成訓(xùn)練數(shù)據(jù)集。SVR模型選取徑向基核函數(shù),通過GA算法的全局搜索能力對(duì)其參數(shù)進(jìn)行尋優(yōu),建立預(yù)測模型。使用模型進(jìn)行預(yù)測,得到預(yù)測值。
1.1 時(shí)間序列分析
水產(chǎn)品價(jià)格序列呈現(xiàn)出周期性和非平穩(wěn)的特點(diǎn),階數(shù)的選擇對(duì)預(yù)測結(jié)果的準(zhǔn)確性會(huì)產(chǎn)生很大影響,可以通過偏自相關(guān)系數(shù)來確定序列的平穩(wěn)性并確定相關(guān)階數(shù)[22-23]。采用時(shí)間序列中偏自相關(guān)系數(shù)分析方法對(duì)價(jià)格序列進(jìn)行分析,使得價(jià)格序列滿足預(yù)測的基本要求,保證了預(yù)測的有效性,使得預(yù)測的結(jié)果更為真實(shí)。詳細(xì)計(jì)算過程為:對(duì)于平穩(wěn)的隨機(jī)過程,其期望為常數(shù),用來表示,即
式中E(·)為期望函數(shù),x表示價(jià)格值,t表示對(duì)應(yīng)的序數(shù)。平穩(wěn)隨機(jī)過程的方差σ也是一個(gè)常量:
式中Var(·)為方差函數(shù),x表示價(jià)格值,t表示對(duì)應(yīng)的序數(shù)。方差σ用來度量隨機(jī)過程取值對(duì)其均值μ的離散程度。設(shè)隨機(jī)變量相隔的期數(shù)為h,則相隔h期的2個(gè)隨機(jī)變量 xt與 xt+h的協(xié)方差即為滯后h期的自協(xié)方差,定義為:
式中Cov(·)為自協(xié)方差函數(shù),μ為得到的期望值,h表示相隔的期數(shù)。
自相關(guān)系數(shù)定義為:
式中Cov(·)為自協(xié)方差函數(shù),Var(·)為方差函數(shù)。因?yàn)閷?duì)于一個(gè)平穩(wěn)過程有:
當(dāng)h=0時(shí),ρ0=1。以滯后期h為變量的自相關(guān)系數(shù)數(shù)列稱為自相關(guān)函數(shù)。
偏相關(guān)系數(shù)的遞推公式為:
得到序列的偏相關(guān)系數(shù)?h,h。統(tǒng)計(jì)推斷的依據(jù)為平穩(wěn)序列的自相關(guān)系數(shù)ρh和偏相關(guān)系數(shù)?h,h均服從的正態(tài)分布,可以通過差分和季節(jié)差分操作使序列平穩(wěn)化。假設(shè)存在正整數(shù)mar和mma,滿足:
式中n和m均表示正整數(shù),訓(xùn)練數(shù)據(jù)集包括訓(xùn)練子集X和期望值子集Y,x表示訓(xùn)練子集X和期望值子集Y中的元素。
1.2 SVR模型
對(duì)于水產(chǎn)品價(jià)格序列的非線性的特點(diǎn),使用支持向量回歸機(jī)可以很好的處理這個(gè)問題。使用得到的數(shù)據(jù)集來對(duì)序列進(jìn)行擬合,生成預(yù)測模型。基于ε-SVR的時(shí)間序列預(yù)測問題的數(shù)學(xué)描述如下[24-26]。
式中?(·)為模型算子,w和b為函數(shù)系數(shù)。其優(yōu)化問題為:
式中C為懲罰因子,ε為不敏感損失參數(shù)。
考慮到價(jià)格序列非線性的特點(diǎn),本文采用徑向基函數(shù)作為SVR的核函數(shù),那么,其決策函數(shù)就是:
式中γ核函數(shù)系數(shù),P為滯后階數(shù),w和b為預(yù)測函數(shù)系數(shù)。由此,即可得到預(yù)測輸出,預(yù)測過程可描述為圖1所示。
圖1 支持向量回歸SVR模型原理Fig.1 SVR model theory
1.3 GA優(yōu)化算法
遺傳算法(genetic algorithms,GA)是模擬自然界遺傳機(jī)制和生物進(jìn)化論而成的一種隨機(jī)搜索算法,具有高效、并行、全局搜索等優(yōu)點(diǎn),不依賴于具體的問題領(lǐng)域,能夠在搜索的過程中依據(jù)優(yōu)勝劣汰的原則,隨機(jī)進(jìn)行信息交換,自動(dòng)獲取并積累搜索空間的知識(shí),求得最優(yōu)解。它把生物進(jìn)化原理引入優(yōu)化參數(shù)形成的編碼串聯(lián)群體中,按照所選擇的適應(yīng)度函數(shù),通過遺傳中的選擇、交叉和變異對(duì)個(gè)體進(jìn)行迭代,直至滿足終止條件[27-29]。在這個(gè)過程中引入交叉驗(yàn)證(cross validation,CV)的機(jī)制,以減少數(shù)據(jù)集劃分差異對(duì)算法的影響。交叉驗(yàn)證[30]是用來驗(yàn)證模型性能的一種統(tǒng)計(jì)方法,K折交叉驗(yàn)證是把訓(xùn)練集隨機(jī)分成K個(gè)不相關(guān)的子集,然后用其中的K-1個(gè)子集作為訓(xùn)練子集,建立預(yù)測模型,其他的子集為驗(yàn)證子集。通過K次循環(huán),使每個(gè)子集都充當(dāng)一次驗(yàn)證子集,然后對(duì)這K次的均方誤差(mean square error,MSE)取均值,作為K折交叉驗(yàn)證的均方誤差。
對(duì)于采用徑向基核函數(shù)的SVR,參數(shù)變量有3個(gè),分別是核函數(shù)系數(shù)γ、懲罰因子C和損失參數(shù)ε。算法步驟如下:
1)染色體表示
染色體表示是指染色體的編碼方式,常用的有二進(jìn)制法,實(shí)數(shù)法等。這里采用實(shí)數(shù)編碼,即每個(gè)染色體為一個(gè)實(shí)數(shù)串,對(duì)于待優(yōu)化的參數(shù)γ,C,ε,染色體可以表示為,其中分別對(duì)應(yīng)
2)適應(yīng)度函數(shù)
適應(yīng)度函數(shù)是衡量染色體優(yōu)劣的指標(biāo)。根據(jù)個(gè)體得到的參數(shù),用訓(xùn)練數(shù)據(jù)對(duì)SVR進(jìn)行訓(xùn)練,并預(yù)測其輸出,把預(yù)測輸出和期望輸出之間的誤差絕對(duì)值之和作為個(gè)體的適應(yīng)度值。引入K折交叉驗(yàn)證機(jī)制后,適應(yīng)度函數(shù)將K次均方誤差的均值作為適應(yīng)度值,計(jì)算公式如下:
式中fj為K折交叉驗(yàn)證過程中一次循環(huán)產(chǎn)生的值,yi為真實(shí)值,oi預(yù)測輸出;F為染色體的適應(yīng)度值,值越小表示染色體越優(yōu),被選擇的概率就越大。
3)選擇操作
選擇操作用于模擬自然界優(yōu)勝略汰的自然選擇過程。為了確保進(jìn)化過程朝著優(yōu)化的方向進(jìn)行,選擇過程根據(jù)求得的適應(yīng)度值的大小,淘汰一些較差的個(gè)體,選出一些比較優(yōu)良的個(gè)體。選擇算子有輪盤賭法、錦標(biāo)賽法等,本文選擇輪盤賭法,即適應(yīng)度值越優(yōu)的染色體被選擇的概率越大。
4)交叉操作
交叉操作相當(dāng)于的是生物遺傳基因的重組,本質(zhì)是為了增強(qiáng)算法的全局搜索能力。所謂交叉是把兩個(gè)染色體的部分結(jié)構(gòu)加以替換重組而生成新個(gè)體的操作。若一對(duì)染色體被確定進(jìn)行交叉操作,則需要確定一個(gè)或多個(gè)交叉點(diǎn)。由于染色體采用實(shí)數(shù)編碼,所以本文選擇的交叉算子是單點(diǎn)實(shí)數(shù)交叉法,即不同染色體在對(duì)應(yīng)的某個(gè)位置上以概率 Pcross進(jìn)行交叉操作,Pcross一般取值在[0.4,0.99]之間。
5)變異操作
變異操作的本質(zhì)是增強(qiáng)算法的局部搜索能力,避免陷入局部極小值,同時(shí)維持種群的多樣性,以防止過早收斂。因此,變異的概率 Pmut應(yīng)取較小的值,通常在[0.0001,0.1]區(qū)間內(nèi)選擇。
當(dāng)?shù)_(dá)到最大次數(shù),就停止迭代,輸出最優(yōu)參數(shù)組合。算法流程如圖2所示。
圖2 GA-SVR模型流程Fig.2 GA-SVR model process
2.1 數(shù)據(jù)獲取
本文中的試驗(yàn)數(shù)據(jù)來自中國農(nóng)業(yè)信息網(wǎng)北京新發(fā)地批發(fā)市場水產(chǎn)品價(jià)格(從2011年1月到2015年12月),水產(chǎn)品的常見品種有魚、蝦、蟹等類別,品種的選取參考了水產(chǎn)品的分類,考慮到當(dāng)?shù)仫嬍沉?xí)慣,以及市場銷售情況,文章分別從各個(gè)類別中選取了居民日常食用較為廣泛的桂魚、基圍蝦和梭子蟹作為試驗(yàn)對(duì)象,共抓取桂魚1541條記錄,基圍蝦1525條記錄,梭子蟹1430條記錄,如圖3所示。通過解析得到價(jià)格數(shù)據(jù)值,然后計(jì)算每個(gè)月有記錄天數(shù)對(duì)應(yīng)的月均值。將 2 011年到 2 014年數(shù)據(jù)訓(xùn)練模型,并預(yù)測2015年的值,和2015年的真實(shí)數(shù)據(jù)比較,驗(yàn)證模型的準(zhǔn)確性。
圖3 水產(chǎn)品月平均價(jià)格Fig.3 Aquaculture product monthly average price
2.2 預(yù)測過程
由于數(shù)據(jù)序列的取值范圍及波動(dòng)情況不同,應(yīng)首先對(duì)原始數(shù)據(jù)序列進(jìn)行預(yù)處理,如歸一化、取對(duì)數(shù)等操作。針對(duì)水產(chǎn)品價(jià)格波動(dòng)區(qū)間較大的特點(diǎn),這里選擇取對(duì)數(shù)操作,將價(jià)格序列映射到一個(gè)較小的區(qū)間上,之后的操作應(yīng)用于取對(duì)數(shù)后的序列上。
首先,對(duì)數(shù)據(jù)序列進(jìn)行平穩(wěn)化處理,進(jìn)行季節(jié)差分或一階差分,通過自相關(guān)系數(shù)判斷序列的平穩(wěn)性。經(jīng)過處理后,檢驗(yàn)序列的偏相關(guān)系數(shù),確定序列的相關(guān)階數(shù),可得相關(guān)階數(shù)分別為:以此可生成訓(xùn)練數(shù)據(jù)集。
然后,使用遺傳算法對(duì)訓(xùn)練集的參數(shù)進(jìn)行優(yōu)化。初始化遺傳算法設(shè)置,設(shè)種群規(guī)模為20,進(jìn)化代數(shù)為200,參數(shù)γ,C,ε的搜索范圍分別為[0,100],[0,100],[0.01,1],迭代的過程采用5-折交叉驗(yàn)證機(jī)制。當(dāng)?shù)_(dá)到最大次數(shù)后,輸出最優(yōu)組合。
通過遺傳算法計(jì)算,得到優(yōu)化后的參數(shù)組合如表1所示。以此來建立模型,并預(yù)測2015年的值。預(yù)測結(jié)果如圖4所示。
表1 參數(shù)組合Table 1 Parameter combination
圖4 預(yù)測值和真實(shí)值的比較Fig.4 Comparison between forecasted values with actual values
2.3 對(duì)比分析
本文選取了相對(duì)誤差(relative error,RE),平均絕對(duì)誤差百分比(mean absolute percentage error,MAPE),均方根誤差(root mean square error,RMSE)來評(píng)估模型的準(zhǔn)確性,計(jì)算公式如下:
式中 yi為真實(shí)值,yi'為預(yù)測值,n為數(shù)據(jù)項(xiàng)個(gè)數(shù)。評(píng)價(jià)指標(biāo)的誤差值越小,模型的準(zhǔn)確性越好,詳細(xì)指標(biāo)如表2所示。
本文同時(shí)使用了基于時(shí)間序列的SVR模型和BPANN對(duì)價(jià)格序列進(jìn)行了預(yù)測,預(yù)測結(jié)果如圖5所示。由表3可知不同模型對(duì)于2015年的預(yù)測性能,桂魚的預(yù)測結(jié)果相較于SVR模型,MAPE和RMSE分別降低了14.32%和36.76%,相較于BPANN模型,MAPE和RMSE分別降低了16.98%和11.71%;基圍蝦的預(yù)測結(jié)果相較于SVR模型,MAPE和RMSE分別降低了32.64%和21.55%,相較于BPANN模型,MAPE和RMSE分別降低了46.91%和31.1%;梭子蟹的預(yù)測結(jié)果相較于SVR模型,MAPE和RMSE分別降低了7.51%和12.64%,相較于BPANN模型,MAPE和RMSE分別降低了11.13%和29.5%;由分析可知,本文提出的模型具有較好的預(yù)測性能。
由實(shí)地調(diào)得知,水產(chǎn)品的供應(yīng)受季節(jié),天氣,交通運(yùn)輸?shù)纫蛩氐挠绊懞艽?,價(jià)格的波動(dòng)隨著這些因素的變化呈現(xiàn)出不規(guī)則的波動(dòng)。以梭子蟹為例,梭子蟹是一種海蟹,主要養(yǎng)殖在遼寧、山東、福建、廣東等東部沿海地區(qū),由于北京地處北方,該市場的梭子蟹主要由渤海灣等地供應(yīng),每年的5月和8月是梭子蟹上市時(shí)期,受供求關(guān)系影響,價(jià)格波動(dòng)較大,在5月是上市期,價(jià)格小幅回落,但由于供應(yīng)量并不是很大,加上 6 月進(jìn)入休漁期,造成之后的一段時(shí)間價(jià)格漲幅較大,8月前后,隨著梭子蟹的大規(guī)模上市,價(jià)格大幅回落。對(duì)比預(yù)測結(jié)果,和實(shí)際規(guī)律相符。
表2 價(jià)格預(yù)測值和真實(shí)值對(duì)比Table 2 Comparison between forecasted values with actual values of price
圖5 基于時(shí)間序列的BPANN、SVR、GA-SVR模型的預(yù)測對(duì)比Fig.5 Comparison of forecasted values of BPANN,SVR,GA-SVR based on time series
價(jià)格的波動(dòng)往往受多種因素的影響,受養(yǎng)殖品種、養(yǎng)殖地區(qū)、市場狀況、交通運(yùn)輸、飲食習(xí)慣等因素的影響,很難以某一地區(qū)某一品種價(jià)格的影響因素來推斷其他品種,加上數(shù)據(jù)搜集難度較大,這是水產(chǎn)品價(jià)格預(yù)測的難點(diǎn)。本文從技術(shù)角度對(duì)水產(chǎn)品價(jià)格的研究進(jìn)行分析,結(jié)合相關(guān)領(lǐng)域的研究成果,以求找到一種適應(yīng)性強(qiáng)的方法來對(duì)價(jià)格進(jìn)行預(yù)測。而且隨著大數(shù)據(jù)時(shí)代的來臨,數(shù)據(jù)將會(huì)越來越豐富、完整,本文模型將會(huì)得到進(jìn)一步的驗(yàn)證。
表3 不同模型價(jià)格預(yù)測結(jié)果Table 3 Forecasted values of BPANN,SVR,GA-SVR based on time series for price
本文根據(jù)水產(chǎn)品價(jià)格序列的非線性、非平穩(wěn)和周期性特點(diǎn),提出了一種基于時(shí)間序列分析GA-SVR水產(chǎn)品價(jià)格預(yù)測模型,研究主要結(jié)論如下:
1)通過時(shí)間序列的分析方法,計(jì)算水產(chǎn)品價(jià)格的偏自相關(guān)系數(shù),通過差分以及季節(jié)差分操作,可以將水產(chǎn)品價(jià)格序列平穩(wěn)化,并確定相關(guān)階數(shù)項(xiàng),構(gòu)建滿足預(yù)測條件的訓(xùn)練數(shù)據(jù)集。
2)針對(duì)水產(chǎn)品價(jià)格的非線性、周期性等特點(diǎn),選擇SVR作為預(yù)測模型,采用徑向基核函數(shù),通過GA優(yōu)化算法對(duì)模型的參數(shù)進(jìn)行優(yōu)化,給出一種組合GA-SVR模型用于水產(chǎn)品價(jià)格的預(yù)測。
3)使用桂魚、基圍蝦、梭子蟹的價(jià)格數(shù)據(jù)對(duì)模型進(jìn)行驗(yàn)證,對(duì)2015年價(jià)格數(shù)據(jù)進(jìn)行預(yù)測,預(yù)測結(jié)果同真實(shí)值相比較,桂魚、基圍蝦、梭子蟹的平均絕對(duì)誤差分別為0.0670、0.0782、0.1476,均方根誤差分別為5.8531、23.7011、13.8580。同時(shí)使用了基于時(shí)間序列的SVR模型和BPANN模型進(jìn)行對(duì)比,本文提出的模型預(yù)測精度更優(yōu)??芍?,該模型可以為水產(chǎn)品價(jià)格的預(yù)測提供參考依據(jù)。
[1] 王威巍,梁鴿峰,孫珅. 中國水產(chǎn)品市場價(jià)格波動(dòng)特征研究[J]. 中國漁業(yè)經(jīng)濟(jì),2015,33(6):56-63. Wang Weiwei,Liang Gefeng,Sun Chen.The price fluctuant characteristics research on China’s aquatic products market[J]. Chinese Fisheries Economics,2015,33(6):56-63.(in Chinese with English abstract)
[2] 任宏偉. 農(nóng)產(chǎn)品市場價(jià)格預(yù)測方法探析[J]. 中國農(nóng)學(xué)通報(bào)2011,27(26):209-212. Ren Hongwei. Study on methods of forecasting the farm products prices[J]. Chinese Agricultural Science Bulletin,2011,27(26):209-212.(in Chinese with English abstract)
[3] Tarjei K. A time series spot price forecast model for the Nord Pool market[J]. Electrical Power and Energy Systems,2014,61:20-26.
[4] Rojas I,Valenzuela O,Rojas F,et al. Soft-computing techniques and ARMA model for time series prediction[J]. Neur Computing,2008,71:519-537.
[5] Daniel B,Carlo M. Forecasting copper prices with dynamic averaging and selection models[J]. North American Journal of Economics and Finance,2015,33:1-38
[6] Deepak S,Swarup K S. Electricity price forecasting using artificial neural networks[J]. Electrical Power and Energy Systems,2011,33:550-555.
[7] Li Ganqiong,Xu Shiwei,Li Zhemin,et al. Using quantile regression approach to analyze price movements of agricultural products in china [J]. Journal of Integrative Agriculture,2012,11(4):674-683.
[8] Xiong Tao,Li Chongguang,BaoYukun,et al. A combination method for interval forecasting of agricultural commodity futures prices[J]. Knowledge-Based Systems,2015,77:92-102.
[9] 周世昊,林蒼祥,倪衍森. 基于遺傳算法和神經(jīng)網(wǎng)絡(luò)的新股上市價(jià)格預(yù)測法[J]. 計(jì)算機(jī)工程,2007,33(22):9-11. Zhou Shihao,Lin W T,Ni Yansen. Price forecasting approach for initial public offerings using genetic algorithm and neural network[J]. Computer Engineering,2007,33(22):9-11.(in Chinese with English abstract)
[10] Zhu Bangzhu,Wei Yiming. Carbon price forecasting with a novel hybrid ARIMA and least squares support vector machines methodology[J]. Omega,2013,41:517-524.
[11] Zhang Xiaoshuan,Hu Tao,Brain Revell,et al. A forecasting support system for aquatic products price in China[J]. Expert Systems with Application,2005,28:119-126.
[12] Li Hongwei,Gao Xiaoxiang,Cheng Kejun. The application of wavelet neural network in prediction of the fish price[J]. Applied Mechanics and Materials,2014,687-691:1945-1949.
[13] 任海軍,孫瑞志,劉廣利. 基于AR_SVR模型的時(shí)間序列預(yù)測算法的研究[J]. 計(jì)算機(jī)工程與設(shè)計(jì),2010,31(2):421-424. Ren Haijun,Sun Ruizhi,LiuGuangli. Research of time-series forecasting algorithm based on AR_SVR model[J]. Computer Engineering and Design,2010,31(2):421-424.(in Chinese with English abstract)
[14] 李哲敏,許世衛(wèi),崔利國,等. 基于動(dòng)態(tài)混沌神經(jīng)網(wǎng)絡(luò)的預(yù)測研究:以馬鈴薯時(shí)間序列價(jià)格為例[J]. 系統(tǒng)工程理論與實(shí)踐,2015,35(8):2083-2091. Li Zhemin,Xu Shiwei,Cui Liguo,et al. Prediction study based on dynamic chaotic neural network:taking potato timeseries prices as an example[J]. Systems Engineering-Theory &Practice,2015,35(8):2083-2091.(in Chinese with English abstract)
[15] Peng K L,Wu C H,Goo Y J. The development of a new statistical technique for relating financial information to stock market returns[J]. International Journal of Management,2004,21(4):492-505.
[16] Byeonghwa P,Jae K B. Using machine learning algorithms for housing price prediction:The case of Fairfax County,Virginia housing data [J]. Expert Systems with Applications,2015,42:2928-2943.
[17] 孫建明. 基于能繁母豬存欄量和豬糧價(jià)比的豬肉價(jià)格預(yù)報(bào)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2013,29(13):1-6. Sun Jianming. Pork price forecast based on breeding sow stocks and hog-grain price ratio[J]. Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2013,29(13):1-6.(in Chinese with English abstract)
[18] 屠星月,薛佳妮,郭承坤,等. 基于時(shí)間序列與RBF的農(nóng)產(chǎn)品市場價(jià)格短期預(yù)測模型[J]. 廣東農(nóng)業(yè)科學(xué),2014(23):168-173. Tu Xingyue,Xue Jiani,Guo Chengkun,et al. Short-term forecast of agricultural products price based on time series and RBF[J]. Journal of Guangdong Agricultural Sciences,2014(23):168-173.(in Chinese with English abstract)
[19] Jiajun Zong,Quanyin Zhu. Apply grey prediction in the agriculture production price[C]//International Conference on Multimedia Information Networking and Security,2012.
[20] Longqin Xu,Shuangyin Liu. Study of short-term water quality prediction model based on wavelet neural network[J]. Mathematical and Computer Modelling,2013,(58):807-813.
[21] 李宏偉,高小翔,程可軍. 基于小波神經(jīng)網(wǎng)絡(luò)的魚類價(jià)格預(yù)測研究[J]. 中國漁業(yè)經(jīng)濟(jì),2014,32(4):61-66. Li Hongwei,Gao Xiaoxiang,Cheng Junke. Research on price forecasting of fish based on wavelet neural network method[J]. Chinese Fisheries Economics,2014,32(4):61-66.(in Chinese with English abstract)
[22] Jonathan DC,KungS C. Time Series Analysis with Applications in R(Second Edition)[M]. New Yourk:Springer-Verlag,2008.
[23] Pai P F,Lin C S. A hybrid ARIMA and support vector machines model in stock price forecasting [J]. The International Journal of Management Science,2005,33(6),497-505.
[24] 劉廣利. 基于支持向量機(jī)的經(jīng)濟(jì)預(yù)警方法研究[D]. 北京:中國農(nóng)業(yè)大學(xué),2003. Liu Guangli. Research on Economic Early Warning Methods Based on Support Vector Machine[D]. Beijing:China Agricultural University,2003.(in Chinese with English abstract)
[25] 王快妮,鐘萍,趙耀紅. 魯棒SVR在金融時(shí)間序列預(yù)測中的應(yīng)用[J]. 計(jì)算機(jī)工程,2011,37(15):155-163. Wang Kuaini,Zhong Ping,Zhao Yaohong. Application of robust support vector regression in financial time sequence prediction[J]. Computer Engineering,2011,37(15):155-163.(in Chinese with English abstract)
[26] 張豪,羅亦泳,張立亭,等. 基于遺傳算法最小二乘支持向量機(jī)的耕地變化預(yù)測[J]. 農(nóng)業(yè)工程學(xué)報(bào),2009,25(7):226-231. Zhang Hao,Luo Yiyong,Zhang Liting,et al. Cultivated land change forecast based on genetic algorithm and least squaressupport vector machines[J]. Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2009,25(7):226-231.(in Chinese with English abstract)
[27] Jiawei H,Micheline K,Jian P. Data mining:concepts and techniques[M]. Third edition Burlington:Morgan Kaufmann Publishers,2012.
[28] 王海軍,柳敏燕,高娟. 利用遺傳算法和支持向量機(jī)測算農(nóng)用地理論單產(chǎn)和可實(shí)現(xiàn)單產(chǎn)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2013,29(19):244-252. Wang Haijun,Liu Minyan,Gao Juan. Calculation of theoretical and accessible yields of agricultural land based on geneticalgorithm and support vector machine[J]. Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2013,29(19):244-252.(in Chinese with English abstract)
[29] 王海軍,柳敏燕,高娟. 利用遺傳算法和支持向量機(jī)測算農(nóng)用地理論單產(chǎn)和可實(shí)現(xiàn)單產(chǎn)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2013,29(19):244-252. Wang Haijun,Liu Minyan,Gao Juan. Calculation of theoretical and accessible yields of agricultural land based on genetic algorithm and support vector machine[J]. Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2013,29(19):244-252.(in Chinese with English abstract)
[30] Wang Jue,Qiao Jianzhong. Parameter selection of SVR based on improved k-fold cross validation[J]. Applied Mechanics and Materials,2014,462:182-186.
Forecasting model and validation for aquatic product price based on time series GA-SVR
Duan Qingling,Zhang Lei,Wei Fangfang,Xiao Xiaoyan,Wang Liang
(College of Information and Electrical Engineering,China Agricultural University,Beijing 100083,China)
Fluctuations in aquatic product prices have an important impact on the development of the aquaculture industry. Accurate forecasting results can enable farmers to keep abreast of changes in the market and rationally plan aquaculture. Based on the non-linear,non-stationary and periodicity of the aquatic product price series,a genetic algorithm(GA) support vector regression(SVR) model based on time series for forecasting aquatic product price was presented in this paper. Firstly,the time series method was applied to the price series,the autocorrelation function was used to judge the stability,and the partial correlation coefficient was used to judge the data items,then the training data set was obtained. After that,the genetic algorithm was used to optimize the parameters of support vector regression. The parameters of SVR based on radial basis kernel function were kernel function coefficient,penalty factor,and loss parameter. We designed these three parameters by using real number coding individual representation. We used the selection operation to select the mean square error as the fitness function,to calculate the fitness value of each individual,and to select the individuals with better fitness value. By use of the crossover operation,we selected the point intersecting as crossover operator with different individuals,respectively in a corresponding position to a certain probability. The nature of mutation operation was used to enhance the local search algorithm,and avoid falling into the local minimum. We mutated individual to a certain probability and change its current value,then generated new population. We introduced the mechanism of 5-fold cross validation to the process of each iteration to obtain the optimized parameter combination. Finally,the support vector regression model was established by using the optimized parameters to forecast the price of aquatic products in the next period. In this paper,we selected mandarin fish,metapenaeus ensis and portunus trituberculatus as the experimental objects. The experimental data we used were the value of aquatic product price from January 2011 to December 2015 of Beijing Xinfadi market website(http://www.xinfadi.com.cn). After craw ling the web data-including 1,541 records of mandarin fish,1,525 records of metapenaeus ensis and 1,430 records of portunus trituberculatus,we calculated the monthly average price to represent the price of a period. We trained the proposed model by using data from 2011 to 2014,and forecasted the price of the next year. Through comparing with the real value,the mean absolute percent error of mandarin fish,metapenaeus ensis and portunus trituberculatus was 6.70%,7.82% and 14.76%,with corresponding root mean square error of 5.8531,23.7011 and 13.8580,respectively. After surveying the market,we found that the results of forecasting were more in line with the actual situation. In this paper,the SVR model and the BP neural network model based on time series were all used in contrast experiment of our model. The experiment results showed that our model was superior. According to the characteristics of aquatic product price in this paper,we proposed a combined model for the determination of the relevant items of the aquatic product price series,the selection of the kernel function and the parameter optimization. The results showed that the proposed model can provide the basis for the forecasting of aquatic product price.
aquaculture;models;support vector machine;price forecast;aquatic product;genetic algorithm;time series
10.11975/j.issn.1002-6819.2017.01.042
F304.2;TP301.6
A
1002-6819(2017)-01-0308-07
段青玲,張 磊,魏芳芳,肖曉琰,王 亮. 基于時(shí)間序列GA-SVR的水產(chǎn)品價(jià)格預(yù)測模型及驗(yàn)證[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(1):308-314.
10.11975/j.issn.1002-6819.2017.01.042 http://www.tcsae.org
Duan Qingling,Zhang Lei,Wei Fangfang,Xiao Xiaoyan,Wang Liang. Forecasting model and validation for aquatic product price based on time series GA-SVR[J]. Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2017,33(1):308-314.(in Chinese with English abstract)doi:10.11975/j.issn.1002-6819.2017.01.042 http://www.tcsae.org
2016-05-16
2016-10-24
公益性行業(yè)(農(nóng)業(yè))科研專項(xiàng)(201203017);寧波市農(nóng)業(yè)重大(重點(diǎn))擇優(yōu)委托科技攻關(guān)項(xiàng)目(2011C11006)
段青玲,女,教授,博士生導(dǎo)師,主要從事農(nóng)業(yè)智能信息處理與數(shù)據(jù)挖掘研究。北京 中國農(nóng)業(yè)大學(xué)信息與電氣工程學(xué)院,100083。Email:dqling@cau.edu.cn