馬文楷
摘要:針對卷積神經(jīng)網(wǎng)絡在小樣本易產(chǎn)生過擬合,性能較差等問題,提出融合主成分分析(Principal Component Analysis, PCA)與并行混合的卷積神經(jīng)網(wǎng)絡(PCA Parallel Mixing CNN, PCA-PMCNN)模型。該模型首先利用主成分分析非監(jiān)督預訓練初始化卷積神經(jīng)網(wǎng)絡,學習得到含有訓練數(shù)據(jù)統(tǒng)計特性的初始濾波器集合,以解決首層濾波器集無法充分訓練的問題;其次,引入局部對比度標準化概念及概率最大化采樣規(guī)則,以減小下采樣對特征的損失,增強特征描述的魯棒性;最后,采用線性修正函數(shù)(Rectified Linear Units, ReLU)代替非線性激活函數(shù),以保證特征稀疏,提高訓練效率。實驗結果表明該模型對行人目標具有較好的識別率,對行人重疊、姿態(tài)變化、復雜背景等具有較好的魯棒性。
關鍵詞:卷積神經(jīng)網(wǎng)絡;主成分分析;局部對比度標準化;概率最大化下采樣;并行混合
中圖分類號:TP391 文獻標識碼:A 文章編號:1009-3044(2018)25-0199-02
Convolution Neural Network Combining Principal Component Analysis and Parallel Mixing
MA Wen-kai
(School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China)
Abstract: Convolution neural network is prone to overfitting and poor performance in small samples, we proposed a convolution neural network model combining principal component analysis and parallel mixing. The model first initializes the convolution neural network with the principal component analysis unsupervised pre training, and learns the initial filter set containing the statistical characteristics of the training data to solve the problem that the first layer filter sets can not be fully trained; Then, the concept of local contrast normalization and probability maximization sampling rule are introduced to reduce the loss of feature and enhance the robustness of feature description; Finally, Rectified Linear Units (ReLU) is used to replace the nonlinear activation function to ensure sparse features and improve training efficiency. The experimental results show that the model has better recognition rate for pedestrian targets, and is robust to pedestrian overlap, attitude change and complex background.
Key words: convolution neural network; principal component analysis; local contrast standardization; probability maximization sampling; parallel mixing
人工神經(jīng)網(wǎng)絡過多的參數(shù)設置,加劇了過擬合情況的發(fā)生;網(wǎng)絡深度設計過深,易造成梯度發(fā)散,使得網(wǎng)絡模型的解極易陷入局部最優(yōu)。與人工神經(jīng)網(wǎng)絡相比,深度CNN特征具有魯邦性強、抗旋轉(zhuǎn)以及對光照變化不敏感等優(yōu)點。LeNet5通過共享權值以及下采樣層,減少了網(wǎng)絡參數(shù),降低了特征維數(shù),但網(wǎng)絡深度較淺,分類效率與識別能力有待提高。Alex-Net增加了卷積神經(jīng)網(wǎng)絡模型的深度,并擴充了神經(jīng)網(wǎng)絡模型中卷積核數(shù)量;文獻[1]通過線性修正單元(Rectified Linear Unit, ReLU)來加速梯度收斂速率文獻[2]采用全連接聚類以增強對圖像中的非一致性動態(tài)模糊的魯棒性。文獻[3]采用Dropout層,增加樣本訓練的隨機性,防止過擬合。
1 本文算法
1.1 主成分分析非監(jiān)督預訓練
卷積神經(jīng)網(wǎng)絡的輸入為N張大小為[m×n]的圖像,卷積濾波器的大小為[k1×k2]。獲得圖像[Ii]的圖像塊數(shù)據(jù)[Xi],則主成分分析學習到初始化卷積神經(jīng)網(wǎng)絡的濾波器組[W1l]可表示為:
[W1l=mk1k2qlXXT] (1)
主成分分析非監(jiān)督訓練能夠提取輸入圖像局部塊的主成分信息,這些信息能夠最大限度地代表圖像的局部特征。
1.2 并行混合CNN模型
人類通過雙目視神經(jīng)來觀察客觀世界,雙目視神經(jīng)獲取的視覺信息通過膝狀體和信息混合以供大腦分析。將同幅圖像采用不同的編碼形式輸入到深度學習網(wǎng)絡中,神經(jīng)網(wǎng)絡通過學習能夠獲取到不同維度的特征信息,因此,本文提出了并行混合CNN模型,兩條CNN數(shù)據(jù)流分別使用不同的輸入數(shù)據(jù),以增強對圖像特征的描述力,如圖1所示。
2 實驗環(huán)境與結果分析
在Caltech256公開數(shù)據(jù)集上進行測試實驗。訓練Caltech256數(shù)據(jù)集時,Dropout ratio參數(shù)為0.5;初始學習率為0.005,采用多項式減小的方法控制學習率,其減小冪值為0.5;將Batch size設置為20,;迭代次數(shù)設置為20萬次。表1列出了Caltech256數(shù)據(jù)集上不同深度網(wǎng)絡模型的分類精度,可以看出,PMCNN的分類精度在Top1上也有了提高。
3 結論
本文結合人類視覺原理,融合主成分分析與并行混合的神經(jīng)網(wǎng)絡結構,提出PCA-PMCNN模型,在提高圖像分類精度的同時,保證網(wǎng)絡更具有魯棒性;進行局部對比度標準化,對數(shù)據(jù)中存在噪聲的目標圖像魯棒性更強,從而保證卷積神經(jīng)網(wǎng)絡對復雜背景有更強的魯棒性;引入概率最大化下采樣的方法,既能提高圖像特征的抗干擾能力,又能減少圖像信息的損失,提高圖像信息的利用率能有效避免訓練陷入局部最優(yōu),使特征更加稀疏;在公開數(shù)據(jù)集上的實驗表明,本文模型在保證網(wǎng)絡深度的基礎上,設置多條特征數(shù)據(jù)流進行信息混合,提高特征的可判別能力。
參考文獻:
[1] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6):1137-1149.
[2] Ballester P, Araujo R M. On the performance of GoogLeNet and AlexNet applied to sketches[C]// Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press, 2016:1124-1128.
[3] Sun J, Cao W, Xu Z, et al. Learning a convolutional neural network for non-uniform motion blur removal[J]. 2015(CVPR):769-777.
【通聯(lián)編輯:代影】