謝平
摘 要:目前海量存儲(chǔ)系統(tǒng)規(guī)模逐漸增長(zhǎng),存儲(chǔ)節(jié)點(diǎn)失效是普遍現(xiàn)象。因此存儲(chǔ)系統(tǒng)的重構(gòu)優(yōu)化問(wèn)題越來(lái)越受到研究人員的關(guān)注。綜述了存儲(chǔ)系統(tǒng)從數(shù)據(jù)布局和數(shù)據(jù)調(diào)度兩個(gè)層面的重構(gòu)技術(shù)研究進(jìn)展和現(xiàn)狀,同時(shí)對(duì)各種典型重構(gòu)技術(shù)從原理、實(shí)現(xiàn)機(jī)制等方面進(jìn)行了分析和歸納,并對(duì)比分析和總結(jié)了各種重構(gòu)技術(shù)的適應(yīng)場(chǎng)景。結(jié)合海量存儲(chǔ)系統(tǒng)負(fù)載特征的復(fù)雜性和應(yīng)用環(huán)境的復(fù)雜性等特點(diǎn),指出了存儲(chǔ)系統(tǒng)重構(gòu)技術(shù)的未來(lái)研究方向。
關(guān)鍵詞:糾刪編碼存儲(chǔ)系統(tǒng);重構(gòu)技術(shù);存儲(chǔ)可靠性;數(shù)據(jù)可用性
中圖分類(lèi)號(hào):TP311 文獻(xiàn)標(biāo)識(shí)碼:A 文章編號(hào):2095-1302(2017)05-0-04
0 引 言
當(dāng)今社會(huì)正處于數(shù)據(jù)爆炸式增長(zhǎng)的時(shí)代,網(wǎng)絡(luò)技術(shù)提供商Cisco預(yù)測(cè),從2013到2018年全球每個(gè)月的網(wǎng)絡(luò)數(shù)據(jù)量將以21%的年增長(zhǎng)速度上升,每月的網(wǎng)絡(luò)數(shù)據(jù)量將從2013年的51 EB增長(zhǎng)到2018年的132 EB,數(shù)據(jù)量幾乎增長(zhǎng)了3倍,并且到2016年,每個(gè)月的網(wǎng)絡(luò)數(shù)據(jù)量已達(dá)91 EB[1]。企業(yè)數(shù)據(jù)中心面臨海量數(shù)據(jù)存儲(chǔ)的需求,因此數(shù)據(jù)中心需要廉價(jià)、可靠、高性能和高能效的數(shù)據(jù)存儲(chǔ)系統(tǒng)。
現(xiàn)代存儲(chǔ)系統(tǒng)采用一定的容錯(cuò)策略,通過(guò)重構(gòu)技術(shù)確保存儲(chǔ)的可靠性和數(shù)據(jù)可用性。一方面當(dāng)一定存儲(chǔ)節(jié)點(diǎn)失效時(shí),通過(guò)重構(gòu)技術(shù)可以恢復(fù)失效節(jié)點(diǎn)以確保存儲(chǔ)可靠,另一方面考慮網(wǎng)絡(luò)I/O負(fù)載的復(fù)雜性特征,為及時(shí)響應(yīng)用戶(hù)的數(shù)據(jù)訪(fǎng)問(wèn)請(qǐng)求,通過(guò)重構(gòu)技術(shù)以確保數(shù)據(jù)的高效可用。重構(gòu)技術(shù)是根據(jù)存儲(chǔ)系統(tǒng)容錯(cuò)數(shù)據(jù)布局方案,采用一定I/O優(yōu)化調(diào)度策略,以減少I(mǎi)/O開(kāi)銷(xiāo)與降低CPU計(jì)算開(kāi)銷(xiāo)為手段,實(shí)現(xiàn)可靠并快速獲取用戶(hù)數(shù)據(jù)為目的的優(yōu)化過(guò)程。
1 糾刪編碼存儲(chǔ)系統(tǒng)重構(gòu)優(yōu)化技術(shù)
圖1所示為典型糾刪編碼存儲(chǔ)系統(tǒng)的重構(gòu)優(yōu)化過(guò)程。在糾刪編碼存儲(chǔ)系統(tǒng)中,將k個(gè)保存原始數(shù)據(jù)的磁盤(pán)經(jīng)過(guò)編碼計(jì)算操作,得到m個(gè)冗余磁盤(pán);當(dāng)存儲(chǔ)系統(tǒng)中有不超過(guò)m個(gè)磁盤(pán)失效時(shí),根據(jù)糾刪編碼的編碼/解碼計(jì)算規(guī)則,通過(guò)存活的數(shù)據(jù)磁盤(pán)和冗余磁盤(pán)恢復(fù)出失效磁盤(pán),其存儲(chǔ)效率為k/(k+m)。在糾刪編碼的設(shè)計(jì)中,重構(gòu)性能是其最重要的設(shè)計(jì)目標(biāo)之一,在真實(shí)存儲(chǔ)環(huán)境下,重構(gòu)性能通常由恢復(fù)失效磁盤(pán)所用的重構(gòu)時(shí)間來(lái)衡量,重構(gòu)時(shí)間越短則重構(gòu)性能越好,反之亦然。在理論分析中,由于重構(gòu)過(guò)程中的計(jì)算開(kāi)銷(xiāo)比I/O開(kāi)銷(xiāo)快多個(gè)數(shù)量級(jí),因而在理論比較中其計(jì)算開(kāi)銷(xiāo)可以忽略不計(jì),因此校驗(yàn)陣列編碼的重構(gòu)性能可以轉(zhuǎn)化成以存取數(shù)據(jù)塊的個(gè)數(shù)來(lái)衡量重構(gòu)性能。目前校驗(yàn)陣列編碼的重構(gòu)優(yōu)化技術(shù)在學(xué)術(shù)界和工業(yè)界引起了廣泛關(guān)注,主要分為以下幾種研究趨勢(shì)。
1.1 最優(yōu)重構(gòu)鏈長(zhǎng)策略
針對(duì)MDS編碼隨著存儲(chǔ)系統(tǒng)規(guī)模的擴(kuò)大,其重構(gòu)性能逐漸降低的問(wèn)題,研究者提出了許多新的Non-MDS編碼以提升存儲(chǔ)系統(tǒng)的重構(gòu)性能,如WEAVER編碼[2],Hover編碼[3],Pyramid編碼[4],Stepped Combination編碼[5],Code-M編碼[6]和V2-Code編碼[7]。Non-MDS編碼相對(duì)于MDS編碼在校驗(yàn)鏈的構(gòu)建機(jī)制上使用了更多的校驗(yàn)塊,減少了生成一個(gè)校驗(yàn)塊所需的數(shù)據(jù)塊個(gè)數(shù),因此在相同存儲(chǔ)規(guī)模的系統(tǒng)中,Non-MDS編碼縮短了校驗(yàn)鏈的長(zhǎng)度。在重構(gòu)過(guò)程中,Non-MDS編碼獲得了更短的重構(gòu)鏈長(zhǎng),在重構(gòu)一個(gè)失效塊的情況下需要讀取更少的數(shù)據(jù)塊,提升重構(gòu)性能;此外,Non-MDS編碼的重構(gòu)鏈的長(zhǎng)度不隨存儲(chǔ)系統(tǒng)規(guī)模的增長(zhǎng)而變化,即重構(gòu)性能與RAID規(guī)模大小無(wú)關(guān),然而對(duì)于MDS編碼其重構(gòu)鏈的長(zhǎng)度隨著存儲(chǔ)系統(tǒng)規(guī)模的增長(zhǎng)而變長(zhǎng),因此其重構(gòu)性能會(huì)逐漸降低。故設(shè)計(jì)新型高容錯(cuò)能力的Non-MDS編碼成為提高重構(gòu)性能的一種研究趨勢(shì)。
1.2 最優(yōu)重構(gòu)數(shù)據(jù)量策略
對(duì)于現(xiàn)有的容多錯(cuò)陣列編碼中,單磁盤(pán)失效恢復(fù)是最常見(jiàn)的問(wèn)題[8],最初的重構(gòu)策略便采用傳統(tǒng)的恢復(fù)方式[9,10],即所有失效塊的重構(gòu)只考慮讀取一種校驗(yàn)鏈的方式,該方式需要讀取所有數(shù)據(jù)塊用于重構(gòu),因而增加了存取I/O的復(fù)雜度;然而容多錯(cuò)的編碼通常包含多種校驗(yàn)鏈(如行校驗(yàn)鏈、斜校驗(yàn)鏈和反斜校驗(yàn)鏈),每一種校驗(yàn)鏈之間都存在共用塊的情況,因此最大化共用塊的個(gè)數(shù)將會(huì)減少讀取重構(gòu)所需的數(shù)據(jù)量,從而達(dá)到提升重構(gòu)性能的目的,這種重構(gòu)方式稱(chēng)為混合校驗(yàn)鏈重建方式。如RDOR算法針對(duì)RDP編碼的單磁盤(pán)失效提出了尋找最少重建數(shù)據(jù)塊的快速重建方案[11];王等人基于最少重構(gòu)數(shù)據(jù)量的思路實(shí)現(xiàn)了EVENODD編碼單盤(pán)快速重構(gòu)[12];針對(duì)任意多容錯(cuò)糾刪編碼的單盤(pán)恢復(fù)問(wèn)題,Khan等人提出了一種枚舉恢復(fù)算法,即尋找最小重建數(shù)據(jù)量[13];朱等人提出了一種替換恢復(fù)算法,加速了最少重構(gòu)數(shù)據(jù)量的尋找過(guò)程,尋找到了次優(yōu)的最少重構(gòu)數(shù)據(jù)量[14]。
1.3 最優(yōu)重構(gòu)帶寬策略
在分布式存儲(chǔ)系統(tǒng)中,其重構(gòu)過(guò)程中減少網(wǎng)絡(luò)I/O開(kāi)銷(xiāo)是主要的優(yōu)化目標(biāo),即最優(yōu)重構(gòu)帶寬策略,以提供良好的網(wǎng)絡(luò)存儲(chǔ)性能。再生編碼[15]基于最優(yōu)重構(gòu)帶寬被提出,例如Exact Regenerating Codes提出精確恢復(fù)失效數(shù)據(jù)[16];李等人基于再生碼充分考慮了異構(gòu)網(wǎng)絡(luò)結(jié)構(gòu)以及并行恢復(fù)算法遂提出了快速重建方案[17];MCR和MBCR基于再生碼策略并發(fā)重建了多失效節(jié)點(diǎn)[18,19];CHR在異構(gòu)網(wǎng)絡(luò)集群環(huán)境下,考慮節(jié)點(diǎn)的異構(gòu)權(quán)重因子加速了RAID-6編碼的存儲(chǔ)系統(tǒng)單盤(pán)失效重構(gòu)過(guò)程[20]。PM-RBT基于MSR再生編碼,考慮在重構(gòu)過(guò)程中確保最優(yōu)網(wǎng)絡(luò)通信開(kāi)銷(xiāo)、最優(yōu)存儲(chǔ)效率和可靠性的同時(shí),實(shí)現(xiàn)了最小化磁盤(pán)I/O開(kāi)銷(xiāo),從而加速了重構(gòu)的實(shí)現(xiàn)[21]。
1.4 均衡重構(gòu)負(fù)載的策略
在磁盤(pán)陣列的環(huán)境中,由于并行性存儲(chǔ)I/O結(jié)構(gòu),影響存儲(chǔ)系統(tǒng)重構(gòu)性能的決定因素是瓶頸磁盤(pán)的性能,故羅等人提出了均衡重構(gòu)I/O的恢復(fù)思路,即在重構(gòu)過(guò)程中,從各存活磁盤(pán)讀取用于重構(gòu)的數(shù)據(jù)量要均衡,因此在RAID編碼的陣列里,選擇重構(gòu)所需的數(shù)據(jù)塊在各存活磁盤(pán)要均衡布局,最終實(shí)現(xiàn)最小化從單盤(pán)讀取重構(gòu)數(shù)據(jù)塊的思路[22]。S2-RAID提出了一種新的RAID結(jié)構(gòu),該結(jié)構(gòu)充分考慮了并行數(shù)據(jù)布局思路,在重構(gòu)過(guò)程中確保重構(gòu)所需的數(shù)據(jù)塊能并行和均衡地從各存活磁盤(pán)讀取,提升重構(gòu)性能[23]。另外基于平衡I/O響應(yīng)時(shí)間和磁盤(pán)重建時(shí)間的考慮,候等人提出了負(fù)載均衡的重建優(yōu)化方案[24]。
1.5 重構(gòu)I/O優(yōu)化策略
在重構(gòu)過(guò)程中充分優(yōu)化重構(gòu)I/O流,并減少用戶(hù)I/O流對(duì)重構(gòu)性能的影響,可以有效加速重構(gòu)實(shí)現(xiàn)過(guò)程,因而各種重構(gòu)I/O優(yōu)化策略被提出以提升重構(gòu)性能,例如SOR和DOR充分考慮了重構(gòu)I/O的并行性加速了重構(gòu)的實(shí)現(xiàn)[25,26];Pro基于用戶(hù)I/O存在存取局部性的特性,提出了基于用戶(hù)I/O熱度的重構(gòu)算法,即在重構(gòu)過(guò)程中優(yōu)先考慮重構(gòu)用戶(hù)I/O訪(fǎng)問(wèn)最熱的區(qū)域,以確保對(duì)用戶(hù)I/O的及時(shí)響應(yīng),加速重構(gòu)過(guò)程[27];Workout采用寫(xiě)重定向策略,即分離用戶(hù)I/O流和重構(gòu)I/O流,從而有效避免兩者相互干擾,實(shí)現(xiàn)了加速重構(gòu)過(guò)程的目的[28],此外VDF和MCRO充分利用了高效的Cache策略,有效減少了重構(gòu)I/O的數(shù)量,即減少對(duì)磁盤(pán)的存取次數(shù),提升了重構(gòu)性能[29,30]。
2 重構(gòu)優(yōu)化技術(shù)的分析與討論
上面分別從重構(gòu)數(shù)據(jù)布局和重構(gòu)數(shù)據(jù)調(diào)度兩個(gè)層面對(duì)糾刪編碼存儲(chǔ)系統(tǒng)的重構(gòu)技術(shù)進(jìn)行了說(shuō)明,表1對(duì)上述典型重構(gòu)技術(shù)進(jìn)行了對(duì)比和總結(jié)。
從表1典型重構(gòu)技術(shù)的對(duì)比中可以發(fā)現(xiàn):
(1)重構(gòu)編碼布局逐漸由最早的MDS糾刪編碼向重構(gòu)性能更好的Non-MDS糾刪編碼發(fā)展,以及提供更好容錯(cuò)能力和重構(gòu)性能的“Embed-RAID”編碼轉(zhuǎn)變[26],例如RAID0+1和RAID1+0數(shù)據(jù)布局;
(2)重構(gòu)調(diào)度策略逐漸由面向糾刪編碼的多線(xiàn)程、并行性SOR和DOR的重構(gòu)技術(shù)向優(yōu)化重構(gòu)I/O流和用戶(hù)I/O流存取特征的Pro和WorkOut重構(gòu)策略轉(zhuǎn)變,及減少重構(gòu)I/O量的高效Cache策略MCRO和VDF發(fā)展;
(3)基于現(xiàn)有糾刪編碼,優(yōu)化其編碼/解碼復(fù)雜度以減少重構(gòu)數(shù)據(jù)量的RDOR、LDF和CoRec重構(gòu)策略逐漸受到關(guān)注;
(4)現(xiàn)有糾刪編碼由面向并行陣列環(huán)境向分布式存儲(chǔ)環(huán)境的應(yīng)用方向發(fā)展,結(jié)合存儲(chǔ)異質(zhì)異構(gòu)和網(wǎng)絡(luò)I/O復(fù)雜性特征,研究再生編碼重構(gòu)策略。
3 結(jié) 語(yǔ)
本文從糾刪編碼存儲(chǔ)系統(tǒng)數(shù)據(jù)布局和數(shù)據(jù)調(diào)度兩個(gè)層面展開(kāi),圍繞復(fù)雜的存儲(chǔ)環(huán)境和應(yīng)用環(huán)境面臨的重構(gòu)技術(shù)關(guān)鍵問(wèn)題,考慮存儲(chǔ)系統(tǒng)的異質(zhì)異構(gòu)以及復(fù)雜的用戶(hù)I/O存取特征,結(jié)合并行性/分布式存儲(chǔ)系統(tǒng)場(chǎng)景,分析、總結(jié)了現(xiàn)有重構(gòu)優(yōu)化技術(shù)。隨著網(wǎng)絡(luò)存儲(chǔ)技術(shù)的深入發(fā)展,糾刪編碼的存儲(chǔ)系統(tǒng)重構(gòu)技術(shù)未來(lái)主要呈現(xiàn)以下發(fā)展趨勢(shì):
(1)面向云存儲(chǔ)的重構(gòu)策略。隨著云存儲(chǔ)技術(shù)的發(fā)展,結(jié)合空間位置特性和存儲(chǔ)節(jié)點(diǎn)異質(zhì)異構(gòu)的特征,研究高容錯(cuò)能力和高效新型Non-MDS編碼與應(yīng)用于云存儲(chǔ)系統(tǒng)的重構(gòu)技術(shù)有待深入發(fā)展;
(2)多級(jí)糾刪編碼的重構(gòu)策略。現(xiàn)有糾刪編碼為用戶(hù)數(shù)據(jù)提供了單層可靠保障,并未考慮用戶(hù)I/O需求級(jí)別,如何將用戶(hù)數(shù)據(jù)存取優(yōu)先級(jí)別與容錯(cuò)能力結(jié)合,提供多級(jí)糾刪編碼融合數(shù)據(jù)重構(gòu)策略,將可有效提供更好的用戶(hù)數(shù)據(jù)存取服務(wù)和容錯(cuò)能力;
(3)面向負(fù)載特征的重構(gòu)策略。借鑒統(tǒng)計(jì)學(xué)與數(shù)據(jù)挖掘領(lǐng)域的技術(shù),進(jìn)一步分析和挖掘數(shù)據(jù)集的特性,考慮新型存儲(chǔ)器件性能以及數(shù)據(jù)的合理布局與調(diào)度,提供良好用戶(hù)服務(wù)的數(shù)據(jù)重構(gòu)策略;
(4)多失效節(jié)點(diǎn)的重構(gòu)策略。目前的重構(gòu)策略主要基于單失效節(jié)點(diǎn),然而隨著存儲(chǔ)規(guī)模的擴(kuò)大,同時(shí)重構(gòu)多失效節(jié)點(diǎn)的策略有待開(kāi)發(fā),其中優(yōu)化多失效節(jié)點(diǎn)的重構(gòu)I/O流將是主要的目標(biāo)。
目前數(shù)據(jù)的組織模式已由以計(jì)算為中心向存儲(chǔ)為中心的層面轉(zhuǎn)變,基于存儲(chǔ)的研究越來(lái)越受到科研人員的關(guān)注。總的來(lái)講,計(jì)算機(jī)的存儲(chǔ)研究呈現(xiàn)出大容量、網(wǎng)絡(luò)化、高容錯(cuò)性和高效性趨勢(shì)。因此可以預(yù)見(jiàn),未來(lái)對(duì)于分布式高性能計(jì)算和存儲(chǔ)系統(tǒng)的重構(gòu)研究將成為主要方向。
參考文獻(xiàn)
[1] Cisco全球網(wǎng)絡(luò)數(shù)據(jù)流量預(yù)測(cè)分析:The Zettabyte Era—Trends and Analysis[EB/OLE]. http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/VNI_Hyperconnectivity_WP.html
[2] Hafner, J. L. Weaver codes: Highly fault tolerant erasure codes for storage systems[C].In Proceedings of the 4th USENIX Conference on File and Storage Technologies.Berkeley,CA,USA, USENIX,2005: 16-26.
[3] Hafner J L. HoVer erasure codes for disk arrays[C].2006 International Conference on Dependable Systems and Networks. IEEE, 2006: 217-226.
[4] Huang C, Chen M, Li J. Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems[J]. 2007 Sixth IEEE International Symposium on Network Computing and Applications. IEEE, 2007,9(1): 79-86.
[5] Greenan K M, Li X, Wylie J J. Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs[J]. Mass Storage Systems and Technologies,2010,2(1): 1-14.
[6] Wan S, Cao Q, Xie C, et al.Code-m: A non-mds erasure code scheme to support fast recovery from up to two-disk failures in storage systems[J].2010 IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 2010,23(5): 51-60.
[7] Xie Ping, Huang Jianzhong, Cao Qiang, et al. V2-Code: A new non-MDS array code with optimal reconstruction performance for RAID-6[C].2013 IEEE International Conference on Cluster Computing (CLUSTER13). IEEE, 2013: 1-8.
[8] Pinheiro E, Weber W D, Barroso L A. Failure Trends in a Large Disk Drive Population[C]. In Proceedings of the 2007 USENIX Conference on File and Storage Technologies. USENIX, 2007: 17-23.
[9] Blaum M, Brady J, Bruck J, et al. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures[J]. Acm Sigarch Computer Architecture News, 1995, 22(2): 245-254.
[10] Corbett P F, English R, Goel A, et al. Row-diagonal parity for double disk failure correction[C].In Proceedings of the 3rd USENIX Conference on File and Storage Technologies. USENIX, 2004: 1-14.
[11] Xiang L, Xu Y, Lui J, et al. A hybrid approach to failed disk recovery using RAID-6 codes: Algorithms and performance evaluation[J]. ACM Transactions on Storage, 2011, 7(3): 11
[12] Wang Z, Dimakis A G, Bruck J. Rebuilding for array codes in distributed storage systems[C]. 2010 IEEE GLOBECOM Workshops, 2010: 1905-1909.
[13] Khan O, Burns R, Plank J, et al. In search of I/O-optimal recovery from disk failures[C].In Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems. USENIX Association, 2011: 6.
[14] Zhu Y, Lee P P C, Hu Y, et al.On the speedup of single-disk failure recovery in xor-coded storage systems:Theory and practice[J].Mass Storage Systems and Technologies,2012,22(5):1-12.
[15] Dimakis A G, Godfrey P B, Wu Y, et al.Network coding for distributed storage systems[J]. IEEE Transactions on Information Theory, 2010, 56(9): 4539-4551.
[16] Rashmi K V, Shah N B, Kumar P V, et al.Explicit construction of optimal exact regenerating codes for distributed storage[C].Allerton 2009:47th Annual Allerton Conference on Communication, Control, and Computing. IEEE, 2009: 1243-1249.
[17] Li J, Wang X, Li B. Pipelined regeneration with regenerating codes for distributed storage systems[C]. 2011 International Symposium on Network Coding. IEEE, 2011: 1-6
[18] Hu Y, Xu Y, Wang X, et al. Cooperative recovery of distributed storage systems from multiple losses with network coding[J]. IEEE Journal on Selected Areas in Communications, 2010, 28(2): 268-276.
[19] Kermarrec A M, Le Scouarnec N, Straub G. Repairing multiple failures with coordinated and adaptive regenerating codes[C].2011 International Symposium on Network Coding. IEEE, 2011: 1-6.
[20] Zhu Y, Lee P P C, Xiang L, et al. A cost-based heterogeneous recovery scheme for distributed storage systems with RAID-6 codes[C].2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2012: 1-12.
[21] K. V. Rashmi, Preetum Nakkiran, Jingyan Wang,et al. Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage and Network-bandwidth[C]. In Proceedings of the 2015 USENIX Conference on File and Storage Technologies. USENIX, 2015:81-94.
[22] Luo X, Shu J.Load-Balanced Recovery Schemes for Single-Disk Failure in Storage Systems with Any Erasure Code[C]. 2013 42nd International Conference on Parallel Processing, 2013: 552-561.
[23] Wan J, Wang J, Yang Q, et al.S2-RAID:A new RAID architecture for fast data recovery[C]. 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, 2010: 1-9.
[24] Hou R Y, Menon J, Patt Y N. Balancing I/O response time and disk rebuild time in a RAID5 disk array[C].In Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences. IEEE, 1993: 70-79.
[25] Holland M, Gibson G A, Siewiorek D P. Architectures and algorithms for on-line failure recovery in redundant disk arrays[J].Distributed and Parallel Databases, 1994, 2(3): 295-335.
[26] Holland M, Gibson G A, Siewiorek D P. Fast, on-line failure recovery in redundant disk arrays[C].The Twenty-Third International Symposium on Fault-Tolerant Computing, 1993: 422-431.
[27] Tian L, Feng D, Jiang H, et al. PRO: A Popularity-based Multi-threaded Reconstruction Optimization for RAID-Structured Storage Systems[C].In Proceedings of the 2007 USENIX Conference on File and Storage Technologies. USENIX, 2007:277-290.
[28] Wu S, Jiang H, Feng D, et al. WorkOut: I/O Workload Outsourcing for Boosting RAID Reconstruction Performance[C]. In Proceedings of the 2009 USENIX Conference on File and Storage Technologies. USENIX 2009: 239-252.
[29] Shenggang Wan, Xubin He, Jianzhong Huang,et al.An Efficient Penalty-Aware Cache to Improve the Performance of Parity-Based Disk Arrays under Faulty Conditions[J].IEEE Transactions on Parallel Distributed Systems,2013,24(8): 1500-1513.
[30] Xu L, Bruck J. X-code: MDS array codes with optimal encoding[J].IEEE Transactions on Information Theory, 1999, 45(1): 272-276.
[31] Xie T, Wang H. MICRO: A multilevel caching-based reconstruction optimization for mobile storage systems[J].IEEE Transactions on Computers, 2008, 57(10): 1386-1398.
[32] Guide P. Multiple (Nested) RAID Levels[EB/OL]. http://www.pcguide.com/ref/hdd/perf/raid/levels/mult.htm
[33] Jin C, Jiang H, Feng D, et al. P-Code: A new RAID-6 code with optimal properties[C].In Proceedings of the 23rd international conference on Supercomputing. ACM, 2009: 360-369.
[34] Li M, Shu J, Zheng W. GRID codes: Strip-based erasure codes with high fault tolerance for storage systems[J]. ACM Transactions on Storage, 2009, 4(4): 1-22.
[35] Shiyi Li, Xubin He, Shenggang Wan, et al. Exploiting Decoding Computational Locality to Improve the I/O Performance of an XOR-Coded Storage Cluster under Concurrent Failures[J]. SRDS 2014(1): 125-135.
[36] Shiyi Li,Qiang Cao,Shenggang Wan,et al.PPM: A Partitioned and Parallel Matrix Algorithm to Accelerate Encoding/Decoding Process of Asymmetric Parity Erasure Codes[C]. International Conference on Parallel Processing,2015:460-469.
[37] Jianzhong Huang, Er-wei Dai, Changsheng Xie, et al.CoRec: A Cooperative Reconstruction Pattern for Multiple Failures in Erasure-Coded Storage Clusters[C].International Conference on Parallel Processing,2015: 470-479.