一種基于迭代短卷積算法的低復雜度并行FIR濾波器結構

2014-05-30 11:42:18田晶晶李廣軍

電子與信息學報 2014年5期

田晶晶李廣軍李強②

田晶晶*①李廣軍①李強①②

① (電子科技大學通信與信息工程學院成都 611731)②(奧爾胡斯大學工程系奧爾胡斯 DK-8000)

該文基于快速卷積算法，提出一種適用于線性相位FIR濾波器的并行結構。該結構采用快速卷積算法減少子濾波器個數(shù)，同時讓盡可能多的子濾波器具有對稱系數(shù)，然后利用系數(shù)對稱的特性減少子濾波器模塊中的乘法器數(shù)量。對于具有對稱系數(shù)的FIR濾波器，提出的并行結構能夠比已有的并行FIR結構節(jié)省大量的硬件資源，尤其當濾波器的抽頭數(shù)較大時效果更明顯。具體地，對一個4并行144抽頭的FIR濾波器，提出的結構比改進的快速FIR算法(Fast FIR Algorithm, FFA)結構節(jié)省36個乘法器(14.3%)，23個加法器(6.6%)和35個延時單元(11.0%)。

并行FIR濾波器；快速卷積；迭代短卷積；對稱系數(shù)

1 引言

有限脈沖響應(Finite Impulse Response, FIR)濾波器因其優(yōu)良的線性相位特性以及無條件穩(wěn)定的特點，在視頻和圖像處理，無線通信等多個領域都得到了廣泛的應用。在某些應用中，如高速的遙感衛(wèi)星接收機，4G通信系統(tǒng)等，由于其數(shù)據(jù)傳輸速率越來越高，導致其對FIR濾波器的吞吐率要求也越來越高，而在另一些運用領域，如移動電話，手持終端醫(yī)療設備等則對FIR濾波器的功耗有著嚴格的要求。

本文提出一種改進的并行FIR濾波器結構，該結構利用快速卷積算法減少并行結構的子濾波器個數(shù)，同時讓盡可能多的子濾波器具有對稱系數(shù)，然后利用系數(shù)對稱的特性來降低子濾波器模塊中乘法器的數(shù)量。相比已有的并行FIR結構，提出的結構可以進一步節(jié)省硬件資源，尤其在濾波器抽頭數(shù)較大的時候。

本文內容安排如下：第2節(jié)介紹通過線性卷積得到并行FIR濾波器結構的設計思路，第3節(jié)介紹提出的并行FIR濾波器結構，第4節(jié)是硬件資源對比分析，第5節(jié)是有限字長性能的分析，第6節(jié)為結論。

2 基于線性卷積的并行FIR濾波器結構

3 本文提出的低復雜度并行FIR濾波器結構

圖2 具有對稱系數(shù)的子濾波器的實現(xiàn)

3.1 2并行結構

對于2并行線性相位FIR濾波器，其抽頭數(shù)是并行度的整數(shù)倍，具有對稱系數(shù)的子濾波器集合如式(4)。

3.2 3并行結構

對于3并行的線性相位FIR濾波器，具有對稱系數(shù)的子濾波器集合如式(6)。

表1 2并行濾波器結構中對應于的子濾波器

圖3 本文提出的2并行FIR濾波器的實現(xiàn)

表2 3并行濾波器結構中對應于的子濾波器

圖4 本文提出的3并行FIR濾波器的實現(xiàn)

如圖5所示，黑色背景框圖表示系數(shù)對稱的子濾波器模塊。本文提出的3并行FIR濾波器結構有5個子濾波器，其中2個具有對稱系數(shù)。而文獻[15]中的3并行結構有6個子濾波器，其中4個具有對稱系數(shù)。

3.3 4并行結構

對于4并行的線性相位的FIR濾波器，具有對稱系數(shù)的子濾波器集合如式(8)。

圖5 3并行的改進FFA結構與提出的3并行結構的子濾波器模塊比較

表3 4并行濾波器結構中對應于的子濾波器

圖6 4并行的改進FFA結構與本文提出的4并行結構的子濾波器模塊比較

3.4 迭代結構

4 復雜度對比分析

需要的總的延時單元數(shù)量由表達式(13)得到

表4對本文提出的結構和文獻[15]中改進的FFA結構在不同并行度和抽頭數(shù)下所用的硬件資源做了一個比較，對比資源包括：乘法器數(shù)量()，節(jié)省的乘法器數(shù)量(RM)，總的加法器數(shù)量()，子濾波器模塊中的加法器數(shù)量(Sub)，前置和后置矩陣中所用加法器數(shù)量()，節(jié)省的加法器數(shù)量(RA)，節(jié)省的延時單元數(shù)量(RD)。表5展示了144抽頭8并行和4并行的FIR濾波器在不同實現(xiàn)結構下所消耗的乘法器數(shù)量()，加法器數(shù)量()，以及延時單元的數(shù)量()。如表4所示，本文提出的4并行結構比文獻[15]中的結構節(jié)省14.3%的乘法器，4.9%到6.6%的加法器，以及10.9%到11.0%的延時單元。本文提出的8并行結構比文獻[15]中的結構節(jié)省12.8%到13.0%的乘法器，-1.1%到3.9%的加法器以及10.8%到10.9%的延時單元。其中節(jié)省的加法器和延時單元的百分比和濾波器的抽頭數(shù)有關，濾波器抽頭數(shù)越大節(jié)省資源的百分比也越高。

表4本文提出結構和文獻[15]中改進FFA結構的硬件資源消耗對比

并行度抽頭數(shù)結構MRM(%)ARA(%)RD(%) Sub+P 372文獻[15]960138+1714.216.4 本文96115+18 144文獻[15]1920282+1715.416.5 本文192235+18 472文獻[15]12614.3153+314.910.9 本文108136+39 144文獻[15]25214.3315+316.611.0 本文216280+39 872文獻[15]21112.8216+134-1.110.8 本文184192+162 144文獻[15]41413.0459+1343.910.9 本文360408+162

5 有限字長性能分析

表6給出了本文結構和文獻[15]中濾波器結構的有限字長性能對比(同樣量化位寬下)。本文提出結構有更大的均方誤差，主要原因是：本文結構的子濾波器前常系數(shù)分母較大且非2的冪次方，在量化濾波器系數(shù)時會引入更大的量化誤差。但考慮提出結構能節(jié)省大量的硬件資源，有限字性能的適當下降是可以接受的。

表5 144抽頭的濾波器所用硬件資源

表6 本文提出結構和文獻[15]中改進FFA結構的均方誤差對比

6 結論

本文展示了一種適用于線性相位FIR濾波器的改進的并行濾波器結構。本文提出的結構利用系數(shù)對稱的特性和快速卷積算法來節(jié)省硬件資源。比較已有的并行FIR結構，本文提出結構的有限字長性能有一定下降，但可以節(jié)省較多的硬件資源，F(xiàn)IR濾波器的抽頭數(shù)越大，節(jié)省的資源也越多。

[1] Parhi K K. VLSI Digital Signal Processing Systems: Design and Implementation[M]. New York: John Wiley & Sons, 2007: 237-275.

[2] Parker D A and Parhi K K. Low-area/power parallel FIR digital filter implementations[J].,, 1997, 17(1): 75-92.

[3] 鄧軍, 楊銀堂. 全數(shù)字接收機中一種基于并行流水線與快速FIR算法的插值濾波器結構及其實現(xiàn)[J]. 電子與信息學報, 2010, 32(9): 2089-2094.

Deng Jun and Yang Yin-tang. Structure of interpolation filter based on parallel pipelining and fast FIR algorithm and its implementation for all digital receiver[J].&, 2010, 32(9): 2089-2094.

[4] Acha J I. Computational structures for fast implementation of-path and-block digital filters[J]., 1989, 36(6): 805-812.

[5] Cheng C and Parhi K K. Hardware efficient fast parallel FIR filter structures based on iterated short convolution[J].:, 2004, 51(8): 1492-1500.

[6] Cheng C and Parhi K K. Further complexity reduction of parallel FIR filters[C]. Proceedings of IEEE International Symposium on Circuits and Systems, Kobe, 2005: 1835-1838.

[7] Cheng C and Parhi K K. Low-cost parallel FIR filter structures with 2-stage parallelism[J].:, 2007, 54(2): 280-290.

[8] Aktan M, Yurdakul A, and Dundar G. An algorithm for the design of low-power hardware-efficient FIR filter[J].:, 2008, 55(6): 1536-1545.

[9] Shi D and Yu Y J. Design of discrete-valued linear phase FIR filters in cascade form[J].:, 2011, 58(7): 1627-1636.

[10] Park S Y and Meher P K. Low-power, high-throughput, and low-area adaptive FIR filter based on distributed arithmetic [J].:, 2013, 60(6): 346-350.

[11] Tsao Y C and Choi K. Hardware-efficient parallel FIR digital filter structures for symmetric convolutions[C]. Proceedings of IEEE International Symposium on Circuits and Systems, Rio de Janeiro, 2011: 2301-2304.

[12] Tsao Y C and Choi K. Hardware-efficient VLSI implementation for 3-parallel linear-phase FIR digital filter of odd length[C]. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), Seoul, 2012: 998-1001.

[13] Liu Z, Ye F, and Ren J. Low-cost parallel FIR digital filter structures utilizing the coefficient symmetry[C]. IEEE 11th International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Xi’an, 2012: 1-3.

[14] Tsao Y C and Choi K. Area-efficient VLSI implementation for parallel linear-phase FIR digital filters of odd length based on fast FIR algorithm[J].:, 2012, 59(6): 371-375.

[15] Tsao Y C and Choi K. Area-efficient parallel FIR digital filter structures for symmetric convolutions based on fast FIR algorithm[J].(), 2012, 20(2): 366-371.

[16] Selvakumar J, Narendran S, and Bhaskar V. FPGA based efficient fast FIR algorithm for higher order digital FIR filter[C]. International Symposium on Electronic System Design (ISED), Kolkata, 2012: 43-47.

田晶晶：男，1989年生，碩士生，研究方向為VLSI數(shù)字信號處理實現(xiàn)技術.

李廣軍：男，1950年生，教授，博士生導師，研究領域包括通信系統(tǒng)設計、ASIC/SOC設計、信號與信息處理.

李強：男，1979年生，教授，博士生導師，研究領域為數(shù)模混合集成電路.

Hardware-efficient Parallel Structures for Linear-phase FIR DigitalFilter Based on Iterated Short Convolution Algorithm

Tian Jing-jing①Li Guang-jun①Li Qiang①②

①(,,611731,)②(,,-8000,)

Based on fast convolution algorithm, improved parallel FIR filter structures are proposed for linear- phase FIR filters where the number of taps is a multiple of parallelism. The proposed parallel FIR structures not only use fast convolution algorithm to reduce the number of sub-filters, but also exploit the symmetric coefficients of linear-phase FIR filter to reduce half the number of multiplications in sub-filter section at the expense of additional adders in pre-processing and post-processing blocks. The proposed parallel FIR structures save a large amount of hardware cost for symmetric coefficients from the reported parallel FIR filter structures, especially when the length of the filter is large. Specifically, for a 4-parallel 144-tap filter, the proposed structure saves 36 multipliers (14.3%), 23 adders (6.6%), and 35 delay elements (11.0%) from the improved Fast FIR Algorithm (FFA) structure.

Parallel FIR filter; Fast convolution; Iterated short convolution; Symmetric coefficients

TN713.7

1009-5896(2014)05-1151-07

10.3724/SP.J.1146.2013.00976

田晶晶 jing.jing.t@163.com

2013-07-08收到，2013-11-08改回

國家自然科學基金(61006027)和新世紀優(yōu)秀人才支持計劃(NCET- 10-0297)資助課題