Parallel Matched Filtering Algorithm with Low Complexity

For the problem of double counting in the overlap save algorithm(OSA), this paper presents parallel matched filtering algorithm with low complexity. The current data is segment based on the filter order, then using the quarter discrete fourier transform (QDFT) to reduce the amount of calculation. The calculation result of the previous and current data block are added to obtain the block filter results. Analysis and simulation results show that the algorithm effectively reduces the computational complexity. It is more suitable for high-speed demodulation which has multiple parallel paths.


Introduction
With the rapid development of information technology and communication technology, the data transmission rate of communication system can reach the order of Gsps.High-speed data can not be processed by traditional serial demodulation method at present, so it needs parallel processing algorithm [1] .Matched filter is high complexity part of the key algorithm, so It is significant to study the parallel matching filter algorithm.
To reduce the complexity of filtering, in [1] the complexity of the filter is reduced by optimizing the design of constant coefficient multiplier.In [1], the fast FIR algorithm (FFAs) is used to generate the low complexity filtering algorithm by using the characteristics of the filter coefficient symmetry.
In this paper, a low complexity parallel filtering algorithm is proposed on the basis of overlapped preserving filtering algorithm.The overlap-preserving algorithm is decomposed into two independent processing modules, and the low-complexity matched filter structure is calculated.

Low complexity of the frequency domain filtering algorithm.
Overlapping preserving method, the successive input section overlaps the data part will bring the duplicate computation, eliminate the duplication computation part, and reduce the algorithm the complexity.Through deduction, the traditional overlapping reservation method can be decomposed into two parts of the previous data block and the current data block for independent processing.The processing result of the previous data block is reserved by the last segment, and only the current data block is processed.The length of the data that needs to be involved in the calculation is reduced by half .Observing the FFT algorithm by frequency extraction (DIF), each input segment DFT result can be decomposed into a shorter two-part element according to the parity of its vector index.
Where the elements e X and o X represents the even and odd vector indices of d X , K  are Kth order odd discrete Fourier transform (ODFT) matrices,and the (n, k) elements in the matrix are G two parts,since half of g is f and the rest is the M-size zero vector M 0 .We have Also from (1) and the definitions, the circular convolution can be rewritten as: The first half of the calculation on the right side of equation ( 3) relates only to x p , while the latter half is only related to x c.The low complexity of the frequency domain filter algorithm is shown in Figure 1, where J module represents the dashed part of the calculation, by the two basic modules and components.For each successive input segment, the J-module output of the current data block is calculated and added to the J-module result of the previous data, so that the result of the corresponding matched filter can be obtained and the J-module result of the current data is saved .)  computations are simplified, and the complexity of the matched filtering algorithm is further reduced.2Mpoint input data QDFT corresponding input and output relationship is: QDFT Multiply the time series by the rotation factor   , and then calculate the DFT results, so the same can be used FFT to quickly calculate.Through observation and analysis, in the time domain 1 2 ( ) can be expressed as a cyclic convolution :  corresponds to the skew circular convolution as: can be expressed as a cyclic convolution formula: K Q are Kth order quarter discrete fourier transform (QDFT) matrices,and the (n, k) elements in the matrix are ,The superscript q indicates that the data undergo QDFT transformation.Comparing ( 5) and ( 6) with ( 7), we can see that the real part of 1 2 ( ) and the imaginary part is equivalent to

DQDFT parallel filtering algorithm.
In the high-speed demodulation system with more parallel channels, the computational complexity of the QDFT-based frequency-domain matched filtering algorithm is still large, and the J-module is subdivided into smaller modules to reduce the complexity of the matched filtering algorithm.
Based on the above analysis, a parallel matching filter algorithm with lower complexity is proposed.The algorithm divides the parallel input data into shorter data blocks and inputs them into several J modules in parallel to obtain the calculation results of the multiple J modules and carry out the J module results of the last data block which is retained by the previous data .The corresponding operation, to achieve the purpose of block convolution.The improved frequency domain parallel matching filter algorithm is equivalent to the effect of multiple overlapped data segments matching filtering at the same time, which reduces the computational complexity and improves the ability of real-time processing.In block-based convolutional matched filtering, the data is divided into smaller data blocks, and more J modules are used.We define the smallest J module as the unit J module, the input size is the number of power 2, determined by the filter length.For example, when the filter length L is equal to 33, the input to the unit J module is 64.If the current data parallel input n J module for QDFT matching filtering method , we call nQDFT matching filter algorithm.And n J modules are unit J module when we called DQDFT matching filtering algorithm.

Complexity analysis and simulation experiment
In this paper, in order to facilitate the complexity analysis, M, m using the power of 4 times the number of hardware used in the widely used base-4 FFT calculations.The computational complexity of the base-4 FFT can be summarized as follows:

W
 .superscriptd and Ɵ respectively represent the data for DFT and ODFT transform .Divided d G into the same e G and o

Fig. 1
Fig.1 Schematic diagram of low-complexity frequency domain filtering algorithm