Fast Fourier Transform

Fast Fourier Transform (hereafter referred to FFT) is a fast algorithm for Discrete Fourier Transform (hereinafter referred to DFT), which is one of the most basic methods in the analysis of time-domain a frequency-domain transform. The algorithm enables interconversion of data from the time domain to the frequency domain, which provides another dimension of data processing method support. In engineering practices, DFT algorithms cannot be widely implemented in reality due to the computationally excessive. The FFT algorithm optimizes the computation to a practicable order of magnitude by improving the computing method of the DFT, and the discrete Fourier transform is really widely used in the engineering field.

Operator Effect

Time Domain Input DataParameterFrequency Domain Output Data
imagep_size = HB_HPL_FFT16
normalize = 0
dataType = HB_HPL_DATA_TYPE_I16
imFormat = HB_IM_FORMAT_SEPARATE
numDimensionSize = 1
image

Principle

FFT is an improved method based on the DFT algorithm, which is equivalent to DFT in terms of computational results and optimizes the computational process. If F(n) is the discrete Fourier transform of f(n), the DFT formula is expressed as follows:

F(n)=k=0N1f(k)WNkn,n=0,1,,N1F(n)=\sum_{k=0}^{N-1}f(k)W_N^{kn},\qquad n=0,1,\cdots,N-1

The rotation factor is WNkn=ej2πNknW_N^{kn}=e^{-j\frac{2\pi}{N}kn} , and N is the input sequence length.

IDFT is the inverse of DFT which is expressed as follows:

f(n)=1Nk=0N1F(k)WNkn,n=0,1,,N1f(n)=\frac{1}{N}\sum_{k=0}^{N-1}F(k)W_N^{-kn},\qquad n=0,1,\cdots,N-1

Another fundamental formula is Euler's formula:

eix=cos(x)+isin(x)e^{ix}=cos(x)+i*sin(x)

From Euler's formula is deduced that the rotation factor has the following properties:

WN0=ej2πN0=ej0=cos(0)+jsin(0)=1W_N^{0}=e^{-j\frac{2\pi}{N}0}=e^{-j0}=cos(0)+jsin(0)=1 WNkN2=(cos(π)+jsin(π))k=(1)kW_N^{k\frac{N}{2}}=(cos(-\pi)+jsin(-\pi))^k=(-1)^k WN2kn=ej(2πN2)kn=WN2knW_N^{2kn}=e^{j(\frac{2\pi}{\frac{N}{2}})kn}=W_{\frac{N}{2}}^{kn}

For the N-point DFT formula, the formula can be decomposed in parity order as follows:

F(n)=k=0N1f(k)WNkn=E+OF(n)=\sum_{k=0}^{N-1}f(k)W_N^{kn}=E+O E=k=0N21f(2k)ej2πknN2E=\sum_{k=0}^{\frac{N}{2}-1}f(2k)e^{\frac{-j2\pi kn}{\frac{N}{2}}} O=WNnk=0N21f(2k+1)ej2πknN2O=W_N^n\sum_{k=0}^{\frac{N}{2}-1}f(2k+1)e^{\frac{-j2\pi kn}{\frac{N}{2}}}

The decomposition effect is shown in the figure, bisecting the overall computational task layer by layer.

HPL5FFT_split

As shown above, the FFT tasks can all be decomposed into single-point computational tasks when N is a power of two.

From Euler's formula, WNnW_N^{n} can be written in the following form, which shows that WNnW_N^{n} has a periodicity and symmetry associated with N.

WNn=ej2πNn=cos(2πNn)+isin(2πNn)=cos(2πNn)isin(2πNn)W_N^{n}=e^{-j\frac{2\pi}{N}n} =cos(-\frac{2\pi}{N}n)+i*sin(-\frac{2\pi}{N}n) =cos(\frac{2\pi}{N}n)-i*sin(\frac{2\pi}{N}n)

According to this property one obtains WNn=WNN2+nW_N^{n}=W_N^{\frac{N}{2}+n} , where nn takes the range [0,N/2)[0, N/2) .

Taking the FFT operation with N of 8 as an example, the specific decomposition process as follows:

  1. Split 8 points into 4 points
HPL5FFT_bufferfly1
  1. Split 4 points into 2 points
HPL5FFT_bufferfly2
  1. Calculate the single-point FFT
HPL5FFT_bufferfly3

In which, the computational process of interleaving the nodes two by two is known as butterfly operation as follows:

HPL5FFT_bufferfly

The specific formula as follows:

FN/4(0)=fN(0)+WN/40fN(4)F'_{N/4}(0) = f_N(0)+W^0_{N/4}*f_N(4) FN/4(1)=fN(0)WN/40fN(4)F'_{N/4}(1) = f_N(0)-W^0_{N/4}*f_N(4)

In the above figure, it can be seen that there is a bit-reversal relationship between F(n) and f(n) for n. That is, in FFT, the indices of the input and output parameters are bit-reversals of each other,providing better support for optimized computational efficiency.

The calculation process of IFFT is similar to that of FFT, which can be derived based on the IDFT formula.

API Interface

int32_t hbFFT1D(hbUCPTaskHandle_t *taskHandle, hbHPLImaginaryData *dst, hbHPLImaginaryData const *src, hbHPLFFTParam const *param);

For detailed interface information, please refer to hbFFT1D.

Usage

// Include the header #include "hobot/hb_ucp.h" #include "hobot/hpl/hb_hpl.h" #include "hobot/hpl/hb_fft.h" // init, allocate memory for data src_length = 1024 * 1024 * 5; hbUCPMallocCached(&src_re_mem, src_length, 0); hbUCPMallocCached(&src_im_mem, src_length, 0); hbHPLImaginaryData src; src.realDataVirAddr = src_re_mem.virAddr; src.realDataPhyAddr = src_re_mem.phyAddr; src.imDataVirAddr = src_im_mem.virAddr; src.imDataPhyAddr = src_im_mem.phyAddr; src.numDimensionSize = 1; src.dataType = HB_HPL_DATA_TYPE_I16; src.imFormat = HB_IM_FORMAT_SEPARATE; src.dimensionSize[0] = 16 * 11; hbUCPMallocCached(&dst_re_mem, src_length, 0); hbUCPMallocCached(&dst_im_mem, src_length, 0); hbHPLImaginaryData dst; dst.realDataVirAddr = dst_re_mem.virAddr; dst.realDataPhyAddr = dst_re_mem.phyAddr; dst.imDataVirAddr = dst_im_mem.virAddr; dst.imDataPhyAddr = dst_im_mem.phyAddr; dst.numDimensionSize = 1; dst.dataType = HB_HPL_DATA_TYPE_I16; dst.imFormat = HB_IM_FORMAT_SEPARATE; dst.dimensionSize[0] = 16 * 11; // init param hbFFTParam param; param.pSize = HB_FFT_POINT_SIZE_16; param.normalize = 0; // init task handle and schedule param hbUCPTaskHandle_t task_handle{nullptr}; hbUCPSchedParam sched_param; HB_UCP_INITIALIZE_SCHED_PARAM(&sched_param); sched_param.backend = HB_UCP_DSP_CORE_0; // create task hbFFT1D(&task_handle, &dst, &src, &param); // submit task hbUCPSubmitTask(task_handle, &sched_param); // wait for task done hbUCPWaitTaskDone(task_handle, 0); // release task handle hbUCPReleaseTask(task_handle); // release memory hbUCPFree(&src_re_mem); hbUCPFree(&src_im_mem); hbUCPFree(&dst_re_mem); hbUCPFree(&dst_im_mem);