In multithreaded execution on shared memory multiprocessors, the Fast Fourier Transform algorithm (FFT) is partitioned into a number of threads which concurrently run on the available processors of the system. Each thread computes a partition of the data vector and two or more threads communicate with each other through shared memory. The analysis of sequential algorithm points out different types of data dependencies that are removed by insertion of synchronization points between threads. For the parallel algorithm, an estimation of its performance parameters is done in order to determine the performance of the parallel execution.
FFT algorithm, shared memory systems, multithreaded execution, scalability of parallel systems.