“…For this we manually optimized the parallel execution of computations representing some well known numerical and vision algorithms. These are: (1) sorting (S) [26], (2) LU decomposition (LU) [26], (3) matrix multiply (MM) [26], (4) cyclic reduction, FFT, and DCT (CR/FFT/DCT) [21], (5) vision algorithm with odd stride (V-odd) [11], and (6) vision algorithm with even stride (V-even) [11]. All used data array are N × N , where N = 1024.…”