“…Previous approaches have attempted to achieve the minimum completion time for the parallel loop scheduling problem only by distributing the workload as evenly as possible while minimizing the number of synchronization operations required and the communication overhead caused by access to non local data on shared-memory systems [2,3,5,6,11,18]. Other authors have studied the parallelism across iterations to consider loop carried dependencies, proposing different techniques to improve this parallelism [4,8,[12][13][14]16,17]: cyclo-compaction scheduling, loop pipelining, etc.…”