“…There exists a substantial body of prior research efforts including data locality-oriented cache optimizations [6,11,22,33,53,57,61], data layout transformations [23,35,37,39,48,59], and neardata computing (NDC) techniques [7,9,19,21,24,29,50], aiming to reduce the cost of data accesses in single-core and manycore systems. Among them, NDC is one popular execution paradigm that offloads computations to execute near data, instead of the traditional approaches that fetch data to computation.…”