“…Regarding the template detection techniques, we found in the literature, some of them used artificial benchmarks [25], while others used real heterogeneous web pages [114,10]. Similarly, some authors selected the input web pages randomly [120,116,50], while others provided the input web pages manually [114,113]. Finally, regarding the block detection techniques, we found authors that used well-known benchmark suites such as CleanEval [20] benchmark suite [115,99], MSS [85], L3S-GN1 [61], etc.…”