“…First, it should have a good crawling strategy, i.e., a strategy for deciding which pages to download next. Second, it needs to have highly optimized system architecture, i.e., robust against crashes, manageable, and considerate of resources and web servers [1].The performance of the crawling process has been improved by using a parallel web crawler instead of a batch crawler. But, the existing parallel crawler as a single point coordinator with high chances of data redundancy leading to crawling of the same URLs multiple times thus affects the performance [3].…”