Data deduplication works on eliminating redundant data and reducing storage consumption. Nowadays more data generated and it was stored in the cloud repeatedly, due to this large volume of storage will be consumed. Data deduplication tries to reduce data volumes disk space and network bandwidth can be to reduce costs and energy consumption for running storage systems. In the data deduplication method, data broken into small size of chunk or block. Hash ID will be calculated for all the blocks then it’s compared with existing blocks for duplication. Blocks may be fixed or variable size, compared with a fixed size of block variable size chunking gives a better result. So the chunking process is the initial task of deduplication to get an optimal result. In this paper, we discussed various content defined chunking algorithms and their performance based on chunking properties like chunking speed, processing time, and throughput.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.