In cloud storage system, when we search similar documentation files, keyword-based similarity evaluation scheme is well performed. However, if we want to find similar binary files then it is very difficult to satisfy user request. Because there is no widely used binary file search system that supports similarity evaluation among files. File similarity evaluation is essential for digital forensic and data deduplication field. In the file similarity processing time, the CPU consumption and resource overhead of memory are increased as the number of files increase. Moreover, as the file size is getting bigger, the overhead of metadata management is critical. In this paper, we suggest the similarity evaluation scheme using a hybrid chunking which reduce overall processing time of similarity evaluation. Experiment result shows that the proposed system can reduce processing time and data storage capacity effectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.