There is a rapidly growing interest in securing big data due to the rapid development of cloud computing, big data, and other information technologies according to the fourth industrial revolution. Continuous data protection (CDP) is an effective method to deal with huge loss caused by data loss. More optimal design methods are available and studies on the establishment of the knowledge base for an efficient backup data management in the CDP field. In this paper, a knowledge‐based smart file‐level CDP scheme is suggested. The user's foundation database and context information are applied to machine learning technology, enabling a large amount of files' log data and context information accumulated continuously to be stored in the knowledge base using B + tree structure. This enables high performance and flexibility in the data protection management system. The result of comparative evaluation with different security risk levels for verifying the validity shows that the suggested method presented a higher performance in write/query operations and storage overhead.