With the high integration of computing units and physical objects, cyber and physical systems are gradually coupled into cyber-physical systems (CPSs). According to the laws of physical systems and operation flow in CPSs, unknown CPS data can be inferred from other known data. The inferred data leakage threat is triggered when an accurate inference path connects between low-and high-security domain data. In this paper, by analyzing CPS data leakage accidents caused by data inference, paradigms of data leakage threats are proposed from two dimensions: data theft and data inference. Data inference is classified into three problem types: state estimation, parameter identification, and blind source separation. The algorithms for data inference are categorized as model-driven, data-driven, and data-model-driven methods. In the case of an electricity market, the process of inferring key parameters of a power system from public electricity price data is demonstrated, verifying that data inference can cause severe CPS data leakage threats. Meanwhile, the challenges of existing data protection methods are investigated. Additionally, the future research of CPS data inference defense and data security governance is discussed.