The rapid development of cloud computing with virtualization technology has benefited both the academia and the industry. For any cloud data center at scale, one of the primary challenges is how to effectively orchestrate a large number of virtual machines (VMs) in a performance-aware and cost-effective manner. A key problem here is that the performance interference between VMs can significantly undermine the efficiency of cloud data centers, leading to performance degradation and additional operation cost. To address this issue, extensive studies have been conducted to investigate the problem from different aspects. In this survey, we make a comprehensive investigation into the causes of VM interference and provide an in-depth review of existing research and solutions in the literature. We first categorize existing studies on interference models according to their modeling objectives, metrics used and modeling methods. Then we revisit interference-aware strategies for scheduling optimization as well as co-optimization based approaches. Finally, the survey identifies open challenges with respect to VM interference in data centers and discusses possible research directions to provide insights for future research in the area.