Internet of Things (IoT), as a typical representation of cyberization, enables the interconnection of physical things and the Internet, which provides intelligent and advanced services for industrial production and human lives. However, it also brings new challenges to IoT applications due to heterogeneity, complexity and dynamic nature of IoT. Especially, it is difficult to determine the sources of specified data, which is vulnerable to inserted attacks raised by different parties during data transmission and processing. In order to solve these issues, data provenance is introduced, which records data origins and the history of data generation and processing, thus possible to track the sources and reasons of any problems. Though some related researches have been proposed, the literature still lacks a comprehensive survey on data provenance in IoT. In this paper, we first propose a number of design requirements of data provenance in IoT by analyzing the features of IoT data and applications. Then, we provide a deepinsight review on existing schemes of IoT data provenance and employ the requirements to discuss their pros and cons. Finally, we summarize a number of open issues to direct future research.