Crowdsourcing provides a distributed method to solve the tasks that are difficult to complete using computers and require the wisdom of human beings. Due to its fast and inexpensive nature, crowdsourcing is widely used to collect metadata and data annotation in many fields, such as information retrieval, machine learning, recommendation system, and natural language processing. Crowdsourcing helps enable the collection of rich and large-scale data, which promotes the development of researches driven by data. In recent years, a large amount of effort has been spent on crowdsourcing in data collection, to address the challenges, including quality control, cost control, efficiency, and privacy protection. In this paper, we introduce the concept and workflow of crowdsourcing data collection. Furthermore, we review the key research topics and related technologies in its workflow, including task design, task-worker matching, response aggregation, incentive mechanism, and privacy protection. Then, the limitations of the existing work are discussed, and the future development directions are identified.