As the bandwidth accessible to average users is increasing, multimedia data, in particular images and videos, become the fastest growing data type on Internet. Especially with the popularity of social media, there has been exponential growth in images and videos available on the Web. On one hand, among these huge volumes of images and videos, there exist large numbers of near‐duplicates and copies. Near‐duplicates carry both blessing and redundant signals. For example, it provides rich visual clue for indexing and summarizing broadcast videos from different channels. On the other hand, the excessive amount of near‐duplicates makes browsing web videos streamed over Internet an extremely time‐consuming task. In this paper, we provide an overview of the state‐of‐the‐art researches on the algorithms and techniques for large‐scale mining of visual near‐duplicates. It basically covers issues about image representation, image feature indexing and matching, pattern modeling, scalable detection, and emerging multimedia applications that are built upon near‐duplicate detection.