In image and video data, visual pattern refers to re-occurring composition of visual primitives. Such visual patterns extract the essence of the image and video data that convey rich information. However, unlike frequent patterns in transaction data, there are considerable visual content variations and complex spatial structures among visual primitives, which make effective exploration of visual patterns a challenging task. Many methods have been proposed to address the problem of visual pattern discovery during the past decade. In this article, we provide a review of the major progress in visual pattern discovery. We categorize the existing methods into two groups: bottom-up pattern discovery and top-down pattern modeling. The bottom-up pattern discovery method starts with unordered visual primitives followed by merging the primitives until larger visual patterns are found. In contrast, the top-down method starts with the modeling of visual primitive compositions and then infers the pattern discovery result. A summary of related applications is also presented. At the end we identify the open issues for future research.