The online social platforms, like Twitter, Facebook, LinkedIn and WeChat, have grown really fast in last decade and have been one of the most effective platforms for people to communicate and share information with each other. Due to the "word of mouth" effects, information usually can spread rapidly on these social media platforms. Therefore, it is important to study the mechanisms driving the information diffusion and quantify the consequence of information spread. A lot of efforts have been focused on this problem to help us better understand and achieve higher performance in viral marketing and advertising. On the other hand, the development of neural networks has blossomed in the last few years, leading to a large number of graph representation learning (GRL) models. Compared with traditional models, GRL methods are often shown to be more effective. In this paper, we present a comprehensive review for recent works leveraging GRL methods for popularity prediction problem, and categorize related literatures into two big classes, according to their mainly used model and techniques: embedding-based methods and deep learning methods. Deep learning method is further classified into six small classes: convolutional neural networks, graph convolutional networks, graph attention networks, graph neural networks, recurrent neural networks, and reinforcement learning. We compare the performance of these different models and discuss their strengths and limitations. Finally, we outline the challenges and future chances for popularity prediction problem.Impact Statement-Social platforms have exploded in popularity in last decade, and information diffusion on social networks has attracted widespread attention in both industrials and academics. Companies need estimate whether marketing strategies will be successful before real implementation to save costs and earn more profits. On the other hand, it is vital to predict diffusion of misinformation before it spreads too far, and block its further diffusion. Therefore, it is important to understand underlying diffusion dynamics of cascade and its popularity trend. With recent achievements of graph representation learning in artificial intelligence and other fields, many works leveraged graph representation learning methods for information diffusion modeling and popularity prediction. This article presents a comprehensive review of recent works using graph representation learning methods for popularity prediction. Reviewing different methods can help researchers identify strengths and limitations of existing models, and find out challenges and directions for popularity prediction in future work.