“…Most recently, Gilbert et al presented CPU and GPU optimizations on many graph coarsening algorithms and demonstrated significant performance improvements on graph partitioning [27]. There have been recent attempts to use coarsening for embedding [5], [6], [28], [29], [30], however, they do not utilize specialized processing units such as GPUs, and they employ computationally expensive coarsening algorithms. Distributed embedding approaches are also proposed to make the embedding faster [10], [31], [32].…”