Predicting pedestrian trajectories in urban scenarios is a challenging task that has a wide range of applications, from video surveillance to autonomous driving. The task is difficult since pedestrian behavior is affected by both their individual path’s history, their interactions with others, and with the environment. For predicting pedestrian trajectories, an attention-based interaction-aware spatio-temporal graph neural network is introduced. This paper introduces an approach based on two components: a spatial graph neural network (SGNN) for interaction-modeling and a temporal graph neural network (TGNN) for motion feature extraction. The SGNN uses an attention method to periodically collect spatial interactions between all pedestrians. The TGNN employs an attention method as well, this time to collect each pedestrian’s temporal motion pattern. Finally, in the graph’s temporal dimension characteristics, a time-extrapolator convolutional neural network (CNN) is employed to predict the trajectories. Using a lower variable size (data and model) and a better accuracy, the proposed method is compact, efficient, and better than the one represented by the social-STGCNN. Moreover, using three video surveillance datasets (ETH, UCY, and SDD), D-STGCN achieves better experimental results considering the average displacement error (ADE) and final displacement error (FDE) metrics, in addition to predicting more social trajectories.