Due to the rapid growth of digital multimedia technology, the importance of multimedia data such as digital images/videos is increasing rapidly in diverse domains. This includes surveillance CCTV footages, which act as primary evidences in many court cases, including highly sensitive contexts. Today, with the wide availability of low-cost image/video manipulating software, digital images/videos have become highly vulnerable to manipulation/modification attacks; one such attack is the class of object-based forgery in surveillance videos. In this paper, we propose a Capsule Network based digital forensic technique for detection of object-based forgery in surveillance videos. In the proposed technique, we use motion residual, computed from every video frame, to extract intra-and inter-frame inherent statistical characteristics of the video sequence, as the input of capsule network. Our experimental results indicate that the proposed technique achieves significant performance in terms of authentic, double compressed and forged frame detection, irrespective of the group of pictures length and degree of compression in videos.