In recent years, with the advancements in the Deep Learning realm, it has been easy to create and generate synthetically the face swaps from GANs and other tools, which are very realistic, leaving few traces which are unclassifiable by human eyes. These are known as 'DeepFakes' and most of them are anchored in video formats. Such realistic fake videos and images are used to create a ruckus and affect the quality of public discourse on sensitive issues; defaming one's profile, political distress, blackmailing and many more fake cyber terrorisms are envisioned. This work proposes a microscopic-typo comparison of video frames. This temporal-detection pipeline compares very minute visual traces on the faces of real and fake frames using Convolutional Neural Network (CNN) and stores the abnormal features for training. A total of 512 facial landmarks were extracted and compared. Parameters such as eye-blinking lip-synch; eyebrows movement, and position, are few main deciding factors that classify into real or counterfeit visual data. The Recurrent Neural Network (RNN) pipeline learns based on these features-fed inputs and then evaluates the visual data. The model was trained with the network of videos consisting of their real and fake, collected from multiple websites. The proposed algorithm and designed network set a new benchmark for detecting the visual counterfeits and show how this system can achieve competitive results on any fake generated video or image. INDEX TERMS DeepFakes, Generative Adversarial Network (GANs), Facial landmarks, Convolutional Neural Networks (CNN), Recurrent Neural Network (RNN), Visual Counterfeits.