The trend of learning from videos instead of documents has increased. There could be hundreds and thousands of videos on a single topic, with varying degrees of context, content, and depth of the topic. The literature claims that learners are nowadays less interested in viewing a complete video but prefers the topic of their interests. This develops the need for indexing of video lectures. Manual annotation or topic-wise indexing is not new in the case of videos. However, manual indexing is time-consuming due to the length of a typical video lecture and intricate storylines. Automatic indexing and annotation is, therefore, a better and efficient solution. This research aims to identify the need for automatic video indexing for better information retrieval and ease users navigating topics inside a video. The automatically identified topics are referred to as "Index Points." 137-layer YoloV4 Darknet Neural Network creates a custom object detector model. The model is trained on approximately 6000 video frames and then tested on a suite of 50 videos of around 20 hours of run time. Shot Boundary detection is performed using Structural Similarity fused with a Binary Search Pattern algorithm which outperformed the state-of-the-art SSIM technique by reducing the processing time to approximately 21% and providing around 96% accuracy. Generation of accurate index points in terms of true positives and false negatives is detected through precision, recall, and F1 score, which varies between 60-80% for every video. The results show that the proposed algorithm successfully generates a digital index with reasonable accuracy in topic detection.