Detecting objects remains one of computer vision and image understanding applications' most fundamental and challenging aspects. Significant advances in object detection have been achieved through improved object representation and the use of deep neural network models. This paper examines more closely how object detection has evolved in the era of deep learning over the past years. We present a literature review on various state-of-the-art object detection algorithms and the underlying concepts behind these methods. We classify these methods into three main groups: anchor-based, anchor-free, and transformer-based detectors. Those approaches are distinct in the way they identify objects in the image. We discuss the insights behind these algorithms and experimental analyses to compare quality metrics, speed/accuracy tradeoffs, and training methodologies. The survey compares the major convolutional neural networks for object detection. It also covers the strengths and limitations of each object detector model and draws significant conclusions. We provide simple graphical illustrations summarising the development of object detection methods under deep learning. Finally, we identify where future research will be conducted.
Detecting objects in images is an extremely important step in many image and video analysis applications. Object detection is considered as one of the main challenges in the field of computer vision, which focuses on identifying and locating objects of different classes in an image. In this paper, we aim to highlight the important role of deep learning and convolutional neural networks in particular in the object detection task. We analyze and focus on the various state-of-the-art convolutional neural networks serving as a backbone in object detection models. We test and evaluate them in the common datasets and benchmarks up-to-date. We Also outline the main features of each architecture. We demonstrate that the application of some convolutional neural network architectures has yielded very promising state-of-the-art results in image classification in the first place and then in the object detection task. The results have surpassed all the traditional methods, and in some cases, outperformed the human being's performance.
The number of images produced each day increased significantly. The ability to detect and correct an image's orientation can provide several advantages in computer vision. This paper presents a new framework based on a transfer learning technique for automatically detecting image orientation. To implement the power of deep neural networks, we applied a convolutional neural network model pretrained on the ImageNet database for feature extraction. Then, we built a multi-class logistic regression classifier to detect the four image orientation probabilities corresponding to the following orientations (0 for no orientation, 90, 180, and 270). We tested our model on the SUN-397 dataset, one of the most extensive data sets currently used for image-orientation detection tasks. We conducted a cross-dataset evaluation for in-depth testing and analysis. We also examined our model using different old and recent state-of-the-art convolutional neural network (CNN) baselines. We demonstrate that our model yields promising results based on transfer learning for feature extraction combined with a one-vs-rest logistic regression classifier. Our proposed model surpassed the state-of-the-art results in terms of accuracy and performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.