When reconstructing a 3D object, it is difficult to obtain accurate 3D geometric information using a single camera. In order to capture detailed geometric information of a 3D object, it is inevitable to increase the number of cameras to capture the object. However, cameras need to be synchronized in order to simultaneously capture frames. If cameras are incorrectly synchronized, many artifacts are produced in the reconstructed 3D object. The RealSense RGB-D camera, which is commonly used for obtaining geometric information of a 3D object, provides synchronization modes to mitigate synchronization errors. However, the synchronization modes provided by theRealSense cameras can only sync depth cameras and have limitations in the number of cameras that can be synchronized using a single host due to the hardware issue of stable data transmission. Therefore, in this paper, we propose a novel synchronization method that synchronizes an arbitrary number of RealSense cameras by adjusting the number of hosts to support stable data transmission. Our method establishes a master–slave architecture in order to synchronize the system clocks of the hosts. While synchronizing the system clocks, delays that resulted from the process of synchronization were estimated so that the difference between the system clocks could be minimized. Through synchronization of the system clocks, cameras connected to the different hosts can be synchronized based on the timestamp of the data received by the hosts. Thus, our method synchronizes theRealSense cameras to simultaneously capture accurate 3D information of an object at a constant frame rate without dropping it.
Deep learning-based object detection is one of the most popular research topics. However, in cases where large-scale datasets are unavailable, the training of detection models remains challenging due to the data-driven characteristics of deep learning. Small object detection in infrared images is such a case. To solve this problem, we propose a YOLOv5-based framework with a novel training strategy based on the domain adaptation method. First, an auxiliary domain classifier is combined with the YOLOv5 architecture to compose a detection framework that is trainable using datasets from multiple domains while maintaining calculation costs in the inference stage. Secondly, a new loss function based on Wasserstein distance is proposed to deal with small-sized objects by overcoming the problem of the intersection over union sensitivity problem in small-scale cases. Then, a model training strategy inspired from domain adaptation and knowledge distillation is presented. Using the domain confidence output of the domain classifier as a soft label, domain confusion loss is backpropagated to force the model to extract domain-invariant features while training the model with datasets with imbalanced distributions. Additionally, we generate a synthetic dataset in both the visible light and infrared spectrum to overcome the data shortage. The proposed framework is trained on the MS COCO, VEDAI, DOTA, ADAS Thermal datasets along with a constructed synthetic dataset for human detection and vehicle detection tasks. The experimental results show that the proposed framework achieved the best mean average precision (mAP) of 64.7 and 57.5 in human and vehicle detection tasks. Additionally, the ablation experiment shows that the proposed training strategy can improve the performance by training the model to extract domain-invariant features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.