Deep Convolutional Neural Networks (CNN) are the state-of-the-art performers for the object detection task. It is well known that object detection requires more computation and memory than image classification. In this work, we propose LCDet, a fully-convolutional neural network for generic object detection that aims to work in embedded systems. We design and develop an end-to-end TensorFlow(TF)-based model. The detection works by a single forward pass through the network. Additionally, we employ 8-bit quantization on the learned weights. As a use case, we choose face detection and train the proposed model on images containing a varying number of faces of different sizes. We evaluate the face detection performance on publicly available dataset FDDB and Widerface. Our experimental results show that the proposed method achieves comparative accuracy comparing with state-ofthe-art CNN-based face detection methods while reducing the model size by 3× and memory-BW by 3 − 4× comparing with one of the best real-time CNN-based object detector YOLO [23]. Our 8-bit fixed-point TF-model provides additional 4× memory reduction while keeping the accuracy nearly as good as the floating point model and achieves 20× performance gain compared to the floating point model. Thus the proposed model is amenable for embedded implementations and is generic to be extended to any number of categories of objects.
Frame rate up conversion (FRUC) methods that employ motion have been proven to provide better image quality compared to nonmotion-based methods. While motion-based methods improve the quality of interpolation, artifacts are introduced in the presence of incorrect motion vectors. In this paper, we study the design problem of optimal temporal interpolation filter for motion-compensated FRUC (MC-FRUC). The optimal filter is obtained by minimizing the prediction error variance between the original frame and the interpolated frame. In FRUC applications, the original frame that is skipped is not available at the decoder, so models for the power spectral density of the original signal and prediction error are used to formulate the problem. The closed-form solution for the filter is obtained by Lagrange multipliers and statistical motion vector error modeling. The effect of motion vector errors on resulting optimal filters and prediction error is analyzed. The performance of the optimal filter is compared to nonadaptive temporal averaging filters by using two different motion vector reliability measures. The results confirm that to improve the quality of temporal interpolation in MC, the interpolation filter should be designed based on the reliability of motion vectors and the statistics of the MC prediction error.
Frame rate up conversion methods that employ motion have been proven to provide better image quality compared to non-motion based methods. While motion based methods improve the performance of interpolation; artifacts are introduced in the presence of incorrect motion vectors. In this paper, the effect of motion vector errors on the efficiency of motion compensated frame rate up conversion (MC-FRUC) and temporal interpolation filter is analyzed. The problem of MC-FRUC is investigated by combining blocks with different motion vector errors as opposed to equal error assumption. A general expression for temporal interpolation filter is derived. To improve the efficiencyof the prediction, it is shown that the'interpolator should be able to switch between uni-directional or bi-directional modes based on the motion vector reliability. 0-7803-8104-1/03/$17.00 82003 IEEE
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.