Content-based image retrieval (CBIR) focuses on video searching with fine-tuning of pre-trained off-the-shelf features. CBIR is an intuitive method for image retrieval, although it still requires labeled datasets for fine-tuning due to the inefficiency caused by annotation. Therefore, we explored an unsupervised model for feature extraction of image contents. We used a variational auto-encoder (VAE) expanding channel of neural networks and studied the activation of layer outputs. In this study, the channel expansion method boosted the capability of image retrieval by exploring more kernels and selecting a layer of comparatively activated object region. The experiment included a comparison of channel expansion and visualization of each layer in the encoder network. The proposed model achieved (52.7%) mAP, which outperformed (36.5%) the existing VAE on the MNIST dataset.
Recently, many studies on the image completion methods make us erase obstacles and fill the hole realistically but putting a new object in its place cannot be solved with the existing Image Completion. To solve this problem, this paper proposes Image Completion which filled a new object that is created through sketch image. The proposed network use pix2pix image translation model for generating object image from sketch image. The image completion network used gated convolution to reduce the weight of meaningless pixels in the convolution process. And WGAN-GP loss is used to reduce the mode dropping. In addition, by adding a contextual attention layer in the middle of the network, image completion is performed by referring to the feature value at a distant pixel. To train the models, Places2 dataset was used as background training data for image completion and Standard Dog dataset was used as training data for pix2pix. As a result of the experiment, an image of dog is generated well by sketch image and use this image as an input of the image completion network, it can generate the realistic image as a result.
Since deep learning applications in object recognition, object detection, segmentation, and image generation are needed increasingly, related research has been actively conducted. In this paper, using segmentation and style transfer together, a method of producing desired images in the desired area in real-time video is proposed. Two deep neural networks were used to enable as possible as in real-time with the trade-off relationship between speed and accuracy. Modified BiSeNet for segmentation and CycleGAN for style transfer were processed on a desktop PC equipped with two RTX-2080-Ti GPU boards. This enables real-time processing over SD video in decent level. We obtained good results in subjective quality to segment Road area in city street video and change into the Grass style at no less than 6(fps).
In this paper, DCT(Discrete Cosine Transform) and CAVLC(Context Adaptive Variable Length Coding) are co-designed as hardware IP with software operation of the other modules in H.264/AVC codec. In order to increase the operation speed, a new method using SHIFT table is proposed. As a result, enhancement of about 16(%) in the operation speed is obtained. Designed Hardware IPs are downloaded into Virtex-4 FX60 FPGA in the ML-410 development board and H.264/AVC encoding is performed with Microblaze CPU implemented in FPGA. Software modules are developed from JM13.2 to make C code. In order to verify the designed Hardware IPs, Modelsim program is used for functional simulation. As a result that all Hardware IPs and software modules are downloaded into the FPGA, improvement of processing speed about multiples of 16 in case of DCT hardware IP and multiples of 10 in case of CAVLC compared with software-only processing. Although this paper deals with co-design of H/W and S/W for H.264, it can be utilized for the other embedded system design.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.