Video tag annotations have become a useful and powerful feature to facilitate video search in many social media and web applications. The majority of tags assigned to videos are supplied by users -a task which is time consuming and may result in annotations that are subjective and lack precision. A number of studies have utilized content-based extraction techniques to automate tag generation. However, these methods are compute-intensive and challenging to apply across domains. Here, we describe a complementary approach for generating tags based on the geographic properties of videos. With today's sensor-equipped smartphones, the location and orientation of a camera can be continuously acquired in conjunction with the captured video stream. Our novel technique utilizes these sensor meta-data to automatically tag outdoor videos in a two step process. First, we model the viewable scenes of the video as geometric shapes by means of its accompanied sensor data and determine the geographic objects that are visible in the video by querying geo-information databases through the viewable scene descriptions. Subsequently we extract textual information about the visible objects to serve as tags. Second, we define six criteria to score the tag relevance and rank the obtained tags based on these scores. Then we associate the tags with the video and the accurately delimited segments of the video. To evaluate the proposed technique we implemented a prototype tag generator and conducted a user study. The results demonstrate significant benefits of our method in terms of automation and tag utility.
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results as the compression process often muddles the semantics between various planes. Besides, these datadriven approaches impose an urgent demand for massive data annotations, which are laborious and time-consuming. For the first problem, we propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics. DOPNet consists of three modules that are integrated to deliver distortion-free, semantics-clean, and detail-sharp disentangled representations, which benefit the subsequent layout recovery. For the second problem, we present an unsupervised adaptation technique tailored for horizon-depth and ratio representations. Concretely, we introduce an optimization strategy for decision-level layout analysis and a 1D cost volume construction method for feature-level multi-view aggregation, both of which are designed to fully exploit the geometric consistency across multiple perspectives. The optimizer provides a reliable set of pseudo-labels for network training, while the 1D cost volume enriches each view with comprehensive scene information derived from other perspectives. Extensive experiments demonstrate that our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
Peer-to-Peer (P2P) applications generate large amounts of Internet network traffic. The wide-reaching connectivity of P2P systems is creating resource inefficiencies for network providers. Recent studies have demonstrated that localizing cross-ISP (Internet service provider) traffic can mitigate this challenge. However, bandwidth sensitivity and display quality requirements complicate the ISP-friendly design for live streaming systems. To this date, although some prior techniques focusing on live streaming systems exist, the correlation between traffic localization and streaming quality guarantee has not been well explored. Additionally, the proposed solutions are often not easy to apply in practice. In our presented work, we demonstrate that the cross-ISP traffic of P2P live streaming systems can be significantly reduced with little impact on the streaming quality. First, we analytically investigate and quantify the tradeoff between traffic localization and streaming quality guarantee, determining the lower bound of the inter-AS (autonomous system) streaming rate below which streaming quality cannot be preserved. Based on the analysis, we further propose a practical ISP-friendly solution, termed IFPS , which requires only minor changes to the peer selection mechanism and can easily be integrated into both new and existing systems. Additionally, the significant opportunity for localizing traffic is underscored by our collected traces from PPLive, which also enabled us to derive realistic parameters to guide our simulations. The experimental results demonstrate that IFPS reduces cross-ISP traffic from 81% up to 98% while keeping streaming quality virtually unaffected.
Estimating the depth of omnidirectional images is more challenging than that of normal field-of-view (NFoV) images because the varying distortion can significantly twist an object's shape. The existing methods suffer from troublesome distortion while estimating the depth of omnidirectional images, leading to inferior performance. To reduce the negative impact of the distortion influence, we propose a distortiontolerant omnidirectional depth estimation algorithm using a dual-cubemap. It comprises two modules: Dual-Cubemap Depth Estimation (DCDE) module and Boundary Revision (BR) module. In DCDE module, we present a rotation-based dual-cubemap model to estimate the accurate NFoV depth, reducing the distortion at the cost of boundary discontinuity on omnidirectional depths. Then a boundary revision module is designed to smooth the discontinuous boundaries, which contributes to the precise and visually continuous omnidirectional depths. Extensive experiments demonstrate the superiority of our method over other state-of-the-art solutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.