Non-professional video, commonly known as User Generated Content (UGC) has become very popular in todays video sharing applications. However, traditional metrics used in compression and quality assessment, like BD-Rate and PSNR, are designed for pristine originals. Thus, their accuracy drops significantly when being applied on non-pristine originals (the majority of UGC). Understanding difficulties for compression and quality assessment in the scenario of UGC is important, but there are few public UGC datasets available for research. This paper introduces a large scale UGC dataset (1500 20 sec video clips) sampled from millions of YouTube videos. The dataset covers popular categories like Gaming, Sports, and new features like High Dynamic Range (HDR). Besides a novel sampling method based on features extracted from encoding, challenges for UGC compression and quality evaluation are also discussed. Shortcomings of traditional reference-based metrics on UGC are addressed. We demonstrate a promising way to evaluate UGC quality by no-reference objective quality metrics, and evaluate the current dataset with three no-reference metrics (Noise, Banding, and SLEEQ).
Adaptive bit rate (ABR) streaming is one enabling technology for video streaming over modern throughput-varying communication networks. A widely used ABR streaming method is to adapt the video bit rate to channel throughput by dynamically changing the video resolution. Since videos have different ratequality performances at different resolutions, such ABR strategy can achieve better rate-quality trade-off than single resolution ABR streaming. The key problem for resolution switched ABR is to work out the bit rate appropriate at each resolution. In this paper, we investigate optimal strategies to estimate this bit rate using both quantitative and subjective quality assessment. We use the design of bitrates for 2K and 4K resolutions as an example of the performance of this strategy. We introduce strategies for selecting an appropriate corpus for subjective assessment and find that at this high resolution there is good agreement between quantitative and subjective analysis. The optimal switching bit rate between 2K and 4K resolutions is 4 Mbps.
Location based services are proving to be the next driving factors for growth in smartphones. While GPS solves the problem of accurate localization in outdoor environments, indoor localization is still an area of active research. Emergence of new generation smartphones with low cost sensors, have provided an effective way of indoor localization by pedestrian dead reckoning (PDR). We propose a robust mechanism for detecting the step of a person and estimating his step length. Our system is independent of the location and orientation of the device. Our system is shown to perform 45% better than the traditional PDR systems proposed in prior-art. Another important problem in PDR system is determining the orientation of the mobile device and the direction of user motion. Many systems assume the device to be oriented in the direction of the user motion. Some of the recent systems use accelerometer, magnetometer patterns and PCA to detect the direction of user orientation. We propose a system which uses map matching and particle filtering to determine the direction of user motion. We tabulate our findings on the feasibility of such a system.
Given the proliferation of consumer media recording devices, events often give rise to a large number of recordings. These recordings are taken from different spatial positions and do not have reliable timestamp information. In this paper, we present two robust graph-based approaches for synchronizing multiple audio signals. The graphs are constructed atop the over-determined system resulting from pairwise signal comparison using cross-correlation of audio features. The first approach uses a Minimum Spanning Tree (MST) technique, while the second uses Belief Propagation (BP) to solve the system. Both approaches can provide excellent solutions and robustness to pairwise outliers, however the MST approach is much less complex than BP. In addition, an experimental comparison of audio features-based synchronization shows that spectral flatness outperforms the zero-crossing rate and signal energy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.