International audienceTraditionally, audio quality and video quality are evaluated separately in subjective tests. Best practices within the quality assessment community were developed before many modern mobile audiovisual devices and services came into use, such as internet video, smart phones, tablets and connected televisions. These devices and services raise unique questions that require jointly evaluating both the audio and the video within a subjective test. However, audiovisual subjective testing is a relatively under-explored field. In this paper, we address the question of determining the most suitable way to conduct audiovisual subjective testing on a wide range of audiovisual quality. Six laboratories from four countries conducted a systematic study of audiovisual subjective testing. The stimuli and scale were held constant across experiments and labs; only the environment of the subjective test was varied. Some subjective tests were conducted in controlled environments and some in public environments (a cafeteria, patio or hallway). The audiovisual stimuli spanned a wide range of quality. Results show that these audiovisual subjective tests were highly repeatable from one laboratory and environment to the next. The number of subjects was the most important factor. Based on this experiment, 24 or more subjects are recommended for Absolute Category Rating (ACR) tests. In public environments, 35 subjects were required to obtain the same Student.s t-test sensitivity. The second most important variable was individual differences between subjects. Other environmental factors had minimal impact, such as language, country, lighting, background noise, wall color, and monitor calibration. Analyses indicate that Mean Opinion Scores (MOS) are relative rather than absolute. Our analyses show that the results of experiments done in pristine, laboratory environments are highly representative of those devices in actual use, in a typical user environment
We explore the quality impact when audiovisual content is delivered to different mobile devices. Subjects were shown the same sequences on five different mobile devices and a broadcast quality television. Factors influencing quality ratings include video resolution, viewing distance, and monitor size. Analysis shows how subjects' perception of multimedia quality differs when content is viewed on different mobile devices. In addition, quality ratings from laboratory and simulated living room sessions were statistically equivalent.
We focus on electronic theses and dissertations (ETDs), aiming to improve access and expand their utility, since more than 6 million are publicly available, and they constitute an important corpus to aid research and education across disciplines. The corpus is growing as new born-digital documents are included, and since millions of older theses and dissertations have been converted to digital form to be disseminated electronically in institutional repositories. In ETDs, as with other scholarly works, figures and tables can communicate a large amount of information in a concise way. Although methods have been proposed for extracting figures and tables from born-digital PDFs, they do not work well with scanned ETDs. Considering this problem, our assessment of state-of-the-art figure extraction systems is that the reason they do not function well on scanned PDFs is that they have only been trained on born-digital documents. To address this limitation, we present ScanBank, a new dataset containing 10 thousand scanned page images, manually labeled by humans as to the presence of the 3.3 thousand figures or tables found therein. We use this dataset to train a deep neural network model based on YOLOv5 to accurately extract figures and tables from scanned ETDs. We pose and answer important research questions aimed at finding better methods for figure extraction from scanned documents. One of those concerns the value for training, of data augmentation techniques applied to born-digital documents which are used to train models better suited for figure extraction from scanned documents. To the best of our knowledge, ScanBank is the first manually annotated dataset for figure and table extraction for scanned ETDs. A YOLOv5-based model, trained on ScanBank, outperforms existing comparable open-source and freely available baseline methods by a considerable margin.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.