The amount of text people need to read and understand grows daily. Software defaults, designers, or publishers often choose the fonts people read in. However, matching individuals with a faster font could help them cope with information overload. We collaborated with typographers to (1) select eight fonts designed for digital reading to systematically compare their efectiveness and to (2) understand how font and reader characteristics afect reading speed. We collected font preferences, reading speeds, and characteristics from 252 crowdsourced participants in a remote readability study. We use font and reader characteristics to train FontMART, a learning to rank model that automatically orders a set of eight fonts per participant by predicted reading speed. FontMART's fastest font prediction shows an average increase of 14-25 WPM compared to other font defaults, without hindering comprehension. This encouraging evidence provides motivation for adding our personalized font recommendation to future interactive systems.
Human Intelligence Tasks (HITs) allow people to collect and curate labeled data from multiple annotators. Then labels are often aggregated to create an annotated dataset suitable for supervised machine learning tasks. The most popular label aggregation method is majority voting, where each item in the dataset is assigned the most common label from the annotators. This approach is optimal when annotators are unbiased domain experts. In this paper, we propose Debiased Label Aggregation (DLA) an alternative method for label aggregation in subjective HITs, where cross-annotator agreement varies. DLA leverages user voting behavior patterns to weight labels. Our experiments show that DLA outperforms majority voting in several performance metrics; e.g. a percentage increase of 20 points in the š¹ 1 measure before data augmentation, and a percentage increase of 35 points in the same measure after data augmentation. Since DLA is deceptively simple, we hope it will help researchers to tackle subjective labeling tasks. CCS Concepts: ā¢ Information systems ā Crowdsourcing; ā¢ Human-centered computing ā Collaborative and social computing design and evaluation methods; ā¢ Computing methodologies ā Supervised learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citationsācitations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.