Abstract:Google books has been criticized for faulty metadata, problems with search functionality, copyright infringement, and legibility of scanned texts. This paper explores only the legibility of texts scanned by Google Books. A review of 2500 pages from 50 randomly selected books was undertaken. The results show less than 1% of pages had errors that affected their legibility. This paper concludes that while Google Books is not perfect, the majority of texts sampled were legible.Introduction:
This article reports on a study of error rates found in the metadata records of texts scanned by the Google Books digitization project. A review of the author, title, publisher, and publication year metadata elements for 400 randomly selected Google Books records was undertaken. The results show 36% of sampled books in the digitization project contained metadata errors. This error rate is higher than one would expect to find in a typical library online catalog.
Structured Abstract:Purpose -This article reports on a quantitative study of massive digital library (MDL) Google Books' coverage of Hawaiian and Pacific Books.Design/methodology/approach -A total of 1,500 books were randomly selected from the University of Hawai'i at Mānoa's Hawaiian, Pacific, and general stacks collections. Their level of access was then determined in Google Books by observing whether the books had a metadata record, were full-text searchable, and whether they were available as in snippet, preview, or full-text views.Findings -Results show that Google Books has a sizable number of metadata records for Hawaiian and Pacific books, but has only a limited number available for full-text searching. In contrast, a larger number of books from the general stacks were available for full-text searching.Research limitations/implications -Because of the small sample size, margins of error remain large. The field would benefit from a larger size of collection sample. The scope of the project is also limited to Google Books and does not investigate other book digitization projects.Practical implications -Diversity in librarianship is a major concern for libraries both within the United States, as in the case of historically underrepresented groups as well as in non-Englishspeaking countries.Social implications -Diversity in librarianship also concerns the central mission of libraries to provide the basic human right of access to information. Digital libraries must be held to the same standards.Originality/value -Massive digital libraries such as Google Books need to be more carefully examined; this study contributes to this need.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.