This paper shows how information in digital collections that have been catalogued using highquality metadata can be retrieved more easily by users of search engines such as Google. The research and proposals described arose from an investigation into the observed phenomenon that pages from the Glasgow Digital Library (gdl.cdlr.strath.ac.uk) were regularly appearing near the top of Google search results shortly after publication, without any deliberate effort to achieve this. The reasons for this phenomenon are now well understood and are described in the second part of the paper. The first part provides context with a review of the impact of Google and a summary of recent initiatives by commercial publishers to make their content more visible to search engines.The literature research provides firm evidence of a trend amongst publishers to ensure that their online content is indexed by Google, in recognition of its popularity with Internet users. The practical research demonstrates how search engine accessibility can be compatible with use of established collection management principles and high-quality metadata.The concept of data shoogling is introduced, involving some simple techniques for metadata optimisation. Details of its practical application are given, to illustrate how those working in academic, cultural and public-sector organisations could make their digital collections more easily accessible via search engines, without compromising any existing standards and practices.
For many years metadata has been recognised as a significant component of the digital information environment. Substantial work has gone into creating complex metadata schemes for describing digital content. Yet increasingly Web search engines, and Google in particular, are the primary means of discovering and selecting digital resources, although they make little use of metadata. This article considers how digital libraries can gain more value from their metadata by adapting it for Google users, while still following well-established principles and standards for cataloguing and digital preservation.This article introduces the concepts of functional and variable metadata, and explains why they may be of value to users and managers of digital libraries that rely on Web searching as a significant means of resource discovery.Functional means something that works, so "functional metadata" is used here to mean metadata that fulfils its primary function of assisting information retrieval. Not all metadata does this in a Web-based world.Variable means something that may change, so "variable metadata" is used to refer to metadata that may vary according to context. This is not the same as "dynamic metadata", which has been used to describe educational metadata that can influence the behaviour of multimedia learning objects (El Saddik, 2000).In order to consider why functionality and variability might be useful qualities for metadata, it is necessary to acknowledge the dominance of the Web and of Google as means of access and resource discovery for digital libraries. The current pre-eminence of Google extends well beyond the Web: a recent survey, drawing on users from 85 countries, rated Google as the world's number one brand name, above Apple, Mini, Coca-Cola, Samsung, Ikea and Nokia (Brandchannel.com, 2004). One might think information professionals would be delighted that an information retrieval tool had become the world's leading brand. However, some librarians have been known to denigrate Google because it "doesn't work". Given that it can search millions of documents for thousands of users simultaneously, and deliver useful results within seconds, this is clearly a specialist interpretation of "doesn't work". Yet one can understand the sentiment. Many of the things that librarians take for granted simply are not possible with Google. Trying to find articles written by Tony Blair, as opposed to those written about him, is difficult. A library catalogue system would make this easy, as "Blair, Tony" would be entered in the author field. But in a library system items are not normally catalogued at the article level, so the search might produce zero hits even though the retrieval system worked perfectly.This basic problem illustrates the need for functional metadata (and the value of article-level retrieval). In the past cataloguers have been able to concentrate on capturing the metadata of an
This version is available at https://strathprints.strath.ac.uk/2319/ Strathprints is designed to allow users to access the research output of the University of Strathclyde. Unless otherwise explicitly stated on the manuscript, Copyright © and Moral Rights for the papers on this site are retained by the individual authors and/or other copyright owners. Please check the manuscript for details of any other licences that may have been applied. You may not engage in further distribution of the material for any profitmaking activities or any commercial gain. You may freely distribute both the url (https://strathprints.strath.ac.uk/) and the content of this paper for research or private study, educational, or not-for-profit purposes without prior permission or charge.Any correspondence concerning this service should be sent to the AbstractThe role of DDC in the ongoing HILT (High-level Thesaurus) project is discussed. A phased initiative, funded by JISC in the UK, HILT addresses an issue of likely interest to anyone serving users wishing to cross-search or cross-browse groups of networked information services, whether at regional, national or international level -the problem of subject-based retrieval from multiple sources using different subject schemes for resource description.Although all three phases of HILT to date are covered, the primary concern is with the subject interoperability solution piloted in phase II, and with the use of DDC as a spine in that approach.
This article examines the responsibilities of libraries and librarians as Internet information publishers, in view of the popularity of Google amongst users. It argues that librarians should think explicitly about Google users whenever they publish on the web, and should be prepared to update their policies and procedures accordingly. Drawing on experience and practical examples of publishing ebooks and other collections within the Glasgow Digital Library, the article describes procedures that libraries can adopt to ensure that their publications are optimised for access by users of Google and other web search engines. The aim of these procedures is to enhance resource discovery and information retrieval, and to enhance the reputation of libraries as valued custodians of published information, as well as exemplars of good practice in information management.
The BUBL Information Service has recently moved to a new location and undergone major reorganisation and enhancement. This article outlines the main components of the new service and highlights some of its distinctive features. [Anicle copies available for a fee from The Haworth Document Delivery Service: 1-800-342-9678. E-mail address: getinfo@haworth.com]In the hyperspeed reality of the Web world, a venerable institution is anything that's been around for longer than a year. This quote is from a recent British newspaper article (Guardian Online 23 January 1997). The venerable BUBL has now been operating for six years, as outlined by Joanne Gold recently in The Serials Librarian.l In fact BUBL is now so old that it has just had a facelift, undergone major surgery and moved to a new home in a new country. It is now entirely based at Strathclyde University in Scotland. This sadly means saying goodbye to the much-loved bubl.bath.ac.uk, but the new address is much shorter:Alan Dawson is BUBL information Service Manager, Andersonian Library,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.