Infrastructures are not inherently durable or fragile, yet all are fragile over the long term. Durability requires care and maintenance of individual components and the links between them. Astronomy is an ideal domain in which to study knowledge infrastructures, due to its long history, transparency, and accumulation of observational data over a period of centuries. Research reported here draws upon a long-term study of scientific data practices to ask questions about the durability and fragility of infrastructures for data in astronomy. Methods include interviews, ethnography, and document analysis. As astronomy has become a digital science, the community has invested in shared instruments, data standards, digital archives, metadata and discovery services, and other relatively durable infrastructure components. Several features of data practices in astronomy contribute to the fragility of that infrastructure. These include different archiving practices between ground-and space-based missions, between sky surveys and investigator-led projects, and between observational and simulated data. Infrastructure components are tightly coupled, based on international agreements. However, the durability of these infrastructures relies on much invisible work -cataloging, metadata, and other labor conducted by information professionals. Continual investments in care and maintenance of the human and technical components of these infrastructures are necessary for sustainability.
We analyze the people and infrastructure involved in the building, sustaining, and curation of large astronomy sky surveys. Our research assesses what new infrastructures, divisions of labor, knowledge, and expertise are necessary for the proper care of data. Between May 2011-February 2012, we conducted fourteen interviews employing Sloan Digital Sky Survey (SDSS) data use as the focus. SDSS is a multi-faceted, multi-phased data-driven telescope project with hundreds of collaborators and thousands of users of the open data. The Follow the Data interview protocol identifies a single publication authored by each interviewee and uses it as a lens looking backward and forward to identify data uses leading into and out of the publication.The interviews revealed the ways these astronomers discover, locate, retrieve, and store external data for their research. Any given astronomy research project may employ multiple methods to discover, locate, retrieve, and store multiple datasets. Our research finds that informal and formal methods are used to discover and locate data, including person-to-person contact. Data retrieval and storage methods are often determined by the size of the dataset and the amount of infrastructure available to the researcher. Astronomy research practices are evolving rapidly with access to more data and better tools. The poster presentation will report further on how those data are used and reused in astronomy.
The promise of technology-enabled, data-intensive scholarship is predicated upon access to knowledge infrastructures that are not yet in place. Scientific data management requires expertise in the scientific domain and in organizing and retrieving complex research objects. The Knowledge Infrastructures project compares data management activities of four large, distributed, multidisciplinary scientific endeavors as they ramp their activities up or down; two are big science and two are small science. Research questions address digital library solutions, knowledge infrastructure concerns, issues specific to individual domains, and common problems across domains. Findings are based on interviews (n=113 to date), ethnography, and other analyses of these four cases, studied since 2002. Based on initial comparisons, we conclude that the roles of digital libraries in scientific data management often depend upon the scale of data, the scientific goals, and the temporal scale of the research projects being supported. Digital libraries serve immediate data management purposes in some projects and long-term stewardship in others. In small science projects, data management tools are selected, designed, and used by the same individuals. In the multi-decade time scale of some big science research, data management technologies, policies, and practices are designed for anticipated future uses and users. The need for library, archival, and digital library expertise is apparent throughout all four of these cases. Managing research data is a knowledge infrastructure problem beyond the scope of individual researchers or projects. The real challenges lie in designing digital libraries to assist in the capture, management, interpretation, use, reuse, and stewardship of research data.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.