The revolution in scientific publishing that has been promised since the 1980s is about to take place. Scientists have always read strategically, working with many articles simultaneously to search, filter, scan, link, annotate, and analyze fragments of content. An observed recent increase in strategic reading in the online environment will soon be further intensified by two current trends: (i) the widespread use of digital indexing, retrieval, and navigation resources and (ii) the emergence within many scientific disciplines of interoperable ontologies. Accelerated and enhanced by reading tools that take advantage of ontologies, reading practices will become even more rapid and indirect, transforming the ways in which scientists engage the literature and shaping the evolution of scientific publishing.
Markup practices can affect the move tozoard systems that support scholars in the process of thinking and writing. Whereas procedural and presentational markup systems retard that movement, descriptive markup systems accelerate the pace by simplifying mechanical tasks and allowing the authors to focus their attention on the content.
The integration of heterogeneous data in varying formats and from diverse communities requires an improved understanding of the concept of a dataset, and of key related concepts, such as format, encoding, and version. Ultimately, a normative formal framework of such concepts will be needed to support the effective curation, integration, and use of shared multi-disciplinary scientific data. To prepare for the development of this framework we reviewed the definitions of dataset found in technical documentation and the scientific literature. Four basic features can be identified as common to most definitions: grouping, content, relatedness, and purpose. In this summary of our results we describe each of these features, indicating the directions a more formal analysis might take.
THE WAY IN WHICH TEXT IS represented on a computer affects the kinds of uses to which it can be put by its creator and by subsequent users. The electronic document model currently in use is impoverished and restrictive. The authors argue that text is best represented as an ordered hierarchy of content object (OHCO), because that is what text really is. This model conforms with emerging standards such as SGML and contains within it advantages for the writer, publisher, and researcher. The authors then describe how the hierarchical model can allow future use and reuse of the document as a database, hypertext, or network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.