The management of electronic document collections is fundamentally different from the management of paper documents. The ephemeral nature of some electronic documents means that the document address (i.e., reference details of the document) can become incorrect some time after coming into use, resulting in references, such as index entries and hypertext links, failing to correctly address the document they describe. A classic case of invalidated references is on the World Wide Web—links that point to a named resource fail when the domain name, file name, or any other aspect of the addressed resource is changed, resulting in the well-known Error 404. Additionally, there are other errors which arise from changes to document collections.
This paper surveys the strategies used both in World Wide Web software and other hypertext systems for managing the integrity of references and hence the integrity of links. Some strategies are
preventative
, not permitting errors to occur; others are
corrective
, discovering references errors and sometimes attempting to correct them; while the last strategy is
adaptive
, because references are calculated on a just-in-time basis, according the current state of the document collection.