The effectiveness of information retrieval technology in electronic discovery (E-discovery) has become the subject of judicial rulings and practitioner controversy. The scale and nature of E-discovery tasks, however, has pushed traditional information retrieval evaluation approaches to their limits. This paper reviews the legal and operational context of E-discovery and the approaches to evaluating search technology that have evolved in the research community. It then describes a multi-year effort carried out as part of the Text Retrieval Conference to The first three sections of this article draw upon material in the introductory sections of two papers presented at events associated with the 11th and 12th International Conferences on Artificial Intelligence and Law (ICAIL) (Baron and Thompson 2007; Zhao et al. 2009) as well as material first published in (Baron 2008), with permission.develop evaluation methods for responsive review tasks in E-discovery. This work has led to new approaches to measuring effectiveness in both batch and interactive frameworks, large data sets, and some surprising results for the recall and precision of Boolean and statistical information retrieval methods. The paper concludes by offering some thoughts about future research in both the legal and technical communities toward the goal of reliable, effective use of information retrieval in E-discovery.
The paper is structured in a way to facilitate the conference presentation by Anne Thurston, * * on how the lessons coming out of the recent US e-recordkeeping initiative may be applied on an international basis, with a focus on lower resource countries. What follows then is a summary of US policy, with open-ended questions ("Discussion Points") aimed at facilitating additional dialogue.
Lawyers and their large institutional clients increasingly face the enormous problem of how to efficiently and efficaciously conduct searches for relevant documents in large heterogeneous electronic data sets, for the purpose of responding to litigation demands. Past research indicates that lawyers greatly overestimate their true rate of recall in civil discovery. The unprecedented size, scale, and complexity of electronically stored data now potentially subject to routine capture in litigation, for purpose of preservation, access, and review, presents information retrieval researchers with a series of important challenges to overcome. This paper describes the current context of e-discovery and discusses the potential for IR and AI research to address the challenges of conducting e-discovery. The TREC Legal Track is presented as a forum for the evaluation of e-discovery research and one new evaluation measure, elusion, is described, which has potential for addressing problems of measuring recall.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.