Robert Keeling scite author profile

Robert Keeling

5Publications

57Citation Statements Received

55Citation Statements Given

How they've been cited

How they cite others

Affiliations

Sidley Austin, Fujikura (United States)

Publications

Order By: Most citations

Explainable Text Classification in Legal Document Review A Case Study of Explainable Predictive Coding

Chhatwal

Gronvall

Huber-Fliflet

et al. 2018

View full text Add to dashboard Cite

In today's legal environment, lawsuits and regulatory investigations require companies to embark upon increasingly intensive data-focused engagements to identify, collect and analyze large quantities of data. When documents are staged for reviewwhere they are typically assessed for relevancy or privilegethe process can require companies to dedicate an extraordinary level of resources, both with respect to human resources, but also with respect to the use of technology-based techniques to intelligently sift through data. Companies regularly spend millions of dollars producing 'responsive' electronically-stored documents for these types of matters. For several years, attorneys have been using a variety of tools to conduct this exercise, and most recently, they are accepting the use of machine learning techniques like text classification (referred to as predictive coding in the legal industry) to efficiently cull massive volumes of data to identify responsive documents for use in these matters. In recent years, a group of AI and Machine Learning researchers have been actively researching Explainable AI. In an explainable AI system, actions or decisions are human understandable. In typical legal 'document review' scenarios, a document can be identified as responsive, as long as one or more of the text snippets (small passages of text) in a document are deemed responsive. In these scenarios, if predictive coding can be used to locate these responsive snippets, then attorneys could easily evaluate the model's document classification decision. When deployed with defined and explainable results, predictive coding can drastically enhance the overall quality and speed of the document review process by reducing the time it takes to review documents. Moreover, explainable predictive coding provides lawyers with greater confidence in the results of that supervised learning task. The authors of this paper propose the concept of explainable predictive coding and simple explainable predictive coding methods to locate responsive snippets within responsive documents. We also report our preliminary experimental results using the data from an actual legal matter that entailed this type of document review. The purpose of this paper is to demonstrate the feasibility of explainable predictive coding in the context of professional services in the legal space. Keywords-machine learning, text categorization, explainable AI, predictive coding, explainable predictive coding, legal document reviewI.

show abstract

Empirical evaluations of preprocessing parameters' impact on predictive coding's effectiveness

Chhatwal

Huber-Fliflet²,

Keeling

et al. 2016

View full text Add to dashboard Cite

Empirical evaluations of active learning strategies in legal document review

Chhatwal¹,

Huber-Fliflet²,

Keeling

et al. 2017

View full text Add to dashboard Cite

One type of machine learning, text classification, is now regularly applied in legal matters involving voluminous document populations because it can reduce the time and expense associated with the review of those documents. One form of machine learning -Active Learning -has drawn attention from the legal community because it offers the potential to make the machine learning process even more effective. Active Learning, applied to legal documents, is considered a new technology in the legal domain and is continuously applied to all documents in a legal matter until an insignificant number of relevant documents are left for review. This implementation is slightly different than traditional implementations of Active Learning where the process stops once achieving acceptable model performance. The purpose of this paper is twofold: (i) to question whether Active Learning actually is a superior learning methodology and (ii) to highlight the ways that Active Learning can be most effectively applied to real legal industry data. Unlike other studies, our experiments were performed against large data sets taken from recent, real-world legal matters covering a variety of areas. We conclude that, although these experiments show the Active Learning strategy popularly used in legal document review can quickly identify informative training documents, it becomes less effective over time. In particular, our findings suggest this most popular form of Active Learning in the legal arena, where the highest-scoring documents are selected as training examples, is in fact not the most efficient approach in most instances. Ultimately, a different Active Learning strategy may be best suited to initiate the predictive modeling process but not to continue through the entire document review.

show abstract

Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

Keeling

Chhatwal²,

Huber-Fliflet³

et al. 2019

View full text Add to dashboard Cite

Research has shown that Convolutional Neural Networks (CNN) can be effectively applied to text classification as part of a predictive coding protocol. That said, most research to date has been conducted on data sets with short documents that do not reflect the variety of documents in real world document reviews. Using data from four actual reviews with documents of varying lengths, we compared CNN with other popular machine learning algorithms for text classification, including Logistic Regression, Support Vector Machine, and Random Forest. For each data set, classification models were trained with different training sample sizes using different learning algorithms. These models were then evaluated using a large randomly sampled test set of documents, and the results were compared using precision and recall curves. Our study demonstrates that CNN performed well, but that there was no single algorithm that performed the best across the combination of data sets and training sample sizes. These results will help advance research into the legal profession's use of machine learning algorithms that maximize performance.

show abstract

An Empirical Study of the Application of Machine Learning and Keyword Terms Methodologies to Privilege-Document Review Projects in Legal Matters

Gronvall

Huber-Fliflet

Zhang

et al. 2018

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Robert Keeling

Explainable Text Classification in Legal Document Review A Case Study of Explainable Predictive Coding

Empirical evaluations of preprocessing parameters' impact on predictive coding's effectiveness

Empirical evaluations of active learning strategies in legal document review

Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

An Empirical Study of the Application of Machine Learning and Keyword Terms Methodologies to Privilege-Document Review Projects in Legal Matters

Contact Info

Product

Resources

About