The vast volume of documents available in legal databases demands effective information retrieval approaches which take into consideration the intricacies of the legal domain. Relevant document retrieval is the backbone of the legal domain. The concept of relevance in the legal domain is very complex and multi-faceted. In this work, we propose a novel approach of concept based similarity estimation among court judgments. We use a graph-based method, to identify prominent concepts present in a judgment and extract sentences representative of these concepts. The sentences and concepts so mined are used to express/visualize likeness among concepts between a pair of documents from different perspectives. We also propose to aggregate the different levels of matching so obtained into one measure quantifying the level of similarity between a judgment pair. We employ the ordered weighted average (OWA) family of aggregation operators for obtaining the similarity value. The experimental results suggest that the proposed approach of concept based similarity is effective in the extraction of relevant legal documents and performs better than other competing techniques. Additionally, the proposed two-level abstraction of similarity enables informative visualization for deeper insights into case relevance.
The advent of technology has led to rise in data being captured, stored and analyzed. The requirement of improving the computational models along with managing the voluminous data is a primary concern. The transition of the High Performance Computing from catering to traditional problems to the newer domains like finance, healthcare etc. necessitates the joint analytical model to include Big Data. The rise of Big Data and subsequently Big Data analytics has changed the entire perspective of data and data handling. Ever growing analytical needs for Big Data can be satisfied with extremely high performance computing models. As a result of enormous research in this field, recent years have seen the emergence diverse paradigms for Big Data analytics. With the spread of Big Data analytics in varied domains, newer concerns regarding the effectiveness of analytical paradigms are also observed. This paper highlights the major analytical models and concerns and challenges in High Performance Data Analytics.
Information retrieval (IR) is an automatic mechanism to extract required information from a collection of unstructured or semi-structured data. IR systems minimize the effort of a user to locate the information based on the requirements. Clustering of documents is carried out as a preprocessing step for filtering irrelevant information in an IR system. Legal domain is a producer as well as consumer of huge in-formation which also contains invaluable legal knowledge and its interpretation. Knowledge based legal information retrieval systems is need of the day. Citation analysis is a technique to find the hidden relationships between the documents and is used for understanding knowledge transfer across various domains and hence becomes very important in legal domain. In this study, similarities among documents are analyzed using data clustering when applied on data of citations in court judgments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.