PRILJ: an efficient two-step method based on embedding and clustering for the identification of regularities in legal case judgments

Martino, Graziella De; Pio, Gianvito; Ceci, Michelangelo

doi:10.1007/s10506-021-09297-1

Cited by 14 publications

(1 citation statement)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The numbers represent some information about each word in the text, for example, the term frequency (TF) ( Baeza-Yates & Ribeiro-Neto, 1999 ). Beyond BOW model, there are word embeddings ( Pennington, Socher & Manning, 2014 ; Bojanowski et al, 2016 ), topic modeling ( Blei, Ng & Jordan, 2003 ; Kherwa & Bansal, 2017 ), and many others ( Devlin et al, 2018 ; Peters et al, 2018 ; Brown et al, 2020 ; Pittaras et al, 2020 ; Dhanani, Mehta & Rana, 2022 ; Martino, Pio & Ceci, 2021 ; Chalkidis et al, 2020 ).…”

Section: Regression Applied To Text Datamentioning

confidence: 99%

Regression applied to legal judgments to predict compensation for immaterial damage

Pont

Sabo

Hübner

et al. 2023

PeerJ Computer Science

View full text Add to dashboard Cite

Immaterial damage compensation is a controversial matter in the judicial practice of several law systems. Due to a lack of criteria for its assessment, the judge is free to establish the value based on his/her conviction. Our research motivation is that knowing the estimated amount of immaterial damage compensation at the initial stage of a lawsuit can encourage an agreement between the parties. We thus investigate text regression techniques to predict the compensation value from legal judgments in which consumers had problems with airlines and claim for immaterial damage. We start from a simple pipeline and create others by adding some natural language processing (NLP) and machine learning (ML) techniques, which we call adjustments. The adjustments include N-Grams Extraction, Feature Selection, Overfitting Avoidance, Cross-Validation and Outliers Removal. An special adjustment, Addition of Attributes Extracted by the Legal Expert (AELE), is proposed as a complementary input to the case text. We evaluate the impact of adding these adjustments in the pipeline in terms of prediction quality and execution time. N-Grams Extraction and Addition of AELE have the biggest impact on the prediction quality. In terms of execution time, Feature Selection and Overfitting Avoidance have significant importance. Moreover, we notice the existence of pipelines with subsets of adjustments that achieved better prediction quality than a pipeline with them all. The result is promising since the prediction error of the best pipeline is acceptable in the legal environment. Consequently, the predictions will likely be helpful in a legal environment.

show abstract