Classificação de Páginas de Petições iniciais Utilizando Redes Neurais Convolucionais Multimodais

Mota, Caio; Lima, Andressa; Nascimento, André; Miranda, Péricles; Mello, Rafael de

doi:10.5753/eniac.2020.12139

Cited by 4 publications

(7 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Textual and visual content are two of the four document aspects listed by Chen and Blostein [4] as possible feature sources. Image features range from fixed descriptors such as pixel density at different locations and scales [20] to approaches based on convolutional neural networks [13,8,1,30,17] such as VGG-16 [24] and MobileNetV2 [22]. Text features range from traditional methods such as latent semantic analysis [6] to pretrained word embeddings (e.g.…”

Section: Related Workmentioning

confidence: 99%

“…Engin et al [8] explore late and early fusion for the classification of Turkish banking documents, finding that both outperform unimodal methods. Mota et al [17] investigate multimodal classification of Brazilian court documents, concluding that multimodal approaches compare favourably with unimodal ones.…”

Section: Related Workmentioning

confidence: 99%

“…Therefore, our goal is to explore and evaluate methods that automatically classify document pages by combining different sources of information. Though previous work [16,17] has examined Brazilian legal document classification, we are the first to combine visual, textual, and sequential data to train better performing models. Our main contributions are:…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Sequence-aware multimodal page classification of Brazilian legal documents

Araujo

Almeida

Braz

et al. 2022

IJDAR

View full text Add to dashboard Cite

The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases-which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate our methods on a novel multimodal dataset of 6,510 lawsuits (339,478 pages) with manual annotation assigning each page to one of six classes. Each lawsuit is an ordered sequence of pages, which are stored both as an image and as a corresponding text extracted through optical character recognition. We first train two unimodal classifiers: a ResNet pretrained on ImageNet is fine-tuned on the images, and a convolutional network with filters of multiple kernel sizes is trained from scratch on document texts. We use them as extractors of visual and textual features, which are then combined through our proposed Fusion This preprint, which was originally written on 8 April 2021, has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in the International Journal on Document Analysis and Recognition (IJDAR), and

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Sequence-aware multimodal page classification of Brazilian legal documents

Araujo

Almeida

Braz

et al. 2022

IJDAR

View full text Add to dashboard Cite

show abstract

“…A disponibilização desses modelos pré-treinados são importantes pois o custo computacional para o treinamento deles do zero poderia inviabilizar muitas pesquisas. O que confirma essa importância, é que trabalhos analisados na Seção 6, especialmente [Silva and Maia 2020] e [Mota et al 2020], lançam mão dos modelos de word embedding gerados em [Hartmann et al 2017] dentro da arquitetura de Classificação Textual aplicada.…”

Section: Modelos De Linguagem Estado Da Arte Em Classificação Textual (Q1)unclassified

“…Dentre as aplicações no contexto jurídico, há a classificação de documentos para fins de migração dos processos físicos para eletrônicos [Silva and Maia 2020], ou ainda para direcionamento correto de demandas peticionadas eletronicamente, tal qual analisado em [Noguti et al 2020] e [Mota et al 2020]. Essas atividades requerem recursos humanos e horas de trabalho que podem ser destinadas a outras áreas com necessidades mais cognitivas, possibilitando a promoção de maior celeridade no Sistema de Justiça Brasileiro.…”

Section: Introductionunclassified

Text Classification in Law Area: a Systematic Review

Martins¹,

Silva²

2021

Anais Do IX Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2021)

View full text Add to dashboard Cite

Automatic Text Classification represents a great improvement in law area workflow, mainly in the migration of physical to electronic lawsuits. A systematic review of studies on text classification in law area from January 2017 up to February 2020 was conducted. The search strategy identified 20 studies, that were analyzed and compared. The review investigates from research questions: what are the state-of-art language models, its application of text classification in English and Brazilian Portuguese datasets from legal area, if there are available language models trained on Brazilian Portuguese, and datasets in Brazilian law area. It concludes that there are applications of automatic text classification in Brazil, although there is a gap on the use of language models when compared with English language dataset studies, also the importance of language model in domain pre-training to improve results, as well as there are two studies making available Brazilian Portuguese language models, and one introducing a dataset in Brazilian law area.

show abstract

Classification of Brazilian Supreme Federal Court Documents: A Comparative Study

Martins,

Silva

2024

2024 6th International Conference on Natural Language Processing (ICNLP)

View full text Add to dashboard Cite

Classificação de Páginas de Petições iniciais Utilizando Redes Neurais Convolucionais Multimodais

Cited by 4 publications

References 10 publications

Sequence-aware multimodal page classification of Brazilian legal documents

Sequence-aware multimodal page classification of Brazilian legal documents

Text Classification in Law Area: a Systematic Review

Classification of Brazilian Supreme Federal Court Documents: A Comparative Study

Contact Info

Product

Resources

About