PyEuroVoc: A Tool for Multilingual Legal Document Classification with EuroVoc Descriptors

Avram, Andrei; Pais, V.; Tufiș, Dan

doi:10.26615/978-954-452-072-4_012

Cited by 6 publications

(4 citation statements)

References 13 publications

(8 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Although not directly comparable due to the use of different datasets/annotation schemas, the performance of our models is in line with current stateof-the-art approaches to legal text classification (Chalkidis et al, 2019;Avram et al, 2021), which however consider the full text of legal acts.…”

Section: Evaluation and Error Analysissupporting

confidence: 59%

“…In the legal domain, text classification has an established tradition, both in the monolingual (Šarić et al, 2014;Papaloukas et al, 2021) and in the multi-lingual setting (Steinberger et al, 2006(Steinberger et al, , 2012Chalkidis et al, 2019;Avram et al, 2021;Chalkidis et al, 2021). Moreover, the large availability of legal data, produced by national and supranational public institutions, set the stage for the development of domain-adapted models (Chalkidis et al, 2020;Douka et al, 2021;Masala et al, 2021;Licari and Comandè, 2022).…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Italian Legislative Text Classification for Gazzetta Ufficiale

Rovera,

Palmero Aprosio,

Greco

et al. 2023

Proceedings of the Natural Legal Language Processing Workshop 2023

View full text Add to dashboard Cite

This work introduces a novel, extensive annotated corpus for multi-label legislative text classification in Italian, based on legal acts from the Gazzetta Ufficiale, the official source of legislative information of the Italian state. The annotated dataset, which we released to the community, comprises over 363,000 titles of legislative acts, spanning over 30 years from 1988 until 2022. Moreover, we evaluate four models for text classification on the dataset, demonstrating how using only the acts' titles can achieve top-level classification performance, with a micro F1-score of 0.87. Also, our analysis shows how Italian domain-adapted legal models do not outperform general-purpose models on the task. Models' performance can be checked by users via a demonstrator system provided in support of this work.

show abstract

Section: Evaluation and Error Analysissupporting

confidence: 59%

Section: Related Workmentioning

confidence: 99%

Italian Legislative Text Classification for Gazzetta Ufficiale

Rovera,

Palmero Aprosio,

Greco

et al. 2023

Proceedings of the Natural Legal Language Processing Workshop 2023

View full text Add to dashboard Cite

show abstract

“…In terms of the former, KB-BERT has now been utilised by medical researchers seeking to develop new lifestyle treatments for diabetes patients; in attempts to automatically identify the presence of implants (ie. pacemakers or stents) in heart patients prior to MRI scans; and for the classification of legal documents (Dwibedi et al, 2022;Jerdhaf et al, 2020;Avram et al, 2021). In terms of the latter, the lab's models have been put to work in automating and streamlining the information handling processes of various public authorities, including local councils, the Swedish Tax Agency (Skatteverket), the Swedish Courts (Domstolsverket) and most recently, the support function of State administration (Statens servicecenter).…”

Section: The Value Of Collections-based Models In Practicementioning

confidence: 99%

Transfiguring the Library as Digital Research Infrastructure: Making KBLab at the National Library of Sweden

Börjeson¹,

Haffenden²,

Malmsten³

et al. 2023

Preprint

View full text Add to dashboard Cite

This article provides an account of the making of KBLab, the data lab at the National Library of Sweden (KB). The first part of the article offers an evaluative discussion of the work involved in establishing KBLab as both a physical and a digital site for researchers to use KB’s digital collections at previously unimaginable scales. Beyond explaining how the lab aligns with KB’s broader mission as a national library, we also elaborate upon the design of the technical setup and the processes of research coordination that the operation of a library lab presumes. The second part discusses how KBLab has deployed the library’s collections as data to produce high quality Swedish AI models, which constitute a significant new form of digital research infrastructure. We situate this development work in the context of uneven AI coverage for smaller languages, and consider how the lab’s models have contributed to the making of important AI infrastructure for the Swedish language. The conclusion raises the possibilities and challenges involved in continuing the type of library-based AI development we have initiated at KBLab.

show abstract

“…и включает 6883 понятия. На практике, в частности, данный тезаурус используется для индексации документов в системах документооборота европейских учреждений, а также для классификации юридических документов [Caled et al, 2019;Avram et al, 2021].…”

Section: онтологические ресурсы для задач регионального управленияunclassified

Формирующий Искусственный Интеллект: Новые Возможности Информационной Поддержки Регионального Управления

Шишаев,

Пимешков,

Никонорова

et al. 2023

ЭИ

View full text Add to dashboard Cite

Интеллектуальные информационные системы находят все более широкое применение в сфере регионального управления. Одной из современных базовых концепций их организации является формирующий искусственный интеллект. В данной статье представлен анализ содержания данного понятия и его соотнесение с существующими технологиями создания интеллектуальных информационных систем, а также обзор существующего опыта использования таких технологий формирующего искусственного интеллекта, как генеративно-состязательные искусственные нейронные сети и онтологии в различных прикладных задачах, связанных с региональным управлением. Делается вывод о том, что необходимой средой реализации формирующего искусственный интеллект является информационная система с агентными свойствами. Также сделан вывод о широких возможностях использования формирующего искусственного интеллекта в сфере информационной поддержки регионального управления, с одной стороны, и не полном использовании всего потенциала современных интеллектуальных информационных технологий – с другой. БлагодарностиИсследование выполнено в рамках государственного задания ИИММ КНЦ РАН Министерства науки и высшего образования РФ, регистрационный номер темы НИР: 122022800551-0.

show abstract

PyEuroVoc: A Tool for Multilingual Legal Document Classification with EuroVoc Descriptors

Cited by 6 publications

References 13 publications

Italian Legislative Text Classification for Gazzetta Ufficiale

Italian Legislative Text Classification for Gazzetta Ufficiale

Transfiguring the Library as Digital Research Infrastructure: Making KBLab at the National Library of Sweden

Формирующий Искусственный Интеллект: Новые Возможности Информационной Поддержки Регионального Управления

Contact Info

Product

Resources

About