Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law 2021
DOI: 10.1145/3462757.3466088
|View full text |Cite
|
Sign up to set email alerts
|

When does pretraining help?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 75 publications
(54 citation statements)
references
References 24 publications
0
22
0
Order By: Relevance
“…The research on legal AI is undergoing a fundamental change at the moment: instead of many small models each for one specific task, researchers have started to build and utilize one big model for many different tasks. Such a big model, aka foundation model [5], for the legal domain, is a large language model either pre-trained on legal corpora from scratch or adapted from a general model with further pretraining on legal corpora [6,25], which we call a Large Legal Language Model (L 3 M).…”
Section: Presentmentioning
confidence: 99%
See 1 more Smart Citation
“…The research on legal AI is undergoing a fundamental change at the moment: instead of many small models each for one specific task, researchers have started to build and utilize one big model for many different tasks. Such a big model, aka foundation model [5], for the legal domain, is a large language model either pre-trained on legal corpora from scratch or adapted from a general model with further pretraining on legal corpora [6,25], which we call a Large Legal Language Model (L 3 M).…”
Section: Presentmentioning
confidence: 99%
“…On one hand, the massive scale of complex text data enables or facilitates the (self-supervised) pre-training of L 3 M. On the other hand, the few-shot prompting (i.e., in-context learning) or zero-shot prompting capability of L 3 M for downstream tasks can greatly alleviate or even avoid the high labeling cost, while the flexibility of L 3 M to accommodate ambiguity and idiosyncrasies can help to meet the challenges of thoroughness and specialized knowledge. It is not surprising that with L 3 Ms such as LEGAL-BERT [6] and Lawformer [21], we are seeing new heights achieved in legal text classification and other tasks [25,15].…”
Section: Presentmentioning
confidence: 99%
“…And there is several BERT based pretrained models in legal domain. Famous ones are legal BERT by (Chalkidis et al, 2020) and (Zheng et al, 2021). In legal domain adaptation in pretraining, (Chalkidis et al, 2020) try to prove the effectiveness of domain pretraining and (Zheng et al, 2021) tries to answer when domain pre-training helps.…”
Section: Related Workmentioning
confidence: 99%
“…32,33,34,35,36 Despite sometimes being characterized as general models, it is still an open question as to how much uptake or utility such core developments in NLP might offer when directed at complex domain-specific problems. While general models have shown real progress on legal tasks in the zero shot context, 1,8,37,38,39,40 there are still strong reasons 41,42,43 to believe that some combination of domain-specific pre-training, prompt engineering, prompt composition or chaining, hyper-parameter optimization, and other model tuning efforts will yield improved results in many substantive use cases. In other words, general NLP models will likely not eclipse the performance of an otherwise equally-sized large language model that has been well trained on the legal domain.…”
Section: Introductionmentioning
confidence: 99%