Proceedings of the Natural Legal Language Processing Workshop 2019 2019
DOI: 10.18653/v1/w19-2203
|View full text |Cite
|
Sign up to set email alerts
|

The Extent of Repetition in Contract Language

Abstract: Contract language is repetitive (Anderson and Manns, 2017), but so is all language (Zipf, 1949). In this paper, we measure the extent to which contract language in English is repetitive compared with the language of other English language corpora. Contracts have much smaller vocabulary sizes compared with similarly sized non-contract corpora across multiple contract types, contain 1/5 th as many hapax legomena, pattern differently on a loglog plot, use fewer pronouns, and contain sentences that are about 20% m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…Recently, studies grow increasing attention on CCE to extract clauses, which are complete units in contracts, and carefully select a large number of clause types worth human attention (Borchmann et al, 2020;Wang et al, 2021b;Hendrycks et al, 2021). Due to the repetition of contract language that new contracts usually follow the template of old contracts (Simonson et al, 2019), existing methods tend to incorporate structure information to tackle CCE. For example, Chalkidis et al (2017) assign a fixed extraction zone for each clause type and limit the clauses to be extracted from corresponding extraction zones.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, studies grow increasing attention on CCE to extract clauses, which are complete units in contracts, and carefully select a large number of clause types worth human attention (Borchmann et al, 2020;Wang et al, 2021b;Hendrycks et al, 2021). Due to the repetition of contract language that new contracts usually follow the template of old contracts (Simonson et al, 2019), existing methods tend to incorporate structure information to tackle CCE. For example, Chalkidis et al (2017) assign a fixed extraction zone for each clause type and limit the clauses to be extracted from corresponding extraction zones.…”
Section: Related Workmentioning
confidence: 99%
“…Prior NLP work on contract language is extensive, including summarization (Manor and Li, 2019;Keymanesh et al, 2020) information extraction and understanding (Anish et al, 2019;Borchmann et al, 2020;Agarwal et al, 2021), as well as corpus studies looking at intrinsic properties of contracts (Curtotti and McCreath, 2011;Simonson et al, 2019) or providing new annotations over contract language (Funaki et al, 2020).…”
Section: Prior Workmentioning
confidence: 99%
“…For coreference resolution, the system employs a heuristic technique. Since contract language tends to avoid 3rd person anaphora (Simonson et al, 2019), the system avoids some of the more challenging coreference problems and targets primarily noun phrases headed by common nouns and proper nouns. First, all 1st and 2nd person pronouns are joined into chains unique to each pronoun since their intrinsic deictic properties make them unambiguous.…”
Section: Pre-processingmentioning
confidence: 99%
See 1 more Smart Citation
“…The fundamental units of discourse in contracts consist of "clauses" that are paragraphs of text that outline the terms and conditions of various types or topics (e.g., severability, benefits) (Table 1). Legal clauses can be characterized by their high inter-sentence similarity, and topic-specific content (Simonson et al, 2019). For example, Zhong et al (2020) showed that the sentences in legal corpora are almost 20% similar to each other.…”
Section: Introductionmentioning
confidence: 99%