In the prose style transfer task a system, provided with text input and a target prose style, produces output which preserves the meaning of the input text but alters the style. These systems require parallel data for evaluation of results and usually make use of parallel data for training. Currently, there are few publicly available corpora for this task. In this work, we identify a high-quality source of aligned, stylistically distinct text in different versions of the Bible. We provide a standardized split, into training, development and testing data, of the public domain versions in our corpus. This corpus is highly parallel since many Bible versions are included. Sentences are aligned due to the presence of chapter and verse numbers within all versions of the text. In addition to the corpus, we present the results, as measured by the BLEU and PINC metrics, of several models trained on our data which can serve as baselines for future research. While we present these data as a style transfer corpus, we believe that it is of unmatched quality and may be useful for other natural language tasks as well.
Social scientists recognize a complex and iterative relationship between the built environment and social identities. Here, we explore the extent to which household and settlement remains may be used as archaeological correlates of collective identities among the Stó:lō-Coast Salish peoples of the Fraser River Valley. Using data from six recently tested archaeological sites we begin with the household and explore expressions of identity at various social-spatial scales. The sites span the period from 4200 cal B.C. to the late A.D. 1800s and include settlements with semi-subterranean houses of different forms as well as aboveground plank houses. Across this timeframe we see both change and continuity in settlement location, layout, size, and house form. Our data suggest that although group identities have changed over the millennia, selected social units have persisted through many generations and can be linked to present-day identities of the Stó:lō-Coast Salish.
This Article presents the results of a quantitative analysis of writing style for the entire corpus of US Supreme Court decisions. The basis for this analysis is the measure of frequency of function words, which has been found to be a useful "stylistic fingerprint" and which we use as a general proxy for the stylistic features of a text or group of texts. Based on this stylistic fingerprint measure, we examine temporal trends on the Court, verifying that there is a "style of the time" and that contemporaneous Justices are more stylistically similar to their peers than to temporally remote Justices. We examine potential "internal" causes of stylistic changes, and conduct an in-depth analysis of the role of the modern institution of the judicial clerk in influencing writing style on the Court. Using two different measures of stylistic consistency, one measuring intra-year consistency on the Court and the other examining inter-year consistency for individual Justices, we find evidence that the writing styles of individual Justices have become less consistent as clerks have taken on a greater role on the Court.
For decades, researchers have studied the relationship between the political leanings of judges and the outcomes of appellate litigation in the United States. The primary source of data for this research has been published judicial opinions that describe cases and their outcomes. However, only a relatively small number of cases result in published opinions, and this sample of cases may be subject to serious biases. Based on computational text analysis of over 150,000 published opinions issued by federal appellate courts in the years 1970–2010, we find strong evidence of data bias based on relationships between the party affiliations of judges on appellate court panels and the characteristics of cases that result in published opinions. These relationships imply that the inferential model that underlies much of the judicial politics literature can lead to biased or spurious findings concerning the causal influence of judicial attributes on case outcomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.