The starting point of this article is the question "How to retrieve fingerprints of rhythm in written texts?" We address this problem in the case of Brazilian and European Portuguese. These two dialects of Modern Portuguese share the same lexicon and most of the sentences they produce are superficially identical. Yet they are conjectured, on linguistic grounds, to implement different rhythms. We show that this linguistic question can be formulated as a problem of model selection in the class of variable length Markov chains. To carry on this approach, we compare texts from European and Brazilian Portuguese. These texts are previously encoded according to some basic rhythmic features of the sentences which can be automatically retrieved. This is an entirely new approach from the linguistic point of view. Our statistical contribution is the introduction of the smallest maximizer criterion which is a constant free procedure for model selection. As a by-product, this provides a solution for the problem of optimal choice of the penalty constant when using the BIC to select a variable length Markov chain. Besides proving the consistency of the smallest maximizer criterion when the sample size diverges, we also make a simulation study comparing our approach with both the standard BIC selection and the Peres-Shields order estimation. Applied to the linguistic sample constituted for our case study, the smallest maximizer criterion assigns different context-tree models to the two dialects of Portuguese. The features of the selected models are compatible with current conjectures discussed in the linguistic literature.
In this paper we study the syntax of clitic-placement in Portuguese authors born from 1542 to 1836, as regards their patterns of clitic pronouns placement. The motivation for the research was to enquire: what is the pattern of enclisis (V-cl) and proclisis (cl-V) variation in those texts; is it indicative of linguistic change; if so, when in the timeline can the change be located? Drawing from the emprical results, we analyse the syntax of clitic placement in those texts as representative of a grammatical change which should be located in the first half of the 18th century. Our empirical arguments and structural analysis sustain that in texts up to the 18th century, enclisis is strictly a Verb-First phenomenon (even so, we will argue, in constructions that are supperficially non-verb initial). We sustain that the effects of this syntax in clitic placement ceases to be noticed for texts written by authors born after 1700.1 This predominance was shown by several studies; among others, cf. Lobo, 1992, Ribeiro, 1995 Our results, cf. Figure 1 to come, point to an even stronger contrast, since we find proclisis in 98% of the cases. The discrepancy with Britto's study is due to some differences in the set of phenomena considered. 3 It must be noted that we depart from these analyses not only because we have much more data at our disposal, but also because we adopt the view defended by Kroch (1989) that when two forms compete along the time, the grammatical change should be located not at the end of this competition, but at its beginning; cf. Final Remarks.
This paper argues that the variation in the placement of clitic pronouns in European and
IntroductionThis article proposes an analysis of clitic placement in European Portuguese and Brazilian Portuguese (henceforth, respectively EP and BP), from a comparative perspective. In the first part of the article, we use as a comparative corpus the original text of Paulo Coelho's novel O Alquimista and the adapted version of the Portuguese edition in order to illustrate the differences between the two varieties of Portuguese.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.