Belgian Dutch (BD) and Netherlandic Dutch (ND) are known to exhibit phonetic and lexical differences, but national variation in the syntax of Dutch has often been claimed to be quasi non-existent. This view is rooted in the fact that both laypersons and researchers
are oblivious to national divergences in the grammar of Dutch (unless they are categorical and/or heavily mediatized), but also in the undisputed belief that BD and ND are different surface manifestations of ‘the same grammatical motor’. As a result, only a few syntactic phenomena
have hitherto been shown to be sensitive to national constraints. In this paper we illustrate a computational bottom-up approach (pioneered in Bannard & Callison-Burch 2005) to cast the net as widely as possible. Building on statistical machine translation and a parallel corpus of Dutch
translations of English subtitles, we identify plausible mappings between English n-grams and their Dutch translations. We do this in order to obtain paraphrases, i.e., stretches of interchangeable Dutch text that carry approximately the same meaning. In a first case study, we found
corroborating evidence among the discovered paraphrases for many syntactic variables that have previously been attested in Dutch, including complementizer variation, existential er-variation, word order phenomena, and inflection variation. Crucially, we also discovered a number of alternations
we had not anticipated as interesting variables. In order to detect national constraints on the newly found variables, we carried out a second experiment with a smaller corpus of Belgian and Netherlandic subtitles: the two variables we investigated in this light ‐ deictic strength
variation and subordination variation ‐ did indeed manifest national sensitivity.