Steps I made the trees using the Arethusa software on the Perseids website [13]. Original text files were obtained from the Perseus Project [14] (Tufts Univ.) and from the Pedalion Project (UK Leuven). I followed the rules of dependency syntax, employing the standard AGDT 1.1 tagset [2] and refining them according to the discussion of dependency syntax offed by Pinkster [15]. I have not used the 2.0 tagset based on Smyth developed by Celano [4]: the level of specificity increases the subjectivity of the annotation decisions exponentially, often relying more on semantics than syntax (what is the difference between a partitive genitive and a genitive of material in the phrase 'piece of pie'?), and the tagset is specific to Greek, making a linguistic comparison between languages more difficult. Sampling strategy While no formal statistical sampling methods were used, I chose to annotate at least 20,000 tokens each from a variety of Greek prose authors. As the size of an average 'book' by many authors, it represents a dataset large enough to use for significant sampling algorithms. I have included works from the Classical, Hellenistic, and Roman periods: Aeschines, Antiphon, Appian, Athenaeus, Demosthenes, Dionysius of Halicarnassus, Herodotus, Josephus, Lysias, Plutarch, Polybius, Thucydides, and Xenophon. Quality Control The relation labeling follows the general instructions for the AGDT 1.1 tagset given in Bamman and Crane [2]. I have created more detailed instructions for annotating major linguistic phenomena not covered in Bamman and Crane [2] in the 'Treebanking Tips' file within this dataset, relying heavily on the parallel interpretation of dependency syntax offered for Latin by Pinkster [15].