“…Input I aint neva did dat befo Output (I, <PRP>), (aint, < VBP >), (neva, < NN >), (did, <VBD>)(dat, < JJ >), (befo, < NN >) To address these issues, we aim to empirically study predictive bias (see Swinton (1981) for definition) i.e., if POS tagger models make predictions dependent on demographic language features, and attempt a dynamic approach in data-collection of non-standard spellings and lexical items. To examine the behaviors of AAE speakers and their language use, we first collect variable (morphological and phonological) rules of AAE language features from literature (Labov, 1975;Bailey et al, 1998;Green, 2002;Bland-Stewart, 2005;Stewart, 2014;Blodgett et al, 2016;Elazar and Goldberg, 2018;Baugh, 2008;Green, 2014) (see Appendix C). Then, we employ 5 trained sociolinguist Amazon Mechanical Turk (AMT) annotators 3 who identify as bi-dialectal dominant AAE speakers to address the issue of lexical, semantic and syntactic ambiguity of tweets (see Appendix B for annotation guidelines).…”