Although Chinese Americans set up Chinese heritage language schools as early as 1848 to preserve the heritage language and to promote a sense of ethnic identity among their American-born children (Chao, 1997), there is strong evidence that language shift to English is taking place rather rapidly within the Chinese communities across the U.S. Data from the 2006 ACS show that while only 34.1 percent of first generation (i.e. foreign-born) Chinese Americans reported speaking ‘English very well’, the percentages rise dramatically for those who are American-born (i.e. second generation and beyond) or born overseas but arrived in the U.S. before the age of 16 (i.e. the 1.5 generation). 70.4 percent of the 1.5 generation and 93.8 percent of the American-born Chinese Americans reported speaking ‘English very well’. Additionally, only about 27.6 percent of the ABCs were estimated to speak their heritage language at home. Taken together, these estimates suggest that the rate of shift from Chinese to English is accelerating. Jia (2008) finds that even for first generation Chinese Americans, their Chinese language skills continue to decline with increasing English immersion. Rapid language shift to English means that many ABCs speak English as one of their native languages, if not the only one. This raises interesting sociolinguistic questions concerning the characteristics of the English spoken by ABCs and how ABCs utilize varieties of English to construct and negotiate differences with respect to each other and vis-à-vis the larger social structure.
A set of shared coding conventions for speaker ethnicity is necessary for open‐source data sharing and cross‐study compatibility between linguistic corpora. However, ethnicity, like many other aspects of speaker identity, is continually negotiated and reproduced in discourse, and therefore a challenge to code representatively. This paper discusses some of the challenges facing researchers who want to use, create, or contribute to existing corpora that are annotated for the ethnic identity of a speaker. We specifically problematize the macro‐social label ‘Asian American’ and propose that researchers should consider different levels and types of specificity of ‘Asianness’ in order to ensure that the corpora best represent the reality of ethnic identity in the community sampled. This is particularly important given the limited incorporation of different Asian groups in most existing linguistic research). We argue that more rigorous coding for Asian American ethnicities in corpora will improve the utility of archived corpora and enhance sociolinguistic research on language variation and ethnic identity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.