This paper presents a full procedure for the development of a segmented, POS-tagged and chunk-parsed corpus of Old Tibetan. As an extremely lowresource language, Old Tibetan poses non-trivial problems in every step towards the development of a searchable treebank. We demonstrate, however, that a carefully developed, semisupervised method of optimising and extending existing tools for Classical Tibetan, as well as creating specific ones for Old Tibetan, can address these issues. We thus also present the very first Tibetan Treebank in a variety of formats to facilitate research in the fields of NLP, historical linguistics and Tibetan Studies.
No abstract
Some issues carried the title in Tibetan as Bod Sog zla reʼi gsar ʼgyur. 3 Barnett discovered 16 issues in the market in Chengdu, which are now in the library of Columbia University. A number of issues of the Mengzang yuebao are held by: Stanford (9 issues, https://searchworks.stanford.edu/view/14055358); Harvard (11 issues, https://hollis.harvard.edu/primo-explore/fulldisplay?docid=01HVD_ALMA212010380950003941&con-text=L&vid=HVD2&lang=en_US&search_scope=default_scope&adaptor=Lo-cal%20Search%20Engine&tab=books&query=any,contains,meng%20zang%20yue&offset=0; Cornell (4 issues, https://newcatalog.library.cornell.edu/catalog/398440); Library of Congress (14 issues, https://catalog.loc.gov/vwebv/holdingsInfo?searchId=12644&rec-Count=25&recPointer=0&bibId=13847187); and Chinese University of Hong Kong (15 issues, https://julac.hosted.exlibrisgroup.com/primo-explore/fulldisplay?do-cid=CUHK_IZ21767439280003407&con-text=L&vid=CUHK&lang=en_US&search_scope=All&adaptor=Local%20Search%20En-gine&tab=default_tab&query=any,contains,meng%20zang%20yue&offset=0). 4 The Uyghur translation given for the title of the Mengzang yuebao differs in different issues, even within the same year. These three versions of the title are found in issues from the first six months of 1940: Manġul vä Täbät qomitäsi aylıq mäǰmūʿäsi, Manġul vä Tibät qomitäsi ṭäräfindän čıqarılġan aylıq mäǰmūʿä; Manġul vä Tibät qomitäsi ṭäräfindän čıqarılġan aylıq mäǰmūʿäsi; and Manġul vä täbät qomitäsi ayliq mäǰmūʿäsi / Manġul vä tübät qomitäsi ṭäräfindän čıqarılġan ayliq mäǰmūʿä. 5 These included the Xizang baihuabao (西藏白話報, Bod kyi phal skad gsar ʼgyur, "Tibet Vernacular News") from 1907 to 1910 and the Menghuabao (蒙話報, Mongolyn sonin bichig, "Mongolian Colloquial Newspaper") from 1908 (Zhang 2016, Bai 2018), both of which were bilingual (Chinese-Tibetan and Chinese-Mongolian). Bai 2018 says that a publication called the Mengwenbao《蒙文報, "Mongolian Language News") began in 1907, and He Jiani says publication of the Mengwen baihuabao (蒙文白話報, Mongγol yerü üge-yin sedkül, "Mongolian Vernacular News") was proposed before the fall of the dynasty in 1911, but implies that the proposal was not implemented (He 2018: 144). Freeman notes that the Chinese-language Yili baihua bao (伊犂白話報, "Ili Vernacular News") was published for a year
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.