The arrival of the boom of teaching Chinese as a foreign language (TCFL) and personalized learning has led to a sharp increase in the demand for the Chinese language reading material. There are numerous reading materials available in Chinese for foreign students and learners to read and evaluate. The high-quality TCFL reading materials with reasonable arrangement can provide convenience for learners with different reading comprehension, interpretation abilities, and levels to master a language more quickly. Therefore, this study carries out an automatic readability assessment of books in Chinese as a foreign language. This paper comprehensively considers the factors affecting the difficulty of reading materials from the perspective of Chinese ontology based on the existing readability assessment research. Using natural language processing and a database management system to extract the features of books in Chinese as a foreign language, the text readability is evaluated with a statistical machine learning algorithm. The model is optimized by feature selection and sorting feature selection technology. The packaging feature selection technology is introduced to optimize the algorithm performance. The feature sets and each independent feature in the three dimensions of word meaning, part of speech, and discourse were optimized by the machine learning regression model based on certain evaluation indexes. Moreover, this work examined that the regression model is effective at identifying and recommending simpler textbooks for learning with difficult foreign language materials. For high-proficient learners, this approach significantly improves performance and measurement efficiency of reading books.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.