Abstract. The World Wide Web offers a great availability of heterogeneous educational resources. This suggests the idea that such materials can be re-used in compose courses. In this paper we address this issue by proposing an architecture for composing teaching courses using "the best parts" of heterogeneous educational materials available on the Web. Course composition relies on a simple but effective evaluation methodology which reproduces real techniques used by teachers in composing and improving classroom courses. The final goal of this article is to help the teacher to construct his course until the obtension of a steady course.We present our initial work and discuss about future developments.
Identifier names (e.g., packages, classes, methods, variables) are one of most important software comprehension sources. Identifier names need to be analyzed in order to support collaborative software engineering and to reuse source codes. Indeed, they convey domain concept of softwares. For instance, "getMinimumSupport" would be associated with association rule concept in data mining softwares, while some are difficult to recognize such as the case of mixing parts of words (e.g., "init-FeatSet"). We thus propose methods for assisting automatic software understanding by classifying identifier names into domain concept categories. An innovative solution based on data mining algorithms is proposed. Our approach aims to learn character patterns of identifier names. The main challenges are (1) to automatically split identifier names into relevant constituent subnames (2) to build a model associating such a set of subnames to predefined domain concepts. For this purpose, we propose a novel manner for splitting such identifiers into their constituent words and use Ngrams based text classification to predict the related domain concept. In this article, we report the theoretical method and the algorithms we propose, together with the experiments run on real software source codes that show the interest of our approach.
Supervised classification has been extensively addressed in the literature as it has many applications, especially for text categorization or web content mining where data are organized through a hierarchy. On the other hand, the automatic analysis of brand names can be viewed as a special case of text management, although such names are very different from classical data. They are indeed often neologisms, and cannot be easily managed by existing NLP tools. In our framework, we aim at automatically analyzing such names and at determining to which extent they are related to some concepts that are hierarchically organized. The system is based on the use of character n-grams. The targeted system is meant to help, for instance, to automatically determine whether a name sounds like being related to ecology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.