Background: Text classification is a very important task in information retrieval. Its objective is to classify new text documents in a set of predefined classes, using different supervised algorithms. Objectives: We focus on the text classification for Albanian news articles using two approaches. Methods/Approach: In the first approach, the words in a collection are considered as independent components, allocating to each of them a conforming vector in the vector’s space. Here we utilized nine classifiers from the scikit-learn package, training the classifiers with part of news articles (80%) and testing the accuracy with the remaining part of these articles. In the second approach, the text classification treats words based on their semantic and syntactic word similarities, supposing a word is formed by n-grams of characters. In this case, we have used the fastText, a hierarchical classifier, that considers local word order, as well as sub-word information. We have measured the accuracy for each classifier separately. We have also analyzed the training and testing time. Results: Our results show that the bag of words model does better than fastText when testing the classification process for not a large dataset of text. FastText shows better performance when classifying multi-label text. Conclusions: News articles can serve to create a benchmark for testing classification algorithms of Albanian texts. The best results are achieved with a bag of words model, with an accuracy of 94%.
The development of the Text-to-Speech (TTS) field is not at the same level in all countries and for all languages. This was obviously conditioned by the differences in speaking and writing in different languages, but also and by the economic justification for research in national terms. Therefore it is evident that the advancement of this field for local languages has not made any significant progress in comparison with the English language. In this context, within this paper is explored the possibility of using the existing TTS converters dedicated for English language [3], for the Albanian language. In particular eSpeak is treated as a compact open source software for speech synthesizer in English, which uses the method of synthesis formant and supports the synthesis of some other words in other languages. The research is focused on two aspects, in determining the understanding of words generated by this system and in determining the feasibility of using eSpeak for generating of speech in Albanian.
The research study investigates the development approach and analyses of m-learning course management system developed as case study. The contribution of the research study is the analyses of the task based instructional strategy and conceptual definition of the assessment methods regarding the level of mobile learning outcomes while assessing different software engineering aspects of development mobile application for Windows phone 8. The analysis also covers the development with primary focus on mobile learning using ASP.NET MVC 5 technology, and design issues for responsive techniques. Finally insights are stated, results of the analyses are provided and recommendations and discussions of the approach are described.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.