Language model pre-training architectures have demonstrated to be useful to learn language representations. bidirectional encoder representations from transformers (BERT), a recent deep bidirectional self-attention representation from unlabelled text, has achieved remarkable results in many natural language processing (NLP) tasks with fine-tuning. In this paper, we want to demonstrate the efficiency of BERT for a morphologically rich language, Turkish. Traditionally morphologically difficult languages require dense language pre-processing steps in order to model the data to be suitable for machine learning (ML) algorithms. In particular, tokenization, lemmatization or stemming and feature engineering tasks are needed to obtain an efficient data model to overcome data sparsity or high-dimension problems. In this context, we selected five various Turkish NLP research problems as sentiment analysis, cyberbullying identification, text classification, emotion recognition and spam detection from the literature. We then compared the empirical performance of BERT with the baseline ML algorithms. Finally, we found enhanced results compared to base ML algorithms in the selected NLP problems while eliminating heavy pre-processing tasks.
The main aim of software projects is developing software programs to meet functional and non-functional requirements within the project budget and at a particular time. The greatest challenge in reaching this goal is the software errors that were found in the software projects. The most basic technique that is used to solve software errors is testing the software programs according to the methods in the literature. These methods are the software tests that are basically conducted by software developers, although they have different methods of verification and validation according to their size, experience, techniques or tools they use. When software is tested, it is very significant that software errors are found in the early phases. Software error estimation is a proven method of effectiveness and validity that increases the quality of software and reduces the cost of software development. In this study, by using machine learning algorithms and software metrics; software error estimation has been carried out with a developed software.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.