Purpose
This paper aims to model a technique that categorizes the texts from huge documents. The progression in internet technologies has raised the count of document accessibility, and thus the documents available online become countless. The text documents comprise of research article, journal papers, newspaper, technical reports and blogs. These large documents are useful and valuable for processing real-time applications. Also, these massive documents are used in several retrieval methods. Text classification plays a vital role in information retrieval technologies and is considered as an active field for processing massive applications. The aim of text classification is to categorize the large-sized documents into different categories on the basis of its contents. There exist numerous methods for performing text-related tasks such as profiling users, sentiment analysis and identification of spams, which is considered as a supervised learning issue and is addressed with text classifier.
Design/methodology/approach
At first, the input documents are pre-processed using the stop word removal and stemming technique such that the input is made effective and capable for feature extraction. In the feature extraction process, the features are extracted using the vector space model (VSM) and then, the feature selection is done for selecting the highly relevant features to perform text categorization. Once the features are selected, the text categorization is progressed using the deep belief network (DBN). The training of the DBN is performed using the proposed grasshopper crow optimization algorithm (GCOA) that is the integration of the grasshopper optimization algorithm (GOA) and Crow search algorithm (CSA). Moreover, the hybrid weight bounding model is devised using the proposed GCOA and range degree. Thus, the proposed GCOA + DBN is used for classifying the text documents.
Findings
The performance of the proposed technique is evaluated using accuracy, precision and recall is compared with existing techniques such as naive bayes, k-nearest neighbors, support vector machine and deep convolutional neural network (DCNN) and Stochastic Gradient-CAViaR + DCNN. Here, the proposed GCOA + DBN has improved performance with the values of 0.959, 0.959 and 0.96 for precision, recall and accuracy, respectively.
Originality/value
This paper proposes a technique that categorizes the texts from massive sized documents. From the findings, it can be shown that the proposed GCOA-based DBN effectively classifies the text documents.
Oral cancer is a significant health problem throughout the world. It is very important to detect such types of cancer at an earlier stage than the later stage where the treatment becomes unsuccessful. Early detection helps surgeons to provide necessary therapeutic measures which also benefit the patients. In this paper, a technique is proposed to detect cancers present in mouth provided by an Orthopantomogram. A novel mathematical morphological watershed algorithm is proposed to preserve these edge details as well as prominent ones to identify tumors in dental radiographs. Applying watershed on images leads to oversegmentation even though it is preprocessed. To avoid this, Marker Controlled Watershed segmentation is used to segment tumors. The results obtained are quite good and were tested.
Solar power is generated using photovoltaic (PV) systems all over the world. Because the output power of PV systems is alternating and highly dependent on environmental circumstances, solar power sources are unpredictable in nature. Irradiance, humidity, PV surface temperature, and wind speed are only a few of these variables. Because of the unpredictability in photovoltaic generating, it’s crucial to plan ahead for solar power generation as in solar power forecasting is required for electric grid. Solar power generation is weather-dependent and unpredictable, this forecast is complex and difficult. The impacts of various environmental conditions on the output of a PV system are discussed. Machine Learning (ML) algorithms have shown great results in time series forecasting and so can be used to anticipate power with weather conditions as model inputs. The use of multiple machine learning, Deep learning and artificial neural network techniques to perform solar power forecasting. Here in this regression models from machine learning techniques like support vector machine regressor, random forest regressor and linear regression model from which random forest regressor beaten the other two regression models with vast accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.