Our aim is to develop a sentiment analysis tool for public health officials to monitor the spreading epidemics in a certain region and time period. Analyzing the public concerns and emotions about health related matters is an important issue to know the spreading of a disease. In this work, sentiment classification of Twitter messages is focused to measure the Degree of Concern (DOC) of the people about a disease spreading. In order to achieve this goal, the disease related tweets are extracted based on time and geographical location. Then, a novel two-step sentiment classification is applied to identify the personal negative tweets. First, the clue-based algorithm is used to classify the personal tweets from non personal tweets by using subjectivity clues. Next, lexicon-based algorithm and Naïve Bayes classifiers are applied to classify negative and non-negative personal tweets. The personal negative tweets are used to measure Degree of Concern. The Public Health Surveillance System (PHSS) is also developed by using visualization techniques such as maps, graphs and charts to visualize the Degree of Concern (DOC) of the epidemic related twitter data. The visual concern graphs and charts can help health specialists to monitor the progression and peaks of health concerns of people for a disease in particular space and time, so that necessary preventive actions can be taken by public health officials. Negation Handling and Laplacian Smoothing techniques are used with Lexicon Based classifier and Naïve Bayes classifier to improve performance.
Facial Expression Recognition (FER) has grown in popularity as a result of the recent advancement and use of humancomputer interface technologies. Because the images can vary in brightness, backdrop, position, etc. it is challenging for current machine learning and deep learning models to identify facial expression. If the database is small, it doesn't operate well. Feature extraction is crucial for FER, and if the derived characteristics can be separated, even a straightforward approach can help tremendously. Deep learning techniques and automated feature extraction, allow some irrelevant features to conflict with important features. In this paper, we deal with limited data and simply extract useful features from images. To make data more numerous and allow for the extraction of just important facial features, we suggest innovative face cropping, rotation, and simplification procedures and advocate using the Transfer Learning technique to construct DCNN for building a very accurate FER system. By replacing the dense top layer(s) with FER, a pretrained DCNN model is adopted, and the model is then modified with facial expression data. The training of the dense layer(s) is followed by adjusting each of the pre-trained DCNN blocks in turn. This new pipeline technique has gradually increased the accuracy of FER to a higher degree. On the CK+ and JAFFE datasets, experiments were run to assess the suggested methodology. For 7-class studies on the CK+ and JAFFE databases, high average accuracy in recognition of 99.49% and 98.58% were acquired.
Document clustering is a technique used to split the collection of textual content into clusters or groups. In modern days, generally, the spectral clustering is utilized in machine learning domain. By using a selection of text mining algorithms, the diverse features of unstructured content is captured for ensuing in rich descriptions. The main aim of this article is to enhance a novel unstructured text data clustering by a developed natural language processing technique. The proposed model will undergo three stages, namely, preprocessing, features extraction, and clustering. Initially, the unstructured data is preprocessed by the techniques such as punctuation and stop word removal, stemming, and tokenization. Then, the features are extracted by the word2vector using continuous Bag of Words model and term frequency-inverse document frequency. Then, unstructured features are performed by the hierarchical clustering using the optimizing the cut-off distance by the improved sensing area-based electric fish optimization (FISA-EFO). Tuned deep neural network is used for improving the clustering model, which is proposed by same algorithm. Thus, the results reveal that the model provides better clustering accuracy than other clustering techniques while handling the unstructured text data. K E Y W O R D S fitness improved sensing area-based electric fish optimization, hierarchical clustering, tuned deep neural network, unstructured text data clustering INTRODUCTIONGenerally, speech and text data are read by humans easily, but the machine learning and statistical modeling applications have some unstructured data and so, it is necessary to do some alterations in the coded input feature sets. 1 Data clustering is a technique used for splitting the data elements into many groups so that the elements in the same group have the highest similarity. Though, based on the cluster's attributes, there are diverse elements in other groups. The major aim of clustering techniques is to get centroids or cluster centers for characterizing the entire cluster. Few of the clustering techniques were performed and classified from different scenarios such as "density-based methods, grid-based methods, partitioning methods, and hierarchical methods." 2,3 Moreover, the data set is defined as categorical or numerical. The primary statistical features of numeric data are utilized for describing the distance function between data elements. The categorical data is imitated from the qualitative and quantitative data, and then the descriptions are attained from the counts. 4 By using a "textual virtual schematic model" (TVSM), the textual data are assigned in clusters and it follows three steps. Initially, the extraction of unstructured data is carried out from the data source, and then, it is changed into structured data. 5 After that, clustering is implemented on structured data. Finally, the comparison of documents is done for enhancing the performance of the query based on accuracy.The day today's life generates a huge amount of unstructured text...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.