Social media is a great platform that contains a pool of information combined with people. This can be used to analyze data for better results for an organization. The objective of the paper is to study data analytics and how this concept can be used in social media text analysis. Natural language processing is used for text analysis, information extraction, etc. This survey paper mainly focuses on understanding the need for data analytics in text analysis through various domains like governance, politics, and rural development. Classification and clustering techniques derived from machine learning that helps to extract important information through various algorithms to help companies, organizations, governing bodies understand their audience properly and work according to their needs. The paper consists of a detailed analysis of clustering and classification algorithms over a wide variety of domains and compares the results of the performance metrics. According to the research done by the authors, it is found that Fine-tuning BERT classifier gives the highest accuracy among other classification algorithms which includes Naïve Bayes Classifier, Decision Tree, and Support Vector Machine (SVM). The study shows that clustering has been used for finding sets of similar words, sentences, word sense, etc. These concepts can be used for solving problems related to e-governance as these authorities can deploy such methods to understand their people and work according to their needs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.