Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
In natural language processing, text summarization is an important application used to extract desired information by reducing large text. Existing studies use keyword-based algorithms for grouping text, which do not give the documents' actual theme. Our proposed dynamic corpus creation mechanism combines metadata with summarized extracted text. The proposed approach analyzes the mesh of multiple unstructured documents and generates a linked set of multiple weighted nodes by applying multistage Clustering. We have generated adjacency graphs to link the clusters of various collections of documents. This approach comprises of ten steps: pre-processing, making multiple corpuses, first stage clustering, creating sub-corpuses, interlinking sub-corpuses, creating page rank keyword dictionary of each sub-corpus, second stage clustering, path creation among clusters of sub-corpuses, text processing by forward and backward propagation for results generation. The outcome of this technique consists of interlinked subcorpuses through clusters. We have applied our approach to a News dataset, and this interlinked corpus processing follows step by step clustering to search the most relevant parts of the corpus with less cost, time, and improve content detection. We have applied six different metadata processing combinations over multiple text queries to compare results during our experimentation. The comparison results of text satisfaction show that Page-Rank keywords give 38% related text, single-stage Clustering gives 46%, twostage Clustering gives 54%, and the proposed technique gives 67% associated text. Furthermore, this approach covers/searches the relevant data with a range of most to less relevant content. It provides the systematic query-relevant corpus processing mechanism, which automatically selects the most relevant subcorpus through dynamic path selection. We used the SHAP model to evaluate the proposed technique, and our evaluation results proved that the proposed mechanism improved text processing. Moreover, combining text summarization features, shown satisfactory results compared to the summaries generated by general models of abstractive & extractive summarization.
In natural language processing, text summarization is an important application used to extract desired information by reducing large text. Existing studies use keyword-based algorithms for grouping text, which do not give the documents' actual theme. Our proposed dynamic corpus creation mechanism combines metadata with summarized extracted text. The proposed approach analyzes the mesh of multiple unstructured documents and generates a linked set of multiple weighted nodes by applying multistage Clustering. We have generated adjacency graphs to link the clusters of various collections of documents. This approach comprises of ten steps: pre-processing, making multiple corpuses, first stage clustering, creating sub-corpuses, interlinking sub-corpuses, creating page rank keyword dictionary of each sub-corpus, second stage clustering, path creation among clusters of sub-corpuses, text processing by forward and backward propagation for results generation. The outcome of this technique consists of interlinked subcorpuses through clusters. We have applied our approach to a News dataset, and this interlinked corpus processing follows step by step clustering to search the most relevant parts of the corpus with less cost, time, and improve content detection. We have applied six different metadata processing combinations over multiple text queries to compare results during our experimentation. The comparison results of text satisfaction show that Page-Rank keywords give 38% related text, single-stage Clustering gives 46%, twostage Clustering gives 54%, and the proposed technique gives 67% associated text. Furthermore, this approach covers/searches the relevant data with a range of most to less relevant content. It provides the systematic query-relevant corpus processing mechanism, which automatically selects the most relevant subcorpus through dynamic path selection. We used the SHAP model to evaluate the proposed technique, and our evaluation results proved that the proposed mechanism improved text processing. Moreover, combining text summarization features, shown satisfactory results compared to the summaries generated by general models of abstractive & extractive summarization.
Document clustering is a technique used to split the collection of textual content into clusters or groups. In modern days, generally, the spectral clustering is utilized in machine learning domain. By using a selection of text mining algorithms, the diverse features of unstructured content is captured for ensuing in rich descriptions. The main aim of this article is to enhance a novel unstructured text data clustering by a developed natural language processing technique. The proposed model will undergo three stages, namely, preprocessing, features extraction, and clustering. Initially, the unstructured data is preprocessed by the techniques such as punctuation and stop word removal, stemming, and tokenization. Then, the features are extracted by the word2vector using continuous Bag of Words model and term frequency-inverse document frequency. Then, unstructured features are performed by the hierarchical clustering using the optimizing the cut-off distance by the improved sensing area-based electric fish optimization (FISA-EFO). Tuned deep neural network is used for improving the clustering model, which is proposed by same algorithm. Thus, the results reveal that the model provides better clustering accuracy than other clustering techniques while handling the unstructured text data. K E Y W O R D S fitness improved sensing area-based electric fish optimization, hierarchical clustering, tuned deep neural network, unstructured text data clustering INTRODUCTIONGenerally, speech and text data are read by humans easily, but the machine learning and statistical modeling applications have some unstructured data and so, it is necessary to do some alterations in the coded input feature sets. 1 Data clustering is a technique used for splitting the data elements into many groups so that the elements in the same group have the highest similarity. Though, based on the cluster's attributes, there are diverse elements in other groups. The major aim of clustering techniques is to get centroids or cluster centers for characterizing the entire cluster. Few of the clustering techniques were performed and classified from different scenarios such as "density-based methods, grid-based methods, partitioning methods, and hierarchical methods." 2,3 Moreover, the data set is defined as categorical or numerical. The primary statistical features of numeric data are utilized for describing the distance function between data elements. The categorical data is imitated from the qualitative and quantitative data, and then the descriptions are attained from the counts. 4 By using a "textual virtual schematic model" (TVSM), the textual data are assigned in clusters and it follows three steps. Initially, the extraction of unstructured data is carried out from the data source, and then, it is changed into structured data. 5 After that, clustering is implemented on structured data. Finally, the comparison of documents is done for enhancing the performance of the query based on accuracy.The day today's life generates a huge amount of unstructured text...
Ambidexterity is an important driver of organizational success to meet the future needs of fast‐changing markets. Building on the organizational ambidexterity (OA) literature, this study investigates the concept of OA from the emerging market perspective. It adopts semantic network analysis and meta‐analysis to identify the factors affecting OA in emerging markets. Semantic network analysis measures degree and eigenvector centralities to understand how the studied words are connected to OA in emerging markets. Meta‐analysis summarizes the factors affecting OA in the emerging markets and categorizes them as homogenous or heterogeneous. The results reveal the homogeneity in factors such as firm age, firm size, research and development intensity, top management team (TMT) size, environment instability, ownership, competitive intensity, risk aversion, and international experience, and the heterogeneity in factors such as innovation, firm performance, technological turbulence, new product development, slack resources, TMT social, and market orientation. Future research directions and managerial implications are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.