Vectorization is imperative for processing textual data in natural language processing applications. Vectorization enables the machines to understand the textual contents by converting them into meaningful numerical representations. The proposed work targets at identifying unifiable news articles for performing multi-document summarization. A framework is introduced for identification of news articles related to top trending topics/hashtags and multi-document summarization of unifiable news articles based on the trending topics, for capturing opinion diversity on those topics. Text clustering is applied to the corpus of news articles related to each trending topic to obtain smaller unifiable groups. The effectiveness of various text vectorization methods, namely the bag of word representations with tf-idf scores, word embeddings, and document embeddings are investigated for clustering news articles using the k-means. The paper presents the comparative analysis of different vectorization methods obtained on documents from DUC 2004 benchmark dataset in terms of purity.
This paper presents a detailed methodology to process AMSR-E soil moisture data to generate average soil moisture maps of desired durations be it weekly, monthly or yearly over a large geographical area like a continent or a sub-continent. The paper also explores utility of AMSR-E soil moisture product (AE_Land 3 product) to understand the soil moisture variations over Indian subcontinent by analysing daily soil moisture data for entire calendar year of 2009. In order to demonstrate the developed methodology the year 2009 was selected wherein a total of 730 AMSR-E daily scenes (365 each for ascending as well as descending passes) were processed and analysed. Although the absolute values of soil moisture derived from AMSR-E are not showing good agreement with soil moisture status on ground which is due to large variability in soil moisture within the coarseresolution cell offered by passive sensors [1] but in general AMSR-E derived soil moisture values are well explained on the basis of rainfall data and agricultural practices adopted in different states of Indian sub-continent. The soundness of the detailed methodology proposed in this paper has been well supported by studying the variations in AMSR-E derived soil moisture with seasonal variations and rainfall data. It has been observed that the soil moisture variations are in line with the seasonal changes as well as the rainfall variations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.