New generation sequencing machines: Illumina and Solexa can generate millions of short reads from a given genome sequence on a single run. Alignment of these reads to a reference genome is a core step in Next-generation sequencing data analysis such as genetic variation and genome re-sequencing etc. Therefore there is a need of a new approach, efficient with respect to memory as well as time to align these enormous reads with the reference genome. Existing techniques such as MAQ, Bowtie, BWA, BWBBLE, Subread, Kart, and Minimap2 require huge memory for whole reference genome indexing and reads alignment. Gapped alignment versions of these techniques are also 20–40% slower than their respective normal versions. In this paper, an efficient approach: WIT for reference genome indexing and reads alignment using Burrows–Wheeler Transform (BWT) and Wavelet Tree (WT) is proposed. Both exact and approximate alignments are possible by it. Experimental work shows that the proposed approach WIT performs the best in case of protein sequence indexing. For indexing, the reference genome space required by WIT is 0.6[Formula: see text]N (N is the size of reference genome) whereas existing techniques BWA, Subread, Kart, and Minimap2 require space in between 1.25[Formula: see text]N to 5[Formula: see text]N. Experimentally, it is also observed that even using such small index size alignment time of proposed approach is comparable in comparison to BWA, Subread, Kart, and Minimap2. Other alignment parameters accuracy and confidentiality are also experimentally shown to be better than Minimap2. The source code of the proposed approach WIT is available at http://www.algorithm-skg.com/wit/home.html .
In this era of growing digital media, the volume of text data increases day by day from various sources and may contain entire documents, books, articles, etc. This amount of text is a source of information that may be insignificant, redundant, and sometimes may not carry any meaningful representation. Therefore, we require some techniques and tools that can automatically summarize the enormous amounts of text data and help us to decide whether they are useful or not. Text summarization is a process that generates a brief version of the document in the form of a meaningful summary. It can be classified into abstractive text summarization and extractive text summarization. Abstractive text summarization generates an abstract type of summary from the given document. In extractive text summarization, a summary is created from the given document that contains crucial sentences of the document. Many authors proposed various techniques for both types of text summarization. This paper presents a survey of extractive text summarization on graphical-based techniques. Specifically, it focuses on unsupervised and supervised techniques. This paper shows the recent works and advances on them and focuses on the strength and weaknesses of surveys of previous works in tabular form. At last, it concentrates on the evaluation measure techniques of summary.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.