Arabic Text Copy Detection using Full, Reduced and Unique Syntactical Structures

Elhadi, Mohamed

doi:10.5120/ijca2016912088

Cited by 1 publication

(1 citation statement)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The combined use of syntactical POS tagging and text processing methods for the purpose of text similarity calculations and its applications was used in this recent work [72]- [77]. It was based on the intuition that similar (exact) documents would have similar (exact) syntactical structures.…”

Section: Related Workmentioning

confidence: 99%

Extractive Summarization Using Structural Syntax, Term Expansion and Refinement

Elhadi¹

2017

IJIS

View full text Add to dashboard Cite

This paper investigates a procedure developed and reports on experiments performed to studying the utility of applying a combined structural property of a text's sentences and term expansion using WordNet [1] and a local thesaurus [2] in the selection of the most appropriate extractive text summarization for a particular document. Sentences were tagged and normalized then subjected to the Longest Common Subsequence (LCS) algorithm [3] [4] for the selection of the most similar subset of sentences. Calculated similarity was based on LCS of pairs of sentences that make up the document. A normalized score was calculated and used to rank sentences. A selected top subset of the most similar sentences was then tokenized to produce a set of important keywords or terms. The produced terms were further expanded into two subsets using 1) WorldNet; and 2) a local electronic dictionary/thesaurus. The three sets obtained (the original and the expanded two) were then re-cycled to further refine and expand the list of selected sentences from the original document. The process was repeated a number of times in order to find the best representative set of sentences. A final set of the top (best) sentences was selected as candidate sentences for summarization. In order to verify the utility of the procedure, a number of experiments were conducted using an email corpus. The results were compared to those produced by human annotators as well as to results produced using some basic sentences similarity calculation method. Produced results were very encouraging and compared well to those of human annotators and Jacquard sentences similarity.

show abstract

Section: Related Workmentioning

confidence: 99%

Extractive Summarization Using Structural Syntax, Term Expansion and Refinement

Elhadi¹

2017

IJIS

View full text Add to dashboard Cite

show abstract

Arabic Text Copy Detection using Full, Reduced and Unique Syntactical Structures

Cited by 1 publication

References 19 publications

Extractive Summarization Using Structural Syntax, Term Expansion and Refinement

Extractive Summarization Using Structural Syntax, Term Expansion and Refinement

Contact Info

Product

Resources

About