“…These include plagiarism (Braumoeller, 2001;Monostori, 2002;Cook, 2002;Hoad, 2003;Gilbert, 2003;Pecorari, 2003;Chen, 2004;Bao, 2004), duplicate/ redundant publication (Doherty, 1996;Jefferson, 1998;Schein, 2001;Bailey, 2002;Von Elm, 2004;Mojon-Azzi, 2004;Gwilym, 2004), text/ document clustering (Maderlechner, 1997;Atlam, 2003;Dobrynin, 2004;Shin, 2004;Bansal, 2004), and information retrieval (Salton, 1991;Hui, 2004;Leuski, 2004;Muresan, 2004;Chang, 2004). These studies have shown that, in general, identifying similar documents through concept matching is quite difficult.…”