“…Finally, a third limitation of most ETD systems relates to the use of a unitary static monolithic text corpus from human-maintained indexed databases, such as INSPEC, topic detection and tracking (TDT) or COMPENDEX (Nowell et al, 1997;Lent et al, 1997;Swan and Jensen, 2000;Wong et al, 2000;Kumaran and Allan, 2004;Mei and Zhai, 2005;Zhang et al, 2007;Subasic and Berendt, 2010;Chen and Chundi, 2011). A closed static textual data corpus suffers from limited diversity, variety and richness and must be periodically refreshed, imposing major drawbacks, such as data coverage and indexer effect (Alexa, 1997;Zweigenbaum et al, 2001;Banko and Brill, 2001;Keller and Lapata, 2003).…”