“…These segments of scholarly documents, commonly referred to in the literature as logical structure, can be extracted (Anjewierden, 2001; Bounhas & Slimani, 2010; Burget, 2007; Councill, Giles, & Kan, 2008; Hagen, Harald, Ngen, & Petra Saskia, 2004; K.H. Lee, Choy, & Cho, 2003; Li & Ng, 2004; Luong, Nguyen, & Kan, 2010; Nguyen & Luong, 2010; Ratté, Njomgue, & Ménard, 2007; Stoffel, Spretke, Kinnemann, & Keim, 2010; Wang, Jin, Wang, Wang, & Gao, 2005; Witt et al, 2010; K. Zhang, Wu, & Li, 2006), and can be used to improve document indexing (Bounhas & Slimani, 2010), to represent the semantic content of scientific publications (Luong, Nguyen, & Kan, 2010; Ratté et al, 2007), to extract key phrases and terminologies (Bounhas & Slimani, 2010; Nguyen & Luong, 2010), and to improve document summarization (Teufel & Moens, 2002). To improve indexing of semistructured documents, for instance, a method of terms weighting is applied according to their structural occurrences (or their positions in different segments of the document), instead of using the whole document as in flat weighting methods (Bounhas & Slimani, 2010; de Moura, Fernandes, Ribeiro‐Neto, da Silva, & Gonçalves, 2010).…”