Context surrounding hyperlinked semi-structured documents, externally in the form of citations and internally in the form of hierarchical structure, contains a wealth of useful but implicit evidence about a document's relevance. These rich sources of information should be exploited as contextual evidence. This paper proposes various methods of accumulating evidence from the context, and measures the effect of contextual evidence on retrieval effectiveness for document and focused retrieval of hyperlinked semi-structured documents.We propose a re-weighting model to contextualize (a) evidence from citations in a query-independent and querydependent fashion (based on Markovian random walks) and (b) evidence accumulated from the internal tree structure of documents. The in-links and out-links of a node in the citation graph are used as external context, while the internal document structure provides internal, within-document context. We hypothesize that documents in a good context (having strong contextual evidence) should be good candidates to be relevant to the posed query, and vice versa.We tested several variants of contextualization and verified notable improvements in comparison with the baseline system and gold standards in the retrieval of full documents and focused elements.
Extrapolations are techniques in linear algebra that require little additional infrastructure that must be incorporated in the existing query-dependent Link Analysis Ranking (LAR) algorithms. Extrapolations in LAR settings relies on the prior knowledge of the (iterative) process that created the existing data points (iterates) to compute the new (improved) data point, which periodically leads to the desired solution faster than the original method. In this study, the author presents novel approaches using extrapolation techniques to speed-up the convergence of query-dependent iterative methods, link analysis based ranking methods, where hyperlink structures are used to determine relative importance of a document in the network of inter-connections. The author uses the framework defined in HITS and SALSA and proposes the use of different Extrapolation techniques for faster ranking. The paper improves algorithms like HITS and SALSA using Extrapolation techniques. With the proposed approaches it is possible to accelerate the iterative ranking algorithms in terms of reducing the number of iterations and increasing the rate of convergence.
The textual context of an element, structurally, contains traces of evidences. Utilizing this context in scoring is called contextualization. In this study we hypothesize that the context of an XML-element originated from its preceding and following elements in the sequential ordering of a document improves the quality of retrieval. In the tree form of the document's structure, kinship contextualization means, contextualization based on the horizontal and vertical elements in the kinship tree, or elements in closer to a wider structural kinship. We have tested several variants of kinship contextualization and verified notable improvements in comparison with the baseline system and gold standards in the retrieval of focused elements.
Extrapolations are techniques in linear algebra that require little additional infrastructure that must be incorporated in the existing query-dependent Link Analysis Ranking (LAR) algorithms. Extrapolations in LAR settings relies on the prior knowledge of the (iterative) process that created the existing data points (iterates) to compute the new (improved) data point, which periodically leads to the desired solution faster than the original method. In this study, the author presents novel approaches using extrapolation techniques to speed-up the convergence of query-dependent iterative methods, link analysis based ranking methods, where hyperlink structures are used to determine relative importance of a document in the network of inter-connections. The author uses the framework defined in HITS and SALSA and proposes the use of different Extrapolation techniques for faster ranking. The paper improves algorithms like HITS and SALSA using Extrapolation techniques. With the proposed approaches it is possible to accelerate the iterative ranking algorithms in terms of reducing the number of iterations and increasing the rate of convergence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.