2010
DOI: 10.1016/j.jbi.2009.09.003
|View full text |Cite
|
Sign up to set email alerts
|

Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections

Abstract: The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously as methods for the discovery of such implicit connections. However, LSA in particular is dependent on a computationally demanding method of dimension reduction as a means to obtain meaningful indir… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
90
0
1

Year Published

2012
2012
2024
2024

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 83 publications
(92 citation statements)
references
References 20 publications
0
90
0
1
Order By: Relevance
“…Unlike methods that first construct a VSM at its original high dimension and conduct a dimensionality reduction afterwards, a category of RP-based methods-such as TopSig and RI as well as its variants, e.g. [10]-avoid the construction of the original highdimensional VSM. Instead, using the distributive property of matrix multiplication, these methods combine the construction of a vector space and the dimensionality reduction process (i.e.…”
Section: Dimension Reductionmentioning
confidence: 99%
“…Unlike methods that first construct a VSM at its original high dimension and conduct a dimensionality reduction afterwards, a category of RP-based methods-such as TopSig and RI as well as its variants, e.g. [10]-avoid the construction of the original highdimensional VSM. Instead, using the distributive property of matrix multiplication, these methods combine the construction of a vector space and the dimensionality reduction process (i.e.…”
Section: Dimension Reductionmentioning
confidence: 99%
“…Examples of such methods are Latent Semantic Analysis (LSA) and Random Indexing (RI). The latter is considered more scalable and is used to discover implicit connections from large corpora such as in [19]. However, most of distributional measures are calculated based on text analysis and mining the relationships based on the distribution of words in text.…”
Section: State Of the Artmentioning
confidence: 99%
“…The assumption behind this and other statistical semantics methods is that words which appear in the similar context (with the same set of other words) are synonyms. Synonyms tend not to co-occur with one another directly, so indirect inference is required to draw associations between words used to express the same idea [19]. This method has been shown to approximate human performance in many cognitive tasks such as the Test of English as a Foreign Language (TOEFL) synonym test, the grading of content-based essays and the categorisation of groups of concepts (see [19]).…”
Section: Structure-based Statistical Semantics Similaritymentioning
confidence: 99%
See 1 more Smart Citation
“…This ability has been termed indirect inference and it has been argued that it is essential to LSA's human-like performance on a number of cognitive tasks [5]. Indirect inference is also a fundamental concern of the field of Literature-based Discovery (LBD), which aims to promote scientific discovery by identifying meaningful connections between terms, and concepts, in the scientific literature that have not yet occurred together in any published document [6], and several authors have explored the ability of distributional models to facilitate discoveries of this nature [7][8][9]. A limitation of the use of these models for LBD is that they capture general relatedness between terms or concepts only, without encoding the nature of the relationships concerned.…”
Section: Introductionmentioning
confidence: 99%