1995
DOI: 10.1137/1037127
|View full text |Cite
|
Sign up to set email alerts
|

Using Linear Algebra for Intelligent Information Retrieval

Abstract: Currently, most approaches to retrieving textual materials from scienti c databases depend on a lexical match b e t ween words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity i n t h e w ords people use to describe the same document, lexical methods are necessarily incomplete and imprecise. Using the singular value decomposition (SVD), one can take a d v antage of the implicit higher-order structure in the association of terms with documents by determ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
719
1
37

Year Published

1996
1996
2017
2017

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 1,092 publications
(757 citation statements)
references
References 11 publications
0
719
1
37
Order By: Relevance
“…The root is part of the word, which in a unique way is identified with the lexeme. This operation contributes to a decrease in the data to be analyzed [21].…”
Section: Text Pre-processingmentioning
confidence: 99%
“…The root is part of the word, which in a unique way is identified with the lexeme. This operation contributes to a decrease in the data to be analyzed [21].…”
Section: Text Pre-processingmentioning
confidence: 99%
“…Assuming the cluster-quality threshold *80% q = , the appropriate cluster number is 2 , because There are some efficient algorithms which can do SVD for large sparse matrix very quickly [15]. To save time further, we can run SVD on the top-n items in the search results returned by search engines, then "fold-in" the rest documents incrementally [1]. Because search engines usually place high quality documents at the top of the result list, this approximation would not seriously hurt the clustering quality.…”
Section: Following the Idea Of Idea Of Latent Semantic Indexing (Lsi)mentioning
confidence: 99%
“…One way is to simply specify the first k or last n − k singular triplets (σ i , u i , v i ) needed to capture the most relevant information in the data matrix A for the particular application. This approach is used, for example, in information retrieval [4]. A difficult aspect of this approach is that ad hoc procedures are often used to choose k.…”
Section: Numerical Rank and Singular Subspacesmentioning
confidence: 99%
“…Certain applications give rise to low-rank matrices A in which k n. For instance, low-rank matrices arise in information retrieval using latent semantic indexing (LSI) [4], where the elements of the m×n matrix A provide an incomplete connection between n documents which define the database, and m key words pertaining to the database. The parameter k is typically 0.1% of min(m, n), thus relatively few factors are adequate for the LSI approach.…”
Section: Low-rank Algorithmsmentioning
confidence: 99%
See 1 more Smart Citation