2007
DOI: 10.1186/1471-2105-8-s9-s6
|View full text |Cite|
|
Sign up to set email alerts
|

Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations

Abstract: Background: The construction of literature-based networks of gene-gene interactions is one of the most important applications of text mining in bioinformatics. Extracting potential gene relationships from the biomedical literature may be helpful in building biological hypotheses that can be explored further experimentally. Recently, latent semantic indexing based on the singular value decomposition (LSI/SVD) has been applied to gene retrieval. However, the determination of the number of factors k used in the r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2008
2008
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 22 publications
0
10
0
Order By: Relevance
“…The current state of data visualization revolves around the representation of data matrices—an ensemble of data points. For example, when data is sparse and slight information loss is not an issue, a common practice is to perform Principal Components Analysis (PCA) to extract three or fewer components and then graph the resulting data points in a three-dimensional scatterplot (Wold et al, 1987; Kim et al, 2007). Other methods have been developed for even larger data sets.…”
Section: Resultsmentioning
confidence: 99%
“…The current state of data visualization revolves around the representation of data matrices—an ensemble of data points. For example, when data is sparse and slight information loss is not an issue, a common practice is to perform Principal Components Analysis (PCA) to extract three or fewer components and then graph the resulting data points in a three-dimensional scatterplot (Wold et al, 1987; Kim et al, 2007). Other methods have been developed for even larger data sets.…”
Section: Resultsmentioning
confidence: 99%
“…Kim et al attempted to retrieve unrecognized gene relationships by using LSI along with Non-Negative Matrix Factorization (NMF), another matrix factorization method (Kim et al, 2007). Gene retrieval was evaluated on manually created test sets based on precision and recall, showing that LSI- and NMF-based methods vastly outperformed co-occurrence methods.…”
Section: Latent Links For Literature-based Biomedical Discoverymentioning
confidence: 99%
“…Second, LSI's ability to reduce dimensionality allows for a better visualization of high-dimensionality points that exceed the realm of physical space. For example, LSI can be used to reduce the number of dimensions in vector space to one, two, or three so that each point is graphable in three-dimensional space (Kim et al, 2007). A major disadvantage to this method is that three dimensions is typically not an optimal value for k , so information loss will be significant.…”
Section: Visualization Of High-dimensional Datamentioning
confidence: 99%
See 1 more Smart Citation
“…There are many physical problems that the observations are formed by addition of non-negative components. Some of the examples include photon counting processes (Woolfe et al, 2011), gene relations (Kim et al, 2007), text mining (Park & Kim, 2006), and some computer vision applications (Lee & Seung, 1999).…”
Section: Non-negative Matrix Factorizationmentioning
confidence: 99%