Proceedings of the 2007 SIAM International Conference on Data Mining 2007
DOI: 10.1137/1.9781611972771.33
|View full text |Cite
|
Sign up to set email alerts
|

Less is More: Compact Matrix Decomposition for Large Sparse Graphs

Abstract: Given a large sparse graph, how can we find patterns and anomalies? Several important applications can be modeled as large sparse graphs, e.g., network traffic monitoring, research citation network analysis, social network analysis, and regulatory networks in genes. Low rank decompositions, such as SVD and CUR, are powerful techniques for revealing latent/hidden variables and associated patterns from high dimensional data. However, those methods often ignore the sparsity property of the graph, and hence usuall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
75
0

Year Published

2009
2009
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 85 publications
(78 citation statements)
references
References 29 publications
0
75
0
Order By: Relevance
“…We compare LS-DCUR against CUR-L2 (Euclidean-norm based selection), CUR-SL (statistical leverage based selection), and Greedy (a recent deterministic selection method that was shown to exceed various other methods [7]). To provide a fair comparison, we incorporate several extensions into the importance sampling based methods: both CUR-L2 and CUR-SL use the extensions proposed for CMD [26] and, in both cases, we sample exactly the same number of unique rows and columns as in the case of LS-DCUR and Greedy (double selections do not count as a selected row or column). For methods requiring computation of the top-k singular vectors (CUR-SL, Greedy), we specify a reasonable k. As setting it to the actual number of sampled rows and columns is not advisable, we follow the suggestion of [22] and over-sampled k; various experimental runs show that setting k to ≈ 4 5 of the number of row and column samples provides a convenient tradeoff between run-time performance and approximation accuracy; note that LS-DCUR does not require any additional parameters apart from the number of desired rows and columns.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…We compare LS-DCUR against CUR-L2 (Euclidean-norm based selection), CUR-SL (statistical leverage based selection), and Greedy (a recent deterministic selection method that was shown to exceed various other methods [7]). To provide a fair comparison, we incorporate several extensions into the importance sampling based methods: both CUR-L2 and CUR-SL use the extensions proposed for CMD [26] and, in both cases, we sample exactly the same number of unique rows and columns as in the case of LS-DCUR and Greedy (double selections do not count as a selected row or column). For methods requiring computation of the top-k singular vectors (CUR-SL, Greedy), we specify a reasonable k. As setting it to the actual number of sampled rows and columns is not advisable, we follow the suggestion of [22] and over-sampled k; various experimental runs show that setting k to ≈ 4 5 of the number of row and column samples provides a convenient tradeoff between run-time performance and approximation accuracy; note that LS-DCUR does not require any additional parameters apart from the number of desired rows and columns.…”
Section: Methodsmentioning
confidence: 99%
“…Various extensions to this method have been proposed. For example, the approach in [26] further reduces computation times by avoiding to repeatedly sample the same row (or column).…”
Section: Cur Decompositionmentioning
confidence: 99%
See 3 more Smart Citations