Proceedings of the Fourth ACM Conference on Digital Libraries 1999
DOI: 10.1145/313238.313257
|View full text |Cite
|
Sign up to set email alerts
|

Scalable collection summarization and selection

Abstract: Information retrieval o ver the Internet increasingly requires the ltering of thousands of i n f o r mation sources. As the number and variety of sources increases, new ways of automatically summarizing, discovering, and selecting sources relevant t o a u s er's query a re needed. Pharos i s a h i g hly scalable distributed architecture for locating heterogeneous information sources. Its design is hierarchical, thus allowing it to scale well as the numberofinformation sources increases. We demonstrate the feas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2000
2000
2013
2013

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…Methods relying on manual classification such as used by CiteLine Professional and InvisibleWeb would be impractical due to the lack of such classification over WT2g documents. Due to time constraints even methods based on automatic classification such as Pharos [5] are not evaluated here.…”
Section: Server Selection Methodsmentioning
confidence: 99%
“…Methods relying on manual classification such as used by CiteLine Professional and InvisibleWeb would be impractical due to the lack of such classification over WT2g documents. Due to time constraints even methods based on automatic classification such as Pharos [5] are not evaluated here.…”
Section: Server Selection Methodsmentioning
confidence: 99%
“…where T = (p(w|D)·|D|)/(p(w|D)·|D|+50+150· cw(D) mcw ), 8 We adapted the technique in [13] slightly so that each database is classified under exactly one category. 9 Experiments using shrinkage together with ReDDE [27], a promising, recently proposed database selection algorithm, remain as interesting future work.…”
Section: Database Selection Algorithmsmentioning
confidence: 99%
“…However, there are approaches that incorporate additional information; for example, considering both textual and nontextual information Dolin et al [1997Dolin et al [ , 1998Dolin et al [ , 1999 or set different goals, for example, efficiency [Moffat and Zobel 1995], minimizing the cost of retrieving documents [Fuhr 1999] or accessing collections [Hawking and Thistlewaite 1999]. Craswell et al [2000] argued that the retrieval performance at a collection should be incorporated into the collection selection step.…”
Section: Additional Considerationsmentioning
confidence: 99%