2007
DOI: 10.1504/ijmso.2007.016807
|View full text |Cite
|
Sign up to set email alerts
|

Tracking and modelling information diffusion across interactive online media

Abstract: Information spreads rapidly across Web sites, Web logs and online forums. This paper describes the research framework of the IDIOM Project (Information Diffusion across Interactive Online Media), 1 which analyzes this process by identifying redundant content elements, mapping them to an ontological knowledge structure, and tracking their temporal and geographic distribution. Linguists define "idiom" as an expression whose meaning is different from the literal meanings of its component words. Similarly, the stu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2009
2009
2020
2020

Publication Types

Select...
3
1
1

Relationship

4
1

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 54 publications
0
7
0
Order By: Relevance
“…The simulations use two weekly snapshots from the IDIOM Media Watch on Climate Change (Hubmann-Haidvogel et al, 2009;Scharl et al, 2007) database comprising approximately one million documents. The IDIOM Media Watch on Climate Change's media corpus draws upon a list of 156 news media sites from five English-speaking countries (Liu et al, 2005).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The simulations use two weekly snapshots from the IDIOM Media Watch on Climate Change (Hubmann-Haidvogel et al, 2009;Scharl et al, 2007) database comprising approximately one million documents. The IDIOM Media Watch on Climate Change's media corpus draws upon a list of 156 news media sites from five English-speaking countries (Liu et al, 2005).…”
Section: Discussionmentioning
confidence: 99%
“…Large-scale Semantic Web projects such as the IDIOM Media Watch on Climate Change (Hubmann-Haidvogel et al, 2009;Scharl et al, 2007), which process hundreds of thousands of pages a week, demonstrate the importance of these guidelines. Querying GeoNames to geo-tag all the documents mirrored by IDIOM's architecture would add days of processing time.…”
Section: Introductionmentioning
confidence: 99%
“…Prior to the beginning of the computation, the raw textual data to be analyzed is gathered via a web crawler, then converted and annotated into the content repositories based on previous research [16], [32]. We utilize our experiences with webLyzard, an established and scalable media monitoring and Web intelligence platform (www.weblyzard.com), to generate the document keyword relevance table (1A) and the document word frequency table (1B).…”
Section: Document-term Matrix Generationmentioning
confidence: 99%
“…A chi-square test of significance with Yates' correction determines over-represented terms. The term co-occurrence analysis, based on pattern matching algorithm, along with trigger phrases based on regular expressions, is used to identify the frequently appearing text fragments within the same sentences and within the documents [16], [32]. The redundancy of nouns' singular and plural forms and synonyms in the resultant list of labels are removed by using a combination of regular expression queries and WordNet library lookup.…”
Section: Landscape Creation and Peak Labelingmentioning
confidence: 99%
“…Large scale Semantic Web projects, like the IDIOM media watch on climate change [11], process hundred thousands of pages a week. Querying GeoNames for geo-tagging such numbers of documents would add days of processing time to the IDIOM architecture.…”
Section: Cost Functionsmentioning
confidence: 99%