Classifying information sender of web documents

Kato, Yukie; Kurohashi, Sadao; Inui, Kentaro

doi:10.1108/10662240810862248

Cited by 2 publications

(1 citation statement)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Page gathering and individual analysis modules based on natural language processing need to be automated and improved in the future. Currently, collecting Japanese pages (NICT, 2006) and sender classifications (Kato et al , 2007) are becoming automated. Other modules and data are scheduled to be updated accordingly.…”

Section: Discussionmentioning

confidence: 99%

Evaluation data and prototype system WISDOM for information credibility analysis

et al. 2008

Self Cite

View full text Add to dashboard Cite

Purpose -The purpose of this paper is to describe evaluation data and a prototype system named WISDOM used for analyzing information credibility based on natural language processing. Design/methodology/approach -The authors started the Information Credibility Criteria project in April, 2007, mainly to analyze the credibility of information (text) on the web. The project proposes to capture information credibility based on four criteria (content, sender, appearance, and social valuation) and aims to analyze and organize them logically using natural language processing based on predicate argument structure. Findings -The evaluation data described in this paper were developed as learning and verifying data for these various analysis modules and are composed of manually-annotated data based on each evaluation criteria about several pre-selected topics such as current events and medical issues. The prototype system WISDOM was developed to provide information credibility from different perspectives. Orginality/value -Users will be able to find credible information more reliably by browsing information using different evaluation criteria and conditions provided by the system. IntroductionThe development of computers and computer networks has made a variety of information available via the web. Using the web to research topics and learn new words has become a familiar daily occurrence.However, although a multitude of information is available on the web, its quality is a mixture of good and bad. Some information is truly useful to our daily lives, and some has no basis in truth. While conventional search engines collect information, general users have difficulty assessing the credibility and reliability of the information in the search results provided by these engines.One of the problems with conventional search engines is that the search results are based on a single criterion. For example, when looking for information on a health food

show abstract

Section: Discussionmentioning

confidence: 99%