2008
DOI: 10.1109/tkde.2007.190624
|View full text |Cite
|
Sign up to set email alerts
|

A Statistical Language Modeling Approach to Online Deception Detection

Abstract: Online deception is disrupting our daily life, organizational process, and even national security. Existing approaches to online deception detection follow a traditional paradigm by using a set of cues as antecedents for deception detection, which may be hindered by ineffective cue identification. Motivated by the strength of statistical language models (SLMs) in capturing the dependency of words in text without explicit feature extraction, we developed SLMs to detect online deception. We also addressed the da… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
24
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 84 publications
(24 citation statements)
references
References 23 publications
0
24
0
Order By: Relevance
“…We used 4 phishing dataset files as presented in Table 3. These phishing datasets have been used in phishing detection research including work by [3], [4], [5], [6], [9], [10], [12], and [16]. In order to provide non-phishing datasets, we used the SpamAssassin Project [17] from the easy ham directory.…”
Section: Methodsmentioning
confidence: 99%
“…We used 4 phishing dataset files as presented in Table 3. These phishing datasets have been used in phishing detection research including work by [3], [4], [5], [6], [9], [10], [12], and [16]. In order to provide non-phishing datasets, we used the SpamAssassin Project [17] from the easy ham directory.…”
Section: Methodsmentioning
confidence: 99%
“…Features from the three approaches just introduced are used to train Naïve Bayes and Support Vector Machine classifiers, both of which have performed well in related work (Jindal and Liu, 2008;Mihalcea and Strapparava, 2009;Zhou et al, 2008).…”
Section: Classifiersmentioning
confidence: 97%
“…Under (2), both the NB classifier used by Mihalcea and Strapparava (2009) and the language model classifier used by Zhou et al (2008) are equivalent. Thus, following Zhou et al (2008), we use the SRI Language Modeling Toolkit (Stolcke, 2002) to estimate individual language models, Pr( x | y = c), for truthful and deceptive opinions.…”
Section: Classifiersmentioning
confidence: 99%
See 1 more Smart Citation
“…The study aims to detect deception in text-based synchronous CMC by leveraging structural features extracted from online social networks [10], in addition to linguistic features [2], [11].…”
Section: Feature Selection For Deception Detectionmentioning
confidence: 99%