2017
DOI: 10.1145/3090057
|View full text |Cite
|
Sign up to set email alerts
|

Data Quality Challenges in Social Spam Research

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…In response to the web-spamming challenge [12], different researchers proposed various methods for identifying content-based spamdexing. A group of researchers [3] suggested a few critical content features.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…In response to the web-spamming challenge [12], different researchers proposed various methods for identifying content-based spamdexing. A group of researchers [3] suggested a few critical content features.…”
Section: Literature Reviewmentioning
confidence: 99%
“…ese models are applied to a web page's content for ranking the web page, so content-based web spamming is very popular among web spammers. To manipulate the spam web pages' content, web spammers utilise vulnerabilities of these models [12]. For example, they might use famous keywords many times on a spam web page to increase the keywords' frequencies, copy legitimate website's content, produce the content for spam web pages using machine-generated techniques, and add dictionary's words on a spam web page, giving those words the colour of the background so that these words cannot be seen on the spam web page by the user and are only visible to search engine spiders.…”
Section: Introductionmentioning
confidence: 99%
“…In the most recent years, several content-based spamdexing identification methods are proposed by the researchers during the spamdexing challenge [13]. Ntoulas et.…”
Section: Literature Reviewmentioning
confidence: 99%
“…This spamdexing technique is popular among the web spammers because of several SEs are using the information retrieval models (IRM) for instance, statistical language model [10], vector space model [11], BM25 [12] and probabilistic model which are applied to the web page's content for ranking the web page. Spammers try to utilize the vulnerabilities of these models for manipulating the content of target web page [13]. For instance, using important keywords several times on a target web page and increasing the keywords frequencies, copying the content from various good web pages, using the machine-generated content on target web page, and putting all dictionary words on the page and then changing the color of text similar to the background color so users can not see the dictionary words on target page and only visible to SEs spiders are some methods which web spammers are using for getting higher page rank in the SERPs [14].…”
Section: Introductionmentioning
confidence: 99%
“…Twitter has defined spamming behavior and the rules of tweet freedom [2,10,11]. We established our definition of a spam tweet.…”
Section: Motivationmentioning
confidence: 99%