Unsolicited Bulk Email (UBE) has become a large problem in recent years. The number of mass mailers in existence is increasing dramatically. Automatically detecting UBE has become a vital area of current research. Many email clients (such as Outlook and Thunderbird) already have junk filters built in. Mass mailers are continually evolving and overcoming some of the junk filters. This means that the need for research in the area is ongoing. Many existing techniques seem to randomly choose the features that will be used for classification. This paper aims to address this issue by investigating the utility of over 40 features that have been used in recent literature. Information gain for these features are calculated over Ham, Spam and Phishing corpora.
Despite the many potential benefits to its users, social networking appears to provide a rich setting for criminal activities and other misdeeds. In this paper we consider whether the risks of social networking are unique and novel to this context. Having considered the nature and range of applications to which social networks may be applied, we conclude that there are no exploits or fundamental threats inherent to the social networking setting. Rather, the risks and associated threats treat this communicative and social context as an enabler for existing, long established and well-recognised exploits and activities.
Recent developments in the field of data fusion have seen a focus on
techniques that use training queries to estimate the probability that various
documents are relevant to a given query and use that information to assign
scores to those documents on which they are subsequently ranked. This paper
introduces SlideFuse, which builds on these techniques, introducing a sliding
window in order to compensate for situations where little relevance information
is available to aid in the estimation of probabilities.
SlideFuse is shown to perform favourably in comparison with CombMNZ, ProbFuse
and SegFuse. CombMNZ is the standard baseline technique against which data
fusion algorithms are compared whereas ProbFuse and SegFuse represent the
state-of-the-art for probabilistic data fusion methods
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.