iii "Thanks to my solid academic training, today I can write hundreds of words on virtually any topic without possessing a shred of information, which is how I got a good job in journalism."
AbstractEscuela Politécnica Superior
Msc in BigData and DataScience
Natural language processing for scam detection. Classic and alternative analysis techniquesby Ignacio PALACIO MARÍNWe have seen, over the past decades, an overwhelming increase in the volumes of information being generated, distributed and shared, specially but not only, though social media networks. It has led to an exponential growth on the importance of data in the decision taking processes of most industries and economic sectors, proving the criticality of ensuring the quality of the information we are gathering and using.Whilst most of this information is, or at least is intended to be, true, a non-negligible portion of it contains false information. Miss-information campaigns have played an important role in recent and critical decision-taking processes such as the Brexit referendum or the 2016 U.S. presidential elections.The current spread of incorrect information constitutes a meaningful potential risk on information systems' management. This problem becomes even greater when considering decision taking automatic algorithms. As a matter of fact, social media and opened access to data may constitute a way to break the information's asymmetry that has traditionally affected areas such as the financial industry.This paper will propose different techniques of natural language processing, from the more traditional ones to a brief approach to more recently developed techniques on deep learning approaches. They are all intended to enable an automatic texts' classification in different discussion forums and constructing procedures to pursue users or groups of users' classifications as an open gate to generate attribution procedures for information sources.vii Ignacio PALACIO MARÍN ix
ContentsAbstract v Acknowledgements vii