a b s t r a c tBig data has become an important issue for a large number of research areas such as data mining, machine learning, computational intelligence, information fusion, the semantic Web, and social networks. The rise of different big data frameworks such as Apache Hadoop and, more recently, Spark, for massive data processing based on the MapReduce paradigm has allowed for the efficient utilisation of data mining methods and machine learning algorithms in different domains. A number of libraries such as Mahout and SparkMLib have been designed to develop new efficient applications based on machine learning algorithms. The combination of big data technologies and traditional machine learning algorithms has generated new and interesting challenges in other areas as social media and social networks. These new challenges are focused mainly on problems such as data processing, data storage, data representation, and how data can be used for pattern mining, analysing user behaviours, and visualizing and tracking data, among others. In this paper, we present a revision of the new methodologies that is designed to allow for efficient data mining and information fusion from social media and of the new applications and frameworks that are currently appearing under the "umbrella" of the social networks, social media and big data paradigms. (D. Camacho). petabytes (and even exabytes) in size, and the massive sizes of these datasets extend beyond the ability of average database software tools to capture, store, manage, and analyse them effectively.The concept of big data has been defined through the 3V model, which was defined in 2001 by Laney [5] as: "high-volume, highvelocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making". More recently, in 2012, Gartner [6] updated the definition as follows: "Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization". Both definitions refer to the three basic features of big data: Volume, Variety, and Velocity. Other organisations, and big data practitioners (e.g., researchers, engineers, and so on), have extended this 3V model to a 4V model by including a new "V": Value [7]. This model can be even extended to 5Vs if the concepts of Veracity is incorporated into the big data definition.Summarising, this set of * V-models provides a straightforward and widely accepted definition related to what is (and what is not) a big-data-based problem, application, software, or framework. These concepts can be briefly described as follows [5,7]:• Volume: refers to large amounts of any kind of data from any different sources, including mobile digital data creation devices and digital devices. The benefit from gathering, processing, and analysing these large amounts of data generates a number http://dx.
h i g h l i g h t s • A methodology to detect discussion communities on vaccination is proposed. • Vaccine opinions in twitter can affect the decision-making about vaccination. • The most relevant and influential users are identified analysing the communities. • The collective sentiment on vaccination has been studied for the detected groups. • Results provide useful information to improve immunization strategies.
Due to recent booming of UAVs technologies, these are being used in many fields involving complex tasks. Some of them involve a high risk to the vehicle driver, such as fire monitoring and rescue tasks, which make UAVs excellent for avoiding human risks. Mission Planning for UAVs is the process of planning the locations and actions (loading/dropping a load, taking videos/pictures, acquiring information) for the vehicles, typically over a time period. These vehicles are controlled from Ground Control Stations (GCSs) where human operators use rudimentary systems. This paper presents a new Multi-Objective Genetic Algorithm for solving complex Mission Planning Problems (MPP) involving a team of UAVs and a set of GCSs. A hybrid fitness function has been designed using a Constraint Satisfaction Problem (CSP) to check if solutions are valid and Pareto-based measures to look for optimal solutions. The algorithm has been tested on several datasets optimizing different variables of the mission, such as the makespan, the fuel consumption, distance, etc. Experimental results show that the new algorithm is able to obtain good solutions, however as the problem becomes more complex, the optimal solutions also become harder to find.
El acceso a la versión del editor puede requerir la suscripción del recurso Access to the published version may require subscriptionInternational Journal of Neural Systems, Vol. 0, No. 0 (April, 2000) The graph clustering problem has become highly relevant due to the growing interest of several research communities in social networks and their possible applications. Overlapped graph clustering algorithms try to find subsets of nodes that can belong to different clusters. In social-based applications it is quite usual for a node of the network to belong to different groups, or communities, in the graph. Therefore, algorithms trying to discover, or analyse, the behaviour of these networks need to handle this feature, detecting and identifying the overlapped nodes. This paper shows a soft clustering approach based on a genetic algorithm where a new encoding is designed to achieve two main goals. First, the automatic adaptation of the number of communities that can be detected. Second, the definition of several fitness functions that guide the searching process using some measures extracted from graph theory. Finally, our approach has been experimentally tested using the Eurovision contest dataset, a well-known social-based data network, to show how overlapped communities can be found using our method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.