As the world becomes more and more interconnected, our everyday objects become part of the Internet of Things, and our lives get more and more mirrored in virtual reality, where every piece of information, including misinformation, fake news and malware, can spread very fast practically anonymously. To suppress such uncontrolled spread, efficient computer systems and algorithms capable to track down such malicious information spread have to be developed. Currently, the most effective methods for source localization are based on sensors which provide the times at which they detect the spread. We investigate the problem of the optimal placement of such sensors in complex networks and propose a new graph measure, called Collective Betweenness, which we compare against four other metrics. Extensive numerical tests are performed on different types of complex networks over the wide ranges of densities of sensors and stochasticities of signal. In these tests, we discovered clear difference in comparative performance of the investigated optimal placement methods between real or scale-free synthetic networks versus narrow degree distribution networks. The former have a clear region for any given method's dominance in contrast to the latter where the performance maps are less homogeneous. We find that while choosing the best method is very network and spread dependent, there are two methods that consistently stand out. High Variance Observers seem to do very well for spread with low stochasticity whereas Collective Betwenness, introduced in this paper, thrives when the spread is highly unpredictable.
We investigate the problem of locating the source of diffusion in complex networks without complete knowledge of nodes' states. Some currently known methods assume the information travels via a single, shortest path, which by assumption is the fastest way. We show that such a method leads to the overestimation of propagation time for synthetic and real networks, where multiple shortest paths as well as longer paths between vertices exist. We propose a new method of source estimation based on maximum likelihood principle, that takes into account existence multiple shortest paths. It shows up to 1.6 times higher accuracy in synthetic and real networks. * lukaszgajewski@tuta.io
Learning is a complex cognitive process that depends not only on an individual capability of knowledge absorption but it can be also influenced by various group interactions and by the structure of an academic curriculum. We have applied methods of statistical analyses and data mining (principal component analysis and maximal spanning tree) for anonymized students' scores at Faculty of Physics, Warsaw University of Technology. A slight negative linear correlation exists between mean and variance of course grades, i.e. courses with higher mean scores tend to possess a lower scores variance. There are courses playing a central role, e.g. their scores are highly correlated to other scores and they are in the centre of corresponding maximal spanning trees. Other courses contribute significantly to students' score variance as well to the first principal component and they are responsible for differentiation of students' scores. Correlations of the first principal component to courses' mean scores and scores variance suggest that this component can be used for assigning ECTS points to a given course. The analysis is independent of declared curricula of considered courses. The proposed methodology is universal and can be applied for analysis of students' scores and academic curriculum at any faculty.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.