Abstract. In feature selection algorithms, "stability" is the sensitivity of the chosen feature set to variations in the supplied training data. As such it can be seen as an analogous concept to the statistical variance of a predictor. However unlike variance, there is no unique definition of stability, with numerous proposed measures over 15 years of literature. In this paper, instead of defining a new measure, we start from an axiomatic point of view and identify what properties would be desirable. Somewhat surprisingly, we find that the simple Pearson's correlation coefficient has all necessary properties, yet has somehow been overlooked in favour of more complex alternatives. Finally, we illustrate how the use of this measure in practice can provide better interpretability and more confidence in the model selection process.
Deep neural networks have been widely adopted in recent years, exhibiting impressive performances in several application domains. It has however been shown that they can be fooled by adversarial examples, i.e., images altered by a barely-perceivable adversarial noise, carefully crafted to mislead classification. In this work, we aim to evaluate the extent to which robot-vision systems embodying deeplearning algorithms are vulnerable to adversarial examples, and propose a computationally efficient countermeasure to mitigate this threat, based on rejecting classification of anomalous inputs. We then provide a clearer understanding of the safety properties of deep networks through an intuitive empirical analysis, showing that the mapping learned by such networks essentially violates the smoothness assumption of learning algorithms. We finally discuss the main limitations of this work, including the creation of real-world adversarial examples, and sketch promising research directions. 1
Abstract. Although diversity in classifier ensembles is desirable, its relationship with the ensemble accuracy is not straightforward. Here we derive a decomposition of the majority vote error into three terms: average individual accuracy, "good" diversity and "bad diversity". The good diversity term is taken out of the individual error whereas the bad diversity term is added to it. We relate the two diversity terms to the majority vote limits defined previously (the patterns of success and failure). A simulation study demonstrates how the proposed decomposition can be used to gain insights about majority vote classifier ensembles.
Group-living species frequently pool individual information so as to reach consensus decisions such as when and where to move, or whether a predator is present. Such opinion-pooling has been demonstrated empirically, and theoretical models have been proposed to explain why group decisions are more reliable than individual decisions. Behavioural ecology theory frequently assumes that all individuals have equal decision-making abilities, but decision theory relaxes this assumption and has been tested in human groups. We summarise relevant theory and argue for its applicability to collective animal decisions. We consider selective pressure on confidence-weighting in groups of related and unrelated individuals. We also consider which species and behaviours may provide evidence of confidence-weighting, paying particular attention to the sophisticated vocal communication of cooperative breeders.
MotivationThe identification of biomarkers to support decision-making is central to personalized medicine, in both clinical and research scenarios. The challenge can be seen in two halves: identifying predictive markers, which guide the development/use of tailored therapies; and identifying prognostic markers, which guide other aspects of care and clinical trial planning, i.e. prognostic markers can be considered as covariates for stratification. Mistakenly assuming a biomarker to be predictive, when it is in fact largely prognostic (and vice-versa) is highly undesirable, and can result in financial, ethical and personal consequences. We present a framework for data-driven ranking of biomarkers on their prognostic/predictive strength, using a novel information theoretic method. This approach provides a natural algebra to discuss and quantify the individual predictive and prognostic strength, in a self-consistent mathematical framework.ResultsOur contribution is a novel procedure, INFO+, which naturally distinguishes the prognostic versus predictive role of each biomarker and handles higher order interactions. In a comprehensive empirical evaluation INFO+ outperforms more complex methods, most notably when noise factors dominate, and biomarkers are likely to be falsely identified as predictive, when in fact they are just strongly prognostic. Furthermore, we show that our methods can be 1–3 orders of magnitude faster than competitors, making it useful for biomarker discovery in ‘big data’ scenarios. Finally, we apply our methods to identify predictive biomarkers on two real clinical trials, and introduce a new graphical representation that provides greater insight into the prognostic and predictive strength of each biomarker.Availability and implementationR implementations of the suggested methods are available at https://github.com/sechidis.Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.