Ensemble-based noise detection: noise ranking and visual performance evaluation

Sluban, Borut; Gamberger, Dragan; Lavrač, Nada

doi:10.1007/s10618-012-0299-1

Cited by 70 publications

(67 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In the first three steps of the workflow depicted in Figure 4 the outlier documents are identified and extracted (instead of OntoGen we use NoiseRank [25] as implemented in TextFlows). The goal of this phase is to extract a set of outlier documents from the whole corpus of input documents.…”

Section: New Methodology and Its Implementation As A Repeatable Workfmentioning

confidence: 99%

“…In contrast to the approach for outlier detection with OntoGen, described in Methods section, NoiseRank component implements a different strategy [25]. Here, classifiers are used to detect atypical documents in categorized document corpora, which can be considered as outliers of their own document category.…”

Section: New Methodology and Its Implementation As A Repeatable Workfmentioning

confidence: 99%

“…The main purpose of NoiseRank component as implemented in TextFlows (widget 2.10. in Figure 5) is to support domain experts in identifying noisy, outlier or erroneous data instances [25].…”

Section: New Methodology and Its Implementation As A Repeatable Workfmentioning

confidence: 99%

See 2 more Smart Citations

Reducing the Search Space in Literature-Based Discovery by Exploring Outlier Documents: a Case Study in Finding Links Between Gut Microbiome and Alzheimer’s Disease

Cestnik

Fabbretti

Gubiani

et al. 2017

Genomics Comput Biol

Self Cite

View full text Add to dashboard Cite

Literature-based discovery tools have been often used to overcome the problem of fragmentation of science and to assist researchers in their process of cross-domain knowledge discovery.In this paper we propose a methodology for cross-domain literature-based discovery that focuses on outlier documents to reduce the search space of potential cross-domain links and to improve search efficiency. In a previous study, literature mining tools OntoGen for document clustering and CrossBee for cross-domain bridging term exploration were combined to search for hidden relations in scientific papers from two different domains of interest, where the utility of the approach was demonstrated in a study involving PubMed papers about Alzheimer's disease and gut microbiome. This paper extends the approach by proposing a methodology, implemented as a repeatable workflow in a web-based text mining platform TextFlows, which enables easy access and execution of the methodology for the interested researcher.

show abstract

Section: New Methodology and Its Implementation As A Repeatable Workfmentioning

confidence: 99%

Section: New Methodology and Its Implementation As A Repeatable Workfmentioning

confidence: 99%

See 1 more Smart Citation

Reducing the Search Space in Literature-Based Discovery by Exploring Outlier Documents: a Case Study in Finding Links Between Gut Microbiome and Alzheimer’s Disease

Cestnik

Fabbretti

Gubiani

et al. 2017

Genomics Comput Biol

Self Cite

View full text Add to dashboard Cite

show abstract

“…For each chart type we created a specific template including individual functionality available for a certain type of performance visualization. For example, in PR space charts we included the novel F-isoline evaluation approach [10] which enables to simultaneously visually evaluate algorithm performance in terms of recall, precision and the F -measure. As additional novelty, for ROC curve charts a corresponding PR curve chart can be created (and vice versa), since PR curves give a more informative picture of the algorithm's performance when dealing with highly skewed datasets, which provides additional insight for algorithms design, as discussed in [2].…”

Section: The Vipercharts Platformmentioning

confidence: 99%

ViperCharts: Visual Performance Evaluation Platform

Sluban

Lavrač

2013

Advanced Information Systems Engineering

Self Cite

View full text Add to dashboard Cite

Abstract. The paper presents the ViperCharts web-based platform for visual performance evaluation of classification, prediction, and information retrieval algorithms. The platform enables to create interactive charts for easy and intuitive evaluation of performance results. It includes standard visualizations and extends them by offering alternative evaluation methods like F -isolines, and by establishing relations between corresponding presentations like Precision-Recall and ROC curves. Additionally, the interactive performance charts can be saved, exported to several formats, and shared via unique web addresses. A web API to the service is also available.

show abstract

“…These inconsistencies can be either errors, absent information or unknown values [2]. Whereas noise needs to be identified and treated, secure data in a dataset must be preserved [3]. The term secure data usually refers to instances that are core of the knowledge necessary to build accurate learning models.…”

Section: Introductionmentioning

confidence: 99%