2007
DOI: 10.1029/2006ja012136
|View full text |Cite
|
Sign up to set email alerts
|

Data mining in space physics: MineTool algorithm

Abstract: [1] A novel data mining method called MineTool is introduced which, by virtue of automating the modeling process and model evaluations, makes it more accessible to nonexperts. The technique aggregates the various stages of model building into a four-step process consisting of (1) data segmentation and sampling, (2) variable preselection and transform generation, (3) predictive model estimation and validation, and (4) final model testing. Optimal strategies are chosen for each modeling step. However, the modula… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2009
2009
2019
2019

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 7 publications
0
13
0
Order By: Relevance
“…Automated techniques have also been used to identify magnetopause flux transfer events (FTEs) via bipolar field deflections [e.g., Kawano and Russell, 1996], albeit still requiring manual confirmation. Karimabadi et al [2007] introduced a data mining technique (MineTool) and later applied it to locate FTEs [Karimabadi et al, 2009], using a combination of magnetic field and plasma data. Recently, Malaspina and Gosling [2012] refined a technique to identify rotational discontinuities in the STEREO solar wind data using the gradient of the magnetic field (the method was initially developed by Vasquez et al [2007]).…”
Section: Introductionmentioning
confidence: 99%
“…Automated techniques have also been used to identify magnetopause flux transfer events (FTEs) via bipolar field deflections [e.g., Kawano and Russell, 1996], albeit still requiring manual confirmation. Karimabadi et al [2007] introduced a data mining technique (MineTool) and later applied it to locate FTEs [Karimabadi et al, 2009], using a combination of magnetic field and plasma data. Recently, Malaspina and Gosling [2012] refined a technique to identify rotational discontinuities in the STEREO solar wind data using the gradient of the magnetic field (the method was initially developed by Vasquez et al [2007]).…”
Section: Introductionmentioning
confidence: 99%
“…(I ) Clustering the unlabeled treatment data (67% of all the data, 33% was set aside for testing) using a Gaussian mixture model (using Expectation Maximization algorithm), (II) Adding a new labeled variable cluster with a label equals the optimal number of clusters found in the previous step, (III) Feeding this newly labeled data into a classification method called MineTool [61][62][63] resulting in a predictive model of medical error (PMME) (IV) Dynamically updating the model by repeating the previous three steps (i)-(iii) on 100 percent of the data, whereby simulating new data instances being collected through the oncology information system.…”
Section: Smart-tool For Anomaly Detection In Radiotherapy Treatmementioning
confidence: 99%
“…Kadous achieved fairly to very accurate classification results in the domains of ECG (electrocardiograph) diagnosis (72% accuracy) and the sign language recognition (98% accuracy). We build on TClass for the physics domain application by (1) improving the feature collection and segmentation paradigms, (2) adding more features, and (3) utilizing MineTool [ Karimabadi et al , 2007] as the classification method of choice, which has the capacity to outperform standard data mining tools such as decision trees, artificial neural networks [ Ripley , 1996] and support vector machines [ Cortes and Vapnik , 1995].…”
Section: Minetool‐ts Algorithmmentioning
confidence: 99%
“…One is that potentially useful candidates get overlooked simply because there are too many variables to evaluate. Another is that if a systematic routine for evaluating and including variables is used, it can lead to overfit [ Karimabadi et al , 2007]. Finally, many candidate variables are likely to be redundant, which can cause difficulties for the estimation routines.…”
Section: Minetool‐ts Algorithmmentioning
confidence: 99%
See 1 more Smart Citation