2018
DOI: 10.1016/j.csi.2017.05.004
|View full text |Cite
|
Sign up to set email alerts
|

Intelligent assistance for data pre-processing

Abstract: A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. Typically, a dataset needs to be pre-processed before being mined. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives. As a consequence, non-experienced users become overwhelmed with pre-processing alternatives. In thi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
32
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 41 publications
(34 citation statements)
references
References 19 publications
0
32
0
Order By: Relevance
“…Nguyen et al (2014) construct new pipelines using a beam search focussed on components recommended by a metalearner, and is itself trained on examples of successful prior pipelines. Bilalli et al (2018) predict which pre-processing techniques are recommended for a given classification algorithm. They build a meta-model per target classification algorithm that, given the t new meta-features, predicts which preprocessing technique should be included in the pipeline.…”
Section: Pipeline Synthesismentioning
confidence: 99%
“…Nguyen et al (2014) construct new pipelines using a beam search focussed on components recommended by a metalearner, and is itself trained on examples of successful prior pipelines. Bilalli et al (2018) predict which pre-processing techniques are recommended for a given classification algorithm. They build a meta-model per target classification algorithm that, given the t new meta-features, predicts which preprocessing technique should be included in the pipeline.…”
Section: Pipeline Synthesismentioning
confidence: 99%
“…A significant amount of services that run on the QMP and Guifi.net network are network-intensive (i.e., bandwidth and delay sensitive), transferring large amounts of data between the network nodes [8,30]. The performance of such kind of services depends not just on computational and disk resources but also on the network bandwidth between the nodes on which they are deployed.…”
Section: Bandwidth Characterizationmentioning
confidence: 99%
“…PeerStreamer, 8 an open source live P2P video streaming service, has been paradigmatically established as the live streaming service in Cloudy. This service is based on chunk diffusion, where peers offer a selection of the chunks they own to some peers in their neighborhood.…”
Section: Live-video Streaming Servicementioning
confidence: 99%
“…When data analyst wants to do an analysis of electricity load consumption and pricing trends, then takes dataset of specific electricity company and performs some statistical analysis to get meaningful information. To keep balance between consumption and generation, many researchers are working on electricity load and price forecasting [4]. There are three types of forecasting: Short-Term Load Forecasting (STLF), Medium-Term Load Forecasting (MTLF) and Long-Term Load Forecasting (LTLF).…”
Section: Introductionmentioning
confidence: 99%