Proceedings of the 18th International Conference on Enterprise Information Systems 2016
DOI: 10.5220/0005829001350141
|View full text |Cite
|
Sign up to set email alerts
|

Towards a Synthetic Data Generator for Matching Decision Trees

Abstract: Abstract:It is popular to use real-world data to evaluate or teach data mining techniques. However, there are some disadvantages to use real-world data for such purposes. Firstly, real-world data in most domains is difficult to obtain for several reasons, such as budget, technical or ethical. Secondly, the use of many of the realworld data is restricted or in the case of data mining, those data sets do either not contain specific patterns that are easy to mine for teaching purposes or the data needs special pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0
2

Year Published

2018
2018
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 6 publications
0
1
0
2
Order By: Relevance
“…In other words, data sets do not contain purposeful models, or it requires particular preparation to find the pattern inside. Obtaining experimental results by producing artificial data or in other words synthetic data can overcome these disadvantages (Peng and Hanke 2016).…”
Section: Production Of Artificial Datamentioning
confidence: 99%
“…In other words, data sets do not contain purposeful models, or it requires particular preparation to find the pattern inside. Obtaining experimental results by producing artificial data or in other words synthetic data can overcome these disadvantages (Peng and Hanke 2016).…”
Section: Production Of Artificial Datamentioning
confidence: 99%
“…En (Peng and Hanke, 2016) los autores generan nuevos conjuntos de datos sintéticos por medio deárboles de decisión mediante una modificación del algoritmo ID3 (Iterative Dichotomiser 3). Mediante el uso de losárboles de decisión los autores consiguen crear interdependencia entre los datos de los conjuntos de datos generados con la intención de obtener conjuntos de datos genéricos con los que testear cualquier aplicación de aprendizaje automático.…”
Section: Trabajos Relacionadosunclassified
“…Los datos sintéticos como herramienta para poner a prueba métodos y modelos desarrollados se utilizan en diversos campos científicos, tales como reconocimiento y generación de patrones (Jiang et al, 2009), minería de datos (Peng and Hanke, 2016), en aprendizaje automático (Ekbatani et al, 2017), etc.…”
Section: Introductionunclassified