Integration of useful links in distributed databases using decision tree classification

Mehenni, Tahar

doi:10.1109/isei.2015.7358717

Cited by 7 publications

(4 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, our system performs better in finding the most useful joins across the data sources, thanks to the regression model used in predicting the link usefulness [4,5,6]. To perform the classification task, we use the decision tree classification algorithm that exploits the joins discovered automatically across the databases [4,5,6]. Experiments performed on five real databases were very satisfactory and show that the proposed system succeeded in achieving a fully automatic classification across multiple heterogeneous databases.…”

Section: Introductionmentioning

confidence: 93%

“…Putting all the data from the relevant databases into a single data set can destroy some important information that reflects the individuality of the different databases. 6. And the important limitation is the heterogeneity problem, where the aggregation of all the heterogeneous databases to obtain a whole single database could be simply an unfeasible solution.…”

Section: Related Workmentioning

confidence: 99%

“…To solve the heterogeneity problem, we partially follow the work of [3], where the author presented a framework that automatically identifies approximate foreign-key joins in the multiple heterogeneous databases. Moreover, our system performs better in finding the most useful joins across the data sources, thanks to the regression model used in predicting the link usefulness [4,5,6]. To perform the classification task, we use the decision tree classification algorithm that exploits the joins discovered automatically across the databases [4,5,6].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

SCATTER: Fully Automated Classification System across Multiple Databases

Mehenni¹

2019

IJCDS

Self Cite

View full text Add to dashboard Cite

Data mining approaches performed recently use data coming from a single table and are not adapted to multiple tables. Moreover, computer network expansion and data sources diversity require new data mining systems handling databases heterogeneity in multi-database systems. In this paper, we propose SCATTER: a fully automated classification system from multiple heterogeneous databases. SCATTER is composed of three components. The first component uses schema matching techniques to find foreign-key links across the multi-database system. The second component tries to find the most useful links that are critical for producing accurate classes across multiple databases. The last component is a decision tree classification algorithm which exploits the useful links discovered automatically across the databases. Experiments performed on real databases were very satisfactory with an average accuracy of 86.5% and showed that SCATTER system succeeded in achieving a fully automated classification from multiple heterogeneous databases.

show abstract

Section: Introductionmentioning

confidence: 93%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

SCATTER: Fully Automated Classification System across Multiple Databases

Mehenni¹

2019

IJCDS

Self Cite

View full text Add to dashboard Cite

show abstract

“…The major advantage of decision tree method lies in identifying solutions (Mehenni, 2015). In certain situations when we confront a large sample space, this approach can make data preparation much easier and more understandable for users without technical knowledge compared with remaining methods (Mehenni, 2015).…”

Section: Introductionmentioning

confidence: 99%

A novel hybrid support vector machine with decision tree for data classification

Gashti

2017

Int. j. adv. appl. sci.

View full text Add to dashboard Cite

The purpose of this paper is to increase the accuracy of a proposed support vector machine model using hybrid model of SVM and ID3. Then the hybrid approach based on SVM and ID3 tree will be evaluated focusing on analyzing the impact of ID3 on SVM performance. The evaluation process was carried out on the global dataset and Adult reference extracted from KEEL dataset repository. The obtained results demonstrate higher classification accuracy (0.9125) of the proposed model compared to SVM and ID3.

show abstract