A Comparison Framework of Machine Learning Algorithms for Mixed-Type Variables Datasets: A Case Study on Tire-Performances Prediction

Gutiérrez-Gómez, Leonardo; Petry, Frank; Khadraoui, Djamel

doi:10.1109/access.2020.3041367

Cited by 13 publications

(5 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The discovery of the race is the main goal of machine learning, in addition to making smart decisions. There are a lot of algorithms for machine learning, and they are mainly classified into two types and supervised, and the second category is unsupervised, and there is a class between them called semisupervised [ 1 ]. When there is big data, machine learning will be expanded by algorithms.…”

Section: Introductionmentioning

confidence: 99%

Effective of Smart Mathematical Model by Machine Learning Classifier on Big Data in Healthcare Fast Response

Al‐Khasawneh

Bukhari

Khasawneh

2022

Computational and Mathematical Methods in Medicine

View full text Add to dashboard Cite

In the past few years, big data related to healthcare has become more important, due to the abundance of data, the increasing cost of healthcare, and the privacy of healthcare. Create, analyze, and process large and complex data that cannot be processed by traditional methods. The proposed method is based on classifying data into several classes using the data weight derived from the features extracted from the big data. Three important criteria were used to evaluate the study as well as to benchmark the current study with previous studies using a standard dataset.

show abstract

Section: Introductionmentioning

confidence: 99%

Effective of Smart Mathematical Model by Machine Learning Classifier on Big Data in Healthcare Fast Response

Al‐Khasawneh

Bukhari

Khasawneh

2022

Computational and Mathematical Methods in Medicine

View full text Add to dashboard Cite

show abstract

“…Naïve Bayes, support vector machines (SVM), and boost algorithms are used for supervised learning [57]. Wavelet coefficients of natural images are relatively sparse models implemented as a wavelet coefficient for natural image processing [58], Shannon source coding theorem is used for uniform coding in tree construction [59], sensing data modeling [60], and applications available for data transformation, projection of objects, as well as in learning algorithms. A sample of data could represent the concept of overall information, and normalization can also be applied for better visualization of multiple features in a single frame.…”

Section: Discussionmentioning

confidence: 99%

“…[57]. Semi-supervised learning can work on labeled or unlabeled datasets for clustering and classification [59]. In the real world, labeled data are limited and a semi-supervised model is more practical for work on unlabeled datasets [61] for better performance.…”

Section: Machine Learningmentioning

confidence: 99%

A Ranking Learning Model by K-Means Clustering Technique for Web Scraped Movie Data

et al. 2022

View full text Add to dashboard Cite

Business organizations experience cut-throat competition in the e-commerce era, where a smart organization needs to come up with faster innovative ideas to enjoy competitive advantages. A smart user decides from the review information of an online product. Data-driven smart machine learning applications use real data to support immediate decision making. Web scraping technologies support supplying sufficient relevant and up-to-date well-structured data from unstructured data sources like websites. Machine learning applications generate models for in-depth data analysis and decision making. The Internet Movie Database (IMDB) is one of the largest movie databases on the internet. IMDB movie information is applied for statistical analysis, sentiment classification, genre-based clustering, and rating-based clustering with respect to movie release year, budget, etc., for repository dataset. This paper presents a novel clustering model with respect to two different rating systems of IMDB movie data. This work contributes to the three areas: (i) the “grey area” of web scraping to extract data for research purposes; (ii) statistical analysis to correlate required data fields and understanding purposes of implementation machine learning, (iii) k-means clustering is applied for movie critics rank (Metascore) and users’ star rank (Rating). Different python libraries are used for web data scraping, data analysis, data visualization, and k-means clustering application. Only 42.4% of records were accepted from the extracted dataset for research purposes after cleaning. Statistical analysis showed that votes, ratings, Metascore have a linear relationship, while random characteristics are observed for income of the movie. On the other hand, experts’ feedback (Metascore) and customers’ feedback (Rating) are negatively correlated (−0.0384) due to the biasness of additional features like genre, actors, budget, etc. Both rankings have a nonlinear relationship with the income of the movies. Six optimal clusters were selected by elbow technique and the calculated silhouette score is 0.4926 for the proposed k-means clustering model and we found that only one cluster is in the logical relationship of two rankings systems.

show abstract

“…In the top right corner of Figure 2, we display a score between 0 and 1 measuring the similarity of the current set of inputs to the training data based on a kernel method. 92 The lower the score, the more cautious should the expert be regarding the prediction.…”

Section: Input Space Visualizationmentioning

confidence: 99%

Interact: A visual what-if analysis tool for virtual product design

Ciorna,

Melançon,

Petry

et al. 2023

Information Visualization

Self Cite

View full text Add to dashboard Cite

Virtual prototyping is increasingly used by businesses to streamline operations, cut costs, and enhance daily operations. This often includes a variety of modeling techniques among which, complex, black-box models. The path from model development to utilization in applied contexts is yet long. Domain experts need to be convinced of the validity of the models and to trust their predictions. To be used in the field, model capabilities need to be affordable, that is, allow rapid and interactive scenario building, even for non-experts. Complex relations governed by statistical interactions must be unveiled for users to understand unexpected predictions. We propose Interact, a model-agnostic, visual what-if tool for regression problems, supporting (1) the visualization of statistical interactions between features, (2) the creation of interactive what-if scenarios using predictive models, (3) the evaluation of model quality and building trust, and (4) the externalization of knowledge through model explainability. While the approach applies in various industrial contexts, we validate the application purpose and design with a detailed case study and a qualitative user study with engineers in the tire industry. By unraveling statistical interactions between features, the INTERACT tool proves to be useful to increase the transparency of black-box machine learning models. We also reflect on lessons learned concerning the development of visual what-if tools for virtual product development and beyond.

show abstract

A Comparison Framework of Machine Learning Algorithms for Mixed-Type Variables Datasets: A Case Study on Tire-Performances Prediction

Cited by 13 publications

References 40 publications

Effective of Smart Mathematical Model by Machine Learning Classifier on Big Data in Healthcare Fast Response

Effective of Smart Mathematical Model by Machine Learning Classifier on Big Data in Healthcare Fast Response

A Ranking Learning Model by K-Means Clustering Technique for Web Scraped Movie Data

Interact: A visual what-if analysis tool for virtual product design

Contact Info

Product

Resources

About