2019
DOI: 10.21105/joss.01075
|View full text |Cite
|
Sign up to set email alerts
|

Yellowbrick: Visualizing the Scikit-Learn Model Selection Process

Abstract: Discussions of machine learning are frequently characterized by a singular focus on algorithmic behavior. Be it logistic regression, random forests, Bayesian methods, or artificial neural networks, practitioners are often quick to express their preference. However, model selection is more nuanced than simply picking the "right" or "wrong" algorithm. In practice, the workflow includes multiple iterations through feature engineering, algorithm selection, and hyperparameter tuning-summarized by Kumar et al. as a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
66
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 111 publications
(66 citation statements)
references
References 6 publications
0
66
0
Order By: Relevance
“…To select the appropriate number of clusters the criterion of maximizing the silhouette coefficient was adopted. All calculations were made using own computer codes written in Python using the Scikit-learn [ 94 ] and the Yellowbrick [ 95 ] libraries.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To select the appropriate number of clusters the criterion of maximizing the silhouette coefficient was adopted. All calculations were made using own computer codes written in Python using the Scikit-learn [ 94 ] and the Yellowbrick [ 95 ] libraries.…”
Section: Resultsmentioning
confidence: 99%
“…The vertical dotted line in the elbow method plot indicates the optimal number of clusters which was determined by using the knee point detection algorithm. The knee point detection algorithm finds the point of maximum curvature, which in a well-behaved clustering problem also represents the pivot of the elbow curve (see Bengfort and Bilbro [ 95 ] and Satopaa et al [ 99 ]). We found that for all the investigated periods the elbow method points towards the same clustering as the method based on the silhouette coefficients.…”
Section: Resultsmentioning
confidence: 99%
“…The number of clusters is determined with the elbow method. A quick running of the Yellowbrick [25] implementation of the elbow method suggests that the number of clusters at 20 gives a balance between more variation and over-fitting.…”
Section: Parameter Settingsmentioning
confidence: 99%
“…DNN models were built with supervised machine learning in a Python-3 environment 19 . Python libraries used for model building are Tensorflow 20 , Keras 21 , Pandas 22 , NumPy 23 , Scikit-learn 24 , Matplotlib 25 , Seaborn 26 and Yellowbricks 27 . Web-app was built with Streamlit python package (https://www.streamlit.io/, 2020; Online; accessed 30-11-2020).…”
Section: Dataset Preparation and Preprocessingmentioning
confidence: 99%