2020
DOI: 10.48550/arxiv.2012.00058
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PMLB v1.0: An open source dataset collection for benchmarking machine learning methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 10 publications
(11 citation statements)
references
References 0 publications
0
11
0
Order By: Relevance
“…Flags [20] and Ask Ubuntu [2]. We provide detailed comparative results on both the training and test accuracy of all models, and further report the model sizes, runtimes and optimality gaps of our MIP and PBO models.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Flags [20] and Ask Ubuntu [2]. We provide detailed comparative results on both the training and test accuracy of all models, and further report the model sizes, runtimes and optimality gaps of our MIP and PBO models.…”
Section: Resultsmentioning
confidence: 99%
“…We trained our models over 20, 60, 100, 200, 300, 500, 700 and 1000 training instances, and reported test accuracy results over the remaining dataset. The Flags [20] dataset consists of 43 features and 5 classes with 143 instances. The dataset describes the attributes of the flags of various countries.…”
Section: Experimental Domains and Setupmentioning
confidence: 99%
“…All the tests described in this section are based on ten publicly available data sets from the UCI Machine Learning Repository [45] and PMLB (Penn Machine Learning Benchmarks) [46]. The data we use for the tests come from different domains and have been selected ensuring a wide diversity in terms of data size, dimensionality, and number of classes.…”
Section: Rule Set Evaluationmentioning
confidence: 99%
“…In order to establish common datasets, we extended PMLB, a repository of standardized regression and classification problems [13,36], by adding 130 SR datasets with known model forms. PMLB provides utilities for fetching and handling data, recording and visualizing dataset metadata, and contributing new datasets.…”
Section: Srbenchmentioning
confidence: 99%
“…Each dataset is stored using Git Large File Storage and PMLB is planned for long-term maintenance. PMLB is available under an MIT license, and is described in detail in Romano et al [36]. The authors bear all responsibility in case of violation of rights.…”
Section: A4 Additional Dataset Informationmentioning
confidence: 99%