2019 15th International Conference on eScience (eScience) 2019
DOI: 10.1109/escience.2019.00018
|View full text |Cite
|
Sign up to set email alerts
|

dislib: Large Scale High Performance Machine Learning in Python

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

3
4

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 24 publications
0
14
0
Order By: Relevance
“…Extensions of task-based programming to distributed programming, such as PyCOMPSs [23], [24], Dask [25], Ray [26], Parsl [27], and Pygion [28] are gaining popularity for scientific data analysis for the mix of performance and simplicity they offer. They provide a Python interface and often the transparent parallelization of some classical APIs (or part of them) like Numpy or Pandas.…”
Section: Related Workmentioning
confidence: 99%
“…Extensions of task-based programming to distributed programming, such as PyCOMPSs [23], [24], Dask [25], Ray [26], Parsl [27], and Pygion [28] are gaining popularity for scientific data analysis for the mix of performance and simplicity they offer. They provide a Python interface and often the transparent parallelization of some classical APIs (or part of them) like Numpy or Pandas.…”
Section: Related Workmentioning
confidence: 99%
“…Dislib [13] is a distributed machine learning library built on top of Py-COMPSs programming model. In essence, dislib is a collection of PyCOMPSs applications exposed through two main APIs: an estimator-based interface and a data handling interface.…”
Section: Dislibmentioning
confidence: 99%
“…However, as scientific data sets grow in size, it appears a need for distributed machine learning libraries that can run in traditional computational science platforms like HPC clusters. Towards this, some machine learning libraries, like MLlib [11], Dask-ML [12], dislib [13], and TensorFlow [14] have addressed scikit-learn's limitations by being able to run in multiple computers. Among these libraries, dislib is one of the better suited for HPC clusters, as it provides better performance and scalability than other similar libraries when processing large data sets in these environments [13].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Another area of application of the new features presented in this paper has been the dislib library 4 [4], a distributed computing machine learning library parallelized with PyCOMPSs. Some machine learning algorithms are iterative, where convergence is checked at every iteration step to decide whether the next iteration is necessary.…”
Section: Machine Learning Algorithmsmentioning
confidence: 99%