2018
DOI: 10.25080/majora-4af1f417-002
|View full text |Cite
|
Sign up to set email alerts
|

Equity, Scalability, and Sustainability of Data Science Infrastructure

Abstract: Abstract-We seek to understand the current state of equity, scalability, and sustainability of data science education infrastructure in both the U.S. and Canada. Our analysis of the technological, funding, and organizational structure of four types of institutions shows an increasing divergence in the ability of universities across the United States to provide students with accessible data science education infrastructure, primarily JupyterHub. We observe that generally liberal arts colleges, community college… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…A second obstacle integrating software tools into scientific practice is that software-based learning requires additional education infrastructure. [SNLT18] document the challenges in providing JupyterHub with automatic grading extensions at universities and colleges; they find that many institutions do not have the resources or deep IT expertise necessary to build and maintain this infrastructure. The growing necessity of cloud-based computational notebooks for assignments and exploration in scientific education therefore raises concerns about social equity.…”
Section: Computing In Education and Sciencementioning
confidence: 99%
“…A second obstacle integrating software tools into scientific practice is that software-based learning requires additional education infrastructure. [SNLT18] document the challenges in providing JupyterHub with automatic grading extensions at universities and colleges; they find that many institutions do not have the resources or deep IT expertise necessary to build and maintain this infrastructure. The growing necessity of cloud-based computational notebooks for assignments and exploration in scientific education therefore raises concerns about social equity.…”
Section: Computing In Education and Sciencementioning
confidence: 99%
“…Data 8 utilizes ok.py, a Berkeley developed solution that has a plethora of features for large and diverse computer science and data science classes. However, this comes with a complexity cost for instructors who only need a subset of these features and sysadmins operating an okpy server installation [Suen18]. On the other hand, Data 100, the upper division core data science course, utilizes nbgrader, an open source grading solution built for Jupyter Notebooks.…”
Section: Setting Campus Wide Educational Cyber-infrastructurementioning
confidence: 99%
“…2 While server-based solutions have some costs associated, for students with internet access they can increase equity of education as all students have equal computational power regardless of their own computing hardware. 1,3 These technology challenges are magnified for those teaching data science focused Massively Open Online Courses (MOOCs). MOOCs are typically offered to thousands of learners across the globe completely asynchronously, increasing the number of unique computing environments and reducing instructor contact for individual-level support.…”
Section: Introductionmentioning
confidence: 99%
“…4,5 Others use technology embedded in the platform's learning management system (e.g., shared JuptyerHub). 3,6 Importantly, these data science MOOCs have not had a particular domain focus and thus can use openly available or non-sensitive data in their courses.…”
Section: Introductionmentioning
confidence: 99%