2019
DOI: 10.21203/rs.2.4295/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Shared Data Science Infrastructure for Genomics Data

Abstract: Background: Creating a scalable computational infrastructure to analyze the wealth of information contained in data repositories is difficult due to significant barriers in organizing, extracting and analyzing relevant data. Shared data science infrastructures like Boa_g is needed to efficiently process and parse data contained in large data repositories. The main features of Boa_g are inspired from existing languages for data intensive computing and can easily integrate data from biological data repositories.… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 13 publications
(13 reference statements)
0
1
0
Order By: Relevance
“…Focusing on the similar kind of goal of collective program analysis [56,57] as well as PASS in terms of reducing the data that is sent to downstream analysis and using the semantics of task that is conducted during downstream analysis to reduce the data, can we determine task-dependent program similarity for inding analogous programs in the realm of data science? Another goal might be to realize PASS as part of the shared infrastructure such as Boa [3,6,13,14,25] and understand performance improvements and computation savings that might result from it.…”
Section: Discussionmentioning
confidence: 99%
“…Focusing on the similar kind of goal of collective program analysis [56,57] as well as PASS in terms of reducing the data that is sent to downstream analysis and using the semantics of task that is conducted during downstream analysis to reduce the data, can we determine task-dependent program similarity for inding analogous programs in the realm of data science? Another goal might be to realize PASS as part of the shared infrastructure such as Boa [3,6,13,14,25] and understand performance improvements and computation savings that might result from it.…”
Section: Discussionmentioning
confidence: 99%