Seven Bottlenecks to Workflow Reuse and Repurposing

Goderis, Antoon; Sattler, Ulrike; Lord, Phillip; Goble, Carole

doi:10.1007/11574620_25

Cited by 70 publications

(57 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…With increasing repository sizes, new challenges arise for managing these collections of scientific workflows and for using the information collected in them as a source of expertsupplied knowledge [10,19]. Challenges include the detection of functionally equivalent workflows, grouping of work-flows into functional clusters, workflow retrieval, or the use of existing workflows in the design of novel workflows [36,34,33,4,18].…”

Section: Introductionmentioning

confidence: 99%

Similarity search for scientific workflows

et al. 2014

View full text Add to dashboard Cite

With the increasing popularity of scientific workflows, public repositories are gaining importance as a means to share, find, and reuse such workflows. As the sizes of these repositories grow, methods to compare the scientific workflows stored in them become a necessity, for instance, to allow duplicate detection or similarity search. Scientific workflows are complex objects, and their comparison entails a number of distinct steps from comparing atomic elements to comparison of the workflows as a whole. Various studies have implemented methods for scientific workflow comparison and came up with often contradicting conclusions upon which algorithms work best. Comparing these results is cumbersome, as the original studies mixed different approaches for different steps and used different evaluation data and metrics. We contribute to the field (i) by disecting each previous approach into an explicitly defined and comparable set of subtasks, (ii) by comparing in isolation different approaches taken at each step of scientific workflow comparison, reporting on an number of unexpected findings, (iii) by investigating how these can best be combined into aggregated measures, and (iv) by making available a gold standard of over 2000 similarity ratings contributed by 15 workflow experts on a corpus of almost 1500 workflows and re-implementations of all methods we evaluated.

show abstract

Section: Introductionmentioning

confidence: 99%

Similarity search for scientific workflows

et al. 2014

View full text Add to dashboard Cite

show abstract

“…The development of "rigid" workflow modeling and design frameworks has recently been identified as a major bottleneck for scientific workflow reuse and repurposing [5]. We have found that this lack of flexibility is often due to the use of control-flow within workflows for managing, integrating, and analyzing inherently complex life-science data.…”

Section: Resultsmentioning

confidence: 99%

Collection-Oriented Scientific Workflows for Integrating and Analyzing Biological Data

McPhillips

Bowers

Ludäscher

2006

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Steps in scientific workflows often generate collections of results, causing the data flowing through workflows to become increasingly nested. Because conventional workflow components (or actors) typically operate on simple or application-specific data types, additional actors often are required to manage these nested data collections. As a result, conventional workflows become increasingly complex as data becomes more nested. This paper describes a new paradigm for developing scientific workflows that transparently manages nested data collections. Collection-oriented workflows have a number of advantages over conventional approaches including simpler workflow designs (e.g., requiring fewer actors and control-flow constructs) that are invariant under changes in data nesting. Our implementation within the Kepler scientific workflow system enables the explicit representation of collections and collection schemas, concurrent operation over collection contents via multi-level pipeline parallelism, and allows collection-aware actors to be composed readily from conventional actors.

show abstract

“…In this case it was word of mouth within one institution; this barrier needs to be overcome. So, we have a situation of workflows as reusable knowledge commodities, but with potential barriers to the exchange and propagation of those scientific ideas that are captured as workflow [11].…”

Section: The Workflow As a First Class Citizenmentioning

confidence: 99%