A curated corpus of simulink models for model-based empirical studies

Chowdhury, Shafiul Azam; Varghese, Lina Sera; Soumik, Mohian,; Johnson, Taylor T.; Csallner, Christoph

doi:10.1145/3196478.3196484

Cited by 22 publications

(6 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We took several countermeasures to rule out potential errors in these calculations. For those metrics already reported in the study of Chowdhury et al [15], which are based on a model corpus that overlaps with ours, we have checked the plausibility and were able to reproduce their results. Further, we used the Matlab/API to parse the Simulink models, which prevents errors introduced by other custom-built Simulink parsers.…”

Section: Internal Validitymentioning

confidence: 64%

“…Later on, they apply this framework to explore the impact of software evolution on the behavior of three controllers designed with Simulink, focusing on the mismatches that arise between control models and the corresponding control software. Chowdhury et al [15] reported on a large set of freely available Simulink models that they crawled from various sources on the Internet. They analyzed these models in terms of content and reported basic measures such as the number of blocks and connections.…”

Section: Empirical Studies On Model Characteristicsmentioning

confidence: 99%

“…This study can be classified as a quantitative and qualitative non-probability sample study [59]. Our sample is based on the largest [14] set of publicly available Simulink models, 6 collected by Chowdhury et al [15]. The set by Chowdhury et al comprises a smaller Simulink model collection [34], a Stateflow model collection by the CoCo-Sim-Team [9], and many other projects from Matlab Central, Source-Forge, GitHub, and other sources such as web sites of Although critical open-source repository sites could have been missed, the set by Chowdhury et al covers a wide range of sources.…”

Section: Study Subjectsmentioning

confidence: 99%

“…The set by Chowdhury et al comprises a smaller Simulink model collection [34], a Stateflow model collection by the CoCo-Sim-Team [9], and many other projects from Matlab Central, Source-Forge, GitHub, and other sources such as web sites of Although critical open-source repository sites could have been missed, the set by Chowdhury et al covers a wide range of sources. Instead of using the provided dataset as it is, we re-collected a current snapshot in August 2020 7 consisting of all constituent Simulink models based on the information provided in the meta-data of the corpus of Chowdhury et al The main motivation for this new snapshot arises from several inconsistencies we found between the actual corpus and the results presented in [15]. According to personal correspondence with the authors, these inconsistencies may originate from only a subset of the entire corpus models being used in their study.…”

Section: Study Subjectsmentioning

confidence: 99%

“…As a step to overcome this situation, we investigate a set of 1,734 freely available Simulink models from 194 projects, originally collected by Chowdhury et al [15] and updated in terms of our study. The set comprises projects from Matlab Central 2 , SourceForge 3 , GitHub 4 , and other web pages, as well as two smaller sets [9,34].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Characteristics, potentials, and limitations of open-source Simulink projects for empirical research

et al. 2021

View full text Add to dashboard Cite

Simulink is an example of a successful application of the paradigm of model-based development into industrial practice. Numerous companies create and maintain Simulink projects for modeling software-intensive embedded systems, aiming at early validation and automated code generation. However, Simulink projects are not as easily available as code-based ones, which profit from large publicly accessible open-source repositories, thus curbing empirical research. In this paper, we investigate a set of 1734 freely available Simulink models from 194 projects and analyze their suitability for empirical research. We analyze the projects considering (1) their development context, (2) their complexity in terms of size and organization within projects, and (3) their evolution over time. Our results show that there are both limitations and potentials for empirical research. On the one hand, some application domains dominate the development context, and there is a large number of models that can be considered toy examples of limited practical relevance. These often stem from an academic context, consist of only a few Simulink blocks, and are no longer (or have never been) under active development or maintenance. On the other hand, we found that a subset of the analyzed models is of considerable size and complexity. There are models comprising several thousands of blocks, some of them highly modularized by hierarchically organized Simulink subsystems. Likewise, some of the models expose an active maintenance span of several years, which indicates that they are used as primary development artifacts throughout a project’s lifecycle. According to a discussion of our results with a domain expert, many models can be considered mature enough for quality analysis purposes, and they expose characteristics that can be considered representative for industry-scale models. Thus, we are confident that a subset of the models is suitable for empirical research. More generally, using a publicly available model corpus or a dedicated subset enables researchers to replicate findings, publish subsequent studies, and use them for validation purposes. We publish our dataset for the sake of replicating our results and fostering future empirical research.

show abstract

Section: Internal Validitymentioning

confidence: 64%

Section: Empirical Studies On Model Characteristicsmentioning

confidence: 99%

Section: Study Subjectsmentioning

confidence: 99%

Section: Study Subjectsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Characteristics, potentials, and limitations of open-source Simulink projects for empirical research

et al. 2021

View full text Add to dashboard Cite

show abstract

Advanced discovery mechanisms in model repositories

Indamutsa,

Di Rocco,

Almonte

et al. 2024

Softw Pract Exp

View full text Add to dashboard Cite

SummaryAs model‐driven engineering gains traction and poses as the new paradigm for software engineering, it raises a need for efficient approaches and tools to manage, discover, and retrieve relevant modeling artifacts. Hence, industry and academia are conceiving effective ways to store, search, and retrieve heterogeneous model artifacts that employ advanced discovery mechanisms. This paper presents MDEForge‐Search, a novel approach to discovering heterogeneous model artifacts over MDEForge, a distributed cloud‐based model repository. We designed advanced discovery mechanisms that retrieve heterogeneous artifacts within their context (megamodel) and reuse them across model management services. In addition, a domain‐specific approach has been proposed to formulate queries in terms of keywords, search tags, conditional operators, quality model assessment services and a transformation chain discoverer. Finally, the applicability of our approach was assessed in a recommender system modeling framework, which, thanks to the operated integration, can rely on the availability of more than 5000 model artifacts currently persisted in our cloud‐based model repository.

show abstract

On the Replicability of Experimental Tool Evaluations in Model-Based Development

Boll

Kehrer

2020

Communications in Computer and Information Science

View full text Add to dashboard Cite

A curated corpus of simulink models for model-based empirical studies

Cited by 22 publications

References 6 publications

Characteristics, potentials, and limitations of open-source Simulink projects for empirical research

Characteristics, potentials, and limitations of open-source Simulink projects for empirical research

Advanced discovery mechanisms in model repositories

On the Replicability of Experimental Tool Evaluations in Model-Based Development

Contact Info

Product

Resources

About