2021
DOI: 10.3233/shti210183
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Detection of Metadata Errors in a Registry of Clinical Studies Using Shapes Constraint Language (SHACL) Graphs

Abstract: Registries of clinical studies such as ClinicalTrials.gov are an important source of information. However, the process of manually entering metadata is prone to errors which impedes their use and thereby the overall usefulness of the registry. In this work, we propose a generic approach towards detection of errors in the metadata by using the Shapes Constraint Language for defining rule templates covering constraints regarding value type and cardinality. We developed a Python 3 algorithm for the automatic vali… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 7 publications
0
3
0
Order By: Relevance
“…As stated earlier, SHACL is a relatively new standard [1] (developed in 2017) and has mainly been tested on generic data graphs like DBpedia and WikiData [11,13]. Very few initiatives exist in creating SHACL shapes for biomedical data graphs, and they mainly focus on data graphs representing Electronic Health Record (EHR) models [15] and patient information [16]. Other clinical use cases for SHACL shapes include validating medical guidelines to integrate Fast Healthcare Interoperability Resources (FHIR) into decisionmaking systems [17], validating medical reports to identify missing data [18], and validating clinical trial study data to detect missing values, wrong cardinalities, and incorrect values that do not adhere to a predefined set [16].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…As stated earlier, SHACL is a relatively new standard [1] (developed in 2017) and has mainly been tested on generic data graphs like DBpedia and WikiData [11,13]. Very few initiatives exist in creating SHACL shapes for biomedical data graphs, and they mainly focus on data graphs representing Electronic Health Record (EHR) models [15] and patient information [16]. Other clinical use cases for SHACL shapes include validating medical guidelines to integrate Fast Healthcare Interoperability Resources (FHIR) into decisionmaking systems [17], validating medical reports to identify missing data [18], and validating clinical trial study data to detect missing values, wrong cardinalities, and incorrect values that do not adhere to a predefined set [16].…”
Section: Related Workmentioning
confidence: 99%
“…Thus, the nature of SHACL shape creation methods employed in these studies is similar to generic data graphs, which include Ontology Design Patterns (ODPs) and existing clinical reference model constraints being converted to SHACL shape constraints. For example, the authors of [16] derived SHACL shapes to regulate the values for fields like gender, study ID, and study type in a clinical trial report. Developing SHACL shapes for such fields is mainly focused on constraining the data types and values for such fields rather than ensuring the presence of missing properties in a class based on the semantics of the class name, which is the case with biomedical ontology data graphs.…”
Section: Related Workmentioning
confidence: 99%
“…certain policies, such as GDPR requirements [35,2]. Other applications of shacl include type checking program code [27] and detecting metadata errors in clinical studies [22]. shacl is also used by the European Commission to facilitate data sharing, for example by validating metadata about public services against the recommended vocabularies [46].…”
Section: Adoption Of Shaclmentioning
confidence: 99%