2023
DOI: 10.1099/mgen.0.000908
|View full text |Cite
|
Sign up to set email alerts
|

The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information

Abstract: Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods, patient demographics, clinical outcomes and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases pose… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…Meanwhile, more and more technical tools for harmonization are being made available to help data scientists, especially in the natural sciences, engage in data harmonization without having to reinvent the wheel. Some examples of tools to aid in data harmonization include: the DataHarmonizer, a standardized browser-based spreadsheet editor which is geared toward genomics data 129 and HarmonizeR, an R package which makes available an algorithm can deal with missing data in omics datasets 130 . Researchers, especially in epidemiology, may further benefit from making use Rmonize 131 , an R package which provides functions to support retrospective data harmonization, evaluation and documentation based on the guidelines developed by Fortier et al .…”
Section: What Tradeoffs Should Be Considered When Harmonizing Data?mentioning
confidence: 99%
“…Meanwhile, more and more technical tools for harmonization are being made available to help data scientists, especially in the natural sciences, engage in data harmonization without having to reinvent the wheel. Some examples of tools to aid in data harmonization include: the DataHarmonizer, a standardized browser-based spreadsheet editor which is geared toward genomics data 129 and HarmonizeR, an R package which makes available an algorithm can deal with missing data in omics datasets 130 . Researchers, especially in epidemiology, may further benefit from making use Rmonize 131 , an R package which provides functions to support retrospective data harmonization, evaluation and documentation based on the guidelines developed by Fortier et al .…”
Section: What Tradeoffs Should Be Considered When Harmonizing Data?mentioning
confidence: 99%
“…3. Laboratories and data systems should provide tools that map varying, but similar, metadata requirements across different systems 61 .…”
Section: Summary Of Needsmentioning
confidence: 99%
“…These different implementations are supported by expert curators to validate terms and ensure compliance. Alternatively, the NMDC uses a specialized data submission tool called DataHarmonizer [30], which provides real-time validation to users and aims to lower barriers for metadata submission (Fig. 3).…”
Section: How To Use the Mixs Standard For Data Submissionmentioning
confidence: 99%
“…Fig. 3 Screenshot of the NMDC Submission Portal (https:/ /data.microbiomedata.org/submission/home), which uses DataHarmonizer [30]. EnvO terms are shown as dropdowns for value sets along with validation checks for terms with measurement fields…”
Section: How To Use the Mixs Standard For Data Submissionmentioning
confidence: 99%