2021
DOI: 10.1108/ijwis-03-2021-0026
|View full text |Cite
|
Sign up to set email alerts
|

Data quality for federated medical data lakes

Abstract: Purpose Medical research requires biological material and data collected through biobanks in reliable processes with quality assurance. Medical studies based on data with unknown or questionable quality are useless or even dangerous, as evidenced by recent examples of withdrawn studies. Medical data sets consist of highly sensitive personal data, which has to be protected carefully and is available for research only after the approval of ethics committees. The purpose of this research is to propose an architec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0
1

Year Published

2021
2021
2025
2025

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 45 publications
0
2
0
1
Order By: Relevance
“…This, combined with the dynamic evolutionary capacity of malware, allows it to assume semantically similar but structurally dissimilar forms, thereby causing novel damages such as connection blockages, system corruption, password theft, and more, posing threats to users, organizations, and systems. Recent history demonstrates how malware has become the primary tool for initiating large-scale attacks, resulting in severe damages and economic losses [11][12][13][14][15][16]. Traditionally, various techniques have been used for malware detection, including signature-based approaches (utilizing regular expressions, file names, etc.…”
Section: Related Workmentioning
confidence: 99%
“…This, combined with the dynamic evolutionary capacity of malware, allows it to assume semantically similar but structurally dissimilar forms, thereby causing novel damages such as connection blockages, system corruption, password theft, and more, posing threats to users, organizations, and systems. Recent history demonstrates how malware has become the primary tool for initiating large-scale attacks, resulting in severe damages and economic losses [11][12][13][14][15][16]. Traditionally, various techniques have been used for malware detection, including signature-based approaches (utilizing regular expressions, file names, etc.…”
Section: Related Workmentioning
confidence: 99%
“…Los almacenes de datos hacen estas funciones a través de procesos de ETL (Extract, Transform, Load), lo que implica altos costos en la creación, mantenimiento y gestión de estos entornos. Este es uno de los motivos por el que los lagos de datos se están posicionando como una alternativa para almacenar grandes cantidades de información en su forma y formato original sin costos altos por la gestión de los datos (Eder & Shekhovtsov, 2021).…”
Section: Repositorios De Información De Salud -Lagos De Datosunclassified
“…In essence, a data lake is a flexible, scalable data storage and management system, which ingests and stores raw data from heterogeneous sources in their original format, and provides maintenance, query processing and data analytics in an on-the-fly manner, with the help of rich metadata [116], [138], [142], [143]. Data lakes are proposed to store and manage data in many real-life use cases: Internet of things (IoT) and smart city [99], manufacturing [112], medicine [42], [55], [114], mobility service (e.g., Uber) [50], biology [23], smart grids [20], [103], air quality control [145], flights data [96], disease control, labor markets and products [13].…”
Section: Introductionmentioning
confidence: 99%