Data quality for federated medical data lakes

Publication by partner Alpen Adria University

Medical research requires biological material and data of high quality as they are associated with / in biobanks. Medical studies based on data with unknown or questionable quality are useless or even dangerous.


The authors of this paper propose an IT architecture to support researchers to efficiently and effectively identify relevant collections of material and data with documented quality for their research projects while observing strict privacy rules.


They describe the landscape of biobanks as federated medical data lakes such as the collections of samples and their annotations in the European federation of biobanks BBMRI-ERIC and developed a conceptual model capturing schema information with quality annotation.


Johann Eder, Vladimir A. Shekhovtsov: Data quality for federated medical data lakes. International Journal of Web Information Systems Vol. 17 No. 5, 2021 pp. 407-426. Emerald Publishing Limited 1744-0084, DOI 10.1108/IJWIS-03-2021-0026