From Framework to Metric: Developing a Quantitative Data Risk Score for Scientific Collections

Yan Hua Dong1
1Department of Computer Science, The University of York, UK
DOI: https://doi.org/10.71448/bcds2453-2
Published: 30/09/2024
Cite this article as: Yan Hua Dong. From Framework to Metric: Developing a Quantitative Data Risk Score for Scientific Collections. Bulletin of Computer and Data Sciences, Volume 5 Issue 3. Page: 21-34.

Abstract

Background: The qualitative data risk assessment matrix proposed by Mayernik et al. (2020) provides a crucial foundation for identifying threats to scientific data preservation. However, its qualitative nature limits its utility for cross-collection comparison and systematic resource allocation. Objective: This paper presents a methodology for transforming the qualitative risk matrix into a quantitative Data Risk Score (DRS), enabling objective prioritization of data preservation efforts. Methods: We employed a two-stage Delphi method with 30 international data stewardship experts to assign weights to the 21 risk factors and 10 categorization methods from the original framework. These weights were integrated into a scoring algorithm. Results: The resulting DRS was validated against three case studies: a modern genomic repository, a legacy social science archive, and a distributed ecological network. The score effectively discriminated risk levels between collections and provided a transparent basis for prioritization. Conclusion: The Data Risk Score operationalizes the conceptual risk framework, providing repositories, funders, and institutions with an actionable metric to guide preservation strategy and investment.

Keywords: data risk assessment, quantitative metric, data preservation, digital stewardship, prioritization, Delphi method

Abstract

Background: The qualitative data risk assessment matrix proposed by Mayernik et al. (2020) provides a crucial foundation for identifying threats to scientific data preservation. However, its qualitative nature limits its utility for cross-collection comparison and systematic resource allocation. Objective: This paper presents a methodology for transforming the qualitative risk matrix into a quantitative Data Risk Score (DRS), enabling objective prioritization of data preservation efforts. Methods: We employed a two-stage Delphi method with 30 international data stewardship experts to assign weights to the 21 risk factors and 10 categorization methods from the original framework. These weights were integrated into a scoring algorithm. Results: The resulting DRS was validated against three case studies: a modern genomic repository, a legacy social science archive, and a distributed ecological network. The score effectively discriminated risk levels between collections and provided a transparent basis for prioritization. Conclusion: The Data Risk Score operationalizes the conceptual risk framework, providing repositories, funders, and institutions with an actionable metric to guide preservation strategy and investment.

Keywords: data risk assessment, quantitative metric, data preservation, digital stewardship, prioritization, Delphi method
Yan Hua Dong
Department of Computer Science, The University of York, UK

DOI

Cite this article as:

Yan Hua Dong. From Framework to Metric: Developing a Quantitative Data Risk Score for Scientific Collections. Bulletin of Computer and Data Sciences, Volume 5 Issue 3. Page: 21-34.

Publication history

Copyright © 2024 Yan Hua Dong. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Browse Advance Search