As part of Open Access Week, we want to recognize the critical role of open data on research and innovation. While the importance of sharing data is well-acknowledged, reusing open data and incorporating it into your research can be equally critical.
Why reuse data?
1. Efficiency and Time-Saving – Open data can accelerate the research process
2. Replicability and Validation – Reusing open data enables the replication of studies and validation research findings, which helps foster transparency and trust in scholarly work.
3. Ethics and Reducing Oversampling Burden – Certain populations, particularly marginalized and vulnerable populations, are sometimes overly sampled in research and reusing open data allows us to reduce the burden of oversampling.
4. Cross-Disciplinary Insights – Researchers can blend data from different disciplines, encouraging cross-disciplinary collaboration.
How to approach data reuse?
1. Assess Quality & Compatibility – Check for completeness, biases, representation. Consider if the scope, variables, and collection methods align with your research objectives (e.g. if you were going to collect your own data, would you be doing it in the same way?)
2. Review Documentation – Part of quality assessment is reviewing metadata and understanding the collection methods and any cleaning processes that the data underwent. Need to examine licensing and ethical concerns around privacy and consent.
3. Data Integration – Open data might not always be the best fit, but it can be thought of as supplementary or contextual data that will make your primary research data more robust.