Chapter 1: Overview of Real-World Data

Linking Real-World Data Sets to Answer Key Questions

RWD can be linked to allow investigators to enhance their depth and breadth in an effort to capture integrated, longitudinal care and health outcomes.16,17 Linkages across RWD sets can be done via clear text matching or privacy preserving methodologies, depending on the necessary privacy framework. The use of large linked data sets can improve the statistical precision of estimates made with smaller data sets, but biases that appear due to missing information do not dissipate with a large sample size. Data linkages can help address missing information, thereby reducing measurement biases in large RWD analyses. Moreover, triangulating data points across multiple data sources not only mitigates potential missingness inherent to RWD, but it can also facilitate data validation and enhance accuracy. While linkages can be advantageous, paradigms regarding the evaluation of the aggregated RWD as fit-for-purpose remain. Care must be taken to assess the reliability and validity of the data linkage. For example, linkages may further create issues with selection bias; reduced geographic, demographic, or clinical representation; missingness; and harmonization problems that could affect analysis.9