Chapter 3: Methods in Real-World Evidence Generation - Sources of Error

1. Confounding

Author: Li H


Confounding is the distortion of the treatment-outcome association when the groups being compared differ with respect to variables that influence the outcome. Real-world evidence (RWE) studies can be challenging in observational settings where treatments are not randomly prescribed. Structurally, confounding occurs when a variable is a common cause of the exposure and outcome, as illustrated in Figure 3.1. In other words, confounders must be distributed unequally among the groups being compared, be associated with the outcome of interest, and cannot be intermediate factors between the exposure and outcome.1 As a note of caution, authors sometimes use the term “selection bias” to refer to structural confounding in that treatments might be “selected” for patients based on certain characteristics that might be risk factors for outcomes. While selection bias is the result of differential selection into one or more of study groups, it may induce confounding if the characteristic on which differential selection is centered is associated with both the exposure and outcome of interest. These are distinct biases, as illustrated in Figure 3.1 and as described below in Section 2 of this chapter on selection bias.

In comparing drug treatments, confounding by indication is a common issue that occurs because patients are treated for a specific condition with a particular drug2 or because patients have conditions that are contraindications to treatment with the medication, and these conditions or contraindications affect the outcome of interest. For example, patients with a history of gastric problems may preferentially receive cyclooxygenase-2-inhibitors (e.g., celecoxib) rather than traditional non-steroidal anti-inflammatory agents such as ibuprofen, which will result in a higher risk for gastric ulcer or bleeding in the celecoxib group.3 To the extent that confounders (variables that are associated with both exposure and outcome) are measured in real-world data, observational studies can address potential bias due to confounding. However, many real-world data sources lack information on important prognostic variables, including confounders.

Restriction, such as to patients with evidence of a particular condition, is a common strategy to reduce confounding in real-world evidence studies. However, even a study including patients with the same condition but differing levels of disease severity might be susceptible to strong confounding. For example, patients with COVID-19 infection can experience a wide range of clinical manifestations, from no symptoms to critical illness; furthermore, patients with certain underlying comorbidities are at a higher risk of progressing to severe COVID-19 or mortality. The prognosis of COVID-19 infection differs by underlying disease severity.4 When comparing treatments for COVID-19, it is therefore critical to account for disease severity and prognosis at the time of treatment initiation. In addition, it is important to recognize how treatment is evaluated when comparing different lines of therapy or dose levels in in RWE studies (e.g., second  or third line therapy or higher doses are likely given to patients with more severe disease), as it might result in confounding by severity when the association reflects the underlying disease severity rather than the study drug effect.5

1: Confounding by indication/contraindication

Paranjpe et al. published a retrospective cohort study examining the association between therapeutic anticoagulation and in-hospital mortality among patients with COVID-19.6 Several commentators have raised concerns about potential confounding by indication, alongside other concerns including unmeasured confounding and immortal time bias.7 One of the concerns was that clinicians prescribe anticoagulation for some patients hospitalized with COVID-19 based on evidence of clotting and clinical judgment of their medical needs, while patients who do not receive anticoagulation typically do not have a clinical indication or may have contraindications such as advanced age, prior hemorrhage, or other bleeding risk factors. Without a full accounting of the treatment indications and contraindications that might influence treatment decisions and affect mortality, uncontrolled confounding is likely to affect the results.

2: Confounding by disease severity

Schultze et al. examined the association between use of inhaled corticosteroids and COVID-19-related death in patients with chronic obstructive pulmonary disease (COPD) or asthma using the OpenSAFELY platform. To the extent that asthma and COPD severity affect corticosteroid use and COVID-19-related death, confounding by severity is possible.8 The investigators sought to measure and adjust for underlying health conditions that may differ between individuals prescribed inhaled corticosteroids and those using other medications for asthma and COPD. The evaluation for confounding by severity was further addressed using negative control outcomes. A persistent harmful association between inhaled corticosteroid use and non-COVID-19 related death was also observed, which could be related to the severity of disease requiring treatment exposure, suggesting that the real-world data source did not capture all markers of disease severity, resulting in a distorted exposure-outcome association. Negative control outcomes can be used to detect9 and correct for otherwise unaddressed sources of confounding.10

Why it is a problem?

Confounding is one of the major threats to observational studies and is frequently cited as an important difference regarding the internal validity of observational studies and randomized trials. Confounding by indication and severity can be particularly problematic in real-world evidence studies conducted in the context of a rapidly evolving pandemic as treatment decisions might vary across clinical, functional, and behavioral patient characteristics. Treatment decisions might also vary over time as trends in determinants of treatments are frequently changing as evidence of outcome risk factors and which treatments are effective evolve quickly. Physicians prescribe drugs based on the most current diagnostic and prognostic information available at the time of treatment decision-making, and in the context of current, but potentially rapidly changing, practice patterns. Regional and temporal patterns of infection transmission and the waxing and waning of different SARS-CoV-2 variants associated with different degrees of outcome severity can also be a source of confounding in the pandemic. Investigators must carefully consider and account for how these factors might vary over the course of a study period.11

How to handle it

Schneeweiss et al. outline a framework for addressing measured and unmeasured confounding.12

Measured confounders can be addressed either in the design, by restriction or matching, or in the analysis through standardization (or weighting), stratification, or multivariable regression modeling. Propensity scores are a commonly used tool to implement many of these strategies. Variables that are not measured in the available data but could be available in a subset could be addressed via 2-stage sampling or external adjustment approaches. Unmeasured variables can sometimes be addressed by certain design and analysis strategies. For example, a self-controlled design is not subject to between-person confounding, although it is susceptible to time-varying confounding. Instrumental variable analysis, which relies on leveraging a variable that is associated with the exposure, but not the outcome, except through its association with the exposure, can yield valid results even in the presence of unmeasured confounding of the treatment-outcome association.

In observational studies, confounding by indication and severity may be strongest and most difficult to adjust for when comparing treated with untreated persons, since individuals who require and use a medication may be clinically different from those that do not.13 Comparing patients exposed to a particular treatment to patients exposed to an active comparator addresses confounding by both measured and unmeasured factors to the extent that the treatments are used interchangeably. Using appropriate active comparators has been shown to reduce the impact of confounding by indication by balancing some unmeasured patient characteristics.14,15

In certain clinical contexts, important confounders may be unmeasured and should be addressed by available methods.16 Quantitative bias analysis, including the E-value,17 allows researchers to correct or bound their estimates by making certain assumptions about the direction and magnitude of potential unmeasured confounding. The E-value represents the smallest magnitude of the association between the unmeasured confounder and the exposure and the unmeasured confounder and the outcome that would explain away an observed association between an exposure and an outcome. Quantitative bias analysis methods, including empirical calibration using negative control outcomes, can also be used to correct for unmeasured confounding.