I2.
Identify Estimand(s)

Once the research task is clear, specify the target quantity or pattern your study is trying to estimate. This is the estimand. A well-defined estimand clarifies the population, outcome, time frame, treatment or exposure conditions, summary measure, and other features needed to interpret the result.

Description

  1. Purpose or intended use of the description (e.g., disease burden estimation, resource allocation, health service planning, public health surveillance) 

  2. Target population (including the population definition and sampling frame)

  3. Health state, event, exposure, or practice of interest

  4. Summary measure and its time frame (e.g. point prevalence on a specified date, cumulative incidence over a specified period, rate per population per year, proportion or coverage at a specified time point)

  5. Auxiliary variables and their role (e.g. age-sex standardisation, or stratification by country)

Signal Discovery

  1. Purpose or intended use of the signals (e.g., hypothesis generation, prioritisation of candidates for confirmatory studies, regulatory safety actions, drug repurposing, biomarker identification)

  2. Target population (including the population definition and sampling frame)

  3. Definition of the feature space and outcome space being screened (e.g. all genotyped variants and a defined phenotype in GWAS; a defined variant and all available phenotypes in PheWAS; all drug exposures and a defined adverse event in drug-wide signal detection)

  4. Summary measures (e.g.  odds ratio, beta coefficient, log fold change, hazard ratio)

  5. Signal detection criteria (e.g. genome-wide significance threshold or false discovery rate target)

  6. Replication strategy (pre-specified plan for independent replication, including the replication cohort or data source, significance threshold for replication, and criteria for declaring a validated versus exploratory signal; if independent replication is not planned, this must be explicitly stated)

Factual Prediction

  1. Clinical decision, action, or policy to be informed by the prediction (e.g., whether to initiate treatment, whether to refer for further investigation, risk-stratified screening or resource allocation)

  2. Target population (including the population definition and sampling frame)

  3. Outcome definition (including the prediction horizon where appropriate, e.g. 5-year risk of heart attack)

  4. Intended deployment context, including proposed user (e.g. by primary care doctor during routine appointment at aged 40 years)

  5. Reference time point (i.e. landmark time) when the prediction will be made

  6. Handling of treatments (e.g. whether predictions reflect outcomes regardless of treatment received, prior to treatment initiation, or with treatment as part of a composite outcome) 

  7. Handling of competing events (e.g. whether competing events such as death are handled via cause-specific or subdistribution approaches)

Counterfactual Prediction

  1. Clinical decision, action, or policy to be informed by the prediction (e.g., which intervention to provide, optimal dose or treatment sequence, personalised treatment selection)

  2. Target population (including the population definition, sampling frame, and eligibility criteria for the specified intervention)

  3. Outcome definition (including the prediction horizon where appropriate, e.g. 5-year risk of heart attack)

  4. Intended deployment context, including proposed user (e.g. by primary care doctor during routine appointment at aged 40 years)

  5. Reference time point (i.e. landmark time) when the prediction will be made

  6. Hypothetical treatment strategies (a precise definition of each treatment strategy under which the outcome is predicted, e.g. if all individuals initiated statins at the reference time point) 

  7. Handling of competing events (e.g. whether competing events such as death are handled via cause-specific or subdistribution approaches)

Causal Effect Estimation

  1. Clinical decision, action, or policy to be informed by the effect estimate (e.g., treatment or policy guideline development, regulatory approval, health technology assessment, post-marketing safety evaluation)

  2. Target population (including the population definition and eligibility criteria)

  3. Start of follow-up (the specific event or decision point that marks the start of follow-up, e.g. date of diagnosis)

  4. Treatment conditions (including definitions of all treatments and comparator, and whether treatment is defined as a point intervention or a sustained or dynamic strategy)

  5. Endpoint (outcome definitions and end of follow-up)

  6. Summary measure (e.g. risk ratio, risk difference)

  7. Handling of intercurrent events* (definitions of all intercurrent events and descriptions of how will they be handled, e.g. deaths handled by treatment policy)

For heterogeneous causal effects and causal mediation analyses, see supplementary estimand specifications

By the end of this step, you should have:

  • Defined the target population

  • Specified the health state, event, exposure, treatment, intervention, feature, or outcome of interest

  • Identified the relevant time frame or prediction horizon

  • Selected the summary measure or target quantity

  • Clarified how competing events, treatments, interventions, or intercurrent events will be handled where relevant

  • Written a complete estimand statement

RIGOROUS