What, When and How to Measure Post COVID-19 Condition: The New Core Outcome Set for Clinical Trials

Introduction and Context

Post COVID-19 condition (PCC), often called “long COVID,” has become a persistent public-health and clinical challenge since the acute phase of the pandemic. Patients report multi-system symptoms — fatigue, breathlessness, cognitive difficulties, pain, sleep disturbance and mental health problems — that can persist for months and impair quality of life and function. Clinical trials and intervention studies aiming to treat PCC have proliferated, but heterogeneity in which outcomes are measured, when they are assessed, and how they are measured has made it difficult to compare results, synthesize evidence, and accelerate effective care strategies.

In response, Pang et al. convened a multi-stakeholder effort and published “A Core Outcome Set for Clinical Trials on Post COVID-19 Condition: ‘What,’ ‘When,’ and ‘How’ to Measure” (J Evid Based Med. 2025). The PROMINENT outcome of that work is COS‑PCC: a consensus-derived, pragmatic set of core outcomes and recommended measurement instruments for PCC trials. This article summarizes the COS‑PCC, explains why it was needed, details the chosen outcomes and instruments, and discusses implications and remaining controversies for clinicians, trialists, and policy-makers.

Key contextual references that frame COS development include the World Health Organization’s clinical case definition of PCC (Delphi consensus, 2021) and national guidance such as NICE’s guidance on managing long-term effects of COVID-19 (NG188). COS development followed accepted core-set methodology promoted by the COMET Initiative and reporting standards (e.g., COS‑STAR/COS‑STAD guidance).

References: WHO (2021); NICE (2020/2021); COMET Initiative; Kirkham et al., Trials 2016; Williamson et al., Trials 2012.

New Guideline Highlights

Pang et al. used a systematic approach: literature review and measurement-method mapping; surveys of clinicians and patients; two Delphi rounds with multi-stakeholder participants; and a final consensus meeting. From an initial inventory of 52 outcomes across seven categories and 206 measurement methods, the group prioritized four domains and nine core outcomes with specified measurement instruments. Consensus also supported 16 optional measurement methods to supplement trials.

The nine core outcomes and their recommended primary instruments are:

– Dyspnea — modified Medical Research Council (mMRC) dyspnea scale
– Cough — Leicester Cough Questionnaire (LCQ)
– Exercise capacity — 6‑minute walk test (6MWT)
– Fatigue — Fatigue Severity Scale (FSS)
– Pain — Numerical Rating Scale (NRS)
– Sleep disturbance — Pittsburgh Sleep Quality Index (PSQI)
– Anxiety — Generalized Anxiety Disorder Scale‑7 (GAD‑7)
– Depression — Patient Health Questionnaire‑9 (PHQ‑9)
– Health status / quality of life — 36‑item Short Form Health Survey (SF‑36)

Major themes and takeaways

– Pragmatism and feasibility drove instrument selection: widely used, validated tools that are feasible in outpatient trial settings were favored.
– Patient and clinician input was central: both symptom burden and functional impact informed outcome choice.
– The COS focuses on measurable domains that are common, important to patients, and likely to change with interventions.
– The COS does not preclude additional outcomes; it sets a minimum standard for comparability across trials.

Updated Recommendations and Key Changes

What this COS adds compared with prior guidance

– Prior to COS‑PCC, guidance (WHO, NICE) described definitions and recommended clinical assessment approaches but did not specify a minimum, standardized set of trial outcomes with instruments. COS‑PCC fills that gap by detailing “what, when and how” to measure in PCC trials.

– COS‑PCC emphasizes both symptom severity and functional status (e.g., 6MWT and SF‑36), reflecting a shift toward outcomes that patients and regulators care about: function and health-related quality of life in addition to symptoms.

– Selected instruments are validated and globally used, which should improve data harmonization and cross-trial meta-analysis compared with the diverse, often single-study-specific measures previously used.

Evidence driving the updates

– The selection was informed by the literature on symptom frequency and impact in PCC cohorts, established psychometric properties of candidate instruments, and stakeholder preferences. The WHO clinical case definition (symptoms ≥3 months) provided a temporal framework for when to assess many outcomes.

Topic-by-Topic Recommendations

Methodology and stakeholder input

– Inventory: 52 outcomes, 206 measurement methods identified by literature review and surveys.
– Delphi: Two rounds; 60 participants completed round 1 (across patients, clinicians, researchers, methodologists, and regulators), 41 completed round 2.
– Consensus meeting: 36 representatives finalized the COS.
– Standards: Development followed accepted approaches for COS development (COMET principles) and reporting guidance.

Core outcomes, instruments and rationale (detailed)

– Dyspnea — mMRC dyspnea scale
– Rationale: Simple, validated, widely used scale correlating with functional limitation. Practical for large outpatient trials and consistent with respiratory clinical practice (Mahler & Wells-style mMRC descriptors).

– Cough — Leicester Cough Questionnaire (LCQ)
– Rationale: Symptom-specific, validated health status measure for cough with responsiveness to change.

– Exercise capacity — 6‑minute walk test (6MWT)
– Rationale: Objective functional measure widely used in pulmonary and rehabilitation trials; provides distance walked as a concrete endpoint (ATS 6MWT statement provides standardized procedures).

– Fatigue — Fatigue Severity Scale (FSS)
– Rationale: Validated scale used across chronic conditions and sensitive to clinically meaningful change in fatigue.

– Pain — Numerical Rating Scale (NRS)
– Rationale: Simple, validated, and widely used for pain intensity; easy to administer repeatedly.

– Sleep disturbance — Pittsburgh Sleep Quality Index (PSQI)
– Rationale: Global measure of sleep quality and disturbances over a month; validated and commonly used in research.

– Anxiety — GAD‑7; Depression — PHQ‑9
– Rationale: Brief, validated screeners that quantify symptom severity and facilitate both clinical and research comparisons.

– Health status — SF‑36
– Rationale: A broad measure of health-related quality of life covering physical and mental domains, enabling comparisons across conditions and economic evaluations.

Optional measurement methods

– The panel agreed on 16 optional instruments to supplement the core set; these include more granular cognitive tests, autonomic function assessments, PROMIS measures, and objective biomarkers or activity-monitoring approaches. These optional measures are recommended when trial scope, resources, or specific interventions warrant deeper phenotyping.

Timing: “When” to measure

– COS‑PCC emphasizes measuring both short-term and long-term phases aligned with clinical relevance and the WHO definition of PCC. The working group prioritized assessments that capture early persistence (around 3 months from acute infection) and longer-term trajectories (6–12 months or beyond), but recognized that exact timepoints should align with the trial’s hypothesis and intervention timing.

– Practical recommendation: include baseline (pre-intervention), an early post-intervention check, a primary timepoint aligned with expected effect (commonly 3–6 months), and longer-term follow-up when feasible (≥12 months) to capture persistence or recovery.

Special populations and adaptations

– COS‑PCC was developed with broad applicability in mind, but the authors note the need for cultural and language adaptation of instruments and possible alternative or additional measures for children, pregnant people, and those with pre-existing disability.

Expert Commentary and Insights

Committee perspectives

– Strengths: The COS balances clinical relevance, patient priorities, and methodological rigor. It prioritizes validated, practical measures that are likely to be feasible across diverse trial settings.

– Cautions: The group acknowledged limitations — some important PCC manifestations such as post-exertional symptom exacerbation (PESE) or autonomic dysfunction are difficult to capture with existing brief instruments and may require specialized assessments that remain optional rather than core.

– Flexibility: COS‑PCC sets a minimal common standard; trials can and should collect additional outcomes tailored to the intervention and the pathophysiology under study.

Key controversies

– Cognitive dysfunction: Despite the prominence of “brain fog” in patient reports, cognitive outcomes are challenging to standardize across trials. The COS included anxiety/depression and a global health instrument (SF‑36) but left detailed cognitive testing as optional.

– Objective biomarkers and remote monitoring: The working group was divided on including objective physiologic markers or digital biomarkers as core measures due to variability in availability, standardization, and interpretability. These remain promising supplements for future versions of the COS.

– Global applicability: Instruments like SF‑36 and PSQI are validated across many languages but not universally. Implementation in low-resource settings may require translation, cultural adaptation, or selection of abbreviated instruments.

Future trends and research needs

– Harmonization of measurement timing and minimal clinically important differences (MCIDs) for PCC populations.
– Development and validation of standardized cognitive and autonomic measures sensitive to PCC-specific phenotypes.
– Integration of digital and physiological markers (e.g., activity monitors, cardiopulmonary exercise testing) as optional but protocolized supplements.

Practical Implications

For trialists

– Minimum dataset: Incorporate the nine core outcomes and instruments into trial protocols to enable comparability and pooling of results.
– Sample size and endpoints: Use the chosen instruments’ measurement properties and established MCIDs (when available) to inform power calculations.
– Reporting: Report core outcomes consistently at the prespecified timepoints and follow COMET/COS‑STAR reporting guidance.

For clinicians and guideline developers

– Evidence synthesis: Future systematic reviews and guideline updates will be more robust if primary studies use the COS‑PCC, enabling synthesis of effects on function and quality of life as well as symptoms.
– Clinical trials and practice alignment: Selected instruments are already familiar to many clinicians (e.g., PHQ‑9, GAD‑7, mMRC), which eases translation of trial findings into practice.

For patients and funders

– Patient-centered measurement: The COS reflects patient priorities (symptom burden and function), improving the likelihood that trial results will be meaningful to people living with PCC.
– Funding decisions: Funders can require or strongly encourage inclusion of the COS in PCC trial proposals to ensure comparability and value for money.

Sample Vignette: Applying the COS in a Trial

John Davis, age 48, developed COVID‑19 six months ago and continues to have exertional breathlessness, fatigue, and poor sleep. He is eligible for a randomized trial of a rehabilitation program. Using COS‑PCC, the trial collects baseline mMRC, LCQ, 6MWT, FSS, NRS for pain, PSQI, GAD‑7, PHQ‑9 and SF‑36; repeat assessments are scheduled at 3 months (primary endpoint) and 12 months (long-term outcome). These standardized measures will allow the rehabilitation trial to be compared and combined with other trials using COS‑PCC instruments.

Conclusions

COS‑PCC is a pragmatic, consensus-driven minimum outcome set for clinical trials of post COVID-19 condition. By specifying nine core outcomes and recommending validated, feasible instruments, it tackles a major obstacle to evidence synthesis: heterogeneity in outcome selection and measurement. Implementation of COS‑PCC across future trials should accelerate knowledge about effective interventions, improve trial comparability, and help ensure that results focus on outcomes that matter to patients — symptoms, function, and quality of life.

The authors acknowledge that COS‑PCC is an initial framework: as PCC science advances, new biomarkers, cognitive assessments, and digital measures may be added and the set will need updating — a process that should remain iterative and inclusive.

References

– Pang B, Wang K, Liu Q, et al. A Core Outcome Set for Clinical Trials on Post COVID-19 Condition: “What,” “When,” and “How” to Measure. J Evid Based Med. 2025 Nov 22:e70082. doi: 10.1111/jebm.70082.
– World Health Organization. A clinical case definition of post COVID-19 condition by a Delphi consensus, 6 October 2021. WHO. https://www.who.int/publications/i/item/WHO-2019-nCoV-Post_COVID-19_condition-Clinical_case_definition-2021.1
– National Institute for Health and Care Excellence (NICE). COVID‑19 rapid guideline: managing the long-term effects of COVID‑19. NG188. 2020 (updated). https://www.nice.org.uk/guidance/ng188
– COMET Initiative. Core Outcome Measures in Effectiveness Trials. https://www.comet-initiative.org
– Kirkham JJ, Gorst S, Altman DG, et al. Core Outcome Set–STAndards for Reporting: The COS‑STAR statement. Trials. 2016;17:345.
– Williamson PR, Altman DG, Blazeby JM, et al. Developing core outcome sets for clinical trials: issues to consider. Trials. 2012;13:132.
– Mahler DA, Wells CK. Evaluation of clinical methods for rating dyspnea. Chest. 1988;93(3):580‑586. (mMRC dyspnea scale adaptation)
– Birring SS, Prudon B, Carr AJ, et al. Development of a symptom‑specific health status measure for patients with chronic cough: the Leicester Cough Questionnaire (LCQ). Thorax. 2003;58(4):339‑343.
– ATS Committee on Proficiency Standards for Clinical Pulmonary Function Laboratories. ATS statement: guidelines for the six‑minute walk test. Am J Respir Crit Care Med. 2002;166(1):111‑117.
– Krupp LB, LaRocca NG, Muir‑Nash J, Steinberg AD. The Fatigue Severity Scale. Arch Neurol. 1989;46(10):1121‑1123.
– Buysse DJ, Reynolds CF 3rd, Monk TH, Berman SR, Kupfer DJ. The Pittsburgh Sleep Quality Index: a new instrument for psychiatric practice and research. Psychiatry Res. 1989;28(2):193‑213.
– Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD‑7. Arch Intern Med. 2006;166(10):1092‑1097.
– Kroenke K, Spitzer RL, Williams JB. The PHQ‑9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606‑613.
– Ware JE Jr, Sherbourne CD. The MOS 36‑item Short‑Form Health Survey (SF‑36). Med Care. 1992;30(6):473‑483.