Standardizing How Investigators Identify Hidradenitis Suppurativa Lesions: Key Findings from a Modified Delphi Consensus

Introduction and Context

Hidradenitis suppurativa (HS) is a chronic, relapsing inflammatory skin disease characterized by painful nodules, abscesses, sinus tracts (tunnels), and scarring that most commonly affect the axillae, groin, and anogenital regions. Clinical trials for HS increasingly underpin changes in care — when trial endpoints are unevenly measured, results may be misleading. Many commonly used trial endpoints in HS (for example, the Hidradenitis Suppurativa Clinical Response, HiSCR, and the International Hidradenitis Suppurativa Severity Score System, IHS4) rely on accurate and reproducible lesion counts. Small differences in how investigators define and count lesions can alter responder designation and trial conclusions.

Recognizing this measurement vulnerability, Garg et al. convened a multi-expert panel and used a modified Delphi process to create consensus-based morphological definitions for HS lesions and practical guidance for investigator assessments in clinical trials. Their JAMA Dermatology publication (2025) reports the consensus development, the final recommendations, and areas where unanimity could not be reached. The consensus is aimed at clinical trial sponsors, principal investigators, and trial raters — particularly those newer to HS evaluation.

Why this consensus matters now

– HS therapeutic research has matured rapidly: biologics and targeted agents have advanced to late-phase trials, raising stakes for clear, reproducible endpoints.
– Existing outcome measures (HiSCR, IHS4) depend on lesion phenotyping whose reliability between raters is imperfect.
– Training materials and lesion atlases have been heterogeneous across trials; a standardized set of morphological definitions and rater guidance promises to reduce interrater variability, improve data quality, and increase the validity of trial conclusions.

New Guideline Highlights

– The panel achieved consensus on morphological definitions for 11 lesion types commonly seen in HS and on 16 practical guidance statements to standardize how lesions are assessed in trials. Nine lesion definitions reached ≥90% agreement, and 16 of 18 guidance statements met the prespecified consensus threshold (70%).
– The two guidance topics that failed consensus concerned (1) how to count tunneled plaques with multiple openings and (2) how to handle scalp lesions — underscoring real-world complexity in some anatomical scenarios.
– Key themes of the guidance: precise lesion morphology (inflammatory vs noninflammatory), when to count vs exclude lesions for endpoint calculations, standardized palpation and visual inspection techniques, and mandatory rater training with photographic atlases and calibration exercises.

Key takeaways for clinicians and trialists
– Use the consensus morphological definitions and rater guidance as a minimum standard in HS trial protocols.
– Include rater certification, centralized photo review or adjudication procedures, and periodic recalibration to sustain reliability.
– Recognize and explicitly prespecify handling of complicated presentations (e.g., tunneled plaques with multiple openings, scalp disease) in trial protocols when they are expected in the study population.

Methods in Brief (why the results are robust)

– The group used a modified Delphi approach with an initial image-assessment questionnaire and qualitative feedback, followed by two electronic Delphi rounds and a virtual group discussion to inform re-voting.
– Participants were health professionals with HS measurement expertise (predominantly dermatologists) plus novice raters. Response rates were high across stages (preliminary 84.7%, round 1 86.0%, round 2 90.9%).
– Prespecified consensus threshold was ≥70% agreement. The process emphasized real clinical photos, rater reasoning, and iterative refinement of definitions and practical guidance.

Updated Recommendations and Key Changes

Note: This document is a consensus focused specifically on lesion morphology and rater guidance for clinical trials rather than a treatment guideline. Key changes relative to prior practice are pragmatic standardizations rather than new therapeutic recommendations.

What this consensus adds or clarifies compared to prior practice
– Formal, consensus-driven definitions for 11 lesion morphologies, reducing ambiguity in common clinical descriptions (e.g., how to separate an inflammatory nodule from an abscess or a fibrotic nodule).
– Explicit guidance on lesion counting rules tied to commonly used trial endpoints (e.g., when to count a lesion for HiSCR or IHS4).
– Operational recommendations on rater training, photographic documentation, and standard palpation/inspection techniques to improve interrater reliability.

Table (summary of major changes)
– Standardized lesion definitions: introduced and agreed — replaces variable local definitions.
– Rater training recommendations: required calibration exercises and photographic atlas — new formalization.
– Handling of complex lesions (tunnels with multiple openings): highlighted as contested and requiring trial-specific rules — new recognition.

Topic-by-Topic Recommendations

The consensus covers lesion morphology, counting rules, rater conduct, and special circumstances in trial contexts.

1) Morphologic definitions (consensus on 11 lesion types)
The panel produced concise morphological definitions for the lesion types most relevant to trial endpoints. These standardized definitions were intended to be used by investigators performing physical examinations and to be embedded in rater training materials and photographic atlases. While the paper lists the full definitions and photographic examples, the core lesion categories include:
– Inflammatory nodule: tender, deep-seated, firm inflammatory lesion without frank fluctuation.
– Abscess (fluctuant inflammatory lesion): tender, often fluctuant collection consistent with localized suppuration.
– Draining tunnel/sinus tract: a subcutaneous tract with one or more external openings that may exude pus or serous drainage.
– Plaque (including tunneled plaque): an elevated, flat-topped area representing confluent nodules or tethered inflammatory tissue.
– Pustule: small, superficial collection of pus visible within or upon the epidermis.
– Papule: small, raised, solid lesion without purulence.
– Cyst (epidermal or follicular): circumscribed saclike lesion, often with a palpable wall, that may or may not be inflamed.
– Fibrotic or indurated nodule/scar: firm, noninflammatory lesion representing fibrosis or healed disease.
– Open comedo (double-comedo): dilated follicular opening characteristic for HS-prone areas.
– Hypertrophic or bridging scar: postinflammatory scarring that may connect adjacent areas.
– Mixed or composite lesion: complex presentations where more than one morphology coexists.

(Each of these categories was accompanied in the published work by pictorial examples and precise clinical descriptors to distinguish overlapping lesions.)

2) Counting rules and endpoints
– For endpoints that rely on lesion counts (HiSCR, IHS4), the panel provided guidance to ensure consistent identification of what constitutes an “abscess,” an “inflammatory nodule,” and a “draining tunnel.”
– Recommendation: Count clearly inflammatory lesions (nodules and abscesses) and draining tunnels per the panel definitions; exclude purely fibrotic scars from inflammatory lesion counts.
– Emphasize that draining tunnels should be identified by visible openings and/or expressible drainage; palpation and gentle compression can help distinguish tunnels from other lesions.
– Important: IHS4 weights lesions differently (e.g., draining tunnels are given higher weight) — misclassification of tunnels vs nodules can materially change severity scores.

3) Rater conduct, training, and documentation
– Mandatory rater training with a standardized photographic atlas containing the consensus definitions and representative images across Fitzpatrick skin types.
– Certification/calfibration sessions (pretrial and periodically during trials) with interrater reliability testing.
– Use of standard approaches to observation and palpation: adequate lighting, removal of dressings, targeted palpation to detect fluctuance or subcutaneous tracts.
– High-quality standardized photography (consistent angles, distance, scales) and centralized adjudication for ambiguous or endpoint-determining lesions where possible.

4) Special populations and anatomical sites
– The panel provided guidance for common sites (axillae, groin, perineum) and recommended that trial protocols prespecify how anatomically complex areas will be handled.
– Scalp lesions: no consensus was reached on a single rule; investigators should prespecify scalp rules in protocols if scalp involvement is anticipated.

5) Handling complex lesions: tunneled plaques with multiple openings
– This scenario produced lack of consensus: should each opening be counted as a separate tunnel, or should the entire plaque represent one tunneled lesion? The panel split, and the authors recommend that trials prespecify the rule they will use and consider sensitivity analyses exploring alternative counting rules.

Expert Commentary and Insights

– Panel perspective: investigators stressed the importance of clarity because lesion misclassification directly affects participant responder classification, safety monitoring (identifying new abscesses), and trial comparability.
– On the two nonconsensus areas, experts noted legitimate clinical variability: some experienced raters treat a tunneled plaque with multiple openings as one biological process and count it once, while others count individual external ostia when endpoints weight tunnels heavily. The lack of consensus highlights the need for protocol-level decisions and for research comparing counting strategies.
– Many panelists emphasized that newer raters, especially those from general dermatology or nondermatology background, benefit most from a calibrated atlas and supervised training sessions; the consensus definitions aim to be usable across levels of experience.

Areas of controversy and future research needs
– Direct validation studies are needed to quantify how much standardized definitions and rater training reduce interrater variability and alter trial outcomes.
– Imaging correlation (ultrasound) may provide an objective standard for tunneling but is not yet practical as a universal trial standard; the consensus recommends further research linking clinical definitions to ultrasound findings.
– Scalp HS and atypical anatomic presentations warrant focused study and extension of the atlas to include diverse skin tones and rare sites.

Practical Implications for Clinical Trials and Practice

– Trial design: Protocols should embed the consensus definitions, rater certification criteria, photographic documentation standards, and prespecified rules for tricky situations (tunneled plaques with multiple openings; scalp disease).
– Data quality: Consistent lesion definition and counting will reduce noise in efficacy endpoints and increase the statistical power and interpretability of trial results.
– Clinician training: The consensus provides a ready-made structure for training investigators and site raters; adoption by academic and industry sponsors can improve cross-study comparability.

Illustrative vignette
Emily, a 29-year-old woman enrolled in a biologic trial for moderate-to-severe HS, had two axillary inflammatory nodules, one unilateral draining tract with two superficial openings, and multiple old bridging scars. Before trial adoption of the consensus guidance, one rater counted the draining tract as two tunnels (one per opening), while another counted it as a single tunnel. This discrepancy would change Emily’s IHS4 score and potentially her HiSCR responder status. With the consensus-adopted protocol (which prespecified counting tunneled plaques per the trial’s chosen rule) and rater calibration against the photographic atlas, both raters counted the lesion consistently, preventing misclassification.

Conclusions and Next Steps

The modified Delphi consensus led by Garg et al. represents a timely, practical step toward harmonizing lesion assessment in HS clinical trials. By providing clear morphological definitions for key lesion types and pragmatic guidance for rater conduct and training, the consensus aims to improve measurement reliability — a foundational need as HS therapeutics advance. Two contentious areas (tunneled plaques with multiple openings and scalp lesions) remain unresolved; the authors and participating experts recommend that trials prespecify rules for these scenarios and that further research (including imaging correlations and reliability studies) be prioritized.

Adoption of this consensus by sponsors, investigative sites, and regulatory reviewers will help ensure that measured treatment effects reflect true biological change rather than variability in lesion interpretation.

References

– Garg A, Strunk A, Midgette B, et al. Standardization of Lesion Classification and Assessment by Investigators in Clinical Trials for Hidradenitis Suppurativa: A Consensus Exercise Using a Modified Delphi Approach. JAMA Dermatol. 2025 Nov 26. doi:10.1001/jamadermatol.2025.4652. PMID: 41296358.
– Jemec GBE. Hidradenitis suppurativa. N Engl J Med. 2012 Jan 19;366(2):158-64. doi:10.1056/NEJMcp1014163.
– (Contextual outcome measures) Zouboulis CC, et al. Development/validation of the International Hidradenitis Suppurativa Severity Score System (IHS4). Br J Dermatol. 2017; note: consult the original IHS4 publication and HiSCR methodology papers for endpoint-specific details.

Note: Readers and trial designers should consult the full consensus paper (Garg et al., JAMA Dermatology, 2025) for the detailed photographic atlas, the exact wordings of each morphological definition, and the complete set of guidance statements and voting results.