Advancing Early Pediatric Sepsis Detection: Machine Learning Models Predicting Onset Within 48 Hours

Advancing Early Pediatric Sepsis Detection: Machine Learning Models Predicting Onset Within 48 Hours

Highlight

  • Machine learning models accurately predict pediatric sepsis risk within 48 hours of emergency department (ED) presentation using early clinical data.
  • Models incorporating gradient tree boosting achieved AUROCs up to 0.94 for sepsis and 0.92 or greater for septic shock prediction.
  • Key predictive features included emergency severity index, age-adjusted vital signs, and medical complexity extracted from EHR data in initial 4 hours of ED care.
  • Fairness analysis showed consistent model performance across demographics, with higher accuracy in Medicaid-insured patients compared to those with commercial insurance.

Study Background

Sepsis remains a leading cause of morbidity and mortality in the pediatric population globally and poses significant clinical challenges due to its heterogeneity and rapid progression. Early recognition and timely treatment substantially improve outcomes. However, identifying children at imminent risk of developing sepsis and septic shock remains difficult, especially in the emergency department setting where early signs may be subtle and nonspecific. Existing diagnostic criteria and clinical judgment alone have limited sensitivity and specificity, and prior predictive models have not consistently enhanced early diagnosis. There is thus a critical unmet need for robust, data-driven tools that can support frontline clinicians by estimating individualized risk of sepsis development during the earliest stages of ED evaluation.

Study Design

This multisite cohort study utilized data from five health systems within the Pediatric Emergency Care Applied Research Network (PECARN), encompassing ED visits from January 2016 through February 2020 for model development and January 2021 through December 2022 for temporal validation. Eligible patients were children aged 2 months to under 18 years, excluding those who died or were transferred during the ED visit, had trauma diagnoses, or had sepsis already present within the predictive feature data window. Using electronic health records (EHRs), patient demographics and physiologic parameters were extracted from the first four hours of ED care. The primary outcome was sepsis development within 48 hours, defined by suspected infection plus a Phoenix Sepsis Criteria (PSC) score ≥2 or death.

The study compared machine learning algorithms—logistic regression with ridge regularization and gradient tree boosting—for predicting sepsis and septic shock. Model reporting adhered to the TRIPOD-AI guidelines and extensive data analysis was conducted up to July 2025 to ensure rigor in development and validation.

Key Findings

The large-scale dataset comprised 1,604,422 eligible encounters in the training cohort and 719,298 in the test cohort. Predictive performance was robust, with the gradient tree boosting models outperforming logistic regression.

For predicting sepsis, the area under the receiver operating characteristic curve (AUROC) was 0.92 (95% CI, 0.92-0.93) for logistic regression and 0.94 (95% CI, 0.93-0.94) for gradient tree boosting. Models predicting septic shock demonstrated AUROCs of 0.92 or greater, denoting excellent discriminative ability.

Positive likelihood ratios (LR+) for gradient tree boosting were between 4.67 and 6.18 for sepsis and 4.16 to 5.83 for septic shock, indicating considerable increase in post-test probability when the model predicts high risk.

Important predictive features identified included the emergency severity index (triage acuity), age-adjusted vital signs such as heart rate and respiratory rate, and complexity of the patient’s medical history. These multidimensional inputs provide a nuanced risk stratification beyond static clinical criteria.

The study also assessed model fairness across demographic groups. The AUROCs and likelihood ratios were consistent irrespective of race, ethnicity, or sex, but notably, models performed better for patients with Medicaid insurance compared to commercial payers. This may reflect differential data capture or population characteristics that merit further exploration.

Expert Commentary

This rigorous investigation demonstrates the potential for machine learning approaches to transform pediatric sepsis prediction in emergency settings by leveraging large-scale, multisite EHR data. The high AUROCs and positive likelihood ratios affirm that combining routinely collected clinical information with advanced analytics can yield accurate early warnings.

Strengths include the geographically diverse patient population, robust external temporal validation, and adherence to transparent reporting standards. The use of easily obtainable variables facilitates practical implementation.

Nevertheless, challenges remain. The study excluded certain high-risk populations, such as those with trauma or pre-existing sepsis, which limits applicability to these subgroups. Potential biases related to insurance status warrant further analysis to avoid unintended disparities in clinical decision support deployment. Additionally, integration with clinical workflow and prospective assessment of impact on patient outcomes are key future steps.

Biologically, the predictive features align with known sepsis pathophysiology, where altered vital signs reflect early systemic inflammatory responses, and higher medical complexity may predispose to infection complications.

Conclusion

This study provides compelling evidence that machine learning models based on early ED clinical data can reliably predict pediatric sepsis and septic shock within 48 hours. The gradient tree boosting approach, with excellent discriminative performance and positive likelihood ratios, offers a promising tool to augment clinician judgment and potentially enable earlier intervention.

Future research should emphasize prospective validation, integration into clinical decision support systems, and evaluation of effects on treatment timeliness and outcomes. Addressing disparities related to insurance and other social determinants is also critical to equitable care. As pediatric sepsis remains a major public health challenge, such data-driven predictive models have the potential to significantly improve early recognition and reduce morbidity and mortality in this vulnerable population.

Funding and ClinicalTrials.gov

The study was conducted under the auspices of the Pediatric Emergency Care Applied Research Network (PECARN). Specific funding sources were not reported in the available data. This observational derivation and validation study utilized retrospective registry data; no clinical trial registration was indicated.

References

Alpern ER, Scott HF, Balamuth F, Chamberlain JM, Depinet H, Bajaj L, Simon NE, Carter CP, Elsholz C, Webb M, Campos D, Deakyne Davies SJ, Cook LJ, Ungar L, Grundmeier R; PECARN PED Screen Study Group. Derivation and Validation of Predictive Models for Early Pediatric Sepsis. JAMA Pediatr. 2025 Oct 13:e253892. doi: 10.1001/jamapediatrics.2025.3892. Epub ahead of print. PMID: 41082207; PMCID: PMC12519407.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *