使用PILOT架构预测肝切除术后肝功能衰竭:整合肝再生生物标志物和分阶段机器学习

使用PILOT架构预测肝切除术后肝功能衰竭:整合肝再生生物标志物和分阶段机器学习

引言:肝切除术后肝功能衰竭的挑战

肝切除术后肝功能衰竭(PHLF)仍然是重大肝脏切除术后最严峻的并发症,是术后发病率和死亡率的主要驱动因素。尽管外科技术和围手术期护理取得了进展,但在恶性肿瘤患者中进行重大肝切除术时,PHLF的发生率仍是一个重要的临床挑战。PHLF的根本病理生理机制在于未来残余肝体积/功能与身体代谢需求之间的不平衡,通常因肝再生受损而加剧。

传统的预测模型,如Child-Pugh评分、MELD评分和各种术前肝功能测试(例如吲哚菁绿清除率),可以提供肝脏储备的一瞥,但经常无法捕捉肝再生和术中及术后早期发生的生理变化的动态、时间敏感性质。为了弥补这一差距,开发了PILOT(围手术期整合肝再生优化分阶段)架构,利用机器学习整合新型生物标志物与纵向临床数据。

PILOT架构的亮点

该研究在肝胆外科和预测分析领域引入了几项关键进展:

1. 新型生物标志物的整合:该模型纳入了特定的肝再生相关生物标志物,包括GATA3、RAMP2、VEGFA和PEDF,这些标志物反映了残余肝的分子状态。
2. 分阶段预测建模:通过将数据分为术前、术中和术后三个阶段,系统允许持续的风险重新评估。
3. 优越的预测准确性:PILOT模型达到了高达0.904的曲线下面积(AUC),远远超过了传统的评分系统。
4. 早期临床可操作性:该框架在术后六小时内实现了高精度的风险分层,为干预提供了关键的时间窗口。

研究设计和方法

这项回顾性多中心研究(ClinicalTrials.gov:NCT05779098)分析了2019年至2024年期间在三个高容量中心接受重大肝切除术的1,071名患者的数据。队列被分为训练集(n = 623)和两个独立的外部验证队列(n = 206和242),以确保结果的普适性。

研究人员评估了55个围手术期变量。本研究的一个独特之处是纳入了四个通过先前转录组学研究确定为肝再生关键因素的新型生物标志物:GATA3(GATA结合蛋白3)、RAMP2(受体活性修饰蛋白2)、VEGFA(血管内皮生长因子A)和PEDF(色素上皮衍生因子)。这些变量被组织成三个不同的数据集:

1. PILOT-Pre:术前临床数据和生物标志物。
2. PILOT-Intra:术前数据与术中因素(如失血量、手术时间)的整合。
3. PILOT-Post:早期术后数据(6小时以内)的纳入。

测试了13种不同的机器学习算法,包括LightGBM和XGBoost,并使用SHAP(SHapley Additive exPlanations)分析解释各个特征对模型预测的贡献。

主要发现和性能指标

PILOT架构在整个围手术期旅程中表现出卓越的区分能力。

模型区分和验证

在训练队列中,PILOT-Pre和PILOT-Intra模型(分别使用10个和15个特征的LightGBM)达到了0.754和0.787的AUC。PILOT-Post模型(使用20个特征的XGBoost)达到了0.904的AUC。这些结果在外部验证队列中成功复制,AUC范围从0.740到0.895。相比之下,传统的MELD评分或ALBI(白蛋白-胆红素)等级显示出显著较低的预测能力,AUC介于0.502到0.644之间(P < 0.050)。

风险分层和精确度

通过整合PILOT-Pre和PILOT-Intra预测,开发了一个共识风险分层框架。该框架将患者分为高危、中危和低危组。共识高危组对PHLF事件的类别特异性精确度为94.4%至96.6%。相反,共识低危组对非PHLF事件的精确度为92.1%至95.5%。这种准确度水平使临床医生能够高度自信地识别哪些患者需要密切监测,哪些患者可以加快康复。

关键风险阈值的识别

SHAP分析识别出几个与PHLF风险增加相关的生理和分子阈值:

1. 血清磷:术后第3天水平 >2.4 mg/dL 被确定为重要预测因子。虽然较高的磷通常与再生有关,但该研究强调了其时间和水平的复杂性。
2. RAMP2-GATA3比值:肝组织中的比值 4.9 与PHLF风险增加相关,反映了一种促炎或抗血管生成状态,阻碍了肝脏恢复。

专家评论:机制洞察和临床效用

PILOT架构的成功在于其能够通过再生的视角量化肝脏的生物储备。RAMP2和VEGFA的纳入尤为重要;这些蛋白质是肝细胞增殖所需的窦状内皮细胞信号传导的核心。当RAMP2-GATA3轴被破坏时,血管和实质生长的协调过程会被破坏,导致残余肝的功能失败。

此外,分阶段的方法认识到手术是一个动态的应激源。术中事件,如长时间缺血或过度失血,可以使患者从术前低风险转变为术后高风险。通过在手术后六小时内提供可靠的预测,PILOT允许早期使用生长因子或特殊营养支持等药物干预,并优化液体管理,以减轻PHLF的严重程度。

然而,必须注意局限性。作为一项回顾性研究,存在固有的选择偏倚风险。尽管外部验证是稳健的,但组织学生物标志物(如RAMP2-GATA3)的整合需要快速的病理处理,这可能在所有临床环境中不可用。未来的前瞻性试验有必要确定基于PILOT预测的干预措施是否直接改善患者的生存率。

结论

PILOT架构代表了个性化外科护理的重大进步。通过整合肝再生的分子基础与先进的机器学习,它提供了一个实用且高度准确的工具,用于预测PHLF。该框架使该领域从静态、一刀切的评估转向动态、数据驱动的方法,可以在术后早期阶段识别高危患者,潜在地改变接受重大肝手术患者的管理方式。

资助和注册

本研究得到了国家自然科学基金委员会(82403243)、中国博士后科学基金(GZC20231943)和上海市科学技术委员会(23Y11905900)的支持。该研究在ClinicalTrials.gov注册,标识符为NCT05779098。

参考文献

1. Shen H, Yuan T, Si A, et al. Liver regeneration-associated machine learning architecture integrating time-phased predictions for post-hepatectomy liver failure. EClinicalMedicine. 2025;90:103661. doi:10.1016/j.eclinm.2025.103661.
2. Reissfelder C, Brand K, Sobotka C, et al. Prediction of posthepatectomy liver failure: A systematic review of existing scoring systems. HPB (Oxford). 2021;23(1):1-12.
3. Rahbari NN, Garden OJ, Padbury R, et al. Posthepatectomy liver failure: a definition and grading by the International Study Group of Liver Surgery (ISGLS). Surgery. 2011;149(5):713-724.

Predicting Post-Hepatectomy Liver Failure with the PILOT Architecture: Integrating Liver Regeneration Biomarkers and Time-Phased Machine Learning

Predicting Post-Hepatectomy Liver Failure with the PILOT Architecture: Integrating Liver Regeneration Biomarkers and Time-Phased Machine Learning

Introduction: The Challenge of Post-Hepatectomy Liver Failure

Post-hepatectomy liver failure (PHLF) remains the most formidable complication following major hepatic resection, serving as a primary driver of postoperative morbidity and mortality. Despite advancements in surgical technique and perioperative care, the incidence of PHLF in patients undergoing major hepatectomy for malignancy continues to pose a significant clinical challenge. The fundamental pathophysiology of PHLF lies in the imbalance between the volume/function of the future liver remnant and the metabolic demands of the body, often exacerbated by impaired liver regeneration.

Traditional predictive models, such as the Child-Pugh score, MELD score, and various preoperative liver function tests (e.g., indocyanine green clearance), provide a snapshot of hepatic reserve but frequently fail to capture the dynamic, time-sensitive nature of liver regeneration and the physiological shifts occurring during the intraoperative and early postoperative periods. To bridge this gap, the PILOT (Perioperative Integrated Liver-regeneration Optimized Time-phased) architecture was developed, utilizing machine learning to integrate novel biomarkers with longitudinal clinical data.

Highlights of the PILOT Architecture

The study introduces several pivotal advancements in the field of hepatobiliary surgery and predictive analytics:

1. Integration of Novel Biomarkers: The model incorporates specific liver regeneration-associated biomarkers, including GATA3, RAMP2, VEGFA, and PEDF, which reflect the molecular state of the liver remnant.
2. Time-Phased Predictive Modeling: By categorizing data into preoperative, intraoperative, and postoperative phases, the system allows for continuous risk reassessment.
3. Superior Predictive Accuracy: The PILOT models achieved Area Under the Curve (AUC) values up to 0.904, vastly outperforming traditional scoring systems.
4. Early Clinical Actionability: The framework enables high-precision risk stratification within the first six hours post-surgery, providing a critical window for intervention.

Study Design and Methodology

This retrospective multicenter study (ClinicalTrials.gov: NCT05779098) analyzed data from 1,071 patients who underwent major hepatectomy across three high-volume centers between 2019 and 2024. The cohort was divided into a training set (n = 623) and two independent external validation cohorts (n = 206 and 242) to ensure the generalizability of the findings.

The researchers evaluated 55 perioperative variables. A unique aspect of this study was the inclusion of four novel biomarkers identified through previous transcriptomic research as critical to liver regeneration: GATA3 (GATA binding protein 3), RAMP2 (Receptor activity modifying protein 2), VEGFA (Vascular endothelial growth factor A), and PEDF (Pigment epithelium-derived factor). These variables were organized into three distinct datasets:

1. PILOT-Pre: Preoperative clinical data and biomarkers.
2. PILOT-Intra: Integration of preoperative data with intraoperative factors (e.g., blood loss, operative time).
3. PILOT-Post: Inclusion of early postoperative data (within 6 hours).

Thirteen different machine learning algorithms were tested, including LightGBM and XGBoost, with SHAP (SHapley Additive exPlanations) analysis used to interpret the contribution of individual features to the model’s predictions.

Key Findings and Performance Metrics

The PILOT architecture demonstrated exceptional discriminative ability across all phases of the perioperative journey.

Model Discrimination and Validation

In the training cohort, the PILOT-Pre and PILOT-Intra models (utilizing LightGBM with 10 and 15 features, respectively) achieved AUCs of 0.754 and 0.787. The PILOT-Post model, which utilized XGBoost with 20 features, reached an AUC of 0.904. These results were successfully replicated in the external validation cohorts, with AUCs ranging from 0.740 to 0.895. In contrast, traditional models such as the MELD score or ALBI (Albumin-Bilirubin) grade showed significantly lower predictive power, with AUCs between 0.502 and 0.644 (P < 0.050).

Risk Stratification and Precision

A consensus risk-stratification framework was developed by integrating PILOT-Pre and PILOT-Intra predictions. This framework categorized patients into high-, intermediate-, and low-risk groups. The consensus high-risk group demonstrated a class-specific precision for PHLF events of 94.4% to 96.6%. Conversely, the consensus low-risk group showed a precision of 92.1% to 95.5% for non-PHLF events. This level of accuracy allows clinicians to identify with high confidence which patients require intensive monitoring and which can be fast-tracked in their recovery.

Identification of Critical Risk Thresholds

SHAP analysis identified several key physiological and molecular thresholds associated with an increased risk of PHLF:
1. Serum Phosphorus: Levels >2.4 mg/dL on postoperative day 3 were identified as a significant predictor. While higher phosphorus is often associated with regeneration, the study highlights the complexity of its timing and levels.
2. RAMP2-GATA3 Ratio: A ratio 4.9 was linked to increased PHLF risk, reflecting a pro-inflammatory or anti-angiogenic state that hinders liver recovery.

Expert Commentary: Mechanistic Insights and Clinical Utility

The success of the PILOT architecture lies in its ability to quantify the biological reserve of the liver through the lens of regeneration. The inclusion of RAMP2 and VEGFA is particularly significant; these proteins are central to the sinusoidal endothelial cell signaling required for hepatocyte proliferation. When the RAMP2-GATA3 axis is disrupted, the orchestrated process of vascular and parenchymal growth is compromised, leading to functional failure of the remnant liver.

Furthermore, the time-phased approach acknowledges that surgery is a dynamic stressor. An intraoperative event, such as a prolonged period of ischemia or excessive blood loss, can shift a patient from a low preoperative risk to a high postoperative risk. By providing a reliable prediction within six hours of surgery, PILOT allows for early pharmacological interventions, such as the use of growth factors or specialized nutritional support, and optimized fluid management to mitigate the severity of PHLF.

However, limitations must be noted. As a retrospective study, there is an inherent risk of selection bias. While the external validation was robust, the integration of tissue-based biomarkers (like RAMP2-GATA3) requires rapid pathological processing which may not be available in all clinical settings. Future prospective trials are necessary to determine if interventions based on PILOT predictions directly improve patient survival rates.

Conclusion

The PILOT architecture represents a significant leap forward in personalized surgical care. By integrating the molecular underpinnings of liver regeneration with advanced machine learning, it provides a practical and highly accurate tool for predicting PHLF. This framework moves the field away from static, one-size-fits-all assessments toward a dynamic, data-driven approach that can identify high-risk patients in the earliest stages of the postoperative period, potentially transforming the management of patients undergoing major liver surgery.

Funding and Registration

This research was supported by the National Natural Science Foundation of China (82403243), the Program for National Postdoctoral Researchers Funding of China (GZC20231943), and the Shanghai Municipal Commission of Science and Technology (23Y11905900). The study is registered with ClinicalTrials.gov under the identifier NCT05779098.

References

1. Shen H, Yuan T, Si A, et al. Liver regeneration-associated machine learning architecture integrating time-phased predictions for post-hepatectomy liver failure. EClinicalMedicine. 2025;90:103661. doi:10.1016/j.eclinm.2025.103661.
2. Reissfelder C, Brand K, Sobotka C, et al. Prediction of posthepatectomy liver failure: A systematic review of existing scoring systems. HPB (Oxford). 2021;23(1):1-12.
3. Rahbari NN, Garden OJ, Padbury R, et al. Posthepatectomy liver failure: a definition and grading by the International Study Group of Liver Surgery (ISGLS). Surgery. 2011;149(5):713-724.

Comments

No comments yet. Why don’t you start the discussion?

发表回复