发热伴血小板减少综合征不良结局的机器学习早期预测模型

Machine learning model for early prediction of adverse outcomes in severe fever patients complicated with thrombocytopenia syndrome

  • 摘要:
    目的 构建基于入院24 h内临床指标的机器学习模型,预测发热伴血小板减少综合征(SFTS)患者住院期间不良结局风险,实现早期风险分层。
    方法 采用回顾性队列研究设计,纳入2022年4月-2025年5月南京医科大学第一附属医院收治的430例SFTS患者(好转组346例,预后不良组84例),收集入院时临床特征及24 h内实验室指标,在训练集上应用LASSO回归进行特征选择,运用5种算法(极度梯度提升、梯度提升模型、随机森林、支持向量机和逻辑回归)构建预测模型,并在独立测试集评估性能。同时,采用SHAP方法提升模型的可解释性。
    结果  LASSO回归确定了7个核心预测因子:年龄、SFTS病毒核酸载量(经lg转换)、铁蛋白(Fer)、凝血酶原时间(PT)、肌酐(Scr)、乳酸脱氢酶(LDH)、降钙素原(PCT)。支持向量机(SVM)模型在测试集综合性能最优(AUC=0.865,95%CI:0.781~0.948),其余模型AUC为0.832~0.858。SHAP分析显示年龄和lgSFTSV是影响模型预测重要的两个特征。
    结论  本研究构建并验证了SVM算法的可解释机器学习模型,利用入院24 h指标可有效预测SFTS患者不良结局风险。该模型整合病毒载量等特异性指标,验证了年龄、lgSFTSV、凝血与器官损伤标志物等的预测价值。

     

    Abstract:
    OBJECTIVE  To develop a machine learning model based on clinical markers within 24 hours of admission to predict the risk of adverse outcomes in patients with severe fever with thrombocytopenia syndrome (SFTS) during hospitalization, and enable early risk stratification.
    METHODS  A retrospective cohort study was conducted, enrolling 430 SFTS patients (346 in the improved group and 84 in the adverse outcome group) admitted to the First Affiliated Hospital of Nanjing Medical University from Apr. 2022 to May 2025. Clinical characteristics at admission and laboratory markers within 24 hours were collected. LASSO regression was applied for characteristic selection in the training set, and five algorithms (extreme gradient boosting, gradient boosting machine, random forest, support vector machine and logistic regression) were used to construct prediction models, with performance evaluated on an independent test set. SHAP method was employed to improve the model interpretability.
    RESULTS  LASSO regression identified seven core predictors: age, SFTS viral nucleic acid load (log-transformed), ferritin (Fer), prothrombin time (PT), serum creatinine (Scr), lactate dehydrogenase (LDH) and procalcitonin (PCT). The support vector machine (SVM) model demonstrated optimal overall performance in the test set (AUC=0.865, 95%CI: 0.781–0.948), while other models achieved AUC of 0.832–0.858. SHAP analysis revealed age and lgSFTSV as the two most influential characteristics in model prediction.
    CONCLUSIONS  This study develops and validates an interpretable machine learning model based on the SVM algorithm, which effectively predicts the risk of adverse outcomes in SFTS patients based on 24 h admission markers. The model integrates specific markers such as viral load and confirms the predictive value of age, lgSFTSV, coagulation and organ injury markers.

     

/

返回文章
返回