腹膜透析相关性腹膜炎预测模型构建及其效能验证

Construction and efficacy validation of a prediction model for peritoneal dialysis-associated peritonitis

  • 摘要:
    目的 构建并验证基于机器学习的腹膜透析相关性腹膜炎(PDAP)风险预测模型,筛选关键预测因子,为临床早期识别高危人群提供决策工具。
    方法 选取2020年1月—2023年12月中山市中医院271例行腹膜透析治疗的患者,根据指南将患者分为PDAP组(73例)和非PDAP组(198例)。收集人口学特征、生化指标及合并症等资料,经中心化与标准化预处理后,按3∶1比例随机划分训练集(204例)和测试集(67例)。采用五折交叉验证(重复3次)优化超参数,构建支持向量机、随机森林、极限梯度提升、K近邻及最小绝对收缩选择算子(LASSO)回归五种机器学习模型,以曲线下面积(AUC)为主要指标评估性能。通过受试者工作特征曲线、混淆矩阵和决策曲线综合评估模型效能得出最佳模型,采用校准曲线进行验证。
    结果 PDAP组高血压比例、血糖水平高于非PDAP组,而磷、尿酸、转铁蛋白饱和度、钾、钙、白蛋白、谷丙转氨酶、总胆固醇、低密度脂蛋白胆固醇、总蛋白水平低于非PDAP组(P<0.05);在测试集中,LASSO模型综合性能最优,其AUC为0.844,灵敏度0.611、特异度0.959、阳性预测值0.846、阴性预测值0.870、F1分数0.710、准确率0.866,且LASSO模型的决策曲线净获益最高(阈值概率范围0.1~0.8),校准曲线显示校准曲线斜率1.001、截距0.095,Brier评分0.119。LASSO筛选出10个重要性变量分别为合并高血压、总蛋白、总胆固醇、钾、谷丙转氨酶、钙、转铁蛋白饱和度、尿酸、磷、白蛋白水平。
    结论 LASSO回归模型在PDAP风险预测中展现了优异性能及临床实用性,其筛选的10项关键指标为高危患者早期识别干预提供量化依据。

     

    Abstract:
    OBJECTIVE  To construct and validate a machine learning-based risk prediction model for peritoneal dialysis-associated peritonitis (PDAP), identify key predictive factors, and provide a decision-making tool for the early identification of high-risk populations in clinical settings.
    METHODS  A total of 271 patients undergoing peritoneal dialysis at Zhongshan Hospital of Traditional Chinese Medicine from Jan. 2020 to Dec. 2023 were selected. Patients were divided into a PDAP group (73 cases) and a non-PDAP group (198 cases) according to guidelines. The dataset was randomly divided into a training set (204 cases) and a test set (67 cases) at a ratio of 3:1. Hyperparameters were optimized with five-fold cross-validation (repeated three times), and five machine learning models, including Support Vector Machine, Random Forest, Extreme Gradient Boosting, K-Nearest Neighbor and Least Absolute Shrinkage and Selection Operator (LASSO) regression, were constructed, with the model performance evaluated primarily based on the area under the curve (AUC). The optimal model was determined through comprehensive assessment of model efficacy with receiver operating characteristic curves, confusion matrices and decision curves, and was validated with calibration curves.
    RESULTS  The PDAP group exhibited higher proportions of hypertension and higher blood glucose levels than the non-PDAP group, while levels of phosphorus, uric acid, transferrin saturation, potassium, calcium, albumin, alanine aminotransferase, total cholesterol, low-density lipoprotein cholesterol and total protein were lower in the PDAP group (P<0.05). In the test set, the LASSO model demonstrated the best comprehensive performance, with an AUC of 0.844, sensitivity of 0.611, specificity of 0.959, positive predictive value of 0.846, negative predictive value of 0.870, F1 score of 0.710 and accuracy of 0.866. Additionally, the LASSO model exhibited the highest net benefit in decision curves (threshold probability range: 0.1−0.8). The calibration curve showed a slope of 1.001, an intercept of 0.095 and a Brier score of 0.119. The LASSO model identified 10 important variables, including combined hypertension, total protein, total cholesterol, potassium, alanine aminotransferase, calcium, transferrin saturation, uric acid, phosphorus and albumin levels.
    CONCLUSIONS  The LASSO regression model demonstrate excellent performance and clinical utility in PDAP risk prediction. The 10 key indicators identified by the model provide a quantitative basis for the early identification and intervention of high-risk patients.

     

/

返回文章
返回