基于机器学习的耐碳青霉烯类肠杆菌科细菌ICU医院获得性定植预测模型的建立与评价

Development and evaluation of machine learning-based prediction model for hospital-acquired colonization by carbapenem-resistant Enterobacteriaceae in ICU

  • 摘要:
    目的 构建重症监护室(ICU)患者耐碳青霉烯类肠杆菌科细菌(CRE)医院获得性定植的机器学习模型,为预防和控制CRE医院感染提供参考。
    方法 本研究收集温州市中西医结合医院2020年1月-2025年8月552例CRE筛查的ICU患者,将CRE筛查阴性(≥1次)转为阳性(≥1次)的71例患者纳入阳性组,将CRE筛查持续阴性≥3次的57例患者纳入对照组,按3∶1将数据集随机分成训练集和测试集。使用递归特征消除法筛选训练集的25个预测因子并构建logistic回归、支持向量机(SVM)、人工神经网络(ANN)、极端梯度提升(XGBoost)、决策树、随机森林6种机器学习模型;采用曲线下面积(AUC)等指标评估模型性能最终确定最优模型。
    结果  经评估,随机森林模型最佳(训练集和测试集AUC分别为0.950、0.826),确定的预测因子为年龄、ICU住院天数、白蛋白水平、联合使用抗菌药物天数、机械通气、碳青霉烯类抗菌药物使用天数、使用抗菌药物天数。基于随机森林模型开发了在线计算工具。
    结论  随机森林模型效能最优,最优模型及在线工具有助于早期及时准确识别CRE医院定植高风险人群,及早接触隔离。

     

    Abstract:
    OBJECTIVE  To develop a machine learning model for predicting hospital-acquired colonization by Carbapenem-resistant Enterobacteriaceae (CRE) in intensive care unit (ICU), thereby providing a reference for the prevention and control of hospital-acquired CRE infections.
    METHODS  We enrolled a total of 552 ICU patients screened for CRE at Wenzhou Integrated Traditional Chinese and Western Medicine Hospital from Jan. 2020 to Aug. 2025. Among them, patients with CRE screening transitioning from negative (≥1 time) to positive (≥1 time) were assigned to a positive group (n=71), while patients with persistently negative CRE screening (≥3 times) were assigned to a control group (n=57). The dataset was randomly divided into training and testing sets at a 3∶1 ratio. Recursive feature elimination was employed to screen 25 predictors from the training set, and six machine learning models were developed: logistic regression analysis, support vector machine (SVM), artificial neural network (ANN), extreme gradient boosting (XGBoost), decision tree and random forest. Model performance was evaluated by metrics such as the area under the curve (AUC) to ultimately determine the optimal model.
    RESULTS  After evaluation, the random forest model performed best (AUC of 0.950 for the training set and 0.826 for the testing set, respectively). The identified predictors included age, length of stay in ICU, albumin level, length of combined antimicrobial use, mechanical ventilation, length of carbapenem antimicrobial use and total days of antimicrobial use. An online calculation tool was developed based on the random forest model.
    CONCLUSIONS  The random forest model demonstrated superior performance. The optimal model and online tool can aid in early, timely and accurate identification of high-risk populations for hospital-acquired colonization by CRE, enabling prompt contact isolation.

     

/

返回文章
返回