基于随机森林算法预测食管癌放化疗并发肺部感染危险因素

Prediction of risk factors for pulmonary infection complication in esophageal cancer and chemoradiotherapy patients based on random forest algorithm

  • 摘要: 目的 探讨食管癌(EC)放化疗患者并发肺部感染(PI)的危险因素,构建随机森林预测模型并基于此制定防治策略。方法 回顾性收集2020年5月-2023年12月南昌大学第一附属医院收治的227例食管癌放化疗患者的临床资料。根据是否并发肺部感染,分为感染组46例和非感染组181例。采用多因素logistic回归分析筛选危险因素,并基于此利用R软件(4.3.2版)构建风险预测模型。结果 多因素logistic回归分析结果显示,吸烟、年龄、营养不良和肿瘤位置是EC放化疗患者并发PI的危险因素(P<0.05); 构建随机森林模型结果显示,影响EC放化疗患者并发PI的重要性排序依次为肿瘤位置、吸烟、年龄及营养不良; 随机森林算法预测模型的AUC为0.774(95%CI:0.714~0.827),logistic回归模型预测的AUC为0.743(95%CI:0.681~798),两种模型的AUC比较,结果显示Z=1.981,P=0.0476。结论 肿瘤位置、吸烟、高龄及营养不良是EC放化疗患者并发PI的独立危险因素。本研究构建的随机森林预测模型准确性较高,可为临床识别高危患者、制定针对性防治策略提供理论依据,以降低PI发生率。

     

    Abstract: OBJECTIVE To explore the risk factors for pulmonary infection (PI) complication in patients with esophageal cancer (EC) undergoing chemoradiotherapy and construct a random forest prediction model on which the prevention and treatment strategies are formulated. METHODS The clinical data were retrospectively collected from 227 EC and chemoradiotherapy patients who were treated in the First Affiliated Hospital of Nanchang University from May 2020 to Dec. 2023. The patients were divided into the infection group with 46 cases and the non-infection group with 181 cases according to status of PI complication. The risk factors were screened out and analyzed by means of multivariate logistic regression model, and the risk prediction model was constructed based on the risk factors by using R software (4.3. 2 version). RESULTS The result of multivariate logistic regression analysis showed that smoking, age, malnutrition and tumor location were the risk factors for the PI complication in the EC and chemoradiotherapy patients(P<0.05). The result of construction of the random forest model indicated that the factors affecting the PI complication, ranked in descending order of importance, were as follows: tumor location, smoking, age and malnutrition. The AUC of the random forest prediction model was 0.774(95%CI:0.714 to 0.827), while the AUC of the logistic regression model was 0.743(95%CI:0.681 to 798); as compared with the AUC between the two models, the result showed that Z=1.981,P=0.0476. CONCLUSIONS The tumor location, smoking, advanced age and malnutrition are the independent risk factors for the PI complication in the EC and chemoradiotherapy patients. The random forest prediction model has high accuracy and can provide theoretical bases for formulating targeted prevention and treatment strategies so as to reduce the incidence of PI.

     

/

返回文章
返回