Automated En Masse Machine Learning Model Generation Shows Comparable Performance as Classic Regression Models for Predicting Delayed Graft Function in Renal Allografts

التفاصيل البيبلوغرافية
العنوان: Automated En Masse Machine Learning Model Generation Shows Comparable Performance as Classic Regression Models for Predicting Delayed Graft Function in Renal Allografts
المؤلفون: Samer Albahra, Kuang-Yu Jen, Hooman H. Rashidi, Nam K. Tran, Felicia Yen, Junichiro Sageshima, Ling Xin Chen
المصدر: Transplantation. 105:2646-2654
بيانات النشر: Ovid Technologies (Wolters Kluwer Health), 2021.
سنة النشر: 2021
مصطلحات موضوعية: Computer science, Delayed Graft Function, 030230 surgery, Machine learning, computer.software_genre, Logistic regression, Machine Learning, 03 medical and health sciences, Naive Bayes classifier, 0302 clinical medicine, Feature (machine learning), Humans, Transplantation, Receiver operating characteristic, business.industry, Bayes Theorem, Regression analysis, Allografts, Kidney Transplantation, Random forest, Support vector machine, Logistic Models, 030211 gastroenterology & hepatology, Artificial intelligence, Gradient boosting, business, computer
الوصف: BACKGROUND Several groups have previously developed logistic regression models for predicting delayed graft function (DGF). In this study, we used an automated machine learning (ML) modeling pipeline to generate and optimize DGF prediction models en masse. METHODS Deceased donor renal transplants at our institution from 2010 to 2018 were included. Input data consisted of 21 donor features from United Network for Organ Sharing. A training set composed of ~50%/50% split in DGF-positive and DGF-negative cases was used to generate 400 869 models. Each model was based on 1 of 7 ML algorithms (gradient boosting machine, k-nearest neighbor, logistic regression, neural network, naive Bayes, random forest, support vector machine) with various combinations of feature sets and hyperparameter values. Performance of each model was based on a separate secondary test dataset and assessed by common statistical metrics. RESULTS The best performing models were based on neural network algorithms, with the highest area under the receiver operating characteristic curve of 0.7595. This model used 10 out of the original 21 donor features, including age, height, weight, ethnicity, serum creatinine, blood urea nitrogen, hypertension history, donation after cardiac death status, cause of death, and cold ischemia time. With the same donor data, the highest area under the receiver operating characteristic curve for logistic regression models was 0.7484, using all donor features. CONCLUSIONS Our automated en masse ML modeling approach was able to rapidly generate ML models for DGF prediction. The performance of the ML models was comparable with classic logistic regression models.
تدمد: 0041-1337
DOI: 10.1097/tp.0000000000003640
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5a4459ab3c45e97cbb18be4970837add
https://doi.org/10.1097/tp.0000000000003640
رقم الانضمام: edsair.doi.dedup.....5a4459ab3c45e97cbb18be4970837add
قاعدة البيانات: OpenAIRE
الوصف
تدمد:00411337
DOI:10.1097/tp.0000000000003640