Academic Journal

Large language models to identify social determinants of health in electronic health records

التفاصيل البيبلوغرافية
العنوان: Large language models to identify social determinants of health in electronic health records
المؤلفون: Guevara, Marco, Chen, Shan, Thomas, Spencer, Chaunzwa, Tafadzwa L, Franco, Idalid, Kann, Benjamin H, Moningi, Shalini, Qian, Jack M, Goldstein, Madeleine, Harper, Susan, Aerts, Hugo J W L, Catalano, Paul J, Savova, Guergana K, Mak, Raymond H, Bitterman, Danielle S
المصدر: Guevara , M , Chen , S , Thomas , S , Chaunzwa , T L , Franco , I , Kann , B H , Moningi , S , Qian , J M , Goldstein , M , Harper , S , Aerts , H J W L , Catalano , P J , Savova , G K , Mak , R H & Bitterman , D S 2024 , ' Large language models to identify social determinants of health in electronic health records ' , npj Digital Medicine , vol. 7 , no. 1 , 6 . ....
سنة النشر: 2024
المجموعة: Maastricht University Research Publications
الوصف: Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1?+?0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p?
نوع الوثيقة: article in journal/newspaper
اللغة: English
Relation: https://cris.maastrichtuniversity.nl/en/publications/8797d845-0164-44b8-b9bb-27688f8590ef
DOI: 10.1038/s41746-023-00970-0
الاتاحة: https://cris.maastrichtuniversity.nl/en/publications/8797d845-0164-44b8-b9bb-27688f8590ef
https://doi.org/10.1038/s41746-023-00970-0
Rights: info:eu-repo/semantics/openAccess
رقم الانضمام: edsbas.C254BDB2
قاعدة البيانات: BASE
الوصف
DOI:10.1038/s41746-023-00970-0