Investigating Comorbidities from Clinical Texts: A Propensity Score Approach
- Autori: Alessandro Albano, Chiara Di Maria, Mariangela Sciandra, Antonella Plaia
- Anno di pubblicazione: 2025
- Tipologia: Contributo in atti di convegno pubblicato in volume
- OA Link: http://hdl.handle.net/10447/683689
Abstract
Understanding comorbidity patterns is crucial for improving patient outcomes and optimising healthcare strategies. In this study, we propose an approach to detect comorbidities of two diseases from clin-ical discharge notes. To account for the complexities of textual data, we summarise the information through propensity scores, which repre-sent the probability of receiving a certain diagnosis conditional on the extracted text. These scores are then used as covariates in a logistic regression model to explore the association between diseases. Specifi-cally, we compare models trained on TF-IDF weighted document-term matrices and text embeddings, employing LASSO regression, XGBoost, and multilayer perceptrons (MLP). Our results, obtained by applying this method to study the association between diabetes and Chronic Kid-ney Disease, demonstrate the potential of Natural Language Process-ing (NLP) and machine learning techniques in advancing observational healthcare research.