02685nas a2200313 4500000000100000008004100001260003700042653002000079653003200099653002000131653001400151653001500165100002300180700001200203700001400215700001100229700001900240700001700259700001700276700003500293700001300328700001600341245008600357856009900443300000900542490000700551520179900558022001402357 2024 d bPublic Library of Science (PLoS)10aChagas' disease10aMachine learning algorithms10aEarly detection10aDiagnosis10aPrevention1 aDe Rose Ghilardi F1 aSilva G1 aVieira TM1 aMota A1 aBierrenbach AL1 aDamasceno RF1 aOliveira LCD1 aDias Porto Chiavegatto Filho A1 aSabino E1 aCarvalho MS00aMachine learning for predicting Chagas disease infection in rural areas of Brazil uhttps://journals.plos.org/plosntds/article/file?id=10.1371/journal.pntd.0012026&type=printable a1-110 v183 a

Introduction: Chagas disease is a severe parasitic illness that is prevalent in Latin America and often goes unaddressed. Early detection and treatment are critical in preventing the progression of the illness and its associated life-threatening complications. In recent years, machine learning algorithms have emerged as powerful tools for disease prediction and diagnosis.

Methods: In this study, we developed machine learning algorithms to predict the risk of Chagas disease based on five general factors: age, gender, history of living in a mud or wooden house, history of being bitten by a triatomine bug, and family history of Chagas disease. We analyzed data from the Retrovirus Epidemiology Donor Study (REDS) to train five popular machine learning algorithms. The sample comprised 2,006 patients, divided into 75% for training and 25% for testing algorithm performance. We evaluated the model performance using precision, recall, and AUC-ROC metrics.

Results: The Adaboost algorithm yielded an AUC-ROC of 0.772, a precision of 0.199, and a recall of 0.612. We simulated the decision boundary using various thresholds and observed that in this dataset a threshold of 0.45 resulted in a 100% recall. This finding suggests that employing such a threshold could potentially save 22.5% of the cost associated with mass testing of Chagas disease.

Conclusion: Our findings highlight the potential of applying machine learning to improve the sensitivity and effectiveness of Chagas disease diagnosis and prevention. Furthermore, we emphasize the importance of integrating socio-demographic and environmental factors into neglected disease prediction models to enhance their performance.

 a1935-2735