Next Article in Journal
Patient Care, Information, Communication and Social Media Influencing Bias—A Discourse
Next Article in Special Issue
Segmentation and Identification of Vertebrae in CT Scans Using CNN, k-Means Clustering and k-NN
Previous Article in Journal
Information Technology Governance for Higher Education Institutions: A Multi-Country Study
Previous Article in Special Issue
Application of Machine Learning in Intensive Care Unit (ICU) Settings Using MIMIC Dataset: Systematic Review
Article

Benchmarking Machine Learning Models to Assist in the Prognosis of Tuberculosis

1
Programa de Pós-Graduação em Engenharia de Computação (PPGEC), Universidade de Pernambuco, Recife 50720-001, Pernambuco, Brazil
2
Business School, Dublin City University, Dublin 9, Dublin, Ireland
3
Fundação de Medicina Tropical Doutor Heitor Vieira Dourado, Manaus 69040-000, Amazonas, Brazil
*
Author to whom correspondence should be addressed.
Academic Editors: Renato Umeton and Gregory Antell
Received: 8 March 2021 / Revised: 8 April 2021 / Accepted: 9 April 2021 / Published: 15 April 2021
(This article belongs to the Special Issue Machine Learning in Healthcare)
Tuberculosis (TB) is an airborne infectious disease caused by organisms in the Mycobacterium tuberculosis (Mtb) complex. In many low and middle-income countries, TB remains a major cause of morbidity and mortality. Once a patient has been diagnosed with TB, it is critical that healthcare workers make the most appropriate treatment decision given the individual conditions of the patient and the likely course of the disease based on medical experience. Depending on the prognosis, delayed or inappropriate treatment can result in unsatisfactory results including the exacerbation of clinical symptoms, poor quality of life, and increased risk of death. This work benchmarks machine learning models to aid TB prognosis using a Brazilian health database of confirmed cases and deaths related to TB in the State of Amazonas. The goal is to predict the probability of death by TB thus aiding the prognosis of TB and associated treatment decision making process. In its original form, the data set comprised 36,228 records and 130 fields but suffered from missing, incomplete, or incorrect data. Following data cleaning and preprocessing, a revised data set was generated comprising 24,015 records and 38 fields, including 22,876 reported cured TB patients and 1139 deaths by TB. To explore how the data imbalance impacts model performance, two controlled experiments were designed using (1) imbalanced and (2) balanced data sets. The best result is achieved by the Gradient Boosting (GB) model using the balanced data set to predict TB-mortality, and the ensemble model composed by the Random Forest (RF), GB and Multi-Layer Perceptron (MLP) models is the best model to predict the cure class. View Full-Text
Keywords: tuberculosis; neglected tropical disease; prognosis; machine learning; ensemble model; imbalanced data sets; feature selection; random search; benchmark tuberculosis; neglected tropical disease; prognosis; machine learning; ensemble model; imbalanced data sets; feature selection; random search; benchmark
Show Figures

Figure 1

MDPI and ACS Style

Lino Ferreira da Silva Barros, M.H.; Oliveira Alves, G.; Morais Florêncio Souza, L.; da Silva Rocha, E.; Lorenzato de Oliveira, J.F.; Lynn, T.; Sampaio, V.; Endo, P.T. Benchmarking Machine Learning Models to Assist in the Prognosis of Tuberculosis. Informatics 2021, 8, 27. https://0-doi-org.brum.beds.ac.uk/10.3390/informatics8020027

AMA Style

Lino Ferreira da Silva Barros MH, Oliveira Alves G, Morais Florêncio Souza L, da Silva Rocha E, Lorenzato de Oliveira JF, Lynn T, Sampaio V, Endo PT. Benchmarking Machine Learning Models to Assist in the Prognosis of Tuberculosis. Informatics. 2021; 8(2):27. https://0-doi-org.brum.beds.ac.uk/10.3390/informatics8020027

Chicago/Turabian Style

Lino Ferreira da Silva Barros, Maicon H.; Oliveira Alves, Geovanne; Morais Florêncio Souza, Lubnnia; da Silva Rocha, Elisson; Lorenzato de Oliveira, João F.; Lynn, Theo; Sampaio, Vanderson; Endo, Patricia T. 2021. "Benchmarking Machine Learning Models to Assist in the Prognosis of Tuberculosis" Informatics 8, no. 2: 27. https://0-doi-org.brum.beds.ac.uk/10.3390/informatics8020027

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop