Predicting COVID-19 disease outcome and post-recovery conditions using machine learning
Abstract
With COVID-19 still running rampant across the world, accurate diagnosis of pa tients and proper management of medical resources is paramount in order to deliver
proper care to those that need it most. In order to do this, prediction models with
the help of various machine learning algorithms are being developed across the world.
Each may deal with certain variables that help predict the disease outcome, such as
comorbidities, symptoms, age, sex, etc. Some models have also been made to help
predict the chances of a COVID-19 patient in developing lasting medical conditions
post recovery. The goal of this research then, is to create a model that takes all
the aforementioned dimensions into account and create a prediction model with the
three timelines in mind. It is a model that will predict if a person has contacted
COVID-19 based on the preliminary symptoms they show (Timeline 1), predict the
chances of a COVID-19 patient developing more serious symptoms based on their
medical history (Timeline 2) and also predict the chances of a patient developing
post-recovery conditions arising after recovering from COVID-19 (Timeline 3). To
accomplish this, we use three machine learning algorithms – Random Forest, Na¨ıve
Bayes and K-nearest Neighbors. For implementation and testing of the model, data
on COVID-19 patients is split into train and test sets and fit over the aforemen tioned algorithms. Their performance are then evaluated. Specific features of the
dataset also analyzed at a deeper level in order to gain a better understanding of
how the virus behaves in certain conditions. Having such a model in place will not
only help us direct medical resources to patients that need the most attention, but
will also provide a clearer understanding of the nature of the virus and how it affects
a specific patient.