Explainable artificial intelligence and model calibration for water quality prediction
View/ Open
Date
2022-08Publisher
BRAC UniversityAuthor
Hellen, NakayizaMetadata
Show full item recordAbstract
Water is a key necessity for survival and sustenance of all living creatures. In the
past years, the quality of water has been adversely affected by pollutants and other
harmful wastes. This increased water pollution deteriorates water quality, making it
unfit for any type of use most especially compromising the safety of drinking water
for public health. The ecological safety and human health have continuously lowered
due to hazardous pollution factors like chemicals and pathogens. By monitoring the
Water Quality data parameters and forecasting them to get early warning, we can
manage the quality of the water for different water sources. Numerous innovative
technologies are slowly replacing human labor and other state of the art methods
in water quality evaluation. Recently, different machine learning and artificial intelligence techniques have been adopted for water quality modeling which has become
very beneficial in assessment and management of water resources. However, they
suffer many times from high computational complexity, high prediction error and
the blackbox nature in which they remain. Another big challenge faced by policy
makers and other responsible Public Health Authorities is the lack of a relatively
generalizable model for water quality prediction for public consumption with provision of explanations for understanding the most influential water quality parameters.
This work presents an Explainable Artificial Intelligence method, SHAP (SHapley
Additive exPlanations) to transparently and explainably assess the most important
metrics that these models use in determining water quality based on potability. We
also model a robust generalizable calibrated ensemble machine learning model for
water quality prediction based on water potability and other water quality metrics
from various water quality samples around the world. We then implement Automated Machine Learning with Stacked Ensembling to compare its results with those
achieved by the Soft Voting Ensemble Model. The simulated results will provide
theoretical support to policy makers and would be of interest to water planners in
terms of assessing or maintaining water quality and improving sustainable pollu tion control, water and ecological management plans of water resources as well as
early risk assessment and prevention in water environment in a simple, fast and
cost-effective way which will protect the health of the people.