Interpretable SHAP-Based Data-Driven Framework for Optimizing Treatment Performance Through Turbidity Dynamics Modelling
DOI:
https://doi.org/10.71170/tecoj.2025.1.3.pp16-35Keywords:
Turbidity Prediction, Physiochemical Water Quality, Feature Importance Analysis, SHAP InterpretationsAbstract
Reliable prediction of Turbidity (Turb) in surface-water treatment systems (WTS) is essential for sustaining safe drinking-water production, particularly in rapidly urbanizing regions where source-water quality (WQ) fluctuates significantly. This study develops a high-precision predictive framework for Turb at the Tamburawa Water Treatment Plant (TWTP) in Kano State, Nigeria, integrating optimized nonlinear models with robust feature-importance diagnostics to improve interpretability and operational usefulness. A comprehensive physicochemical dataset comprising E C (EC), pH, hardness, Alkalinity (Alk), Temperature, alum dosage (Alum), free CO₂, and calcium (Ca) was preprocessed through rigorous screening, normalization, distributional evaluation, and stratified data partitioning. Model development involved the systematic tuning of structural and kernel-based hyperparameters, enabling the construction of four high-performance predictive systems, each with two modelling groups: Neural Network (NN-G1/G2), Bagged Trees (BT-G1/G2), Gaussian Process Regression (GPR-G1/G2), and Support Vector Machine (SVM-G1/G2). Across all configurations, the BT-G1 model delivered the strongest predictive generalization (testing RMSE = 0.0671, MAE = 0.0389), outperforming the NN, GPR, and SVM architectures, and demonstrating high stability across both training and validation phases. SHAP analysis revealed EC as the dominant predictor, followed by free CO₂, while parameters such as Alk and pH contributed comparatively smaller but consistent effects. The findings show that Turb dynamics at TWTP are strongly linked to ionic strength, flow-driven sediment loading, and chemical treatment behavior, aligning with hydrochemical patterns. Beyond model accuracy, the results highlight critical socioeconomic and environmental implications: more precise Turb forecasting can reduce treatment costs, improve allocation of coagulants, and strengthen resilience against climate-driven fluctuations in raw-WQ. The study concludes that interpretable predictive modeling provides a powerful tool for managing WQ risks in northern Nigeria and recommends the integration of real-time monitoring and ensemble-learning extensions in future work to enhance operational decision-making.