Advanced Data Driven Prediction of BOD in the Ganga River Using Multivariate Regression and Nonlinear Bilayered Neural Network Ensembles
DOI:
https://doi.org/10.71170/tecoj.2025.1.3.pp1-15Keywords:
Biochemical Oxygen Demand, Ganga River, Machine Learning, Neural Network Ensembles, Water QualityAbstract
Accurate prediction of Biochemical Oxygen Demand (BOD) is essential for understanding pollution dynamics and supporting effective water quality management in the Ganga River. This study develops a comprehensive data-driven modeling framework that integrates multivariate regression models with neural network ensemble techniques to forecast BOD concentrations using physiochemical and microbial water quality indicators. Four regression models, including Fine-Tree Linear Regression (FLR), Interactive Linear Regression (ILR), Robust Linear Regression (RLR), and Stepwise Linear Regression (SWLR), were developed using combinations of dissolved oxygen (DO), pH, conductivity, total coliform (TC), and fecal coliform (FC). Correlation analysis revealed moderate positive associations of BOD with pH (r = 0.26), conductivity (r = 0.23), and dissolved oxygen (r = 0.06), on the other hand, the microbial indicators showed weak negative correlations, indicating the need for advanced modeling frameworks beyond simple linear relationships. Model evaluation based on MSE, RMSE, MAE, and SMAPE showed that FLR models outperformed other regression models, with FLR-4 producing the lowest testing errors (MSE = 0.0043; RMSE = 0.0657) among all linear regressors. However, integrating the regression outputs into neural network ensembles significantly enhanced prediction accuracy. The Bilayered Neural Ensemble (BNE) models consistently performed best, with BNE-RLR (testing MSE = 0.0015; RMSE = 0.0381) and BNE-ILR (testing MSE = 0.0015; RMSE = 0.0392) providing the highest accuracy and stability across all performance indices. The findings demonstrate that coupling multivariate regression with neural network ensemble modeling provides a robust and highly accurate framework for BOD prediction in the Ganga River and other similar river systems.