Advanced Data Driven Prediction of BOD in the Ganga River Using Multivariate Regression and Nonlinear Bilayered Neural Network Ensembles

Authors

  • Usman U. Aliyu Sharda University, India
  • Abdulhayat Muhammad Jibrin King Fahd University of Petroleum and Minerals
  • Abubakar Sabo Baba Department of Civil Engineering, Federal University Dutsin-Ma, Katsina State
  • Ismail A. Aminu Department of Physics, Northwest University, Kano State
  • Sukalpaa Chaki Civil Engineering Department, Sharda University, Greater Noida, India
  • Rakesh Kumar Civil Engineering Department, Sharda University, Greater Noida, India

DOI:

https://doi.org/10.71170/tecoj.2025.1.3.pp1-15

Keywords:

Biochemical Oxygen Demand, Ganga River, Machine Learning, Neural Network Ensembles, Water Quality

Abstract

Accurate prediction of Biochemical Oxygen Demand (BOD) is essential for understanding pollution dynamics and supporting effective water quality management in the Ganga River. This study develops a comprehensive data-driven modeling framework that integrates multivariate regression models with neural network ensemble techniques to forecast BOD concentrations using physiochemical and microbial water quality indicators. Four regression models, including Fine-Tree Linear Regression (FLR), Interactive Linear Regression (ILR), Robust Linear Regression (RLR), and Stepwise Linear Regression (SWLR), were developed using combinations of dissolved oxygen (DO), pH, conductivity, total coliform (TC), and fecal coliform (FC). Correlation analysis revealed moderate positive associations of BOD with pH (r = 0.26), conductivity (r = 0.23), and dissolved oxygen (r = 0.06), on the other hand, the microbial indicators showed weak negative correlations, indicating the need for advanced modeling frameworks beyond simple linear relationships. Model evaluation based on MSE, RMSE, MAE, and SMAPE showed that FLR models outperformed other regression models, with FLR-4 producing the lowest testing errors (MSE = 0.0043; RMSE = 0.0657) among all linear regressors. However, integrating the regression outputs into neural network ensembles significantly enhanced prediction accuracy. The Bilayered Neural Ensemble (BNE) models consistently performed best, with BNE-RLR (testing MSE = 0.0015; RMSE = 0.0381) and BNE-ILR (testing MSE = 0.0015; RMSE = 0.0392) providing the highest accuracy and stability across all performance indices. The findings demonstrate that coupling multivariate regression with neural network ensemble modeling provides a robust and highly accurate framework for BOD prediction in the Ganga River and other similar river systems.

Author Biographies

Abdulhayat Muhammad Jibrin, King Fahd University of Petroleum and Minerals

PhD Scholar

Abubakar Sabo Baba, Department of Civil Engineering, Federal University Dutsin-Ma, Katsina State

PhD research scholar

Rakesh Kumar, Civil Engineering Department, Sharda University, Greater Noida, India

Professor

Downloads

Published

2025-12-24