Forecasting Dissolved Oxygen in Lakes Using Different AI Models
DOI:
https://doi.org/10.29304/jqcsm.2025.17.11959Keywords:
Heart disease, features selection, ; mutual information, Random ForestAbstract
The freshwater ecological situation and water quality management would be impossible without the accurate prediction of the dissolved oxygen (DO) concentration in lakes. This study focuses on the use of different advanced Artificial Intelligence (AI) models in predicting values of the DO level for a given input environmental data. To make the data usable for analysis, a number of pre-processing steps are carried out. These processes include, but are not limited to, the ability to deal with the missing data and standardization of the features in the data set, so that the features are on the same level. After the pre-processing, the dataset aggregated in features and a target variable, which is the concentration of the target substance, the dissolved oxygen, is selected. Also, as a part of the building process of the model, the analysis of the features with regard to the target variable is performed in order to discern contributing features for better prediction. Additionally, in order to increase the accuracy of the machine learning models, standardization transformations to the mean of zero and to the variance of one were applied in the training set and test set using Standard Scaler. A selection of different machine learning models was therefore performed in order to identify the best predictor of the DO concentration. In the case of Linear Regression, it’s R² has indicated a high degree of predictive accuracy at 0.9974 with a very low Mean Squared Error (MSE) of 9.6146e-05. Meagerly performing as an alternate method, Support Vector Regression (SVR) managed to attain 0.9448 of R² and an MSE of 0.0021, so it works but not as accurately as Linear Regression. In addition, multiple hidden layers were applied through Artificial Neural Networks (ANN) in a bid to model the data as uncertainty that appears in higher orders. Most notably, the ANN model achieved an R² of 0.995 and an MSE of 0.000161 which is nearly comparable to that of Linear Regression.
Downloads
References
D. Pan, Y. Zhang, Y. Deng, J. Van Griensven Thé, S. X. Yang, and B. Gharabaghi, "Dissolved Oxygen Forecasting for Lake Erie’s Central Basin Using Hybrid Long Short-Term Memory and Gated Recurrent Unit Networks," Water, vol. 16, no. 5, p. 707, 2024.
N. Wu, J. Huang, B. Schmalz, and N. Fohrer, "Modeling daily chlorophyll a dynamics in a German lowland river using artificial neural networks and multiple linear regression approaches," Limnology, vol. 15, pp. 47-56, 2014.
Y. Seo, S. Kim, O. Kisi, and V. P. Singh, "Daily water level forecasting using wavelet decomposition and artificial intelligence techniques," Journal of Hydrology, vol. 520, pp. 224-243, 2015.
E. Olyaie, H. Z. Abyaneh, and A. D. Mehr, "A comparative analysis among computational intelligence techniques for dissolved oxygen prediction in Delaware River," Geoscience Frontiers, vol. 8, no. 3, pp. 517-527, 2017.
M. H. Ahmed and L.-S. Lin, "Dissolved oxygen concentration predictions for running waters with different land use land cover using a quantile regression forest machine learning technique," Journal of Hydrology, vol. 597, p. 126213, 2021.
H. G. Kim, S. Hong, K.-S. Jeong, D.-K. Kim, and G.-J. Joo, "Determination of sensitive variables regardless of hydrological alteration in artificial neural network model of chlorophyll a: Case study of Nakdong River," Ecological modelling, vol. 398, pp. 67-76, 2019.
D. Antanasijević, V. Pocajt, A. Perić-Grujić, and M. Ristić, "Modelling of dissolved oxygen in the Danube River using artificial neural networks and Monte Carlo Simulation uncertainty analysis," Journal of Hydrology, vol. 519, pp. 1895-1907, 2014.
V. Ranković, J. Radulović, I. Radojević, A. Ostojić, and L. Čomić, "Neural network modeling of dissolved oxygen in the Gruža reservoir, Serbia," Ecological Modelling, vol. 221, no. 8, pp. 1239-1244, 2010.
M. B. K. Prasad, W. Long, X. Zhang, R. J. Wood, and R. Murtugudde, "Predicting dissolved oxygen in the Chesapeake Bay: applications and implications," Aquatic sciences, vol. 73, pp. 437-451, 2011.
O. Kisi, M. Alizamir, and A. Docheshmeh Gorgij, "Dissolved oxygen prediction using a new ensemble method," Environmental Science and Pollution Research, vol. 27, no. 9, pp. 9589-9603, 2020.
W. Li et al., "Concentration estimation of dissolved oxygen in Pearl River Basin using input variable selection and machine learning techniques," Science of The Total Environment, vol. 731, p. 139099, 2020.
D. Feng, Q. Han, L. Xu, F. Sohel, S. G. Hassan, and S. Liu, "An ensembled method for predicting dissolved oxygen level in aquaculture environment," Ecological Informatics, vol. 80, p. 102501, 2024.
L. Durell, J. T. Scott, D. Nychka, and A. S. Hering, "Functional forecasting of dissolved oxygen in high‐frequency vertical lake profiles," Environmetrics, vol. 34, no. 4, p. e2765, 2023.
I. Suaza Sierra, "Predictive Understanding of Lake Water Temperature and Dissolved Oxygen Profiles Across the Red River Basin Through Interpretable Machine Learning," 2024.
G. James, D. Witten, T. Hastie, R. Tibshirani, and J. Taylor, "Linear regression," in An introduction to statistical learning: With applications in python: Springer, 2023, pp. 69-134.
T. M. Hope, "Linear regression," in Machine learning: Elsevier, 2020, pp. 67-81.
D. Maulud and A. M. Abdulazeez, "A review on linear regression comprehensive in machine learning," Journal of Applied Science and Technology Trends, vol. 1, no. 2, pp. 140-147, 2020.
F. Zhang and L. J. O'Donnell, "Support vector regression," in Machine learning: Elsevier, 2020, pp. 123-140.
M. Ibrahim et al., "Artificial neural network modeling for the prediction, estimation, and treatment of diverse wastewaters: a comprehensive review and future perspective," Chemosphere, p. 142860, 2024.
M. Al-Shawwa and S. S. Abu-Naser, "Predicting birth weight using artificial neural network," in International conference on multidisciplinary science, 2024, vol. 2, no. 3, pp. 5-10.
P. Thakkar, S. Khatri, D. Dobariya, D. Patel, B. Dey, and A. K. Singh, "Advances in materials and machine learning techniques for energy storage devices: A comprehensive review," Journal of Energy Storage, vol. 81, p. 110452, 2024.
M. A. Chowdhury et al., "Recent machine learning guided material research - A review," Computational Condensed Matter, vol. 29, p. e00597, 2021/12/01/ 2021, doi: https://doi.org/10.1016/j.cocom.2021.e00597.
J. Gao, "R-Squared (R2)–How much variation is explained?," Research Methods in Medicine & Health Sciences, vol. 5, no. 4, pp. 104-109, 2024.
T. O. Hodson, T. M. Over, and S. S. Foks, "Mean squared error, deconstructed," Journal of Advances in Modeling Earth Systems, vol. 13, no. 12, p. e2021MS002681, 2021.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Zahraa Ch. Oleiwi, Karrar Khudhair Obayes , Nagham Kamil Hadi, Rahmah.Q yaseen, Asaad Jabbar Sahib

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.